Mirantis OpenStack for Kubernetes Documentation¶

This documentation provides information on how to deploy and operate a Mirantis OpenStack for Kubernetes (MOSK) environment. The documentation is intended to help operators to understand the core concepts of the product. The documentation provides sufficient information to deploy and operate the solution.

The information provided in this documentation set is being constantly improved and amended based on the feedback and kind requests from the consumers of MOS.

The following table lists the guides included in the documentation set you are reading:

Guide list¶
Guide	Purpose
Reference Architecture	Learn the fundamentals of MOSK reference architecture to appropriately plan your deployment
Deployment Guide	Deploy a MOSK environment of a preferred configuration using supported deployment profiles tailored to the demands of specific business cases
Operations Guide	Operate your MOSK environment
Release Notes	Learn about new features and bug fixes in the current MOSK version

Intended audience¶

This documentation is intended for engineers who have the basic knowledge of Linux, virtualization and containerization technologies, Kubernetes API and CLI, Helm and Helm charts, Mirantis Kubernetes Engine (MKE), and OpenStack.

Documentation history¶

The following table contains the released revision of the documentation set you are reading.

Release date	Release name
August, 2023	MOSK 23.2 series

Conventions¶

This documentation set uses the following conventions in the HTML format:

Documentation conventions¶
Convention	Description
boldface font	Inline CLI tools and commands, titles of the procedures and system response examples, table titles
`monospaced` font	Files names and paths, Helm charts parameters and their values, names of packages, nodes names and labels, and so on
italic font	Information that distinguishes some concept or term
Links	External links and cross-references, footnotes
Main menu > menu item	GUI elements that include any part of interactive user interface and menu navigation
^Superscript	Some extra, brief information
Note The Note block	Messages of a generic meaning that may be useful for the user
Caution The Caution block	Information that prevents a user from mistakes and undesirable consequences when following the procedures
Warning The Warning block	Messages that include details that can be easily missed, but should not be ignored by the user and are valuable before proceeding
See also The See also block	List of references that may be helpful for understanding of some related tools, concepts, and so on
Learn more The Learn more block	Used in the Release Notes to wrap a list of internal references to the reference architecture, deployment and operation procedures specific to a newly implemented product feature

Product Overview¶

Mirantis OpenStack for Kubernetes (MOSK) combines the power of Mirantis Container Cloud for delivering and managing Kubernetes clusters, with the industry standard OpenStack APIs, enabling you to build your own cloud infrastructure.

The advantages of running all of the OpenStack components as a Kubernetes application are multi-fold and include the following:

Zero downtime, non-disruptive updates
Fully automated Day-2 operations
Full-stack management from bare metal through the operating system and all the necessary components

The list of the most common use cases includes:

Software-defined data center: The traditional data center requires multiple requests and interactions to deploy new services, by abstracting the data center functionality behind a standardized set of APIs service can be deployed faster and more efficiently. MOSK enables you to define all your data center resources behind the industry standard OpenStack APIs allowing you to automate the deployment of applications or simply request resources through the UI to quickly and efficiently provision virtual machines, storage, networking, and other resources.
Virtual Network Functions (VNFs): VNFs require high performance systems that can be accessed on demand in a standardized way, with assurances that they will have access to the necessary resources and performance guarantees when needed. MOSK provides extensive support for VNF workload enabling easy access to functionality such as Intel EPA (NUMA, CPU pinning, Huge Pages) as well as the consumption of specialized networking interfaces cards to support SR-IOV and DPDK. The centralized management model of MOSK and Mirantis Container Cloud also enables the easy management of multiple MOSK deployments with full lifecycle management.
Legacy workload migration: With the industry moving toward cloud-native technologies many older or legacy applications are not able to be moved easily and often it does not make financial sense to transform the applications to cloud-native applications. MOSK provides a stable cloud platform that can cost-effectively host legacy applications whilst still providing the expected levels of control, customization, and uptime.

Reference Architecture¶

Mirantis OpenStack for Kubernetes (MOSK) is a virtualization platform that provides an infrastructure for cloud-ready applications, in combination with reliability and full control over the data.

MOSK combines OpenStack, an open-source cloud infrastructure software, with application management techniques used in the Kubernetes ecosystem that include container isolation, state enforcement, declarative definition of deployments, and others.

MOSK integrates with Mirantis Container Cloud to rely on its capabilities for bare-metal infrastructure provisioning, Kubernetes cluster management, and continuous delivery of the stack components.

MOSK simplifies the work of a cloud operator by automating all major cloud life cycle management routines including cluster updates and upgrades.

Cluster types¶

The types of Mirantis OpenStack for Kubernetes (MOSK) clusters include:

Bootstrap cluster

Runs the bootstrap process on a seed data center bare metal node.
Requires access to the bare metal provider backend.
Initially, the bootstrap cluster is created with the following minimal set of components: Bootstrap Controller, public API charts, and the Bootstrap API.
The user can interact with the bootstrap cluster through the Bootstrap API to create the configuration for a management cluster and start its deployment. More specifically, the user performs the following operations:
1. Create required deployment objects.
2. Optionally add proxy and SSH keys.
3. Configure the cluster and machines.
4. Deploy the management cluster.
The user can monitor the deployment progress of the cluster and machines.
After a successful deployment, the user can download the kubeconfig artifact of the provisioned cluster.

Management cluster

Provides the following functionality:

Runs all public APIs and services including the management console of the MOSK management cluster.
Runs the baremetal-specific services and internal API including LCMMachine and LCMCluster. Also, it runs an LCM controller for orchestrating MOSK clusters and other controllers for handling different resources.
Requires two-way access to the bare metal provider backend. The provider connects to a backend to spawn MOSK cluster nodes, and the agent running on the nodes accesses the management cluster to obtain the deployment information.

For deployment details of a management cluster, see Deploy a management cluster.

MOSK cluster

A Mirantis Kubernetes Engine (MKE) cluster that an end user creates using the management console.
Requires access to its management cluster. Each node of a MOSK cluster runs an LCM Agent that connects to the LCM machine of the management cluster to obtain the deployment details.
Combines OpenStack, an open-source cloud infrastructure software, with application management techniques used in the Kubernetes ecosystem

All types of MOSK clusters except the bootstrap cluster are based on the MKE and Mirantis Container Runtime (MCR) architecture. For details, see MKE and MCR documentation.

The following diagram illustrates the distribution of services between each type of MOSK clusters:

All artifacts used by Kubernetes and workloads are stored on the Mirantis content delivery network (CDN):

mirror.mirantis.com (Debian packages including the Ubuntu mirrors)
binary.mirantis.com (Helm charts and binary artifacts)
mirantis.azurecr.io (Docker image registry)

All MOSK components are deployed in the Kubernetes clusters. All MOSK management APIs are implemented using the Kubernetes Custom Resource Definition (CRD) that represents custom objects stored in Kubernetes and allows you to expand Kubernetes API.

The MOSK logic is implemented using controllers. A controller handles changes in custom resources defined in the controller CRD. A custom resource consists of a spec that describes the desired state of a resource provided by the user. During every change, a controller reconciles the external state of a custom resource with the user parameters and stores this external state in the status subresource of its custom resource.

Deployment profiles¶

A Mirantis OpenStack for Kubernetes (MOSK) deployment profile is a thoroughly tested and officially supported reference architecture that is guaranteed to work at a specific scale and is tailored to the demands of a specific business case, such as generic IaaS cloud, Network Function Virtualisation infrastructure, Edge Computing, and others.

A deployment profile is defined as a combination of:

Services and features the cloud offers to its users.
Non-functional characteristics that users and operators should expect when running the profile on top of a reference hardware configuration. Including, but not limited to:
- Performance characteristics, such as an average network throughput between VMs in the same virtual network.
- Reliability characteristics, such as the cloud API error response rate when recovering a failed controller node.
- Scalability characteristics, such as the total amount of virtual routers tenants can run simultaneously.
Hardware requirements - the specification of physical servers, and networking equipment required to run the profile in production.
Deployment parameters that an operator for the cloud can tweak within a certain range without being afraid of breaking the cloud or losing support.

In addition, the following items may be included in a definition:

Compliance-driven technical requirements, such as TLS encryption of all external API endpoints.
Foundation-level software components, such as Tungsten Fabric or Open vSwitch as a backend for the networking service.

Note

Mirantis reserves the right to revise the technical implementation of any profile at will while preserving its definition - the functional and non-functional characteristics that operators and users are known to rely on.

MOSK supports a huge list of different deployment profiles to address a wide variety of business tasks. The table below includes the profiles for the most common use cases.

Note

Some components of a MOSK cluster are mandatory and are being installed during the managed cluster deployment by Container Cloud regardless of the deployment profile in use. StackLight is one of the cluster components that are enabled by default. See Container Cloud Operations Guide for details.

Supported deployment profiles¶
Profile	OpenStackDeployment CR Preset	Description
Cloud Provider Infrastructure (CPI)	`compute`	Provides the core set of the services an IaaS vendor would need including some extra functionality. The profile is designed to support up 50-70 compute nodes and a reasonable number of storage nodes. 0 The core set of services provided by the profile includes: Compute (Nova) Images (Glance) Networking (Neutron with Open vSwitch as a backend) Identity (Keystone) Block Storage (Cinder) Orchestration (Heat) Load balancing (Octavia) DNS (Designate) Secret Management (Barbican) Web front end (Horizon) Bare metal provisioning (Ironic) 1 2
CPI with Tungsten Fabric	`compute-tf`	A variation of the CPI profile 1 with Tugsten Fabric as a backend for networking.

0: The supported node count is approximate and may vary depending on the hardware, cloud configuration, and planned workload.
1(1,2): Ironic is an optional component for the CPI profile. See Bare Metal service for details.
2: Ironic is not supported for the CPI with Tungsten Fabric profile. See Tungsten Fabric known limitations for details.

Components overview¶

Mirantis OpenStack for Kubernetes (MOSK) includes the following key design elements.

HelmBundle Operator¶

The HelmBundle Operator is the realization of the Kubernetes Operator pattern that provides a Kubernetes custom resource of the HelmBundle kind and code running inside a pod in Kubernetes. This code handles changes, such as creation, update, and deletion, in the Kubernetes resources of this kind by deploying, updating, and deleting groups of Helm releases from specified Helm charts with specified values.

OpenStack¶

The OpenStack platform manages virtual infrastructure resources, including virtual servers, storage devices, networks, and networking services, such as load balancers, as well as provides management functions to the tenant users.

Various OpenStack services are running as pods in Kubernetes and are represented as appropriate native Kubernetes resources, such as Deployments, StatefulSets, and DaemonSets.

For a simple, resilient, and flexible deployment of OpenStack and related services on top of a Kubernetes cluster, MOSK uses OpenStack-Helm that provides a required collection of the Helm charts.

Also, MOSK uses OpenStack Controller (Rockoon) as the realization of the Kubernetes Operator pattern. Rockoon provides a custom Kubernetes resource of the OpenStackDeployment kind and code running inside a pod in Kubernetes. This code handles changes such as creation, update, and deletion in the Kubernetes resources of this kind by deploying, updating, and deleting groups of the Helm releases.

Ceph¶

Ceph is a distributed storage platform that provides storage resources, such as objects and virtual block devices, to virtual and physical infrastructure.

MOSK uses Rook as the implementation of the Kubernetes Operator pattern that manages resources of the CephCluster kind to deploy and manage Ceph services as pods on top of Kubernetes to provide Ceph-based storage to the consumers, which include OpenStack services, such as Volume and Image services, and underlying Kubernetes through Ceph CSI (Container Storage Interface).

The Ceph Controller is the implementation of the Kubernetes Operator pattern, that manages resources of the MiraCeph kind to simplify management of the Rook-based Ceph clusters.

StackLight Logging, Monitoring, and Alerting¶

The StackLight component is responsible for collection, analysis, and visualization of critical monitoring data from physical and virtual infrastructure, as well as alerting and error notifications through a configured communication system, such as email. StackLight includes the following key sub-components:

Prometheus
OpenSearch
OpenSearch Dashboards
Fluentd

Requirements¶

MOSK cluster hardware requirements¶

This section provides hardware requirements for the Mirantis Container Cloud management cluster with a managed Mirantis OpenStack for Kubernetes (MOSK) cluster.

For installing MOSK, the Mirantis Container Cloud management cluster and managed cluster must be deployed with baremetal provider.

Important

A MOSK cluster is to be used for a deployment of an OpenStack cluster and its components. Deployment of third-party workloads on a MOSK cluster is neither allowed nor supported.

Note

One of the industry best practices is to verify every new update or configuration change in a non-customer-facing environment before applying it to production. Therefore, Mirantis recommends having a staging cloud, deployed and maintained along with the production clouds. The recommendation is especially applicable to the environments that:

Receive updates often and use continuous delivery. For example, any non-isolated deployment of Mirantis Container Cloud.
Have significant deviations from the reference architecture or third party extensions installed.
Are managed under the Mirantis OpsCare program.
Run business-critical workloads where even the slightest application downtime is unacceptable.

A typical staging cloud is a complete copy of the production environment including the hardware and software configurations, but with a bare minimum of compute and storage capacity.

The table below describes the node types the MOSK reference architecture includes.

MOSK node types¶
Node type	Description
Mirantis Container Cloud management cluster nodes	The Container Cloud management cluster architecture on bare metal requires three physical servers for manager nodes. On these hosts, we deploy a Kubernetes cluster with services that provide Container Cloud control plane functions.
OpenStack control plane node and StackLight node	Host OpenStack control plane services such as database, messaging, API, schedulers conductors, and L3 and L2 agents, as well as the StackLight components. Note MOSK enables the cloud operator to collocate the OpenStack control plane with the managed cluster master nodes on the OpenStack deployments of a small size. This capability is available as technical preview. Use such configuration for testing and evaluation purposes only.
Tenant gateway node	Optional. Hosts OpenStack gateway services including L2, L3, and DHCP agents. The tenant gateway nodes are combined with OpenStack control plane nodes. The strict requirement is a dedicated physical network (bond) for tenant network traffic.
Tungsten Fabric control plane node	Required only if Tungsten Fabric is enabled as a backend for the OpenStack networking. These nodes host the TF control plane services such as Cassandra database, messaging, API, control, and configuration services.
Tungsten Fabric analytics node	Unsupported since MOSK 24.2 Required only if Tungsten Fabric is enabled as a backend for the OpenStack networking. These nodes host the TF analytics services such as Cassandra, ZooKeeper, and collector.
Compute node	Hosts the OpenStack Compute services such as QEMU, L2 agents, and others.
Infrastructure nodes	Runs underlying Kubernetes cluster management services. The MOSK reference configuration requires minimum three infrastructure nodes.

The table below specifies the hardware resources the MOSK reference architecture recommends for each node type.

Hardware requirements¶
Node type	# of servers	CPU cores # per server	RAM per server, GB	Disk space per server, GB	NICs # per server
Management cluster node	3 0	32 ^{min 16}	128 ^{min 64}	1 NVME or SSD x 240 1 NVME or SSD x 960 1	2 or 1 x 2-port 2
OpenStack control plane, gateway 3, and StackLight node	3 or more	32	128	1 SSD x 500 2 SSD x 1000 5	5
Tenant gateway node (optional)	0-3	32	128	1 SSD x 500	5
Tungsten Fabric control plane node	3	16	64	1 SSD x 500	1
Tungsten Fabric analytics node Unsupported since MOSK 24.2	3	32	64	1 SSD x 1000	1
Compute node	3 (varies)	16	64	1 SSD x 500 6	5
Infrastructure node (Kubernetes cluster management)	3	16	64	1 SSD x 500	5
Infrastructure node (Ceph) 4	3	16	64	1 SSD x 500 2 HDDs x 2000	5

Note

The exact hardware specifications and number of the control plane and gateway nodes depend on a cloud configuration and scaling needs. For example, for the clouds with more than 12,000 Neutron ports, Mirantis recommends increasing the number of gateway nodes.

0

Adding more than 3 nodes to a management cluster is not supported.

1

In total, at least 2 disks are required:

disk0 - system storage, recommended - 240 GB, minimum - 120 GB.
disk1 - management cluster services storage, recommended - 960 GB, minimum - 480 GB that must include 2 volumes for management services (total 50 GB). For details about required StackLight volumes, see the corresponding footnote below. The exact capacity requirements depend on configured volume sizes for StackLight, MariaDB, and mcc-cache.

For other storage details, see Management cluster storage.

2

Only one PXE port per node is allowed. The out-of-band management (IPMI) port is not included.

3

OpenStack gateway services can optionally be moved to separate nodes.

4

A Ceph cluster with 3 Ceph nodes does not provide hardware fault tolerance and is not eligible for recovery operations, such as a disk or an entire node replacement. Therefore, a minimum of 5 Ceph nodes is recommended for production use.
A Ceph cluster uses the replication factor that equals 3. If the number of Ceph OSDs is less than 3, a Ceph cluster moves to the degraded state with the write operations restriction until the number of alive Ceph OSDs equals the replication factor again.

5

1 SSD x 500 for operating system
1 SSD x 1000 for OpenStack LVP
1 SSD x 1000 for StackLight LVP that must include 4 volumes per node:
- Alertmanager - 2 GB (hardcoded)
- PostgreSQL - 10 GB (hardcoded)
- Prometheus - 16 GB by default
- OpenSearch - 30 GB by default
For production deployments, Mirantis recommends increasing the default values for Prometheus and OpenSearch to store more metrics and logs, according to your needs.

6

When Nova is used with local folders, additional capacity is required depending on the VM images size.

Note

If you would like to evaluate the MOSK capabilities and do not have much hardware at your disposal, you can deploy it in a virtual environment. For example, on top of another OpenStack cloud using the sample Heat templates.

Please mind, the tooling is provided for reference only and is not a part of the product itself. Mirantis does not guarantee its interoperability with any MOSK version.

Management cluster storage¶

The management cluster requires minimum two storage devices per node. Each device is used for different type of storage:

The first device is always used for boot partitions and the root file system. SSD is recommended. A RAID device is not supported.
One storage device per server is reserved for local persistent volumes. These volumes are served by the Local Storage Static Provisioner, that is local-volume-provisioner, and used by many services of MOSK.

You can configure host storage devices using BareMetalHostProfile resources. For details, see Create a custom bare metal host profile.

System requirements for the seed node¶

The seed node is only necessary to deploy the management cluster. When the bootstrap is complete, the bootstrap node can be discarded and added back to the MOSK cluster as a node of any type.

The minimum reference system requirements for a baremetal-based bootstrap seed node are as follow:

Basic Ubuntu 18.04 server with the following configuration:
- Kernel of version 4.15.0-76.86 or later
- 8 GB of RAM
- 4 CPU
- 10 GB of free disk space for the bootstrap cluster cache
No DHCP or TFTP servers on any NIC networks
Routable access IPMI network for the hardware servers.
Internet access for downloading of all required artifacts

If you use a firewall or proxy, make sure that the bootstrap and management clusters have access to the following IP ranges and domain names:

IP ranges:
- Microsoft Azure (only IP addresses for MicrosoftContainerRegistry)
- Amazon AWS (only IP addresses for "service": "CLOUDFRONT")
- Salesforce
Domain names:
- mirror.mirantis.com and repos.mirantis.com for packages
- binary.mirantis.com for binaries and Helm charts
- mirantis.azurecr.io and *.blob.core.windows.net for Docker images
- mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)
- mirantis.my.salesforce.com and login.salesforce.com for Salesforce alerts

Note

Access to Salesforce is required from any cluster type.
If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Components collocation¶

MOSK uses Kubernetes labels to place components onto hosts. For the default locations of components, see MOSK cluster hardware requirements. Additionally, MOSK supports component collocation. This is mostly useful for OpenStack compute and Ceph nodes. For component collocation, consider the following recommendations:

When calculating hardware requirements for nodes, consider the requirements for all collocated components.
When performing maintenance on a node with collocated components, execute the maintenance plan for all of them.
When combining other services with the OpenStack compute host, verify that reserved_host_* has increased accordingly to the needs of collocated components by using node-specific overrides for the compute service.

See also

Infrastructure requirements¶

This section lists the infrastructure requirements for the Mirantis OpenStack for Kubernetes (MOSK) reference architecture.

Infrastructure requirements¶
Service	Description
MetalLB	MetalLB exposes external IP addresses of cluster services to access applications in a Kubernetes cluster.
DNS	The Kubernetes Ingress NGINX controller is used to expose OpenStack services outside of a Kubernetes deployment. Access to the Ingress services is allowed only by its FQDN. Therefore, DNS is a mandatory infrastructure service for an OpenStack on Kubernetes deployment.

See also

Configure DNS to access OpenStack

DHCP range requirements for PXE¶

When setting up the network range for DHCP Preboot Execution Environment (PXE), keep in mind several considerations to ensure smooth server provisioning:

Determine the network size. For instance, if you target a concurrent provision of 50+ servers, a /24 network is recommended. This specific size is crucial as it provides sufficient scope for the DHCP server to provide unique IP addresses to each new Media Access Control (MAC) address, thereby minimizing the risk of collision.

The concept of collision refers to the likelihood of two or more devices being assigned the same IP address. With a /24 network, the collision probability using the SDBM hash function, which is used by the DHCP server, is low. If a collision occurs, the DHCP server provides a free address using a linear lookup strategy.
In the context of PXE provisioning, technically, the IP address does not need to be consistent for every new DHCP request associated with the same MAC address. However, maintaining the same IP address can enhance user experience, making the /24 network size more of a recommendation than an absolute requirement.
For a minimal network size, it is sufficient to cover the number of concurrently provisioned servers plus one additional address (50 + 1). This calculation applies after covering any exclusions that exist in the range. You can define excludes in the corresponding field of the Subnet object. For details, see API Reference: Subnet resource.
When the available address space is less than the minimum described above, you will not be able to automatically provision all servers. However, you can manually provision them by combining manual IP assignment for each bare metal host with manual pauses. For these operations, use the host.dnsmasqs.metal3.io/address and baremetalhost.metal3.io/detached annotations in the BareMetalHostInventory object. For details, see Manually allocate IP addresses for bare metal hosts.
All addresses within the specified range must remain unused before provisioning. If an IP address in-use is issued by the DHCP server to a BOOTP client, that specific client cannot complete provisioning.

Management cluster¶

This section outlines key components of a Mirantis OpenStack for Kubernetes (MOSK) management cluster.

Bare metal provider¶

MOSK bare metal provider provisions nodes of management and MOSK clusters and runs the LCM Agent on these nodes. It runs in a management cluster and requires connection to the bare metal provider backend.

The bare metal provider interacts with the following types of public API objects:

Public API object name	Description
`KaaSRelease`	Contains the following information about clusters: Version of the supported Cluster release for a management cluster List of supported Cluster releases for MOSK clusters and supported upgrade path Description of Helm charts that are installed on a management cluster
`ClusterRelease`	Provides a specific version of a management or MOSK cluster. Any Cluster release object, as well as a Container Cloud release object never changes, only new releases can be added. Any change leads to a new release of a cluster. Contains references to all components and their versions that are used to deploy all cluster types: LCM components: LCM Agent Ansible playbooks Scripts Description of steps to execute during a cluster deployment and upgrade Helm Controller image references Supported Helm charts description: Helm chart name and version Helm release name Helm values
`Cluster`	References the `BareMetalHostCredential`, `KaaSRelease` and `ClusterRelease` objects. Represents all cluster-level resources, for example, networks, load balancer for the Kubernetes API, and so on. It uses data from the `BareMetalHostCredential` object to create these resources and data from the `KaaSRelease` and `ClusterRelease` objects to ensure that all lower-level cluster objects are created.
`Machine`	References the `Cluster` object. Represents one node of a MOSK cluster and contains all data to provision it.
`BareMetalHostCredential`	Contains all information about the Baseboard Management Controller (`bmc`) credentials.
`PublicKey`	Is provided to every machine to obtain SSH access.

The bare metal provider performs the following operations:

Consumes the below types of data from a management cluster:
- Credentials to connect to the provider backend
- Deployment instructions from the KaaSRelease and ClusterRelease objects
- The cluster-level parameters from the Cluster objects
- The machine-level parameters from the Machine objects
Prepares data for all MOSK components:
- Creates the LCMCluster and LCMMachine custom resources for LCM Controller and LCM Agent. The LCMMachine custom resources are created empty to be later handled by the LCM Controller.
- Creates the HelmBundle custom resources for the Helm Controller using data from the KaaSRelease and ClusterRelease objects.
- Creates service accounts for these custom resources.
- Creates a scope in Identity and access management (IAM) for a user access to a MOSK cluster.
Provisions nodes for a MOSK cluster.
Installs and enables LCM Agent using the cloud-init script.
Installs Helm Controller as a Helm v3 chart.

The following diagram illustrates the bare metal provider data flow:

Release Controller¶

The MOSK Release Controller is responsible for the following functionality:

Monitor and control the KaaSRelease and ClusterRelease objects present in a management cluster. If any release object is used in a cluster, the Release Controller prevents the deletion of such an object.
Sync the KaaSRelease and ClusterRelease objects published at https://binary.mirantis.com/releases/ with an existing management cluster.
Trigger the managementc cluster auto-update procedure if a new KaaSRelease object is found:
1. Search for MOSK clusters with old Cluster releases that are not supported by a new KaaSRelease. If any are detected, abort the auto-update and display the corresponding note about an old Cluster release in the MOSK management console. In this case, a user must update all MOSK clusters to the Cluster releases supported by a new KaaSRelease for the auto-update to be retriggered by the Release Controller.
2. Trigger the KaaSRelease update of all components in the management cluster. The update itself is processed by the bare metal provider.
3. Trigger the ClusterRelease update of the management cluster to the Cluster release version that is indicated in the updated KaaSRelease version. The LCMCluster components, such as MKE, are updated before the HelmBundle components, such as StackLight or Ceph.
  
  Once the management cluster is updated, an option to update managed clusters becomes available in the MOSK management console. During a managed cluster update, all cluster components including Kubernetes are automatically updated to newer versions if available. The LCMCluster components, such as MKE, are updated before the HelmBundle components, such as StackLight or Ceph.

The Operator can delay the automatic update procedure for a limited amount of time or schedule update to run at desired hours or weekdays. For details, see Schedule Mirantis Container Cloud updates.

MOSK remains operational during the management cluster update. MOSK clusters are not affected during this update. For the list of components that are updated during the management cluster update, see the Components versions section of the corresponding major Container Cloud release in Container Cloud Release Notes: Container Cloud releases.

When Mirantis announces support of the newest versions of Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine (MKE), MOSK automatically upgrades these components as well. For the maintenance window best practices before upgrade of these components, see MKE Documentation.

Management console¶

MOSK management console is mainly designed to create and update MOSK clusters as well as add or remove machines to or from an existing cluster.

You can use the management console to obtain the management cluster details including endpoints, release version, and so on. The management cluster update occurs automatically with a new release change log available through the management console.

The management console is a JavaScript application that is based on the React framework. The management console is designed to work on a client side only. Therefore, it does not require a special backend. It interacts with the Kubernetes and Keycloak APIs directly. The management console uses a Keycloak token to interact with the management API and download kubeconfig for management and MOSK clusters.

The management console uses NGINX that runs on a management cluster and handles the management console static files. NGINX proxies the Kubernetes and Keycloak APIs for the management console.

Identity and access management¶

Identity and access management (IAM) provides a central point of users and permissions management of a MOSK cluster resources in a granular and unified manner. Also, IAM provides infrastructure for single sign-on user experience across all MOSK web portals.

IAM for MOSK consists of the following components:

Keycloak

Provides the OpenID Connect endpoint
Integrates with an external identity provider (IdP), for example, existing LDAP or Google Open Authorization (OAuth)
Stores roles mapping for users

IAM Controller

Provides IAM API with data about MOSK projects
Handles all role-based access control (RBAC) components in Kubernetes API

IAM API

Provides an abstraction API for creating user scopes and roles

External identity provider integration¶

To be consistent and keep the integrity of a user database and user permissions, in MOSK, IAM stores the user identity information internally. However in real deployments, the identity provider usually already exists.

Out of the box, in MOSK, IAM supports integration with LDAP and Google Open Authorization (OAuth). If LDAP is configured as an external identity provider, IAM performs one-way synchronization by mapping attributes according to configuration.

In case of the Google Open Authorization (OAuth) integration, the user is automatically registered and their credentials are stored in the internal database according to the user template configuration. The Google OAuth registration workflow is as follows:

The user requests a MOSK management console resource.
The user is redirected to the IAM login page and logs in using the Log in with Google account option.
IAM creates a new user with the default access rights that are defined in the user template configuration.
The user can access the MOSK management console resource.

The following diagram illustrates the external IdP integration to IAM:

You can configure simultaneous integration with both external IdPs with the user identity matching feature enabled.

Authentication and authorization¶

Mirantis IAM uses the OpenID Connect (OIDC) protocol for handling authentication.

Implementation flow¶

Mirantis IAM performs as an OpenID Connect (OIDC) provider, it issues a token and exposes discovery endpoints.

The credentials can be handled by IAM itself or delegated to an external identity provider (IdP).

The issued JSON Web Token (JWT) is sufficient to perform operations across MOSK according to the scope and role defined in it. Mirantis recommends using asymmetric cryptography for token signing (RS256) to minimize the dependency between IAM and managed components.

When MOSK calls Mirantis Kubernetes Engine (MKE), the user in Keycloak is created automatically with a JWT issued by Keycloak on behalf of the end user. MKE, in its turn, verifies whether the JWT is issued by Keycloak. If the user retrieved from the token does not exist in the MKE database, the user is automatically created in the MKE database based on the information from the token.

The authorization implementation is out of scope of IAM in MOSK. This functionality is delegated to the component level. IAM interacts with a MOSK component using the OIDC token content that is processed by a component itself and required authorization is enforced. Such an approach enables you to have any underlying authorization that is not dependent on IAM and still to provide a unified user experience across all MOSK components.

See also

External identity provider integration

Kubernetes CLI authentication flow¶

The following diagram illustrates the Kubernetes CLI authentication flow. The authentication flow for Helm and other Kubernetes-oriented CLI utilities is identical to the Kubernetes CLI flow, but JSON Web Tokens (JWT) must be pre-provisioned.

See also

IAM resources

Cloud services¶

Each section below is dedicated to a particular service provided by MOSK. They contain configuration details and usage samples of supported capabilities provided through the custom resources.

Core cloud services:

Compute service¶

Mirantis OpenStack for Kubernetes (MOSK) provides instances management capability through the Compute service (OpenStack Nova). The Compute service interacts with other OpenStack components of an OpenStack environment to provide life-cycle management of the virtual machine instances.

Resource oversubscription¶

The Compute service (OpenStack Nova) enables you to spawn instances that can collectively consume more resources than what is physically available on a compute node through resource oversubscription, also known as overcommit or allocation ratio.

Resources available for oversubscription on a compute node include the number of CPUs, amount of RAM, and amount of available disk space. When making a scheduling decision, the scheduler of the Compute service takes into account the actual amount of resources multiplied by the allocation ratio. Thereby, the service allocates resources based on the assumption that not all instances will be using their full allocation of resources at the same time.

Oversubscription enables you to increase the density of workloads and compute resource utilization and, thus, achieve better Return on Investment (ROI) on compute hardware. In addition, oversubscription can also help avoid the need to create too many fine-grained flavors, which is commonly known as flavor explosion.

Configuring initial resource oversubscription¶

Available since MOSK 23.1

There are two ways to control the oversubscription values for compute nodes:

The legacy approach entails utilizing the {cpu,disk,ram}_allocation_ratio configuration options offered by the Compute service. A drawback of this method is that restarting the Compute service is mandatory to apply the new configuration. This introduces the risk of possible interruptions of cloud user operations, for example, instance build failures.
The modern and recommended approach, adopted in MOSK 23.1, involves using the initial_{cpu,disk,ram}_allocation_ratio configuration options, which are employed exclusively during the initial provisioning of a compute node. This may occur during the initial deployment of the cluster or when new compute nodes are added subsequently. Any further alterations can be performed dynamically using the OpenStack Placement service API without necessitating the restart of the service.

There is no definitive method for selecting optimal oversubscription values. As a cloud operator, you should continuously monitor your workloads, ideally have a comprehensive understanding of their nature, and experimentally determine the maximum values that do not impact performance. This approach ensures maximum workload density and cloud resource utilization.

To configure the initial compute resource oversubscription in MOSK, specify the spec:features:nova:allocation_ratios parameter in the OpenStackDeployment custom resource as explained in the table below.

Resource oversubscription configuration¶
Parameter	`spec:features:nova:allocation_ratios`
Configuration	Configure initial oversubscription of CPU, disk space, and RAM resources on compute nodes. By default, the following values are applied: `cpu: 8.0` `disk: 1.6` `ram: 1.0` Note In MOSK 22.5 and earlier, the effective default value of RAM allocation ratio is `1.1`. Warning Mirantis strongly advises against oversubscribing RAM, by any amount. See Preventing resource overconsumption for details. Changing the resource oversubscription configuration through the `OpenStackDeployment` resource after cloud deployment will only affect the newly added compute nodes and will not change oversubscription for already existing compute nodes. To change oversubscription for already existing compute nodes, use the placement service API as described in Change oversubscription settings for existing compute nodes.
Usage	Configuration example: kind: OpenStackDeployment spec: features: nova: allocation_ratios: cpu: 8 disk: 1.6 ram: 1.0 Configuration example of setting different oversubscription values for specific nodes: spec: nodes: compute-type::hi-perf: features: nova: allocation_ratios: cpu: 2.0 disk: 1.0 In the example configuration above, the compute nodes labeled with `compute-type=hi-perf` label will use less intense oversubscription on CPU and no oversubscription on disk.

Preventing resource overconsumption¶

When using oversubscription, it is important to conduct thorough cloud management and monitoring to avoid system overloading and performance degradation. If many or all instances on a compute node start using all allocated resources at once and, thereby, overconsume physical resources, failure scenarios depend on the resource being exhausted.

Symptoms of resource exhaustion¶
Affected resource	Symptoms
CPU	Workloads are getting slower as they actively compete for physical CPU usage. A useful indicator is the steal time as reported inside the workload, which is a percentage of time the operating system in the workload is waiting for actual physical CPU core availability to run instructions. To verify the steal time in the Linux-based workload, use the top command: top -bn1 \| head \| grep st$ \| awk -F ',' '{print $NF}' Generally, steal times of >10 for 20-30 minutes are considered alarming.
RAM	Operating system on the compute node starts to aggressively use physical swap space, which significantly slows the workloads down. Sometimes, when the swap is also exhausted, the operating system of a compute node can outright OOM kill most offending processes, which can cause major disruptions to workloads or a compute node itself. Warning While it may seem like a good idea to make the most of available resources, oversubscribing RAM can lead to various issues and is generally not recommended due to potential performance degradation, reduced stability, and security risks for the workloads. Mirantis strongly advises against oversubscribing RAM, by any amount.
Disk space	Depends on the physical layout of storage. Virtual root and ephemeral storage devices that are hosted on a compute node itself are put in the read-only mode negatively affecting workloads. Additionally, the file system used by the operating system on a compute node may become read-only too blocking the compute node operability. See also Image storage backend

There are workload types that are not suitable for running in an oversubscribed environment, especially those with high performance, latency-sensitive, or real-time requirements. Such workloads are better suited for compute nodes with dedicated CPUs, ensuring that only processes of a single instance run on each CPU core.

See also

Virtual CPU¶

MOSK provides the capability to configure virtual CPU types for OpenStack instances through the OpenStackDeployment custom resource. This feature enables cloud user to tailor performance and resource allocation within their OpenStack environment to meet specific workload demands effectively.

Parameter	`spec:features:nova:vcpu_type`
Usage	Configures the type of virtual CPU that Nova will use when creating instances. The list of supported CPU models include `host-model` (default), `host-passthrough`, and custom models.

Parameter

spec:features:nova:vcpu_type

Usage

Configures the type of virtual CPU that Nova will use when creating instances.

The list of supported CPU models include host-model (default), host-passthrough, and custom models.

The host-model CPU model¶

The host-model CPU model (default) mimics the host CPU and provides for decent performance, good security, and moderate compatibility with live migrations.

With this mode, libvirt finds an available predefined CPU model that best matches the host CPU, and then explicitly adds the missing CPU feature flags to closely match the host CPU features. To mitigate known security flaws, libvirt automatically adds critical CPU flags, supported by installed libvirt, QEMU, kernel, and CPU microcode versions.

This is a safe choice if your OpenStack compute node CPUs are of the same generation. If your OpenStack compute node CPUs are sufficiently different, for example, span multiple CPU generations, Mirantis strongly recommends setting explicit CPU models supported by all of your OpenStack compute node CPUs or organizing your OpenStack compute nodes into host aggregates and availability zones that have largely identical CPUs.

Note

The host-model model does not guarantee two-way live migrations between nodes.

When migrating instances, the libvirt domain XML is first copied as is to the destination OpenStack compute node. Once the instance is hard rebooted or shut down and started again, the domain XML will be re-generated. If versions of libvirt, kernel, CPU microcode, or BIOS firmware differ from what they were on the source compute node the instance was started before, libvirt may pick up additional CPU feature flags, making it impossible to live-migrate back to the original compute node.

The host-passthrough CPU model¶

The host-passthrough CPU model provides maximum performance, especially when nested virtualization is required or if live migration support is not a concern for workloads. Live migration requires exactly the same CPU on all OpenStack compute nodes, including the CPU microcode and kernel versions. Therefore, for live migrations support, organize your compute nodes into host aggregates and availability zones. For workload migration between non-identical OpenStack compute nodes, contact Mirantis support.

For example, to set the host-passthrough CPU model for all OpenStack compute nodes:

spec:
  features:
    nova:
      vcpu_type: host-passthrough

Custom CPU model¶

MOSK enables you to specify a comma-separated list of exact QEMU CPU models to create and emulate. Specify the common and less advanced CPU models first. All explicit CPU models provided must be compatible with the OpenStack compute node CPUs.

To specify an exact CPU model, review the available CPU models and their features. List and inspect the /usr/share/libvirt/cpu_map/*.xml files in the libvirt containers of pods of the libvirt DeamonSet or multiple DaemonSets if you are using node-specific settings.

For example, for nodes that are labeled with processor=amd-epyc, set a custom EPYC CPU model:

spec:
  nodes:
    processor::amd-epyc:
      features:
        nova:
          vcpu_type: EPYC

See also

Instance migration¶

OpenStack supports the following types of instance migrations:

Cold migration (also referred to simply as migration)
The process involves shutting down the instance, copying its definition and disk, if necessary, to another host, and then starting the instance again on the new host.

This method disrupts the workload running inside the instance but allows for more reliability and works for most types of instances and consumed resources.
Live migration
The process involves copying the instance definition, memory, and disk, if necessary, to another host while the instance continues running, without shutting it down. The instance then momentarily switches to run on the new host.

While generally less disruptive to workloads, this method is less reliable and imposes more restrictions on the instance and target host properties to succeed.

Configuring live migration¶

As a cloud operator, you can configure live migration through the OpenStackDeployment custom resource. The following table provides the details on available configuration.

Parameter	Usage
`features:nova:live_migration_interface`	Specifies the name of the NIC device on the actual host that will be used by Nova for the live migration of instances. Mirantis recommends setting up your Kubernetes hosts in such a way that networking is configured identically on all of them, and names of the interfaces serving the same purpose or plugged into the same network are consistent across all physical nodes. Also, set the option to `vhost0` in the following cases: The Neutron service uses Tungsten Fabric. Nova migrates instances through the interface specified by the Neutron `tunnel_interface` parameter.
`features:nova:libvirt:tls`	Available since MOSK 23.2. If set to `true`, enables the live migration over TLS: spec: features: nova: libvirt: tls: enabled: true See also Encryption of live migration data.

Parameter

Usage

features:nova:live_migration_interface

Specifies the name of the NIC device on the actual host that will be used by Nova for the live migration of instances.

Mirantis recommends setting up your Kubernetes hosts in such a way that networking is configured identically on all of them, and names of the interfaces serving the same purpose or plugged into the same network are consistent across all physical nodes.

Also, set the option to vhost0 in the following cases:

The Neutron service uses Tungsten Fabric.
Nova migrates instances through the interface specified by the Neutron tunnel_interface parameter.

features:nova:libvirt:tls

Available since MOSK 23.2. If set to true, enables the live migration over TLS:

spec:
  features:
    nova:
      libvirt:
        tls:
          enabled: true

Allowing non-administrative users to migrate instances¶

Available since MOSK 24.3

MOSK provides the following distinct sets of policies that govern access to cold and live migrations:

os_compute_api:os-migrate-server:migrate and os_compute_api:os-migrate-server:migrate_live define the ability to initiate migrations without specifying the target host. In this case, the OpenStack Compute scheduler selects the best suited target host automatically.
os_compute_api:os-migrate-server:migrate:host and os_compute_api:os-migrate-server:migrate_live:host define the ability to initiate migration together with specifying the target host. Depending on the API microversion used to start the migration, the host is either validated by the scheduler (recommended) or forced regardless of other considerations. The latter option is not recommended as it may lead to inconsistencies in the internal state of the Compute service.

Since MOSK 24.3, the default policies for migrations without the target host specification is set to rule: project_member_or_admin. This means that migration is available to both cloud administrators and project users with the member role.

The migration to a specific host requires administrative privileges.

Revert to admin-only migrations¶

If the default policy does not suit your deployment, you can require administrative access for all instance migrations by setting these policy values to rule:context_is_admin, or any other value appropriate for your use case.

If you use the default policies and want to revert to the old defaults, ensure that the following snippet is present in your OpenStackDeployment custom resource:

kind: OpenStackDeployment
spec:
  features:
    policies:
      nova:
        os_compute_api:os-migrate-server:migrate: rule:context_is_admin
        os_compute_api:os-migrate-server:migrate_live: rule:context_is_admin

Image storage backend¶

Parameter	`features:nova:images:backend`
Usage	Defines the type of storage for Nova to use on the compute hosts for the images that back up the instances. The list of supported options include: `local` ^Deprecated Option is deprecated and replaced by `qcow2`. `qcow2` The local storage is used. The backend disk image format is `qcow2`. The pros include faster operation, failure domain independency from the external storage. The cons include local space consumption and less performant and robust live migration with block migration. `raw` ^{Available since 24.2} The local storage is used. The backend disk image format is `raw`. Raw images are simple binary dumps of disk data, including empty space, resulting in larger file sizes. They provide superior performance because they do not incur overhead from features such as compression or copy-on-write, which are present in the `qcow2` disk images. `ceph` Instance images are stored in a Ceph pool shared across all Nova hypervisors. The pros include faster image start, faster and more robust live migration. The cons include considerably slower IO performance, workload operations direct dependency on Ceph cluster availability and performance. `lvm` ^TechPreview Instance images and ephemeral images are stored on a local Logical Volume. If specified, `features:nova:images:lvm:volume_group` must be set to an available LVM Volume Group, by default, `nova-vol`. For details, see Enable LVM ephemeral storage.

Parameter

features:nova:images:backend

Usage

Defines the type of storage for Nova to use on the compute hosts for the images that back up the instances.

The list of supported options include:

local ^Deprecated
Option is deprecated and replaced by qcow2.
qcow2
The local storage is used. The backend disk image format is qcow2. The pros include faster operation, failure domain independency from the external storage. The cons include local space consumption and less performant and robust live migration with block migration.
raw ^{Available since 24.2}
The local storage is used. The backend disk image format is raw. Raw images are simple binary dumps of disk data, including empty space, resulting in larger file sizes. They provide superior performance because they do not incur overhead from features such as compression or copy-on-write, which are present in the qcow2 disk images.
ceph
Instance images are stored in a Ceph pool shared across all Nova hypervisors. The pros include faster image start, faster and more robust live migration. The cons include considerably slower IO performance, workload operations direct dependency on Ceph cluster availability and performance.
lvm ^TechPreview
Instance images and ephemeral images are stored on a local Logical Volume. If specified, features:nova:images:lvm:volume_group must be set to an available LVM Volume Group, by default, nova-vol. For details, see Enable LVM ephemeral storage.

Remote console access to virtual machines¶

MOSK provides a number of different methods to interact with OpenStack virtual machines including VNC (default) and SPICE remote consoles. This section outlines how you can configure these different console services through the OpenStackDeployment custom resource.

noVNC-based VNC remote console¶

The noVNC client provides remote control or remote desktop access to guest virtual machines through the Virtual Network Computing (VNC) system. The MOSK Compute service users can access their instances using the noVNC clients through the noVNC proxy server.

The VNC remote console is enabled by default in MOSK.

To disable VNC remote console through the OpenStackDeployment custom resource, set spec:features:nova:console:novnc to false:

spec:
  features:
    nova:
      console:
        novnc:
          enabled: false

Encryption of data transfer for the noVNC client¶

Available since MOSK 23.1

MOSK uses TLS to secure public-facing VNC access on networks between a noVNC client and noVNC proxy server.

The features:nova:console:novnc:tls:enabled ensures that the data transferred between the instance and the noVNC proxy server is encrypted. Both servers use the VeNCrypt authentication scheme for the data encryption.

To enable the encrypted data transfer for noVNC, use the following structure in the OpenStackDeployment custom resource:

 kind: OpenStackDeployment
 spec:
   features:
     nova:
       console:
         novnc:
           tls:
             enabled: true

SPICE remote console¶

Available since MOSK 24.1 TechPreview

The VNC protocol has its limitations, such as the lack of support for multiple monitors, bi-directional audio, reliable cut-and-paste, video streaming, and others. The SPICE protocol aims to overcome these limitations and deliver a robust remote desktop support.

The SPICE remote console is disabled by default in MOSK.

To enable SPICE remote console through the OpenStackDeployment custom resource, set spec:features:nova:console:spice:enabled to true:

spec:
  features:
    nova:
      console:
        spice:
          enabled: true

GPU virtualization¶

Available since MOSK 24.1 TechPreview

MOSK provides GPU virtualization capabilities to its users through the NVIDIA vGPU and Multi-Instance GPU (MIG) technologies.

GPU virtualization is a capability offered by modern datacenter-grade GPUs, enabling the partitioning of a single physical GPU into smaller virtual devices, that can then be attached to individual virtual machines.

In contrast to the Peripheral Component Interconnect (PCI) passthrough feature, leveraging the GPU virtualization enables concurrent utilization of the same physical GPU device by multiple virtual machines. This enhances hardware utilization and fosters a more elastic consumption of expensive hardware resources.

When using GPU virtualization, the physical device and its drivers manage computing resource partitioning and isolation.

Untitled Diagram

The use case for GPU virtualization aligns with any application necessitating or benefiting from accelerated parallel floating-point calculations, such as graphic-intensive desktop workloads, for example, 3D modeling and rendering, as well as computationally intensive tasks, for example, artifial intelligence, specifically, machine learning training and classification.

At its core, GPU virtualization operates on base of the single-root input/output virtualization framework (SR-IOV), which is already widely used by datacenter-grade network adapters and mediated devices Linux kernel framework.

Hardware drivers¶

Typically, using GPU virtualization requires the installation of specific physical GPU drivers on the host system. For detailed instructions on obtaining and installing the required drivers, refer to official documentation from the vendor of your GPU.

For the latest family of NVIDIA GPUs under NVIDIA AI Enterprise, start with NVIDIA AI Enterprise documentation.

You can automate the configuration of drivers by adding a custom post-install script to the BareMetalHostProfile object of your MOSK cluster. See Configure GPU virtualization for details.

NVIDIA GPU virtualization modes¶

Certain NVIDIA GPUs, for example, Ampere GPU architecture and later, support GPU virtualization in two modes: time sliced (vGPU) or Multi-Instance GPU (MIG). Older architectures support only the time-sliced mode.

The distinction between these modes lies in resource isolation, dedicated performance levels, and partitioning flexibility.

Typically, there is no fixed rule dictating which mode should be used, as it depends on the intended workloads for the virtual GPUs and the level of experience and assurances the cloud operator aims to offer users. Below, there is a brief overview of the differences between these two modes.

Time-sliced vGPUs¶

In time-sliced vGPU mode, each virtual GPU is allocated dedicated slices of the physical GPU memory while sharing the physical GPU engines. Only one vGPU operates at a time, with full access to all physical GPU engines. The resource scheduler within the physical GPU regulates the timing of each vGPU execution, ensuring fair allocation of resources.

Therefore, this setup may encounter issues with noisy neighbors, where the performance of one vGPU is affected by resource contention from others. However, when not all available vGPU slots are occupied, the active ones can fully utilize the power of its physical GPU.

Advantages:

Potential ability to fully utilize the compute power of physical GPU, even if not all possible vGPUs have yet been created on that physical GPU.
Easier configuration.

Disadvantages:

Only a single vGPU type (size of the vGPU) can be created on any given physical GPU. The cloud operator must decide beforehand what type of vGPU each physical GPU will be providing.
Less strict resource isolation. Noisy neighbors and unpredictable level of performance for every single guest vGPU.

Multi-Instance GPUs¶

In Multi-Instance GPUs (MIG) mode, each virtual GPU is allocated dedicated physical GPU engines, exclusively utilized by that specific virtual GPU. Virtual GPUs run in parallel, each on its own engines according to their type.

Advantages:

Ability to partition a single physical GPU into various types of virtual GPUs. This approach provides cloud operators with enhanced flexibility in determining the available vGPU types for cloud users. However, the cloud operator has to decide beforehand what types of virtual GPU each physical GPU will be providing and partition each GPU accordingly.
Better resource isolation and guaranteed resource access with predictable performance levels for every virtual GPU.

Disadvantages:

Under-utilization of physical GPU when not all possible virtual GPU slots are occupied.
Comparatively complicated configuration, especially in heterogeneous hardware environments.

Known limitations¶

Note

Some of these restrictions may be lifted in future releases of MOSK.

Cloud users will face the following limitations when working with GPU virtualization in MOSK:

Inability to create several instances with virtual GPUs in one request if there is no physical GPU available that can fit all of them at once. For NVIDIA MIG, this effectively means that you cannot create several instances with virtual GPUs in one request.
Inability to create an instance with several virtual GPUs.
Inability to attach virtual GPU to or detach virtual GPU from a running instance.
Inability to live-migrate instances with virtual GPU attached.

Cloud operator will face the following limitations when configuring GPU virtualization in MOSK:

Partition of physical GPUs to virtual GPUs is static and not on-demand. You need to decide beforehand what types of virtual GPUs each physical GPU will get partinioned into. Changing of the partitioning requires removing all instances using virtual GPUs from the compute node before initiating the repartitioning process.
Repartitioning may require additional manual steps to eliminate orphan resource providers in the placement service, and thus, avoid resource over-reporting and instance scheduling problems.
Configuration of multiple virtual GPU types per node may be very verbose since configuration depends on particular PCI addresses of physical GPUs on each node.

See also

Configure GPU virtualization

Learn more

Graceful instance shutdown¶

Available since MOSK 24.3

Management of compute node reboots is an important Day 2 operation. Before shutting down a host, guest instances must either be migrated to other compute nodes or gracefully powered off. This ensures the integrity of disk filesystems and prevents damage to running applications.

MOSK provides the capability to automatically power off the instances during the compute node shutdown or reboot through the ACPI power event.

Graceful instance shutdown is managed using the systemd inhibit tool. When the nova-compute service starts, it creates locks. For example:

systemd-inhibit --list

Example system response:

WHO                   UID USER PID   COMM     WHAT     WHY                                    MODE
Nova Shutdown Handler 0   root 28927 python3  shutdown Handle events on shutdown notification delay

The process runs in the nova-compute-inhibit-lock container within the nova-compute pod. It intercepts systemd power event and starts graceful guest shutdown. When all guest instances are powered off, the inhibit lock is released.

To initiate a proper shutdown, use the following commands: systemctl shutdown and systemctl reboot.

See also

Networking service¶

Mirantis OpenStack for Kubernetes (MOSK) Networking service (OpenStack Neutron) provides cloud applications with Connectivity-as-a-Service enabling instances to communicate with each other and the outside world.

The API provided by the service abstracts all the nuances of implementing a virtual network infrastructure on top of your own physical network infrastructure. The service allows cloud users to create advanced virtual network topologies that may include load balancing, virtual private networking, traffic filtering, and other services.

MOSK Networking service supports Open vSwitch and Tungsten Fabric SDN technologies as backends.

Backends¶

MOSK offers various networking backends. Selecting the appropriate backend option for the Networking service is essential for building a robust and efficient cloud networking infrastructure. Whether you choose Open vSwitch (OVS), Open Virtual Network (OVN), or Tungsten Fabric, understanding their features, capabilities, and suitability for your specific use case is crucial for achieving optimal performance and scalability in your OpenStack environment.

Refer to Networking backend configuration for the configuration details.

Networking backend capabilities¶
Capability	Tungsten Fabric	Open vSwitch (OVS)	Open Virtual Network (OVN)
Logical routers
Static routes
SNAT
Floating IPs
External IPs on VMs
Per-tenant floating networks and SNAT pools
IPv6
Bare Metal as a Service (Ironic)
DNS as a Service	Designate and Tungsten Fabric vDNS	Designate	Designate
Firewalling	Security groups and application policies	OVS firewall	OVS firewall
Load balancing	Tungsten Fabric built in HAProxy, OpenStack Octavia/Amphora	OpenStack Octavia/Amphora	OpenStack Octavia/Amphora, Octavia/OVN native load balancer
BGP VPNs		Deprecated
VPN as a Service (IPsec)		TechPreview	TechPreview
Data plane acceleration	SR-IOV	SR-IOV	SR-IOV
QoS
Network equipment management	Netconf/OVSDB	Neutron ML2 plugins/networking-generic-switch	Neutron ML2 plugins/networking-generic-switch
East-West traffic encryption

Open vSwitch¶

Open vSwitch is a production-quality, multilayer virtual switch licensed under the open source Apache 2.0 license. It is designed to enable massive network automation through programmatic extension, while supporting standard management interfaces and protocols.

Open vSwitch is suitable for general-purpose networking requirements in OpenStack deployments. It provides flexibility and scalability for various network topologies.

Key characteristics of Open vSwitch:

Depends on RabbitMQ and RPC communication
Uses keepalived to set up HA routers
Uses namespace and Veth routing to provide its capabilities
Locates metadata in router or DHCP namespaces
Centralizes the DHCP service, which is running in a separate namespace

Open Virtual Network¶

Available since MOSK 25.1 as GA (Caracal) Available since MOSK 24.2 as TechPreview (Antelope)

Open Virtual Network is a solution for Open vSwitch that provides native virtual networking support for Open vSwitch environments. It provides enhanced scalability and performance compared to traditional Open vSwitch deployments.

Key characteristics of Open Virtual Network:

Uses the OVSDB protocol for commmunication
Is distributed by design
Handles all traffic with OpenFlow
Runs metadata on all nodes
Provides DHCP through local Open vSwitch instances

Caution

There are numerous limitations related to VLAN/Flat tenant networks in Open Virtual Network with distributed floating IPs for bare metal SR-IOV and Octavia VIP ports. For more information about Open Virtual Network limitations, see relevant upstream documentation.

OpenStack official documentation

Tungsten Fabric¶

Tungsten Fabric is an open-source SDN based on Juniper Contrail. Its design allows for simplified creation and management of virtual networks in cloud environments. Tungsten Fabric supports advanced networking scenarious, such as BGP integration and scalability.

Key characteristics of Tungsten Fabric:

Uses well scalable protocols to set up tunnels, such as BGP/MPLS
Provides out-of-the-box BGPaaS/Service chaining capabilities

General configuration¶

MOSK offers the Networking service as a part of its core setup. You can configure the service through the spec:features:neutron section of the OpenStackDeployment custom resource.

Backend¶

Parameter	`features:neutron:backend`
Usage	Defines the networking backend. The list of supported options includes: `ML2` for Open vSwitch `tungstenfabric` for Tungsten Fabric Available since MOSK 25.1 as GA (Caracal) Available since MOSK 24.2 as TechPreview (Antelope) `ml2/ovn` for Open Virtual Network Refer to Backends to learn more about the networking backends supported by MOSK.

Parameter

features:neutron:backend

Usage

Defines the networking backend. The list of supported options includes:

ML2 for Open vSwitch
tungstenfabric for Tungsten Fabric
Available since MOSK 25.1 as GA (Caracal) Available since MOSK 24.2 as TechPreview (Antelope) ml2/ovn for Open Virtual Network

Refer to Backends to learn more about the networking backends supported by MOSK.

Tunnel interface¶

Parameter	`features:neutron:tunnel_interface`
Usage	Defines the name of the NIC device on the actual host that will be used for Neutron. Mirantis recommends setting up your Kubernetes hosts in such a way that networking is configured identically on all of them, and names of the interfaces serving the same purpose or plugged into the same network are consistent across all physical nodes.

Parameter

features:neutron:tunnel_interface

Usage

Defines the name of the NIC device on the actual host that will be used for Neutron.

DNS servers¶

Parameter	`features:neutron:dns_servers`
Usage	Defines the list of IPs of DNS servers that are accessible from virtual networks. Used as default DNS servers for VMs.

External networks¶

Parameter	`features:neutron:external_networks`
Usage	Contains the data structure that defines external (provider) networks on top of which the Neutron networking will be created.

Floating IP networks¶

Parameter	`features:neutron:floating_network`
Usage	If enabled, must contain the data structure defining the floating IP network that will be created for Neutron to provide external access to your Nova instances.

BGP dynamic routing¶

Available since MOSK 23.2 TechPreview

The BGP dynamic routing extension to the Networking service (OpenStack Neutron) is particularly useful for the MOSK clouds where private networks managed by cloud users need to be transparently integrated into the networking of the data center.

For example, the BGP dynamic routing is a common requirement for IPv6-enabled environments, where clients need to seamlessly access cloud workloads using dedicated IP addresses with no address translation involved in between the cloud and the external network.

Untitled Diagram

BGP dynamic routing changes the way self-service (private) network prefixes are communicated to BGP-compatible physical network devices, such as routers, present in the data center. It eliminates the traditional reliance on static routes or ICMP-based advertising by enabling the direct passing of private network prefix information to router devices.

Note

To effectively use the BGP dynamic routing feature, Mirantis recommends acquiring good understanding of OpenStack address scopes and how they work.

The components of the OpenStack BGP dynamic routing are:

Service plugin
An extension to the Networking service (OpenStack Neutron) that implements the logic for BGP-related entities orhestration and provides the cloud user-facing API. A cloud administrator creates and configures a BGP speaker using the CLI or API and manually schedules it to one or more hosts running the agent.
Agent
Manages BGP peering sessions. In MOSK, the BGP agent runs on nodes labeled with openstack-gateway=enabled.

Prefix advertisement depends on the binding of external networks to a BGP speaker and the address scope of external and internal IP address ranges or subnets.

Prefix advertisement¶

BGP dynamic routing advertises prefixes for self-service networks and host routes for floating IP addresses.

To successfully advertise a self-service network, you need to fulfill the following conditions:

External and self-service networks reside in the same address scope.
The router contains an interface on the self-service subnet and a gateway on the external network.
The BGP speaker associates with the external network that provides a gateway on the router.
The BGP speaker has the advertise_tenant_networks attribute set to True.

To successfully advertise a floating IP address, you need to fulfill the following conditions:

The router with the floating IP address binding contains a gateway on an external network with the BGP speaker association.
The BGP speaker has the advertise_floating_ip_host_routes attribute set to true.

The diagram below is an example of the BGP dynamic routing in the non-DVR mode with self-service networks and the following advertisements:

B>* 192.168.0.0/25 [200/0] through 10.11.12.1
B>* 192.168.0.128/25 [200/0] through 10.11.12.2
B>* 10.11.12.234/32 [200/0] through 10.11.12.1

Untitled Diagram

Operation in the Distributed Virtal Router (DVR) mode¶

For both floating IP and IPv4 fixed IP addresses, the BGP speaker advertises the gateway of the floating IP agent on the corresponding compute node as the next-hop IP address. When using IPv6 fixed IP addresses, the BGP speaker advertises the DVR SNAT node as the next-hop IP address.

The diagram below is an example of the BGP dynamic routing in the DVR mode with self-service networks and the following advertisements:

B>* 192.168.0.0/25 [200/0] through 10.11.12.1
B>* 192.168.0.128/25 [200/0] through 10.11.12.2
B>* 10.11.12.234/32 [200/0] through 10.11.12.12

Untitled Diagram

DVR incompatibility with ARP announcements and VRRP¶

Due to the known issue #1774459 in the upstream implementation, Mirantis does not recommend using Distributed Virtual Routing (DVR) routers in the same networks as load balancers or other applications that utilize the Virtual Router Redundancy Protocol (VRRP) such as Keepalived. The issue prevents the DVR functionality from working correctly with network protocols that rely on the Address Resolution Protocol (ARP) announcements such as VRRP.

The issue occurs when updating permanent ARP entries for allowed_address_pair IP addresses in DVR routers because DVR performs the ARP table update through the control plane and does not allow any ARP entry to leave the node to prevent the router IP/MAC from contaminating the network.

This results in various network failover mechanisms not functioning in virtual networks that have a distributed virtual router plugged in. For instance, the default backend for MOSK Load Balancing service, represented by OpenStack Octavia with the OpenStack Amphora backend when deployed in the HA mode in a DVR-connected network, is not able to redirect the traffic from a failed active service instance to a standby one without interruption.

Block Storage service¶

Mirantis OpenStack for Kubernetes (MOSK) provides volume management capability through the Block Storage service (OpenStack Cinder).

Backup configuration¶

MOSK provides support for the following backends for the Block Storage service (OpenStack Cinder):

Support status of storage backends for Cinder¶
Backend	Support status
Ceph	Full support, default
NFS	TechPreview for Yoga and newer OpenStack releases Available since MOSK 23.2
S3	TechPreview for Yoga and newer OpenStack releases Available since MOSK 23.2

In MOSK, Cinder backup is enabled and uses the Ceph back end for Cinder by default. The backup configuration is stored in the spec:features:cinder:backup structure in the OpenStackDeployment custom resource. If necessary, you can disable the backup feature in Cinder as follows:

kind: OpenStackDeployment
spec:
  features:
    cinder:
      backup:
        enabled: false

Using this structure, you can also configure another backup driver supported by MOSK for Cinder as described below. At any given time, only one backend can be enabled.

Configuring an NFS driver¶

Available since MOSK 23.2 TechPreview

MOSK supports NFS Unix authentication exclusively. To use an NFS driver with MOSK, ensure you have a preconfigured NFS server with an NFS share accessible to a Unix Cinder user. This user must be the owner of the exported NFS folder, and the folder must have the permission value set to 775.

All Cinder services run with the same user by default. To obtain the Unix user ID:

kubectl -n openstack get pod -l application=cinder,component=api -o jsonpath='{.items[0].spec.securityContext.runAsUser}'

Note

The NFS server must be accessible through the network from all OpenStack control plane nodes of the cluster.

To enable the NFS storage for Cinder backup, configure the following structure in the OpenStackDeployment object:

spec:
  features:
    cinder:
      backup:
        drivers:
          <BACKEND_NAME>:
            type: nfs
            enabled: true
            backup_share: <URL_TO_NFS_SHARE>

You can specify the backup_share parameter in following formats: hostname:path, ipv4addr:path, or [ipv6addr]:path. For example: 1.2.3.4:/cinder_backup.

Configuring an S3 driver¶

Available since MOSK 23.2 TechPreview

To use an S3 driver with MOSK, ensure you have a preconfigured S3 storage with a user account created for access.

Note

The S3 storage must be accessible through the network from all OpenStack control plane nodes of the cluster.

To enable the S3 storage for Cinder backup:

Create a dedicated secret in Kuberbetes to securely store the credentials required for accessing the S3 storage:

---
apiVersion: v1
kind: Secret
metadata:
  labels:
    openstack.lcm.mirantis.com/osdpl_secret: "true"
  name: cinder-backup-s3-hidden
  namespace: openstack
type: Opaque
data:
  access_key: <ACCESS_KEY_FOR_S3_ACCOUNT>
  secret_key: <ACCESS_KEY_FOR_S3_ACCOUNT>

Configure the following structure in the OpenStackDeployment object:

spec:
  features:
    cinder:
      backup:
        drivers:
          <BACKEND_NAME>:
            type: s3
            enabled: true
            endpoint_url: <URL_TO_S3_STORAGE>
            store_bucket: <S3_BUCKET_NAME>
            store_access_key:
              value_from:
                secret_key_ref:
                  key: access_key
                  name: cinder-backup-s3-hidden
            store_secret_key:
              value_from:
                secret_key_ref:
                  key: secret_key
                  name: cinder-backup-s3-hidden

Volume encryption¶

TechPreview

The Block Storage service (OpenStack Cinder) supports volume encryption using a key stored in the Key Manager service (OpenStack Barbican). Such configuration uses Linux Unified Key Setup (LUKS) to create an encrypted volume type and attach it to the Compute service (OpenStack Nova) instances. Nova retrieves the asymmetric key from Barbican and stores it on the OpenStack compute node as a libvirt key to encrypt the volume locally or on the backend and only after that transfers it to Cinder.

Note

To create an encrypted volume under a non-admin user, the creator role must be assigned to the user.
When planning your cloud, consider that encryption may impact CPU.

See also

Volume configuration¶

The MOSK Block Storage service (OpenStack Cinder) uses Ceph as the default backend for Cinder Volume. Also, MOSK enables its clients to define their own volume backends using the OpenStackDeployment custom resource. This section provides all the details required to properly configure a custom Cinder Volume backend as a StatefulSet or a DaemonSet.

Disabling the Ceph backend for Cinder Volume¶

MOSK stores the configuration for the default Ceph backend in the spec:features:cinder:volume structure in the OpenStackDeployment custom resource.

To disable the Ceph backend for Cinder Volume, modify the spec:features:cinder:volume structure as follows:

spec:
  features:
    cinder:
      volume:
        enabled: false
  services:
    block-storage:
      cinder:
        values:
          conf:
            DEFAULT:
              default_volume_type: <NEW-DEFAULT-VOLUME-TYPE-NAME>

When disabling the Ceph backend for Cinder Volume, you must explicitly specify the new default_volume_type parameter. Refer to the sections below to learn how you can configure it.

Considerations for configuring a custom Cinder Volume backend¶

Before you start deploying your custom Cinder Volume backend, decide on key backend parameters and understand how they affect other services:

Note

Make sure to navigate to the documentation for the specific OpenStack version used to deploy your environment when referring to the official OpenStack documentation.

In addition, you may need to build your own Cinder image as described in Customize OpenStack container images.

Next, review the following key considerations:

Considerations for configuring a custom Cinder Volume backend¶
Configuration option	Details
StatefulSet or DaemonSet	If the Cinder volume backend you prefer must run on all nodes with a specific label and scale automatically as nodes are added or removed, use a DaemonSet. This type of backend typically requires that its data remains on the same node where its pod is running. A common example of such a backend is the LVM backend. Otherwise, Mirantis recommends using a StatefulSet, which offers more flexibility than a DaemonSet.
Support for Active/Active High Availability	If the driver does not support Active/Active High Availability, ensure that only a single copy of the backend runs and that the `cluster` parameter is left empty in the `cinder.conf` file for this backend. When deploying the backend using a StatefulSet, set `pod.replicas.volume` to `1` for this backend configuration. Additionally, enable `hostNetwork` to ensure that the service endpoint’s IP address remains stable when the backend pod restarts.
Support for Multi-Attach	If the driver supports Multi-Attach, it allows multiple connections to the same volume. This capability is important for certain services, such as Glance. If the driver does not support Multi-Attach, the backend cannot be used for services that require this functionality.
Support for iSCSI and access to the `/run` directory	Deprecated since MOSK 25.1 Some drivers require access to the `/run` directory on the host system for storing their PID or lock files. Additionally, they may need access to iSCSI and multipath services on the host. To enable this capability, set the `conf:enable_iscsi` parameter to `true`. In some cases, you might also need to run the backend container as privileged.
Privileged access for the container	For security reasons, Mirantis recommends running the Cinder Volume backend container with the minimum required privileges. However, if the drivers require privileged access, you can enable it for the StatefulSet by setting the parameter `pod:security_context:cinder_volume:container:cinder_volume:privileged`.
Access to the host network namespace	If the driver requires access to the host network namespace, or if you need to ensure that the Cinder Volume backend’s IP address remains unchanged after pod recreation or restart, set hostNetwork to `true` using the following parameters: For a DaemonSet, use `pod:useHostNetwork:volume_daemonset`. This parameter is set to `true` by default. For a StatefulSet, use `pod:useHostNetwork:volume`. Mirantis recommends avoiding using StatefulSets with `hostNetwork` as it may cause issues. StatefulSet pods are not tied to a specific node, and multiple pods can run on the same node.
Access to the host IPC namespace	If the driver requires access to the host’s IPC namespace, set hostIPC to `true` using the following parameters: For a DaemonSet, use `pod:useHostIPC:volume_daemonset`. For DaemonSet, this parameter is set to `true` by default. For a StatefulSet, use `pod:useHostIPC:volume`.
Access to host PID namespace	If the driver requires access to the host’s PID namespace, set hostPID to `true` using the following parameters: For a DaemonSet, use `pod:useHostPID:volume_daemonset`. For a StatefulSet, use `pod:useHostPID:volume`.

Configuring a custom StatefulSet backend¶

Available since MOSK 24.3 TechPreview

MOSK enables its clients to define volume backends as a StatefulSet.

To configure a custom StatefulSet backend for the MOSK Block Storage service (OpenStack Cinder), use the spec:features:cinder:volume:backends structure in the OpenStackDeployment custom resource:

spec:
  features:
    cinder:
      volume:
        backends:
          <UNIQUE_BACKEND_NAME>:
            enabled: true
            type: statefulset
            create_volume_type: true
            values:
              conf:
              images:
              labels:
              pod:

The enabled and create_volume_type parameters are optional. With create_volume_type set to true (default), the new backend will be added to the Cinder bootstrap job. Once this job is completed, the volume type for the custom backend will be created in OpenStack.

The supported value for type is statefulset.

The list of keys you can override in the values.yaml file of the Cinder chart includes conf, images, labels, and pod.

When you define the custom backend for the Block Storage service, MOSK deploys individual pods for it. These pods have separate Secrets for configuration files and ConfigMaps for scripts.

Example of configuration of a custom StatefulSet backend for Cinder:

The configuration example deploys a StatefulSet for the Cinder volume backend that uses the NFS driver, running a single replica on node labeled kubernetes.io/hostname:service-node. Privilege escalation for the Cinder volume pod is driver-specific.

spec:
  features:
    cinder:
      volume:
        enabled: false
        backends:
          nfs-volume:
            type: statefulset
            values:
              conf:
                cinder:
                  DEFAULT:
                    cluster: ""
                    enabled_backends: volumes-nfs
                  volumes-extra-nfs:
                    nas_host: 1.2.3.4
                    nas_share_path: /cinder_volume
                    nas_secure_file_operations: false
                    nfs_mount_point_base: /tmp/mountpoints
                    nfs_snapshot_support: true
                    volume_backend_name: volumes-nfs
                    volume_driver: cinder.volume.drivers.nfs.NfsDriver
              pod:
                replicas:
                  volume: 1
                security_context:
                  cinder_volume:
                    container:
                      cinder_volume:
                        privileged: true
              labels:
                volume:
                  node_selector_key: kubernetes.io/hostname
                  node_selector_value: service-node
  services:
    block-storage:
      cinder:
        values:
          conf:
            DEFAULT:
              default_volume_type: volumes-nfs

Configuring a custom DaemonSet backend¶

TechPreview

MOSK enables its clients to define volume backends as a DaemonSet, LVM in particular.

To configure a custom DaemonSet backend for the MOSK Block Storage service (OpenStack Cinder), use the spec:nodes structure in the OpenStackDeployment custom resource:

spec:
  nodes:
    <node label>:
      features:
        cinder:
          volume:
            backends:
              <backend name>:
                lvm:
                  <CINDER-LVM-DRIVER-PARAMETERS>

Example of configuration of a custom DaemonSet backend for Cinder:

The configuration example deploys a DaemonSet for the Cinder volume backend that uses the LVM driver and runs on nodes with the openstack-compute-node=enabled label:

Caution

For data storage, this backend uses the LVM cinder-vol group that must be present on nodes before the new backend is applied. For the procedure on how to deploy an LVM backend, refer to Enable LVM block storage.

spec:
  features:
    cinder:
      volume:
        enabled: false
  nodes:
    openstack-compute-node::enabled:
      features:
        cinder:
          volume:
            backends:
              volumes-lvm:
                lvm:
                  volume_group: "cinder-vol"
  services:
    block-storage:
      cinder:
        values:
          conf:
            DEFAULT:
              default_volume_type: volumes-lvm

Disabling stale volume services cleaning¶

MOSK provides the cinder-service-cleaner CronJob by default. This CronJob periodically checks whether all Cinder services in OpenStack are up to date and removes any stale ones.

This CronJob is tested only with backends supported by MOSK. If cinder-service-cleaner does not work properly with your custom Cinder volume backend, you can disable it at the OpenStackDeployment service level in the OpenStackDeployment custom resource:

spec:
  services:
    block-storage:
      cinder:
        values:
          manifests:
            cron_service_cleaner: false

Note

Make sure to navigate to the documentation for the specific OpenStack version used to deploy your environment when referring to the official OpenStack documentation.

See also

Identity service¶

Mirantis OpenStack for Kubernetes (MOSK) provides authentication, service discovery, and distributed multi-tenant authorization through the OpenStack Identity service, aka Keystone.

Federation¶

MOSK integrates with Mirantis Container Cloud Identity and Access Management (IAM) subsystem to allow centralized management of users and their permissions across multiple clouds.

The core component of Container Cloud IAM is Keycloak, the open-source identity and access management software. Its primary function is to perform secure authentication of cloud users against its built-in or various external identity databases, such as LDAP directories, OpenID Connect or SAML compatible identity providers.

By default, every MOSK cluster is integrated with the Keycloak running in the Container Cloud management cluster. The integration automatically provisions the necessary configuration on the MOSK and Container Cloud IAM sides, such as the os client object in Keycloak. However, for the federated users to get proper permissions after logging in, the cloud operator needs to define the role mapping rules specific to each MOSK environment.

See also

Connecting to Keycloak¶

MOSK enables you to connect to the Keycloak identity provider through the following structure in the OpenStackDeployment custom resource:

spec:
  features:
    keystone:
      keycloak:
        enabled: true
        url: https://keycloak.it.just.works
        oidc:
          OIDCSSLValidateServer: false
          OIDCOAuthSSLValidateServer: false
          OIDCScope: "openid email profile groups"

Connecting to external identity provider¶

Available since MOSK 24.3 TechPreview

MOSK enables you to connect external identity provider to Keystone directly through the following structure in the OpenStackDeployment custom resource:

spec:
  features:
    keystone:
     federations:
       openid:
         enabled: true
         oidc_auth_type: oauth2
         providers:
           keycloak:
             issuer: https://keycloak.it.just.works/auth/realms/iam
             mapping:
             - local:
               - user:
                   email: '{1}'
                   name: '{0}'
               - domain:
                   name: Default
                 groups: '{2}'
               remote:
               - type: OIDC-iam_username
               - type: OIDC-email
               - type: OIDC-iam_roles
             metadata:
               client:
                 client_id: os
               conf:
                 response_type: id_token
                 scope: openid email profile
                 ssl_validate_server: false
               provider:
                 value_from:
                   from_url:
                     url: https://keycloak.it.just.works/auth/realms/iam/.well-known/openid-configuration
           okta:
             description: OKTA provider
             enabled: true
             issuer: https://dev-68495932.okta.com/oauth2/default
             mapping:
             - local:
               - user:
                   email: '{1}'
                   name: '{0}'
               - domain:
                   name: Default
                 groups: m:os@admin
               remote:
               - type: OIDC-name
               - type: OIDC-email
             metadata:
               client:
                 client_id: 0oaixfwyqcAkCbC335d7
                 client_secret: aKOtnqHwu37ricQJfOD9ShECqj7DY7SVHgh8nm1NwlAhGbQjGqREHencsGagyfmQ
               conf: {}
               provider:
                 value_from:
                   from_url:
                     url: https://dev-68495932.okta.com/oauth2/default/.well-known/openid-configuration
             oauth2:
               OAuth2TokenVerify: jwks_uri https://dev-68495932.okta.com/oauth2/default/v1/keys
             token_endpoint: https://dev-68495932.okta.com/oauth2/default/v1/token

The oidc_auth_type parameter specifies the Apache module to use: oauth20 or oauth2. The oauth20 functionality is deprecated and superseded by a new oauth2 module. You can configure two and more identity providers only with the oauth2 module.

See also

Regions¶

A region in MOSK represents a complete OpenStack cluster that has a dedicated control plane and set of API endpoints. It is not uncommon for operators of large clouds to offer their users several OpenStack regions, which differ by their geographical location or purpose. In order to easily navigate in a multi-region environment, cloud users need a way to distinguish clusters by their names.

The region_name parameter of an OpenStackDeployment custom resource specifies the name of the region that will be configured in all the OpenStack services comprising the MOSK cluster upon the initial deployment.

Important

Once the cluster is up and running, the cloud operator cannot set or change the name of the region. Therefore, Mirantis recommends selecting a meaningful name for the new region before the deployment starts. For example, the region name can be based on the name of the data center the cluster is located in.

Usage sample:

apiVersion: lcm.mirantis.com/v1alpha1
kind: OpenStackDeployment
metadata:
  name: openstack-cluster
  namespace: openstack
spec:
  region_name: <your-region-name>

Application credentials¶

Application credentials is a mechanism in the MOSK Identity service that enables application automation tools, such as shell scripts, Terraform modules, Python programs, and others, to securely perform various actions in the cloud API in order to deploy and manage application components.

Application credentials is a modern alternative to the legacy approach where every application owner had to request several technical user accounts to ensure their tools could authenticate in the cloud.

For the details on how to create and authenticate with application credentials, refer to Manage application credentials.

Application credentials must be explicitly enabled for federated users¶

By default, cloud users logging in to the cloud through the Mirantis Container Cloud IAM or any external identity provider cannot use the application credentials mechanism.

An application credential is heavily tied to the account of the cloud user owning it. An application automation tool that is a consumer of the credential acts on behalf of the human user who created the credential. Each action that the application automation tool performs gets authorized against the permissions, including roles and groups, the user currently has.

The source of truth about a federated user permissions is the identity provider. This information gets temporary transferred to the cloud’s Identity service inside a token once the user authenticates. By default, if such a user creates an application credential and passes it to the automation tool, there is no data to validate the tool’s action on the user’s behalf.

However, a cloud operator can configure the authorization_ttl parameter for an identity provider object to enable caching of its users authorization data. The parameter defines for how long in minutes the information about user permissions is preserved in the database after the user successfully logs in to the cloud.

Warning

Authorization data caching has security implications. In case a federated user account is revoked or his permissions change in the identity provider, the cloud Identity service will still allow performing actions on the user behalf until the cached data expires or the user re-authenticates in the cloud.

To set authorization_ttl to, for example, 60 minutes for the keycloak identity provider in Keystone:

kubectl -n openstack exec $(kubectl -n openstack get po -l application=keystone,component=client -oname) -ti -c keystone-client -- bash

Inside the Pod, run the following command:

openstack identity provider set keycloak --authorization-ttl 60

See also

Domain-specific configuration¶

Parameter	`features:keystone:domain_specific_configuration`
Usage	Defines the domain-specific configuration and is useful for integration with LDAP. An example of OsDpl with LDAP integration, which will create a separate `domain.with.ldap` domain and configure it to use LDAP as an identity driver: spec: features: keystone: domain_specific_configuration: enabled: true domains: domain.with.ldap: enabled: true config: assignment: driver: keystone.assignment.backends.sql.Assignment identity: driver: ldap ldap: chase_referrals: false group_desc_attribute: description group_id_attribute: cn group_member_attribute: member group_name_attribute: ou group_objectclass: groupOfNames page_size: 0 password: XXXXXXXXX query_scope: sub suffix: dc=mydomain,dc=com url: ldap://ldap01.mydomain.com,ldap://ldap02.mydomain.com user: uid=openstack,ou=people,o=mydomain,dc=com user_enabled_attribute: enabled user_enabled_default: false user_enabled_invert: true user_enabled_mask: 0 user_id_attribute: uid user_mail_attribute: mail user_name_attribute: uid user_objectclass: inetOrgPerson

Parameter

features:keystone:domain_specific_configuration

Usage

Defines the domain-specific configuration and is useful for integration with LDAP. An example of OsDpl with LDAP integration, which will create a separate domain.with.ldap domain and configure it to use LDAP as an identity driver:

spec:
  features:
    keystone:
      domain_specific_configuration:
        enabled: true
        domains:
          domain.with.ldap:
            enabled: true
            config:
              assignment:
                driver: keystone.assignment.backends.sql.Assignment
              identity:
                driver: ldap
              ldap:
                chase_referrals: false
                group_desc_attribute: description
                group_id_attribute: cn
                group_member_attribute: member
                group_name_attribute: ou
                group_objectclass: groupOfNames
                page_size: 0
                password: XXXXXXXXX
                query_scope: sub
                suffix: dc=mydomain,dc=com
                url: ldap://ldap01.mydomain.com,ldap://ldap02.mydomain.com
                user: uid=openstack,ou=people,o=mydomain,dc=com
                user_enabled_attribute: enabled
                user_enabled_default: false
                user_enabled_invert: true
                user_enabled_mask: 0
                user_id_attribute: uid
                user_mail_attribute: mail
                user_name_attribute: uid
                user_objectclass: inetOrgPerson

Image service¶

Mirantis OpenStack for Kubernetes (MOSK) provides the image management capability through the OpenStack Image service, aka Glance.

The Image service enables you to discover, register, and retrieve virtual machine images. Using the Glance API, you can query virtual machine image metadata and retrieve actual images.

MOSK deployment profiles include the Image service in the core set of services. You can configure the Image service through the spec:features definition in the OpenStackDeployment custom resource.

Image signature verification¶

TechPreview

MOSK can automatically verify the cryptographic signatures associated with images to ensure the integrity of their data. A signed image has a few additional properties set in its metadata that include img_signature, img_signature_hash_method, img_signature_key_type, and img_signature_certificate_uuid. You can find more information about these properties and their values in the upstream OpenStack documentation.

MOSK performs image signature verification during the following operations:

A cloud user or a service creates an image in the store and starts to upload its data. If the signature metadata properties are set on the image, its content gets verified against the signature. The Image service accepts non-signed image uploads.
A cloud user spawns a new instance from an image. The Compute service ensures that the data it downloads from the image storage matches the image signature. If the signature is missing or does not match the data, the operation fails. Limitations apply, see Known limitations.
A cloud user boots an instance from a volume, or creates a new volume from an image. If the image is signed, the Block Storage service compares the downloaded image data against the signature. If there is a mismatch, the operation fails. The service will accept a non-signed image as a source for a volume. Limitations apply, see Known limitations.

Configuration example¶

spec:
  features:
    glance:
      signature:
        enabled: true

Signing pre-built images¶

Every MOSK cloud is pre-provisioned with a baseline set of images containing most popular operating systems, such as Ubuntu, Fedora, CirrOS.

In addition, a few services in MOSK rely on the creation of service instances to provide their functions, namely the Load Balancer service and the Bare Metal service, and require corresponding images to exist in the image store.

When image signature verification is enabled during the cloud deployment, all these images get automatically signed with a pre-generated self-signed certificate. Enabling the feature in an already existing cloud requires manual signing of all of the images stored in it. Consult the OpenStack documentation for an example of the image signing procedure.

Supported storage backends¶

The image signature verification is supported for LVM and local backends for ephemeral storage.

The functionality is not compatible with Ceph-backed ephemeral storage combined with RAW formatted images. The Ceph copy-on-write mechanism enables the user to create instance virtual disks without downloading the image to a compute node, the data is handled completely on the side of a Ceph cluster. This enables you to spin up instances almost momentarily but makes it impossible to verify the image data before creating an instance from it.

Known limitations¶

The Image service does not enforce the presence of a signature in the metadata when the user creates a new image. The service will accept the non-signed image uploads.
The Image service does not verify the correctness of an image signature upon update of the image metadata.
MOSK does not validate if the certificate used to sign an image is trusted, it only ensures the correctness of the signature itself. Cloud users are allowed to use self-signed certificates.
The Compute service does not verify image signature for Ceph backend when the RAW image format is used as described in Supported storage backends.
The Compute service does not verify image signature if the image is already cached on the target compute node.
The Instance HA service may experience issues when auto-evacuating instances created from signed images if it does have access to the corresponding secrets in the Key manager service.
The Block Storage service does not perform image signature verification when a Ceph backend is used and the images are in the RAW format.
The Block Storage service does not enforce the presence of a signature on the images.

See also

Object Storage service¶

Ceph Object Gateway provides Object Storage (Swift) API for end users in MOSK deployments. For the API compatibility, refer to Ceph Documentation: Ceph Object Gateway Swift API.

Object storage enablement¶

Parameter	`features:services:object-storage`
Usage	Enables the object storage and provides a RADOS Gateway Swift API that is compatible with the OpenStack Swift API. To enable the service, add `object-storage` to the service list: spec: features: services: - object-storage To create the RADOS Gateway pool in Ceph, see :ref: Operations Guide: Enable Ceph RGW Object Storage <enable-rgw>.

Parameter

features:services:object-storage

Usage

Enables the object storage and provides a RADOS Gateway Swift API that is compatible with the OpenStack Swift API.

To enable the service, add object-storage to the service list:

spec:
  features:
    services:
    - object-storage

To create the RADOS Gateway pool in Ceph, see :ref: Operations Guide: Enable Ceph RGW Object Storage <enable-rgw>.

Object storage server-side encryption¶

TechPreview

Ceph Object Gateway also provides Amazon S3 compatible API. For details, see Ceph Documentation: Ceph Object Gateway S3 API. Using integration with the OpenStack Key Manager service (Barbican), the objects uploaded through S3 API can be encrypted by Ceph Object Gateway according to the AWS Documentation: Protecting data using server-side encryption with customer-provided encryption keys (SSE-C) specification.

Instead of Swift, such configuration uses an S3 client to upload server-side encrypted objects. Using server-side encryption, the data is sent over a secure HTTPS connection in an unencrypted form and the Ceph Object Gateway stores that data in the Ceph cluster in an encrypted form.

See also

Dashboard¶

MOSK Dashboard (OpenStack Horizon) provides a web-based interface for users to access the functions of the cloud services.

Custom theme¶

Parameter	`features:horizon:themes`
Usage	Defines the list of custom OpenStack Dashboard themes. Content of the archive file with a theme depends on the level of customization and can include static files, Django templates, and other artifacts. For the details, refer to OpenStack official documentation: Customizing Horizon Themes. spec: features: horizon: themes: - name: theme_name description: The brand new theme url: https://<path to .tgz file with the contents of custom theme> sha256summ: <SHA256 checksum of the archive above>

Parameter

features:horizon:themes

Usage

Defines the list of custom OpenStack Dashboard themes. Content of the archive file with a theme depends on the level of customization and can include static files, Django templates, and other artifacts. For the details, refer to OpenStack official documentation: Customizing Horizon Themes.

spec:
  features:
    horizon:
      themes:
        - name: theme_name
          description: The brand new theme
          url: https://<path to .tgz file with the contents of custom theme>
          sha256summ: <SHA256 checksum of the archive above>

Message of the Day (MOTD)¶

Available since MOSK 25.1

MOSK enables a cloud operator to configure Message of the Day (MOTD) for the MOSK Dashboard (OpenStack Horizon). These short messages inform users about current infrastructure issues, upcoming maintenance, and other events, helping them plan their work with minimal service disruption.

Cloud operators can configure messages to appear before or after users log in to Horizon, or both. Messages can also be visually distinguished based on severity and support minimal HTML formatting, including links.

To define the MOTD, populate the following structure in the OpenStackDeployment custom resource:

spec:
  features:
    horizon:
      motd:
        <NAME>:
          level: <LEVEL>
          message: <MESSAGE>
          afterLogin: <true|false>
          beforeLogin: <true|false>

Parameters:

<NAME>: A unique symbolic name to distinguish messages
level: The severity level of the message. Supported values: success, info, warning, and error
beforeLogin: Boolean. If true, the message appears on the login page for unauthorized users. Default: false.
afterLogin: Boolean. If true, the message appears after users log in. Default: true.

Configuration example:

spec:
  features:
    horizon:
      motd:
        errorBefore:
          level: error
          message: "We are experiencing <b>issues</b> with the authentication provider<br>Check the status at the <a href='https://foo.bar'>status page</a>"
          afterLogin: false
          beforeLogin: true
        warnAfter:
          level: warning
          message: "Planned maintenance tomorrow"

The above configuration results in the following two messages displayed for all cloud users:

Login page:
After logging in:

Auxiliary cloud services:

Bare Metal service¶

The Bare Metal service (Ironic) is an extra OpenStack service that can be deployed by the OpenStack Controller (Rockoon). This section provides the baremetal-specific configuration options of the OpenStackDeployment resource.

Enabling the Bare Metal service¶

The Bare Metal service is not included into the core set of services and needs to be explicitly enabled in the OpenStackDeployment custom resource.

To install bare metal services, add the baremetal keyword to the spec:features:services list:

spec:
  features:
    services:
      - baremetal

Note

All bare metal services are scheduled to the nodes with the openstack-control-plane: enabled label.

Ironic agent deployment images¶

To provision a user image onto a bare metal server, Ironic boots a node with a ramdisk image. Depending on the node’s deploy interface and hardware, the ramdisk may require different drivers (agents). MOSK provides tinyIPA-based ramdisk images and uses the direct deploy interface with the ipmitool power interface.

Example of agent_images configuration:

spec:
  features:
    ironic:
       agent_images:
         base_url: https://binary.mirantis.com/openstack/bin/ironic/tinyipa
         initramfs: tinyipa-stable-ussuri-20200617101427.gz
         kernel: tinyipa-stable-ussuri-20200617101427.vmlinuz

Since the bare metal nodes hardware may require additional drivers, you may need to build a deploy ramdisk for particular hardware. For more information, see Ironic Python Agent Builder. Be sure to create a ramdisk image with the version of Ironic Python Agent appropriate for your OpenStack release.

Bare metal networking¶

Ironic supports the flat networking mode for both Open vSwitch (OVS) and Open Virtual Network (OVN) backends, and the multitenancy networking mode for the OVN backend only.

Caution

Support for multitenant OVS networking is deprecated since MOSK 25.1. For details, refer to Multitenant OVS networking for the Bare Metal service.

Flat networking¶

The flat networking mode assumes that all bare metal nodes are pre-connected to a single network that cannot be changed during the virtual machine provisioning. This network with bridged interfaces for Ironic should be spread across all nodes including compute nodes to allow plug-in regular virtual machines to connect to Ironic network. In its turn, the interface defined as provisioning_interface should be spread across gateway nodes. The cloud operator can perform all these underlying configuration through the L2 templates.

Example of the OsDpl resource illustrating the configuration for the flat network mode:

spec:
  features:
    services:
      - baremetal
    neutron:
      external_networks:
        - bridge: ironic-pxe
          interface: <baremetal-interface>
          network_types:
            - flat
          physnet: ironic
          vlan_ranges: null
    ironic:
       # The name of neutron network used for provisioning/cleaning.
       baremetal_network_name: ironic-provisioning
       networks:
         # Neutron baremetal network definition.
         baremetal:
           physnet: ironic
           name: ironic-provisioning
           network_type: flat
           external: true
           shared: true
           subnets:
             - name: baremetal-subnet
               range: 10.13.0.0/24
               pool_start: 10.13.0.100
               pool_end: 10.13.0.254
               gateway: 10.13.0.11
       # The name of interface where provision services like tftp and ironic-conductor
       # are bound.
       provisioning_interface: br-baremetal

Multitenant networking¶

Deprecated for OVS since MOSK 25.1

The multitenancy network mode uses the neutron Ironic network interface to share physical connection information with Neutron. This information is handled by Neutron ML2 drivers when plugging a Neutron port to a specific network. MOSK supports the networking-generic-switch Neutron ML2 driver out of the box.

Example of the OsDpl resource illustrating the configuration for the multitenancy network mode:

spec:
  features:
    services:
      - baremetal
    neutron:
      tunnel_interface: ens3
      external_networks:
        - physnet: physnet1
          interface: <physnet1-interface>
          bridge: br-ex
          network_types:
            - flat
          vlan_ranges: null
          mtu: null
        - physnet: ironic
          interface: <physnet-ironic-interface>
          bridge: ironic-pxe
          network_types:
            - vlan
          vlan_ranges: 1000:1099
    ironic:
      # The name of interface where provision services like tftp and ironic-conductor
      # are bound.
      provisioning_interface: <baremetal-interface>
      baremetal_network_name: ironic-provisioning
      networks:
        baremetal:
          physnet: ironic
          name: ironic-provisioning
          network_type: vlan
          segmentation_id: 1000
          external: true
          shared: false
          subnets:
            - name: baremetal-subnet
              range: 10.13.0.0/24
              pool_start: 10.13.0.100
              pool_end: 10.13.0.254
              gateway: 10.13.0.11

See also

DNS service¶

Mirantis OpenStack for Kubernetes (MOSK) provides DNS records managing capability through the DNS service (OpenStack Designate).

LoadBalancer type for PowerDNS¶

The supported backend for Designate is PowerDNS. If required, you can specify whether to use an external IP address or UDP, TCP, or TCP + UDP kind of Kubernetes for the PowerDNS service.

To configure LoadBalancer for PowerDNS, use the spec:features:designate definition in the OpenStackDeployment custom resource.

The list of supported options includes:

external_ip - Optional. An IP address for the LoadBalancer service. If not defined, LoadBalancer allocates the IP address.
protocol - A protocol for the Designate backend in Kubernetes. Can only be udp, tcp, or tcp+udp.
type - The type of the backend for Designate. Can only be powerdns.

For example:

spec:
  features:
    designate:
      backend:
        external_ip: 10.172.1.101
        protocol: udp
        type: powerdns

DNS service known limitations¶

Inability to set up a secondary DNS zone¶

Lifted in 23.1

Due to an issue in the dnspython library, Asynchronous Transfer Full Range (AXFR) requests do not work and cause inability to set up a secondary DNS zone.

The issue affects OpenStack Victoria and is fixed in the Yoga release.

Key Manager service¶

MOSK Key Manager service (OpenStack Barbican) provides secure storage, provisioning, and management of cloud application secret data, such as Symmetric Keys, Asymmetric Keys, Certificates, and raw binary data.

Configuring the Vault backend¶

Parameter	`features:barbican:backends:vault`
Usage	Specifies the object containing the Vault parameters to connect to Barbican. The list of supported options includes: `enabled` - boolean parameter indicating that the Vault back end is enabled `approle_role_id` - Vault app role ID `approle_secret_id` - secret ID created for the app role `vault_url` - URL of the Vault server `use_ssl` - enables the SSL encryption. Since MOSK does not currently support the Vault SSL encryption, the `use_ssl` parameter should be set to `false` `kv_mountpoint` ^TechPreview - optional, specifies the mountpoint of a Key-Value store in Vault to use `namespace` ^TechPreview - optional, specifies the Vault namespace to use with all requests to Vault Note The Vault namespaces feature is available only in Vault Enterprise. Note Vault namespaces are supported only starting from the OpenStack Victoria release. If the Vault backend is used, configure it properly using the following parameters: spec: features: barbican: backends: vault: enabled: true approle_role_id: <APPROLE_ROLE_ID> approle_secret_id: <APPROLE_SECRET_ID> vault_url: <VAULT_SERVER_URL> use_ssl: false Mirantis recommeds hiding the `approle_id` and `approle_secret_id` keys as described in Hiding sensitive information. Note Since MOSK does not currently support the Vault SSL encryption, set the `use_ssl` parameter to `false`.

Parameter

features:barbican:backends:vault

Usage

Specifies the object containing the Vault parameters to connect to Barbican.

The list of supported options includes:

enabled - boolean parameter indicating that the Vault back end is enabled
approle_role_id - Vault app role ID
approle_secret_id - secret ID created for the app role
vault_url - URL of the Vault server
use_ssl - enables the SSL encryption. Since MOSK does not currently support the Vault SSL encryption, the use_ssl parameter should be set to false
kv_mountpoint ^TechPreview - optional, specifies the mountpoint of a Key-Value store in Vault to use
namespace ^TechPreview - optional, specifies the Vault namespace to use with all requests to Vault

Note

The Vault namespaces feature is available only in Vault Enterprise.

Note

Vault namespaces are supported only starting from the OpenStack Victoria release.

If the Vault backend is used, configure it properly using the following parameters:

spec:
  features:
    barbican:
      backends:
        vault:
          enabled: true
          approle_role_id: <APPROLE_ROLE_ID>
          approle_secret_id: <APPROLE_SECRET_ID>
          vault_url: <VAULT_SERVER_URL>
          use_ssl: false

Mirantis recommeds hiding the approle_id and approle_secret_id keys as described in Hiding sensitive information.

Note

Since MOSK does not currently support the Vault SSL encryption, set the use_ssl parameter to false.

Instance High Availability service¶

TechPreview

Instance High Availability service (OpenStack Masakari) enables cloud users to ensure that their instances get automatically evacuated from a failed hypervisor.

The service consists of the following components:

API recieves requests from users and events from monitors, and sends them to engine
Engine executes recovery workflow
Monitors detect failures and notifies API. MOSK uses monitors of the following types:
- Instance monitor performs liveness of instance processes
- Introspective instance monitor enhances instance high availability within OpenStack environments by monitoring and identifying system-level failures through the QEMU Guest Agent
- Host monitor performs liveness of a compute host, runs as part of the Node controller from the OpenStack Controller (Rockoon)
Note

The Processes monitor is not present in MOSK as far as HA for the compute processes is handled by Kubernetes.

This section describes how to enable various components of the Instance High Availability service for your MOSK deployment:

Enabling the Instance HA service¶

The Instance HA service is not included into the core set of services and needs to be explicitly enabled in the OpenStackDeployment custom resource.

Parameter	`features:services:instance-ha`
Usage	Enables Masakari, the OpenStack service that ensures high availability of instances running on a host. To enable the service, add `instance-ha` to the service list: spec: features: services: - instance-ha

Enabling introspective instance monitor¶

Available since MOSK 25.1 TechPreview

The introspective instance monitor in the Instance High Availability service enhances the reliability of the cloud environment by monitoring virtual machines for failure events, including operating system crashes, kernel panics, and unresponsive states. Upon detecting such events in real time, the monitor initiates automated recovery actions, such as rebooting the affected instance. This allows for reduced downtime and maintains high availability of an OpenStack environment.

As a cloud operator, you can enable and configure the instance introspection through the spec:features:masakari:monitors:introspective definition in the OpenStackDeployment custom resource. The list of supported options include:

enabled (boolean)
Enables or disables the introspection monitor. Default: false.
guest_monitoring_interval (integer)
Defines the time interval (in seconds) for monitoring the status of the guest virtual machine. Default: 10.
guest_monitoring_timeout (integer)
Sets the timeout (in seconds) for detecting a non-responsive guest VM before marking it as failed. Default: 2.
guest_monitoring_failure_threshold (integer)
Defines the number of consecutive failures required before a notification is sent or recovery action is initiated. Default: 3.

Example configuration:

spec:
  features:
    masakari:
      monitors:
        introspective:
          enabled: true
          guest_monitoring_interval: 10
          guest_monitoring_timeout: 2
          guest_monitoring_failure_threshold: 3

The introspective instance monitor relies on the QEMU Guest Agent being installed within the guest virtual machine. This agent enables communication between the host and guest operating systems, ensuring precise monitoring of the virtual machine health. Without the QEMU Guest Agent, the introspection monitor cannot accurately assess the state of the virtual machine, which may prevent the initiation of necessary recovery actions. To start monitoring, refer to Configure the introspective instance monitor.

Shared Filesystems service¶

Available since MOSK 24.3

MOSK Shared Filesystems service (OpenStack Manila) provides Shared Filesystems as a service. The Shared Filesystems service enables you to create and manage shared filesystems in your multi-project cloud environments.

Note

MOSK does not support the Shared Filesystems service for the clusters with Tungsten Fabric as a networking backend.

Service architecture¶

The Shared FileSystems service (OpenStack Manila) consists of manila-api, manila-scheduler, and manila-share services. All these services communicate with each other through the AMQP protocol and store their data in the MySQL database:

manila-api
Provides a stable RESTful API, authenticates and routes requests throughout the Shared Filesystem service
manila-scheduler
Responsible for scheduling and routing requests to the appropriate manila-share service by determining which backend should serve as the destination for a share creation request
manila-share
Responsible for managing Shared Filesystems service devices, specifically the backend ones

The diagram below illustrates how the Shared FileSystems service components communicate with each other.

Untitled Diagram

Shared Filesystems drivers¶

MOSK ensures support for different kind of equipment and shared filesystems by means of special drivers that are part of the manila-share service. Also, these drivers determine the ability to restrict access to data stored on a shared filesystem, list of operations with Manila volumes, and types of connections to the client network.

Driver Handles Share Servers (DHSS) is one of the main parameters that define the Manila workflow including the way the Manila driver makes clients access shared filesystems. Some drivers support only one DHSS mode, for example, the LVM share driver. Others support both modes, for example, the Generic driver. If the DHSS is set to False in the driver configuration, the driver does not prepare the share server that provides access to the share filesystems and the server and network setup should be performed by the administrator. In this case, the Shared Filesystems service only manages the server in its own configuration.

Untitled Diagram

If the driver configuration includes DHSS=True, the driver creates a service virtual machine that provides access to shared filesystems. Also, when DHSS=True, the Shared Filesystems service performs a network setup to provide client’s access to the created service virtual machine. For working with the service virtual machine, the Shared Filesystems service requires a separate service network that must be included in the driver’s configuration as well.

The following are descriptions of drivers supported by the MOSK Shared Filesystems service.

Generic driver¶

The generic driver is an example for the DHSS=True case. There are two network topologies for connecting client’s network to the service virtual machine, which depend of the connect_share_server_to_tenant_network parameter. If the connect_share_server_to_tenant_network parameter is set to False, which is default, the client must create a shared network connected to a public router. IP addresses from this network will be granted access to the created shared filesystem. The Shared Filesystems service creates a subnet in its service network where the network port of the new service virtual machine and network port of the clent’s router will be connected to. When a new shared filesystem is created, the client’s machine is granted access to it through the router.

Untitled Diagram

If the connect_share_server_to_tenant_network parameter is set to True, the Shared Filesystems service creates the service virtual machines with two network interfaces. One of them is connected to the service network while the other one is connected to the client’s network.

Untitled Diagram

CephFS driver¶

Available since MOSK 25.1 TechPreview

The CephFS driver is a DHSS=False driver. The CephFS driver can be configured to use the Ceph protocol to provide shares. However, MOSK does not support the NFS Ganesha protocol.

The main advantages of using a direct connection to CephFS through the Ceph protocol over using the NFS protocol include:

Simplified setup
No third-party services are required between the client and CephFS, whereas an NFS layer can introduce an additional point of failure.
No additional load balancing
Making NFS highly available requires setting up additional load balancers, which is unnecessary with direct CephFS access.
Enhanced access control
CephFS shares can be restricted using cephx authentication, whereas NFS only allows access restrictions based on IP addresses.

For the CephFS driver to function, the manila-share service must have access to the Storage Access network. To mount created shares, the client must have access to the Storage Access network, the share URL, and credentials. The URLs and credentials for created shares are exposed to clients through the Manila API.

Note

Due to the existing limitation for Ceph clusters, Ceph Monitor services are only accessible on the MOSK LCM network. Therefore, both the manila-share service and clients require access to the MOSK LCM network. By default, manila-share already have access to this network. However, to enable access for external clients, for example, client VMs, routing must be configured between the client VM and the MOSK LCM network.

Untitled Diagram

The risks of direct connection of client VMs to the Storage Access Network include:

A malicious host on the same network may attempt to attack or scan other clients or the Ceph cluster
A malicious host may intercept and manipulate communication, acting on behalf of a valid client or Ceph cluster (a man-in-the-middle attack)

The following measures can help reduce these risks:

Ensure that port security is enabled on client VM ports connected to Ceph networks, which is enabled by default on OpenStack networks
Ensure that the Ceph cluster and client use the msgr2 protocol with CRC and secure modes enabled, which are enabled by default for MOSK deployments
Configure OpenStack security groups for client VM ports to allow traffic only from trusted hosts

Enabling Shared Filesystems service¶

The Shared Filesystems service is not included into the core set of services and needs to be explicitly enabled in the OpenStackDeployment custom resource.

To install the OpenStack Manila services, add the shared-file-system keyword to the spec:features:services list:

spec:
  features:
    services:
      - shared-file-system

The above configuration installs the Shared Filesystems service with the generic driver configured.

Enabling CephFS driver for Shared Filesystems service¶

Available since MOSK 25.1 TechPreview

Caution

MOSK does not support enabling both the generic driver and CephFS driver in the same environment. If the CephFS driver is enabled in an environment where the generic driver was previously enabled, the CephFS driver will replace the generic one.

The CephFS driver is not enabled by default in the Shared Filesystems service. To enable the CephFS driver:

Verify that CephFS is enabled in the Ceph cluster as described in Configure Ceph Shared File System (CephFS).

Add the following configuration to the OpenStackDeployment object:

spec:
  features:
    manila:
      share:
        backends:
          cephfs:
            type: statefulset
            values:
              conf:
                manila:
                  DEFAULT:
                    enabled_share_backends: cephfs
                  cephfs:
                    share_backend_name: cephfs
                    share_driver: manila.share.drivers.cephfs.driver.CephFSDriver

See also

Dynamic Resource Balancer service¶

Available since MOSK 24.2 TechPreview

In a cloud environment where resources are shared across all workloads, those resources often become a point of contention.

For example, it is not uncommon for an oversubscribed compute node to experience the noisy neighbor problem, when one of the instances may start consuming a lot more resources than usually, negatively affecting performance of other instances running on the same node.

In such cases, an intervention is required from the cloud operators to manually re-distribute workloads in the cluster to achieve more equal utilization of resources.

The Dynamic Resource Balancer (DRB) service continiously measures resource usage on hypervisors and redistributes workloads to achieve some optimum target, thereby eliminating the need for manual interventions from cloud operators.

Architecture overview¶

The DRB service is implemented as a Kubernetes operator, controlled by the custom resource of kind: DRBConfig. Unless at least one resource of such kind is present, the service does not perform any operations. Cloud operators who want to enable the DRB service for their MOSK clouds, need to create the resource with proper configuration.

The DRB controller consists of the following сomponents interacting with each other:

collector
Collects the statistics of resource consumption in the cluster
scheduler
Based on the data from the collector, makes decisions whether cloud resources need to be relocated to achieve the optimum
actuator
Executes the resource relocation decisions made by scheduler

Out of the box, these service components implement a very simple logic, which, however, can be individually enhanced according to the needs of a specific cloud environment by utilizing their pluggable architecture. The plugins need to be written in Python programming language and injected as modules into the DRB service by building a custom drb-controller container image. Default plugins as well as custom plugins are configured through the corresponding sections of DRBConfig custom resources.

Also, it is possible to limit the scope of DRB decisions and actions to only a subset of hosts. This way, you can model the node grouping schema that is configured in OpenStack, for example, compute node aggregates and availability zones, to avoid DRB service attempting resource placement changes that cannot be fulfilled by MOSK Compute service (OpenStack Nova).

Example configuration¶

apiVersion: lcm.mirantis.com/v1alpha1
kind: DRBConfig
metadata:
  name: drb-test
  namespace: openstack
spec:
  actuator:
    max_parallel_migrations: 10
    migration_polling_interval: 5
    migration_timeout: 180
    name: os-live-migration
  collector:
    name: stacklight
  hosts: []
  migrateAny: false
  reconcileInterval: 300
  scheduler:
    load_threshold: 80
    min_improvement: 0
    name: vm-optimize

The spec section of configuration consists of the following main parts:

collector
Specifies and configures the collector plugin to collect the metrics on which decisions are based. At a minimum, the name of the plugin must be provided.
scheduler
Specifies and configures the scheduler plugin that will make decisions based on the collected metrics. At a minimum, the name of the plugin must be provided.
actuator
Specifies and configures the actuator plugin that will move resources around. At a minimum, the name of the plugin must be provided.
reconcileInterval
Defines time in seconds between reconciliation cycles. Should be large enough for the metrics to settle after resources are moved around.

For the default stacklight collector plugin, this value must equal at least 300.
hosts
Specifies the list of cluster hosts to which this given instance of DRBConfig applies. This means that only metrics from these hosts will be used for making decisions, only resources belonging to these hosts will be considered for re-distribution, and only these hosts will be considered as possible targets for re-distribution.

You can create multiple DRBConfig resources that watch over non-overlapping sets of hosts.

Default of this setting is an empty list that implies all hosts.
migrateAny
A boolean flag that the scheduler plugin can consider when making decisions, allowing cloud operators and users to opt certain workloads in or out of redistribution.

For the default vm-optimize scheduler plugin:
- migrateAny: true (default) - any instance can be migrated, except for instances tagged with lcm.mirantis.com:no-drb, explicitly opting out of the DRB functionality
- migrateAny: false - only instances tagged with lcm.mirantis.com:drb are migrated by the DRB service, explicitly opting in to the DRB functionality

Included default plugins¶

Collector plugins¶

stacklight¶

Collects node_load5, machine_cpu_cores, and libvirt_domain_info_cpu_time_seconds:rate5m metrics from the StackLight service running in the MOSK cluster.

Does not have options available.

Requires the reconcileInterval set to at least 300 (5 minutes), as both the collected node and instance CPU usage metrics are effectively averaged over a 5-minute sliding window.

Scheduler plugins¶

vm-optimize¶

Attempts to minimize the standard deviation of node load. The node load is normalized per CPU core, so heterogeneous compute hosts can be compared.

Available options:

load_threshold
The value in percent of the compute host load after which the host will be considered overloaded and attempts will be made to migrate instances from it. Defaults to 80.
min_improvement
Minimal improvement of the optimization metric in percent. While making decisions, the scheduler attempts to predict the resulting load distribution to determine if moving resources is beneficial. If the total improvement after all necessary decisions is calculated to be less than min_improvement, no decisions will be executed.

Defaults to 0, any potential improvement is acted upon. Setting this to a higher value should allow avoiding instance migrations that provide negligible improvements.

Warning

The current version of this plugin takes into account only basic resource classes when making scheduling decisions. These include only RAM, disk, and vCPU count from the instance flavor. It does not take into account any other information including specific image or aggregate metadata, custom resource classes, PCI devices, NUMA, hugepages, and so on. Moving around instances that consume such resources will more likely fail as the current implementation of the scheduler plugin cannot reliably predict if such instances fit onto the selected target host.

Actuator plugins¶

os-live-migration¶

Live migrates instances to specific hosts. Assumes any migration is possible. Refer to the hosts and migrateAny options above to learn how to control which instances are migrated to which locations.

Available options:

max_parallel_migrations
Defines the number of instances to migrate in parallel.

Defaults to 10.

This value applies to all decisions being processed, so it may involve instances from different hosts. Meanwhile, the nova-compute service may have its own limits on how many live migrations a given host can handle in parallel.
migration_polling_interval
Defines the interval in seconds for checking the instance status while the latter is being migrated

Defaults to 5.
migration_timeout
Defines the interval in seconds after which an unfinished migration is considered failed.

Defaults to 180.

noop¶

Only logs the decisions that were scheduled for execution. Useful for debugging and dry-runs.

Note

The list of the services and their supported features included in this section is not full and is being constantly amended based on the complexity of the architecture and use of a particular service.

OpenStack¶

OpenStack cluster¶

OpenStack and auxiliary services are running as containers in the kind: Pod Kubernetes resources. All long-running services are governed by one of the ReplicationController-enabled Kubernetes resources, which include either kind: Deployment, kind: StatefulSet, or kind: DaemonSet.

The placement of the services is mostly governed by the Kubernetes node labels. The labels affecting the OpenStack services include:

openstack-control-plane=enabled - the node hosting most of the OpenStack control plane services.
openstack-compute-node=enabled - the node serving as a hypervisor for Nova. The virtual machines with tenants workloads are created there.
openvswitch=enabled - the node hosting Neutron L2 agents and OpenvSwitch pods that manage L2 connection of the OpenStack networks.
openstack-gateway=enabled - the node hosting Neutron L3, Metadata and DHCP agents, Octavia Health Manager, Worker and Housekeeping components.

Note

OpenStack is an infrastructure management platform. Mirantis OpenStack for Kubernetes (MOSK) uses Kubernetes mostly for orchestration and dependency isolation. As a result, multiple OpenStack services are running as privileged containers with host PIDs and Host Networking enabled. You must ensure that at least the user with the credentials used by Helm/Tiller (administrator) is capable of creating such Pods.

Infrastructure services¶

Service	Description
Storage	While the underlying Kubernetes cluster is configured to use Ceph CSI for providing persistent storage for container workloads, for some types of workloads such networked storage is suboptimal due to latency. This is why the separate `local-volume-provisioner` CSI is deployed and configured as an additional storage class. Local Volume Provisioner is deployed as `kind: DaemonSet`.
Database	A single WSREP (Galera) cluster of MariaDB is deployed as the SQL database to be used by all OpenStack services. It uses the storage class provided by Local Volume Provisioner to store the actual database files. The service is deployed as `kind: StatefulSet` of a given size, which is no less than 3, on any `openstack-control-plane` node. For details, see OpenStack database architecture.
Messaging	RabbitMQ is used as a messaging bus between the components of the OpenStack services. A separate instance of RabbitMQ is deployed for each OpenStack service that needs a messaging bus for intercommunication between its components. An additional, separate RabbitMQ instance is deployed to serve as a notification messages bus for OpenStack services to post their own and listen to notifications from other services. StackLight also uses this message bus to collect notifications for monitoring purposes. Each RabbitMQ instance is a single node and is deployed as `kind: StatefulSet`.
Caching	A single multi-instance of the Memcached service is deployed to be used by all OpenStack services that need caching, which are mostly HTTP API services.
Coordination	A separate instance of etcd is deployed to be used by Cinder, which require Distributed Lock Management for coordination between its components.
Ingress	Is deployed as `kind: DaemonSet`.
Image pre-caching	A special `kind: DaemonSet` is deployed and updated each time the `kind: OpenStackDeployment` resource is created or updated. Its purpose is to pre-cache container images on Kubernetes nodes, and thus, to minimize possible downtime when updating container images. This is especially useful for containers used in `kind: DaemonSet` resources, as during the image update Kubernetes starts to pull the new image only after the container with the old image is shut down.

OpenStack services¶

Service	Description
Identity (Keystone)	Uses MySQL backend by default. `keystoneclient` - a separate `kind: Deployment` with a pod that has the OpenStack CLI client as well as relevant plugins installed, and OpenStack admin credentials mounted. Can be used by administrator to manually interact with OpenStack APIs from within a cluster.
Image (Glance)	Supported backend is RBD (Ceph is required).
Volume (Cinder)	Supported backend is RBD (Ceph is required).
Network (Neutron)	Supported backends are Open vSwitch, Open Virtual Network, and Tungsten Fabric.
Placement
Compute (Nova)	Supported hypervisor is Qemu/KVM through libvirt library.
Dashboard (Horizon)
DNS (Designate)	Supported backend is PowerDNS.
Load Balancer (Octavia)
Ceph Object Gateway (SWIFT)	Provides the object storage and a Ceph Object Gateway Swift API that is compatible with the OpenStack Swift API. You can manually enable the service in the `OpenStackDeployment` CR as described in Deploy an OpenStack cluster.
Instance HA (Masakari)	An OpenStack service that ensures high availability of instances running on a host. You can manually enable Masakari in the `OpenStackDeployment` CR as described in Deploy an OpenStack cluster.
Orchestration (Heat)
Key Manager (Barbican)	The supported backends include: The built-in Simple Crypto, which is used by default Vault Vault by HashiCorp is a third-party system and is not installed by MOSK. Hence, the Vault storage backend should be available elsewhere on the user environment and accessible from the MOSK deployment. If the Vault backend is used, you can configure Vault in the `OpenStackDeployment` CR as described in Deploy an OpenStack cluster.
Tempest	Runs tests against a deployed OpenStack cloud. You can manually enable Tempest in the `OpenStackDeployment` CR as described in Deploy an OpenStack cluster.
Shared Filesystems (OpenStack Manila)	Provides Shared Filesystems as a service that enables you to create and manage shared filesystems in a multi-project cloud environments. For details, refer to Shared Filesystems service.
Shared Filesystems (OpenStack Manila)	Provides Shared Filesystems as a service that enables you to create and manage shared filesystems in a multi-project cloud environments. For details, refer to Shared Filesystems service.

OpenStack database architecture¶

A complete setup of a MariaDB Galera cluster for OpenStack is illustrated in the following image:

MariaDB server pods are running a Galera multi-master cluster. Clients requests are forwarded by the Kubernetes mariadb service to the mariadb-server pod that has the primary label. Other pods from the mariadb-server StatefulSet have the backup label. Labels are managed by the mariadb-controller pod.

The MariaDB Controller periodically checks the readiness of the mariadb-server pods and sets the primary label to it if the following requirements are met:

The primary label has not already been set on the pod.
The pod is in the ready state.
The pod is not being terminated.
The pod name has the lowest integer suffix among other ready pods in the StatefulSet. For example, between mariadb-server-1 and mariadb-server-2, the pod with the mariadb-server-1 name is preferred.

Otherwise, the MariaDB Controller sets the backup label. This means that all SQL requests are passed only to one node while other two nodes are in the backup state and replicate the state from the primary node. The MariaDB clients are connecting to the mariadb service.

OpenStack lifecycle management¶

The OpenStack Operator component is a combination of the following entities:

OpenStack Controller (Rockoon)¶

The OpenStack Controller (Rockoon) runs in a set of containers in a pod in Kubernetes. Rockoon is deployed as a Deployment with 1 replica only. The failover is provided by Kubernetes that automatically restarts the failed containers in a pod.

However, given the recommendation to use a separate Kubernetes cluster for each OpenStack deployment, the controller in envisioned mode for operation and deployment will only manage a single OpenStackDeployment resource, making the proper HA much less of an issue.

Rockoon is written in Python using Kopf, as a Python framework to build Kubernetes operators, and Pykube, as a Kubernetes API client.

Using Kubernetes API, the controller subscribes to changes to resources of kind: OpenStackDeployment, and then reacts to these changes by creating, updating, or deleting appropriate resources in Kubernetes.

The basic child resources managed by the controller are Helm releases. They are rendered from templates taking into account an appropriate values set from the main and features fields in the OpenStackDeployment resource.

Then, the common fields are merged to resulting data structures. Lastly, the services fields are merged providing the final and precise override for any value in any Helm release to be deployed or upgraded.

The constructed values are then used by Rockoon during a Helm release installation.

Rockoon containers¶
Container	Description
`osdpl`	The core container that handles changes in the `osdpl` object.
`helmbundle`	The container that watches the `helmbundle` objects and reports their statuses to the `osdpl` object in `status:children`. See OpenStackDeploymentStatus custom resource for details.
`health`	The container that watches all Kubernetes native resources, such as `Deployments`, `Daemonsets`, `Statefulsets`, and reports their statuses to the `osdpl` object in `status:health`. See OpenStackDeploymentStatus custom resource for details.
`secrets`	The container that provides data exchange between different components such as Ceph.
`node`	The container that handles the node events.

OpenStackDeployment Admission Controller¶

The CustomResourceDefinition resource in Kubernetes uses the OpenAPI Specification version 2 to specify the schema of the resource defined. The Kubernetes API outright rejects the resources that do not pass this schema validation.

The language of the schema, however, is not expressive enough to define a specific validation logic that may be needed for a given resource. For this purpose, Kubernetes enables the extension of its API with Dynamic Admission Control.

For the OpenStackDeployment (OsDpl) CR the ValidatingAdmissionWebhook is a natural choice. It is deployed as part of OpenStack Controller (Rockoon) by default and performs specific extended validations when an OsDpl CR is created or updated.

The inexhaustive list of additional validations includes:

Deny the OpenStack version downgrade
Deny the OpenStack version skip-level upgrade
Deny the OpenStack master version deployment
Deny upgrade to the OpenStack master version
Deny upgrade if any part of an OsDpl CR specification changes along with the OpenStack version

Under specific circumstances, it may be viable to disable the Admission Controller, for example, when you attempt to deploy or upgrade to the master version of OpenStack.

Warning

Mirantis does not support MOSK deployments performed without the OpenStackDeployment Admission Controller enabled. Disabling of the OpenStackDeployment Admission Controller is only allowed in staging non-production environments.

To disable the Admission Controller, ensure that the following structures and values are present in the rockoon HelmBundle resource:

apiVersion: lcm.mirantis.com/v1alpha1
kind: HelmBundle
metadata:
  name: openstack-operator
  namespace: osh-system
spec:
  releases:
  - name: openstack-operator
    values:
      admission:
        enabled: false

At that point, all safeguards except for those expressed by the CR definition are disabled.

OpenStack Exporter¶

The OpenStack Exporter collects metrics from the OpenStack services and exposes them to Prometheus for integration with StackLight. The Exporter interacts with the REST APIs of various OpenStack services to gather data about the infrastructure state and performance for visualization, alerting, and analysis within the monitoring system.

To retrieve metrics from the OpenStack Exporter:

Locate the Exporter pod. The OpenStack Exporter runs in the osh-system namespace:
```
kubectl -n osh-system get pods | grep exporter
```

Query the metrics by executing the curl request inside the exporter container:

kubectl -n osh-system exec -t <EXPORTER-POD>  -c exporter curl http://localhost:9102/

OpenStack configuration¶

MOSK provides the configurational capabilities through a number of custom resources. This section is intended to provide detailed overview of these custom resources and their possible configuration.

OpenStackDeployment custom resource¶

The detailed information about schema of an OpenStackDeployment custom resource can be obtained by running:

kubectl get crd openstackdeployments.lcm.mirantis.com -o yaml

The definition of a particular OpenStack deployment can be obtained by running:

kubectl -n openstack get osdpl -o yaml

Example of an OpenStackDeployment CR of minimum configuration¶

apiVersion: lcm.mirantis.com/v1alpha1
kind: OpenStackDeployment
metadata:
  name: openstack-cluster
  namespace: openstack
spec:
  openstack_version: victoria
  preset: compute
  size: tiny
  internal_domain_name: cluster.local
  public_domain_name: it.just.works
  features:
    neutron:
      tunnel_interface: ens3
      external_networks:
        - physnet: physnet1
          interface: veth-phy
          bridge: br-ex
          network_types:
           - flat
          vlan_ranges: null
          mtu: null
      floating_network:
        enabled: False
    nova:
      live_migration_interface: ens3
      images:
        backend: local

Hiding sensitive information¶

Available since MOSK 23.1

The OpenStackDeployment custom resource enables you to securely store sensitive fields in Kubernetes secrets. To do that, verify that the reference secret is present in the same namespace as the OpenStackDeployment object and the openstack.lcm.mirantis.com/osdpl_secret label is set to true. The list of fields that can be hidden from OpenStackDeployment is limited and defined by the OpenStackDeployment schema.

For example, to hide spec:features:ssl:public_endpoints:api_cert, use the following structure:

spec:
  features:
    ssl:
      public_endpoints:
        api_cert:
          value_from:
            secret_key_ref:
              key: api_cert
              name: osh-dev-hidden

Main elements¶

Main elements of OpenStackDeployment custom resource¶
Element	Sub-element	Description
`apiVersion`	n/a	Specifies the version of the Kubernetes API that is used to create this object
`kind`	n/a	Specifies the kind of the object
`metadata`	`name`	Specifies the name of metadata. Should be set in compliance with the Kubernetes resource naming limitations
	`namespace`	Specifies the metadata namespace. While technically it is possible to deploy OpenStack on top of Kubernetes in other than `openstack` namespace, such configuration is not included in the MOSK system integration test plans. Therefore, Mirantis does not recommend such scenario. Warning Both OpenStack and Kubernetes platforms provide resources to applications. When OpenStack is running on top of Kubernetes, Kubernetes is completely unaware of OpenStack-native workloads, such as virtual machines, for example. For better results and stability, Mirantis recommends using a dedicated Kubernetes cluster for OpenStack, so that OpenStack and auxiliary services, Ceph, and StackLight are the only Kubernetes applications running in the cluster.
`spec`	`openstack_version`	Specifies the OpenStack release to deploy
	`preset`	String that specifies the name of the `preset`, a predefined configuration for the OpenStack cluster. A preset includes: A set of enabled services that includes virtualization, bare metal management, secret management, and others Major features provided by the services, such as VXLAN encapsulation of the tenant traffic Integration of services Every supported deployment profile incorporates an OpenStack preset. Refer to Deployment profiles for the list of possible values.
	`size`	String that specifies the size category for the OpenStack cluster. The size category defines the internal configuration of the cluster such as the number of replicas for service workers and timeouts, etc. The list of supported sizes include: `tiny` - for approximately 10 OpenStack compute nodes `small` - for approximately 50 OpenStack compute nodes `medium` - for approximately 100 OpenStack compute nodes
	`public_domain_name`	Specifies the public DNS name for OpenStack services. This is a base DNS name that must be accessible and resolvable by API clients of your OpenStack cloud. It will be present in the OpenStack endpoints as presented by the OpenStack Identity service catalog. The TLS certificates used by the OpenStack services (see below) must also be issued to this DNS name.
	`persistent_volume_storage_class`	Specifies the Kubernetes storage class name used for services to create persistent volumes. For example, backups of MariaDB. If not specified, the storage class marked as `default` will be used.
	`features`	Contains the top-level collections of settings for the OpenStack deployment that potentially target several OpenStack services. The section where the customizations should take place. The `features:services` element contains a list of extra OpenStack services to deploy. Extra OpenStack services are services that are not included into `preset`.

region_name¶

TechPreview

The name of the region used for deployment, defaults to RegionOne.

features:policies¶

Defines the list of custom policies for OpenStack services.

Configuration structure:

spec:
  features:
    policies:
      nova:
        custom_policy: custom_value

The list of services available for configuration includes: Cinder, Nova, Designate, Keystone, Glance, Neutron, Heat, Octavia, Barbican, Placement, Ironic, aodh, Gnocchi, and Masakari.

Learn more about OpenStack API access policies in MOSK in OpenStack API access policies.

Caution

Mirantis is not responsible for cloud operability in case of default policies modifications but provides API to pass the required configuration to the core OpenStack services.

features:policies:strict_admin¶

TechPreview

Enables a tested set of policies that limits the global admin role to only the user with admin role in the admin project or user with the service role. The latter should be used only for service users utilizied for communication between OpenStack services.

Configuration structure:

spec:
  features:
    policies:
      strict_admin:
        enabled: true
  services:
    identity:
      keystone:
        values:
          conf:
            keystone:
              resource:
                admin_project_name: admin
                admin_project_domain_name: Default

Note

The spec.services part of the above section will become redundant in one of the following releases.

artifacts¶

A low-level section that defines the base URI prefixes for images and binary artifacts.

common¶

A low-level section that defines values that will be passed to all OpenStack (spec:common:openstack) or auxiliary (spec:common:infra) services Helm charts.

Configuration structure:

spec:
  artifacts:
  common:
    openstack:
      values:
    infra:
      values:

services¶

A section of the lowest level, enables the definition of specific values to pass to specific Helm charts on a one-by-one basis:

Warning

Mirantis does not recommend changing the default settings for spec:artifacts, spec:common, and spec:services elements. Customizations can compromise the OpenStack deployment update and upgrade processes. However, you may need to edit the spec:services section to limit hardware resources in case of a hyperconverged architecture as described in Limit HW resources for hyperconverged OpenStack compute nodes.

Logging¶

Parameter	`features:logging:<service>:level`
Usage	Specifies the standard logging levels for OpenStack services that include the following, at increasing severity: `TRACE`, `DEBUG`, `INFO`, `AUDIT`, `WARNING`, `ERROR`, and `CRITICAL`. Configuration example: spec: features: logging: nova: level: DEBUG

Parameter

features:logging:<service>:level

Usage

Specifies the standard logging levels for OpenStack services that include the following, at increasing severity: TRACE, DEBUG, INFO, AUDIT, WARNING, ERROR, and CRITICAL.

Configuration example:

spec:
  features:
    logging:
      nova:
        level: DEBUG

Node-specific configuration¶

Depending on the use case, you may need to configure the same application components differently on different hosts. MOSK enables you to easily perform the required configuration through node-specific overrides at the OpenStack Controller side.

The limitation of using the node-specific overrides is that they override only the configuration settings while other components, such as startup scripts and others, should be reconfigured as well.

Caution

The overrides have been implemented in a similar way to the OpenStack node and node label specific DaemonSet configurations. Though, the OpenStack Controller node-specific settings conflict with the upstream OpenStack node and node label specific DaemonSet configurations. Therefore, we do not recommend configuring node and node label overrides.

The list of allowed node labels is located in the Cluster object status providerStatus.releaseRef.current.allowedNodeLabels field.

If the value field is not defined in allowedNodeLabels, a label can have any value.

Before or after a machine deployment, add the required label from the allowed node labels list with the corresponding value to spec.providerSpec.value.nodeLabels in machine.yaml. For example:

nodeLabels:
- key: <NODE-LABEL>
  value: <NODE-LABEL-VALUE>

The addition of a node label that is not available in the list of allowed node labels is restricted.

The node-specific settings are activated through the spec:nodes section of the OsDpl CR. The spec:nodes section contains the following subsections:

features- implements overrides for a limited subset of fields and is constructed similarly to spec::features
services - similarly to spec::services, enables you to override settings in general for the components running as DaemonSets.

Example configuration:

spec:
  nodes:
    <NODE-LABEL>::<NODE-LABEL-VALUE>:
      features:
        # Detailed information about features might be found at
        # openstack_controller/admission/validators/nodes/schema.yaml
      services:
        <service>:
          <chart>:
            <chart_daemonset_name>:
              values:
                # Any value from specific helm chart

See also

Tempest¶

Parameter	`features:services:tempest`
Usage	Enables tests against a deployed OpenStack cloud: spec: features: services: - tempest

See also

OpenStack Operator resources

OpenStackDeploymentStatus custom resource¶

The resource of kind OpenStackDeploymentStatus is a custom resource that describes the status of an OpenStack deployment. To obtain detailed information about the schema of an OpenStackDeploymentStatus custom resource:

kubectl get crd openstackdeploymentstatus.lcm.mirantis.com -o yaml

To obtain the status definition for a particular OpenStack deployment:

kubectl -n openstack get osdplst

Example of system response:

Since MOSK 24.1.3

NAME      OPENSTACK VERSION   CONTROLLER VERSION   STATE     LCM PROGRESS   HEALTH   MOSK RELEASE
osh-dev   antelope            0.16.1.dev104        APPLIED   20/20          21/22    MOSK 24.1.3

Where:

OPENSTACK VERSION displays the actual OpenStack version of the deployment
CONTROLLER VERSION indicates the version of the OpenStack Controller (Rockoon) responsible for the deployment
STATE reflects the current status of life-cycle management. The list of possible values includes:
- APPLYING indicates that some Kubernetes objects for applications are in the process of being applied
- APPLIED indicates that all Kubernetes objects for applications have been applied to the latest state
LCM PROGRESS reflects the current progress of STATE in the format X/Y, where X denotes the number of applications with Kubernetes objects applied and in the actual state, and Y represents the total number of applications managed by the OpenStack Controller (Rockoon)
HEALTH provides an overview of the current health status of the OpenStack deployment in the format X/Y, where X represents the number of applications with notReady pods, and Y is the total number of applications managed by the OpenStack Controller (Rockoon)
MOSK RELEASE displays the current product release of the OpenStack deployment

MOSK 24.1.2 and older versions

NAME      OPENSTACK VERSION   CONTROLLER VERSION   STATE     MOSK RELEASE
osh-dev   antelope            0.16.1.dev104        APPLIED   MOSK 24.1

Where:

OPENSTACK VERSION displays the actual OpenStack version of the deployment
CONTROLLER VERSION indicates the version of the OpenStack Controller (Rockoon) responsible for the deployment
STATE reflects the current status of life-cycle management. The list of possible values includes:
- APPLYING indicates that some Kubernetes objects for applications are in the process of being applied
- APPLIED indicates that all Kubernetes objects for applications have been applied to the latest state
MOSK RELEASE displays the current product release of the OpenStack deployment

Example of an OpenStackDeploymentStatus custom resource configuration

 kind: OpenStackDeploymentStatus
 metadata:
   name: osh-dev
   namespace: openstack
 spec: {}
 status:
   handle:
     lastStatus: update
   health:
     barbican:
       api:
         generation: 2
         status: Ready
     cinder:
       api:
         generation: 2
         status: Ready
       backup:
         generation: 1
         status: Ready
       scheduler:
         generation: 1
         status: Ready
       volume:
         generation: 1
         status: Ready
   osdpl:
     cause: update
     changes: '((''add'', (''status'',), None, {''watched'': {''ceph'': {''secret'':
       {''hash'': ''0fc01c5e2593bc6569562b451b28e300517ec670809f72016ff29b8cbaf3e729''}}}}),)'
     controller_version: 0.5.3.dev12
     fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
     openstack_version: ussuri
     state: APPLIED
     timestamp: "2021-09-08 17:01:45.633143"
   services:
     baremetal:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:54.081353"
     block-storage:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:57.306669"
     compute:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:18.853068"
     coordination:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:00.593719"
     dashboard:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:57.652145"
     database:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:00.233777"
     dns:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:56.540886"
     identity:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:00.961175"
     image:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:58.976976"
     ingress:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:01.440757"
     key-manager:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:51.822997"
     load-balancer:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:02.462824"
     memcached:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:03.165045"
     messaging:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:58.637506"
     networking:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:35.553483"
     object-storage:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:01.828834"
     orchestration:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:01:02.846671"
     placement:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:58.039210"
     redis:
       controller_version: 0.5.3.dev12
       fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
       openstack_version: ussuri
       state: APPLIED
       timestamp: "2021-09-08 17:00:36.562673"

Health structure¶

The health subsection provides a brief output on services health.

OsDpl structure¶

The osdpl subsection describes the overall status of the OpenStack deployment.

OsDpl structure elements¶
Element	Description
`cause`	The cause that triggered the LCM action: `update` when OsDpl is updated, `resume` when the OpenStack Controller (Rockoon) is restarted
`changes`	A string representation of changes in the `OpenstackDeployment` object
`controller_version`	The version of `rockoon` that handles the LCM action
`fingerprint`	The SHA sum of the `OpenStackDeployment` object `spec` section
`openstack_version`	The current OpenStack version specified in the `osdpl` object
`state`	The current state of the LCM action. Possible values include: `APPLYING` - not all operations are completed `APPLIED` - all operations are completed
`timestamp`	The timestamp of the `status:osdpl` section update

Services structure¶

The services subsection provides detailed information of LCM performed with a specific service. This is a dictionary where keys are service names, for example, baremetal or compute and values are dictionaries with the following items.

Services structure elements¶
Element	Description
`controller_version`	The version of the `rockoon` that handles the LCM action on a specific service
`fingerprint`	The SHA sum of the `OpenStackDeployment` object `spec` section used when performing the LCM on a specific service
`openstack_version`	The OpenStack version specified in the `osdpl` object used when performing the LCM action on a specific service
`state`	The current state of the LCM action performed on a service. Possible values include: `WAITING` - waiting for dependencies. `APPLYING` - not all operations are completed. `APPLIED` - all operations are completed.
`timestamp`	The timestamp of the `status:services:<SERVICE-NAME>` section update.

OpenStack Controller configuration¶

Available since MOSK 23.2

OpenStack Controller (Rockoon)

Since MOSK 25.1, the OpenStack Controller has been open-sourced under the name Rockoon and is maintained as an independent open-source project going forward.

As part of this transition, all openstack-controller pods are named rockoon pods across the MOSK documentation and deployments. This change does not affect functionality, but this is the reminder for the users to utilize the new naming for pods and other related artifacts accordingly.

The OpenStack Controller (Rockoon) enables you to modify its configuration at runtime without restarting. MOSK stores the controller configuration in the rockoon-config ConfigMap in the osh-system namespace of your cluster.

To retrieve the Rockoon configuration ConfigMap, run:

kubectl get configmaps rockoon-config -o yaml

Rockoon extra configuration parameters¶
Section	Parameter	Default value	Description
`[osctl]`	`wait_application_ready_timeout`	`1200`	The number of seconds to wait for all application components to become ready.
	`wait_application_ready_delay`	`10`	The number of seconds before going to the sleep mode between attempts to verify if the application is ready.
	`node_not_ready_flapping_timeout`	`120`	The amount of time to wait for the flapping node.
`[helmbundle]`	`manifest_enable_timeout`	`600`	The number of seconds to wait until the values set in the manifest are propagated to the dependent objects.
	`manifest_enable_delay`	`10`	The number of seconds between attempts to verify if the values were applied.
	`manifest_disable_timeout`	`600`	The number of seconds to wait until the values are removed from the manifest and propagated to the child objects.
	`manifest_disable_delay`	`10`	The number of seconds between attempts to verify if the values were removed from the release.
	`manifest_purge_timeout`	`600`	The number of seconds to wait until the Kubernetes object is removed.
	`manifest_purge_delay`	`10`	The number of seconds between attempts to verify if the Kubernetes object is removed.
	`manifest_apply_delay`	`10`	The number of seconds to pause for the Helm bundle changes.
`[maintenance]`	`instance_migrate_concurrency`	`1`	The number of instances to migrate concurrently.
	`nwl_parallel_max_compute`	`30`	The maximum number of compute nodes allowed for a parallel update.
	`nwl_parallel_max_gateway`	`1`	The maximum number of gateway nodes allowed for a parallel update.
	`respect_nova_az`	`true`	Respect Nova availability zone (AZ). The `true` value allows the parallel update only for the compute nodes in the same AZ.
	`ndr_skip_instance_check`	`false`	The flag to skip the instance verification on a host before proceeding with the node removal. The `false` value blocks the node removal until at least one instance exists on the host.
	`ndr_skip_volume_check`	`false`	The flag to skip the volume verification on a host before proceeding with the node removal. The `false` value blocks the node removal until at least one volume exists on the host. A volume is tied to a specific host only for the LVM backend.

Custom OpenStack images¶

Available since MOSK 25.1

The OpenStack Controller enables you to use customized images in your OpenStack deployments. To start using such images, create a ConfigMap in the openstack namespace with the following content, replacing <OPENSTACKDEPLOYMENT-NAME> with the name of your OpenStackDeployment custom resource:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    openstack.lcm.mirantis.com/watch: "true"
  name: <OPENSTACKDEPLOYMENT-NAME>-artifacts
  namespace: openstack
data:
  caracal: |
    dep_check: <KUBERNETES-ENTRYPOINT-IMAGE-URL>
    openvswitch_db_server: <OPENVSWITCH-IMAGE-URL>
    openvswitch_vswitchd: <OPENVSWITCH-IMAGE-URL>

OpenStack database¶

MOSK relies on the MariaDB Galera cluster to provide its OpenStack components with a reliable storage of persistent data.

For successful long-term operations of a MOSK cloud, it is crucial to ensure the healthy state of the OpenStack database as well as the safety of the data stored in it. To help you with that, MOSK provides built-in automated procedures for OpenStack database maintenance, backup, and restoration. The hereby chapter describes the internal mechanisms and configuration details for the provided tools.

Overview of the OpenStack database backup and restoration¶

MOSK relies on the MariaDB Galera cluster to provide its OpenStack components with a reliable storage for persistent data. Mirantis recommends backing up your OpenStack databases daily to ensure the safety of your cloud data. Also, you should always create an instant backup before updating your cloud or performing any kind of potentially disruptive experiment.

MOSK has a built-in automated backup routine that can be triggered manually or by schedule. For detailed information about the process of MariaDB Galera cluster backup, refer to Workflows of the OpenStack database backup and restoration.

Backup and restoration can only be performed against the OpenStack database as a whole. Granular per-service or per-table procedures are not supported by MOSK.

Important

Because database restoration reverts to a specific snapshot, the resulting database state may not accurately reflect the current state of dynamic resources such as running VMs, volumes, and similar components. For example:

A VM is removed after a snapshot for restoration is created. Such VM will be present as an orphan entry in the database and the OpenStack API after restoration.
A VM is created after a snapshot for restoration is created. Such VM will disappear from the OpenStack API after database restoration but will still be present as a process on the compute host.

Manually analyze and resolve such inconsistencies.

Restoring a database may also impact applications that rely on it for heartbeats. For instance, Octavia Amphorae may become unresponsive after restoration, potentially requiring LoadBalancer failover to maintain service availability.

Periodic backups¶

By default, periodic backups are turned off. Though, a cloud operator can easily enable this capability by adding the following structure to the OpenStackDeployment custom resource:

spec:
  features:
    database:
      backup:
        enabled: true

For the configuration details, refer to Periodic OpenStack database backups.

Database restoration¶

Along with the automated backup routine, MOSK provides the Mariabackup tool for the OpenStack database restoration. For the database restoration procedure, refer to Restore OpenStack databases from a backup. For more information about the restoration process, consult Workflows of the OpenStack database backup and restoration.

Storage for backup data¶

By default, MOSK backup routine stores the OpenStack database data into the Mirantis Ceph cluster, which is a part of the same cloud. This is sufficient for the vast majority of clouds. However, you may want to have the backup data stored off the cloud to comply with specific enterprise practices for infrastructure recovery and data safety.

To achieve that, MOSK enables you to point the backup routine to an external data volume. For details, refer to Remote storage for OpenStack database backups.

Size of a backup storage¶

The size of a backup storage volume depends directly on the size of the MOSK cluster, which can be determined through the size parameter in the OpenStackDeployment CR.

The list of the recommended sizes for a minimal backup volume includes:

20 GB for the tiny cluster size
40 GB for the small cluster size
80 GB for the medium cluster size

If required, you can change the default size of a database backup volume. However, make sure that you configure the volume size before OpenStack deployment is complete. This is because there is no automatic way to resize the backup volume once the cloud is deployed. Also, only the local backup storage (Ceph) supports the configuration of the volume size.

To change the default size of the backup volume, use the following structure in the OpenStackDeployment CR:

spec:
  services:
    database:
      mariadb:
        values:
          volume:
            phy_backup:
              size: "200Gi"

Local backup storage - default¶

To store the backup data to a local Mirantis Ceph, the MOSK underlying Kubernetes cluster needs to have a preconfigured storage class for Kubernetes persistent volumes with the Ceph cluster as a storage backend.

When restoring the OpenStack database from a local Ceph storage, the cron job restores the state on each MariaDB node sequentially. It is not possible to perform parallel restoration because Ceph Kubernetes volumes do not support concurrent mounting from multiple places.

Additionally, MOSK allows for increased backup safety through synchronization of local MariaDB backups with a remote S3 storage. For details, see Synchronization of local MariaDB backups with a remote S3 storage.

Remote backup storage¶

MOSK provides you with a capability to store the OpenStack database data outside of the cloud, on an external storage device that supports common data access protocols, such as third-party NAS appliances.

Refer to Remote storage for OpenStack database backups for the configuration details.

Backup encryption¶

Available since MOSK 25.1 TechPreview

Security compliance may require storing backups of databases in an encrypted format. MOSK enables encryption of database backups, both local and remote, using the OpenSSL aes-256-cbc encryption.

To encrypt database backups, add the following configuration to the OpenStackDeployment custom resource:

spec:
  features:
    database:
      backup:
        encryption:
          enabled: true

Workflows of the OpenStack database backup and restoration¶

This section provides technical details about the internal implementation of automated backup and restoration routines built into MOSK. The below information would be helpful for troubleshooting of any issues related to the process or understanding the impact these procedures impose on a running cloud.

Backup workflow¶

The OpenStack database backup workflow consists of the following phases.

Backup phase 1¶

The mariadb-phy-backup job is responsible for:

Performing basic sanity checks and choosing right node for backup
Verifying the wsrep status and changing the wsrep_desync parameter settings
Checking backup integrity (ensuring correct hash sums)
Managing the mariadb-phy-backup-runner pod
If enabled, synchronizing the local backup storage with the remote S3 storage

During the first backup phase, the following actions take place:

The mariadb-phy-backup pod starts on the node where the mariadb-server replica with the highest number in its name runs. For example, if the MariaDB server pods are named mariadb-server-0, mariadb-server-1, and mariadb-server-2, the mariadb-phy-backup pod starts on the same node as mariadb-server-2.
The backup process verifies the hash sums of existing backup files based on ConfigMap information:
- If the verification fails and synchronization with the remote S3 storage is enabled, the process checks the hash sums of remote backups as well. If the remote backups are valid, they are downloaded.
- If the hash sums are incorrect for both local and remote backups, the backup job fails.
- If no ConfigMap exists, these hash sum checks are skipped.
Sanity check: verification of the Kubernetes status and wsrep status of each MariaDB pod. If some pods have wrong statuses, the backup job fails unless the --allow-unsafe-backup parameter is passed to the main script in the Kubernetes backup job.

Note

Since MOSK 22.4, the --allow-unsafe-backup functionality is removed from the product for security and backup procedure simplification purposes.

Mirantis does not recommend setting the --allow-unsafe-backup parameter unless it is absolutely required. To ensure the consistency of a backup, verify that the MariaDB Galera cluster is in a working state before you proceed with the backup.
Desynchronize the replica from the Galera cluster. The script connects the target replica and sets the wsrep_desync variable to ON. Then, the replica stops receiving write-sets and receives the wsrep status Donor/Desynced. The Kubernetes health check of that mariadb-server pod fails and the Kubernetes status of that pod becomes Not ready. If the pod has the primary label, the MariaDB Controller sets the backup label to it and the pod is removed from the endpoints list of the MariaDB service.
Verify that there is enough space in the /var/backup folder to perform the backup. The amount of available space in the folder should exceed <DB-SIZE> * <MARIADB-BACKUP-REQUIRED-SPACE-RATIO> in KB.

mariadb_backup_scheme-os-k8s-mariadb-backup-phase1

Backup phase 2¶

The mariadb-phy-backup pod performs the backup using the mariabackup tool.
The script puts the backed up replica back to sync with the Galera cluster by setting wsrep_desync to OFF and waits for the replica to become Ready in Kubernetes.

mariadb_backup_scheme-os-k8s-mariadb-backup-phase2

Backup phase 3¶

The script calculates hash sums for backup files and stores them in a special ConfigMap.
If the number of existing backups exceeds the value of the MARIADB_BACKUPS_TO_KEEP job parameter, the script removes the oldest backups to maintain the allowed limit.
If enabled, the script synchronizes the local backup storage with the remote S3 storage.

mariadb_backup_scheme-os-k8s-mariadb-backup-phase3

Restoration workflow¶

The OpenStack database restoration workflow consists of the following phases.

Restoration phase 1¶

The mariadb-phy-restore job launches the mariadb-phy-restore pod. This pod starts with the mariadb-server PVC with the highest number in its name. This PVC is mounted to the /var/lib/mysql folder and the backup PVC (or local filesystem if the hostapath backend is configured) is mounted to /var/backup.

The mariadb-phy-restore pod contains the main restore script, which is responsible for:

Scaling the mariadb-server StatefulSet
Verifying the statuses of mariadb-server pods
Managing the openstack-mariadb-phy-restore-runner pods
Checking backup integrity (ensuring correct hash sums)

Caution

During the restoration, the database is not available for OpenStack services that means a complete outage of all OpenStack services.

During the first phase, the following actions take place:

The restoration process verifies the hash sums of existing backup files based on ConfigMap information:
- If the verification fails and synchronization with the remote S3 storage is enabled, the process checks the hash sums of remote backups as well. If the remote backups are valid, they are downloaded.
- If the hash sums are incorrect for both local and remote backups, the backup job fails.
Save the list of mariadb-server persistent volume claims (PVC).
Scale the mariadb server StatefulSet to 0 replicas. At this point, the database becomes unavailable for OpenStack services.

mariadb_backup_scheme-os-k8s-mariadb-restore-phase1

Restoration phase 2¶

The mariadb-phy-restore pod performs the following actions:
1. Launches the openstack-mariadb-phy-restore-runner pod for each mariadb-server PVC. This pod cleans all MySQL data on each PVC.
2. Collects logs from the openstack-mariadb-phy-restore-runner pod and then removes it.
3. Unarchives the database backup files to a temporary directory within /var/backup.
4. Executes mariabackup --prepare on the unarchived data.
5. Restores the backup to /var/lib/mysql.

mariadb_backup_scheme-os-k8s-mariadb-restore-phase2

Restoration phase 3¶

The mariadb-phy-restore pod scales the mariadb-server StatefulSet back to the configured number of replicas.
The mariadb-phy-restore pod waits until all mariadb-server replicas are ready.

mariadb_backup_scheme-os-k8s-mariadb-restore-phase3

OpenStack database auto-cleanup¶

By design, when deleting a cloud resource, for example, an instance, volume, or router, an OpenStack service does not immediately delete its data but marks it as removed so that it can later be picked up by the garbage collector.

Given that an OpenStack resource is often represented by more than one record in the database, deletion of all of them right away could affect the overall responsiveness of the cloud API. On the other hand, an OpenStack database being severely clogged with stale data is one of the most typical reasons for the cloud slowness.

To keep the OpenStack database small and performance fast, MOSK is pre-configured to automatically clean up the removed database records older than 30 days. By default, the clean up is performed for the following MOSK services every Monday according to the schedule:

The default database cleanup schedule by OpenStack service¶
Service	Service identifier	Clean up time
Block Storage (OpenStack Cinder)	`cinder`	12:01 a.m.
Compute (OpenStack Nova)	`nova`	01:01 a.m.
Image (OpenStack Glance)	`glance`	02:01 a.m.
Instance HA (OpenStack Masakari)	`masakari`	03:01 a.m.
Key Manager (OpenStack Barbican)	`barbican`	04:01 a.m.
Orchestration (OpenStack Heat)	`heat`	05:01 a.m.

If required, you can adjust the cleanup schedule for the OpenStack database by adding the features:database:cleanup setting to the OpenStackDeployment CR following the example below. The schedule parameter must contain a valid cron expression. The age parameter specifies the number of days after which a stale record gets cleaned up.

spec:
  features:
    database:
      cleanup:
        <os-service-identifier>:
          enabled: true
          schedule: "1 0 * * 1"
          age: 30
          batch: 1000

Periodic OpenStack database backups¶

MOSK uses the Mariabackup utility to back up the MariaDB Galera cluster data where the OpenStack data is stored. The Mariabackup gets launched on a periodic basis as a part of the Kubernetes CronJob included in any MOSK deployment and is suspended by default.

Note

If you are using the default backend to store the backup data, which is Ceph, you can increase the default size of a backup volume. However, make sure to configure the volume size before you deploy OpenStack.

For the default sizes and configuration details, refer to Size of a backup storage.

Enabling the periodic backup¶

MOSK enables you to configure the periodic backup of the OpenStack database through the OpenStackDeployment object. To enable the backup, use the following structure:

spec:
  features:
    database:
      backup:
        enabled: true

TechPreview

To enhance cloud security, you can enable encryption of OpenStack database backups using the OpenSSL aes-256-cbc encryption through the OpenStackDeployment custom resource. Refer to Backup encryption for configuration details.

By default, the backup job:

Runs backup on a daily basis at 01:00 AM
Creates incremental backups daily and full backups weekly
Keeps 10 latest full backups
Stores backups in the mariadb-phy-backup-data PVC
Has the backup timeout of 3600 seconds
Has the incremental backup type

To verify the configuration of the mariadb-phy-backup CronJob object, run:

kubectl -n openstack get cronjob mariadb-phy-backup

Example of a mariadb-phy-backup CronJob object

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  annotations:
    openstackhelm.openstack.org/release_uuid: ""
  creationTimestamp: "2020-09-08T14:13:48Z"
  managedFields:
  <<<skipped>>>>
  name: mariadb-phy-backup
  namespace: openstack
  resourceVersion: "726449"
  selfLink: /apis/batch/v1beta1/namespaces/openstack/cronjobs/mariadb-phy-backup
  uid: 88c9be21-a160-4de1-afcf-0853697dd1a1
spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
      labels:
        application: mariadb-phy-backup
        component: backup
        release_group: openstack-mariadb
    spec:
      activeDeadlineSeconds: 4200
      backoffLimit: 0
      completions: 1
      parallelism: 1
      template:
        metadata:
          creationTimestamp: null
          labels:
            application: mariadb-phy-backup
            component: backup
            release_group: openstack-mariadb
        spec:
          containers:
          - command:
            - /tmp/mariadb_resque.py
            - backup
            - --backup-timeout
            - "3600"
            - --backup-type
            - incremental
            env:
            - name: MARIADB_BACKUPS_TO_KEEP
              value: "10"
            - name: MARIADB_BACKUP_PVC_NAME
              value: mariadb-phy-backup-data
            - name: MARIADB_FULL_BACKUP_CYCLE
              value: "604800"
            - name: MARIADB_REPLICAS
              value: "3"
            - name: MARIADB_BACKUP_REQUIRED_SPACE_RATIO
              value: "1.2"
            - name: MARIADB_RESQUE_RUNNER_IMAGE
              value: docker-dev-kaas-local.docker.mirantis.net/general/mariadb:10.4.14-bionic-20200812025059
            - name: MARIADB_RESQUE_RUNNER_SERVICE_ACCOUNT
              value: mariadb-phy-backup-runner
            - name: MARIADB_RESQUE_RUNNER_POD_NAME_PREFIX
              value: openstack-mariadb
            - name: MARIADB_POD_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            image: docker-dev-kaas-local.docker.mirantis.net/general/mariadb:10.4.14-bionic-20200812025059
            imagePullPolicy: IfNotPresent
            name: phy-backup
            resources: {}
            securityContext:
              allowPrivilegeEscalation: false
              readOnlyRootFilesystem: true
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /tmp
              name: pod-tmp
            - mountPath: /tmp/mariadb_resque.py
              name: mariadb-bin
              readOnly: true
              subPath: mariadb_resque.py
            - mountPath: /tmp/resque_runner.yaml.j2
              name: mariadb-bin
              readOnly: true
              subPath: resque_runner.yaml.j2
            - mountPath: /etc/mysql/admin_user.cnf
              name: mariadb-secrets
              readOnly: true
              subPath: admin_user.cnf
          dnsPolicy: ClusterFirst
          initContainers:
          - command:
            - kubernetes-entrypoint
            env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: INTERFACE_NAME
              value: eth0
            - name: PATH
              value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/
            - name: DEPENDENCY_SERVICE
            - name: DEPENDENCY_DAEMONSET
            - name: DEPENDENCY_CONTAINER
            - name: DEPENDENCY_POD_JSON
            - name: DEPENDENCY_CUSTOM_RESOURCE
            image: docker-dev-kaas-local.docker.mirantis.net/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233
            imagePullPolicy: IfNotPresent
            name: init
            resources: {}
            securityContext:
              allowPrivilegeEscalation: false
              readOnlyRootFilesystem: true
              runAsUser: 65534
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
          nodeSelector:
            openstack-control-plane: enabled
          restartPolicy: Never
          schedulerName: default-scheduler
          securityContext:
            runAsUser: 999
          serviceAccount: mariadb-phy-backup
          serviceAccountName: mariadb-phy-backup
          terminationGracePeriodSeconds: 30
          volumes:
          - emptyDir: {}
            name: pod-tmp
          - name: mariadb-secrets
            secret:
              defaultMode: 292
              secretName: mariadb-secrets
          - configMap:
              defaultMode: 365
              name: mariadb-bin
            name: mariadb-bin
  schedule: 0 1 * * *
  successfulJobsHistoryLimit: 3
  suspend: false

Overriding the default configuration¶

To override the default configuration, set the parameters and environment variables that are passed to the CronJob as described in the tables below.

MariaDB backup: Configuration parameters¶
Parameter	Type	Default	Description
`--backup-type`	String	`incremental`	Type of a backup. The list of possible values include: `incremental` If the newest full backup is older than the value of the `full_backup_cycle` parameter, the system performs a full backup. Otherwise, the system performs an incremental backup of the newest full backup. `full` Always performs only a full backup. Usage example: spec: features: database: backup: backup_type: incremental
`--backup-timeout`	Integer	`21600`	Timeout in seconds for the system to wait for the backup operation to succeed. Usage example: spec: services: database: mariadb: values: conf: phy_backup: backup_timeout: 30000
`--allow-unsafe-backup`	Boolean	`false`	Not recommended, removed since MOSK 22.4. If set to `true`, enables the MariaDB cluster backup in a not fully operational cluster, where: The current number of ready pods is not equal to `MARIADB_REPLICAS`. Some replicas do not have healthy wsrep statuses. Usage example: spec: services: database: mariadb: values: conf: phy_backup: allow_unsafe_backup: true

MariaDB backup: Environment variables¶
Variable	Type	Default	Description
`MARIADB_BACKUPS_TO_KEEP`	Integer	`10`	Number of full backups to keep. Usage example: spec: features: database: backup: backups_to_keep: 3
`MARIADB_BACKUP_PVC_NAME`	String	`mariadb-phy-backup-data`	Persistent volume claim used to store backups. Usage example: spec: services: database: mariadb: values: conf: phy_backup: backup_pvc_name: mariadb-phy-backup-data
`MARIADB_FULL_BACKUP_CYCLE`	Integer	`604800`	Number of seconds that defines a period between 2 full backups. During this period, incremental backups are performed. The parameter is taken into account only if `backup_type` is set to `incremental`. Otherwise, it is ignored. For example, with `full_backup_cycle` set to `604800` seconds a full backup is taken weekly and, if cron is set to `0 0 * * `, an incremental backup is performed on daily basis. Usage example:* spec: features: database: backup: full_backup_cycle: 70000
`MARIADB_BACKUP_REQUIRED_SPACE_RATIO`	Floating	`1.2`	Multiplier for the database size to predict the space required to create a backup, either full or incremental, and perform a restoration keeping the uncompressed backup files on the same file system as the compressed ones. To estimate the size of `MARIADB_BACKUP_REQUIRED_SPACE_RATIO`, use the following formula: size of (1 uncompressed full backup + all related incremental uncompressed backups + 1 full compressed backup) in KB =< (`DB_SIZE` * `MARIADB_BACKUP_REQUIRED_SPACE_RATIO`) in KB. The `DB_SIZE` is the disk space allocated in the MySQL data directory, which is `/var/lib/mysql`, for databases data excluding `galera.cache` and `ib_logfile` files. This parameter prevents the backup PVC from being full in the middle of the restoration and backup procedures. If the current available space is lower than `DB_SIZE` `MARIADB_BACKUP_REQUIRED_SPACE_RATIO`, the backup script fails before the system starts the actual backup and the overall status of the backup job is failed. Usage example: spec: services: database: mariadb: values: conf: phy_backup: backup_required_space_ratio: 1.4

For example, to perform full backups monthly and incremental backups daily at 02:30 AM and keep the backups for the last six months, configure the database backup in your OpenStackDeployment object as follows:

spec:
  features:
    database:
      backup:
        enabled: true
        backups_to_keep: 6
        schedule_time: '30 2 * * *'
        full_backup_cycle: 2628000

Remote storage for OpenStack database backups¶

By default, MOSK stores the OpenStack database backups locally in the Mirantis Ceph cluster, which is a part of the same cloud.

Alternatively, MOSK provides you with a capability to create remote backups using an external storage. This section contains configuration details for a remote backend to be used for the OpenStack data backup.

In general, the built-in automated backup routine saves the data to the mariadb-phy-backup-data PersistentVolumeClaim (PVC), which is provisioned from StorageClass specified in the spec.persistent_volume_storage_class parameter of the OpenstackDeployment custom resource (CR).

Remote NFS storage for OpenStack database backups¶

TechPreview

Requirements¶

A preconfigured NFS server with NFS share that a Unix backup and restore user has access to. By default, it is the same user that runs MySQL server in a MariaDB image.

To get the Unix user ID, run:
```
kubectl -n openstack get cronjob mariadb-phy-backup -o jsonpath='{.spec.jobTemplate.spec.template.spec.securityContext.runAsUser}'
```
Note

Verify that the NFS server is accessible through the network from all of the OpenStack control plane nodes of the cluster.
The nfs-common package installed on all OpenStack control plane nodes.

Limitations¶

Only NFS Unix authentication is supported.
Removal of the NFS persistent volume does not automatically remove the data.
No validation of mount options. If mount options are specified incorrectly in the OpenStackDeployment CR, the mount command fails upon the creation of a backup runner pod.

Enabling the NFS backend¶

To enable the NFS backend, configure the following structure in the OpenStackDeployment object:

spec:
  features:
    database:
      backup:
        enabled: true
        backend: pv_nfs
        pv_nfs:
          server: <ip-address/dns-name-of-the-server>
          path: <path-to-the-share-folder-on-the-server>

TechPreview

Optionally, MOSK enables you to set the required mount options for the NFS mount command. You can set as many options of mount as you need. For example:

spec:
  services:
    database:
      mariadb:
        values:
          volume:
            phy_backup:
              nfs:
                mountOptions:
                  - "nfsvers=4"
                  - "hard"

See also

Synchronization of local MariaDB backups with a remote S3 storage¶

Available since MOSK 25.1 TechPreview

MOSK provides the capability to synchronize local MariaDB backups with a remote S3 storage. Distributing backups across multiple locations increases their safety. Optionally, backup archives stored in S3 can be encrypted on the server side.

To enable synchronization, you need to have a preconfigured S3 storage and a user account for access.

Limitations¶

Only one remote S3 storage can be configured
Disabling the S3 synchronization does not automatically remove the data

Enable the synchronization with the S3 storage¶

Verify that the S3 storage is accessible through the network from all OpenStack control plane nodes.

Create the secret to store credentials for access to the S3 storage:

---
apiVersion: v1
kind: Secret
metadata:
  labels:
    openstack.lcm.mirantis.com/osdpl_secret: "true"
  name: mariadb-backup-s3-hidden
  namespace: openstack
type: Opaque
data:
  access_key: <ACCESS-KEY-FOR-S3-ACCOUNT>
  secret_key: <SECRET-KEY-FOR-S3-ACCOUNT>
  sse_kms_key_id: <SECRET-KEY-FOR-SERVER-SIDE-ENCRYPTION>

Enable synchronization by adding the following structure to the OpenStackDeployment custom resource. For example, to use Ceph RadosGW as the S3 storage provider and enable server-side encryption for stored archives:

spec:
  features:
    database:
      backup:
        enabled: true
        sync_remote:
          enabled: true
          remotes:
            << remote name >>:
              conf:
                type: s3
                provider: Ceph
                endpoint: <URL-TO-S3-STORAGE>
                path: <BUCKET-NAME-FOR-BACKUPS-ON-S3-STORAGE>
                server_side_encryption: aws:kms
                access_key_id:
                  value_from:
                    secret_key_ref:
                      key: access_key
                      name: mariadb-backup-s3-hidden
                secret_access_key:
                  value_from:
                    secret_key_ref:
                      key: secret_key
                      name: mariadb-backup-s3-hidden
                sse_kms_key_id:
                  value_from:
                    secret_key_ref:
                      key: sse_kms_key_id
                      name: mariadb-backup-s3-hidden

Alternatively, you can set the provider parameter to AWS if you prefer using AWS as a provider for S3 storage and omit the server_side_encryption and sse_kms_key_id parameters if encryption is not required.

OpenStack message bus¶

The internal components of Mirantis OpenStack for Kubernetes (MOSK) coordinate their operations and exchange status information using the cluster’s message bus (RabbitMQ).

Exposable OpenStack notifications¶

Available since MOSK 22.5

MOSK enables you to configure OpenStack services to emit notification messages to the MOSK cluster messaging bus (RabbitMQ) every time an OpenStack resource, for example, an instance, image, and so on, changes its state due to a cloud user action or through its lifecycle. For example, MOSK Compute service (OpenStack Nova) can publish the instance.create.end notification once a newly created instance is up and running.

Note

In certain cases, RabbitMQ notifications may prove unreliable, such as when the RabbitMQ server undergoes a restart or when communication between the server and the client reading the notifications breaks down. To optimize reliability, Mirantis suggests using multiple channels to store notification events, encompassing:

StackLight notifications
Storing audit as part of the OpenStack logs

Sample of an instance.create.end notification

{
    "event_type": "instance.create.end",
    "payload": {
        "nova_object.data": {
            "action_initiator_project": "6f70656e737461636b20342065766572",
            "action_initiator_user": "fake",
            "architecture": "x86_64",
            "auto_disk_config": "MANUAL",
            "availability_zone": "nova",
            "block_devices": [],
            "created_at": "2012-10-29T13:42:11Z",
            "deleted_at": null,
            "display_description": "some-server",
            "display_name": "some-server",
            "fault": null,
            "flavor": {
             "nova_object.data": {
              "description": null,
              "disabled": false,
              "ephemeral_gb": 0,
              "extra_specs": {
                  "hw:watchdog_action": "disabled"
              },
              "flavorid": "a22d5517-147c-4147-a0d1-e698df5cd4e3",
              "is_public": true,
              "memory_mb": 512,
              "name": "test_flavor",
              "projects": null,
              "root_gb": 1,
              "rxtx_factor": 1.0,
              "swap": 0,
              "vcpu_weight": 0,
              "vcpus": 1
             },
             "nova_object.name": "FlavorPayload",
             "nova_object.namespace": "nova",
             "nova_object.version": "1.4"
            },
            "host": "compute",
            "host_name": "some-server",
            "image_uuid": "155d900f-4e14-4e4c-a73d-069cbf4541e6",
            "instance_name": "instance-00000001",
            "ip_addresses": [
             {
              "nova_object.data": {
                  "address": "192.168.1.3",
                  "device_name": "tapce531f90-19",
                  "label": "private",
                  "mac": "fa:16:3e:4c:2c:30",
                  "meta": {},
                  "port_uuid": "ce531f90-199f-48c0-816c-13e38010b442",
                  "version": 4
              },
              "nova_object.name": "IpPayload",
              "nova_object.namespace": "nova",
              "nova_object.version": "1.0"
             }
            ],
            "kernel_id": "",
            "key_name": "my-key",
            "keypairs": [
             {
              "nova_object.data": {
                  "fingerprint": "1e:2c:9b:56:79:4b:45:77:f9:ca:7a:98:2c:b0:d5:3c",
                  "name": "my-key",
                  "public_key": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDx8nkQv/zgGgB4rMYmIf+6A4l6Rr+o/6lHBQdW5aYd44bd8JttDCE/F/pNRr0lRE+PiqSPO8nDPHw0010JeMH9gYgnnFlyY3/OcJ02RhIPyyxYpv9FhY+2YiUkpwFOcLImyrxEsYXpD/0d3ac30bNH6Sw9JD9UZHYcpSxsIbECHw== Generated-by-Nova",
                  "type": "ssh",
                  "user_id": "fake"
              },
              "nova_object.name": "KeypairPayload",
              "nova_object.namespace": "nova",
              "nova_object.version": "1.0"
             }
            ],
            "launched_at": "2012-10-29T13:42:11Z",
            "locked": false,
            "locked_reason": null,
            "metadata": {},
            "node": "fake-mini",
            "os_type": null,
            "power_state": "running",
            "progress": 0,
            "ramdisk_id": "",
            "request_id": "req-5b6c791d-5709-4f36-8fbe-c3e02869e35d",
            "reservation_id": "r-npxv0e40",
            "state": "active",
            "tags": [
             "tag"
            ],
            "task_state": null,
            "tenant_id": "6f70656e737461636b20342065766572",
            "terminated_at": null,
            "trusted_image_certificates": [
             "cert-id-1",
             "cert-id-2"
            ],
            "updated_at": "2012-10-29T13:42:11Z",
            "user_id": "fake",
            "uuid": "178b0921-8f85-4257-88b6-2e743b5a975c"
        },
        "nova_object.name": "InstanceCreatePayload",
        "nova_object.namespace": "nova",
        "nova_object.version": "1.12"
    },
    "priority": "INFO",
    "publisher_id": "nova-compute:compute"
}

OpenStack notification messages can be consumed and processed by various corporate systems to integrate MOSK clouds into the company infrastructure and business processes.

The list of the most common use cases includes:

Using notification history for retrospective security audit
Using the real-time aggregation of notification messages to gather statistics on cloud resource consumption for further capacity planning

Cloud billing considerations

Notifications alone should not be considered as a source of data for any kind of financial reporting. The delivery of the messages can not be guaranteed due to various technical reasons. For example, messages can be lost if an external consumer is not fetching them from the queue fast enough.

Mirantis strongly recommends that your cloud billing solutions rely on the combination of the following data sources:

Periodic polling of the OpenStack API as a reliable source of information about allocated resources
Subscription to notifications to receive timely updates about the resource status change

If you are looking for a ready-to-use billing solution for your cloud, contact Mirantis or one of our partners.

A cloud administrator can securely expose part of a MOSK cluster message bus to the outside world. This enables an external consumer to subscribe to the notification messages emitted by the cluster services.

Important

The latest OpenStack release available in MOSK supports notifications from the following services:

Block storage (OpenStack Cinder)
DNS (OpenStack Designate)
Image (OpenStack Glance)
Orchestration (OpenStack Heat)
Bare Metal (OpenStack Ironic)
Identity (OpenStack Keystone)
Shared Filesystems (OpenStack Manila)
Instance High Avalability (OpenStack Masakari)
Networking (OpenStack Neutron)
Compute (OpenStack Nova)

To enable the external notification endpoint, add the following structure to the OpenStackDeployment custom resource. For example:

spec:
  features:
    messaging:
      notifications:
        external:
          enabled: true
          topics:
            - external-consumer-A
            - external-consumer-2

For each topic name specified in the topics field, MOSK creates a topic exchange in its RabbitMQ cluster together with a set of queues bound to this topic. All enabled MOSK services will publish their notification messages to all configured topics so that multiple consumers can receive the same messages in parallel.

A topic name must follow Kubernetes standard format for object names and IDs that is only lowercase alphanumeric characters, -, or . The topic name notifications is reserved for the internal use.

MOSK supports the connection to message bus (RabbitMQ) through an encrypted or non-encrypted endpoint. Once connected, it supports authentication through either a plain text user name and password or mutual TLS authentication using encrypted X.509 client certificates.

Each topic exchange is protected by automatically generated authentication credentials and certificates for secure connection that are stored as a secret in the openstack-external namespace of a MOSK underlying Kubernetes cluster. A secret is identified by the name of the topic. The list of attributes for the secret object includes:

hosts
The IP addresses which an external notification endpoint is available on
port_amqp, port_amqp-tls
The TCP ports which external notification endpoint is available on
vhost
The name of the RabbitMQ virtual host which the topic queues are created on
username, password
Authentication data
ca_cert
The client CA certificate
client_cert
The client certificate
client_key
The client private key

For the configuration example above, the following objects will be created:

kubectl -n openstack-external get secret

NAME                                            TYPE           DATA   AGE
openstack-external-consumer-A-notifications     Opaque         4      4m51s
openstack-external-consumer-2-notifications     Opaque         4      4m51s

Tungsten Fabric¶

Tungsten Fabric provides basic L2/L3 networking to an OpenStack environment running on the MKE cluster and includes the IP address management, security groups, floating IP addresses, and routing policies functionality. Tungsten Fabric is based on overlay networking, where all virtual machines are connected to a virtual network with encapsulation (MPLSoGRE, MPLSoUDP, VXLAN). This enables you to separate the underlay Kubernetes management network. A workload requires an external gateway, such as a hardware EdgeRouter or a simple gateway to route the outgoing traffic.

The Tungsten Fabric vRouter uses different gateways for the control and data planes.

Tungsten Fabric cluster¶

All services of Tungsten Fabric are delivered as separate containers, which are deployed by the Tungsten Fabric Operator (TFO). Each container has an INI-based configuration file that is available on the host system. The configuration file is generated automatically upon the container start and is based on environment variables provided by the TFO through Kubernetes ConfigMaps.

The main Tungsten Fabric containers run with the host network as DeploymentSet, without using the Kubernetes networking layer. The services listen directly on the host network interface.

The following diagram describes the minimum production installation of Tungsten Fabric with a Mirantis OpenStack for Kubernetes (MOSK) deployment.

For the details about the Tungsten Fabric services included in MOSK deployments and the types of traffic and traffic flow directions, see the subsections below.

Tungsten Fabric cluster components¶

This section describes the Tungsten Fabric services and their distribution across the Mirantis OpenStack for Kubernetes (MOSK) deployment.

The Tungsten Fabric services run mostly as DaemonSets in separate containers for each service. The deployment and update processes are managed by the Tungsten Fabric Operator. However, Kubernetes manages the probe checks and restart of broken containers.

Configuration and control services¶

All configuration and control services run on the Tungsten Fabric Controller nodes.

Service name	Service description
`config-api`	Exposes a REST-based interface for the Tungsten Fabric API.
`config-provisioner`	Provisions the node for execution of configuration services.
`control`	Communicates with the cluster gateways using BGP and with the vRouter agents using XMPP, as well as redistributes appropriate networking information.
`control-provisioner`	Provisions the node for execution of configuration services.
`device-manager`	Manages physical networking devices using `netconf` or `ovsdb`. In multi-node deployments, it operates in the active-backup mode.
`dns`	Using the `named` service, provides the DNS service to the VMs spawned on different compute nodes. Each vRouter node connects to two Tungsten Fabric Controller containers that run the `dns` process.
`named`	The customized Berkeley Internet Name Domain (BIND) daemon of Tungsten Fabric that manages DNS zones for the `dns` service.
`schema`	Listens to configuration changes performed by a user and generates corresponding system configuration objects. In multi-node deployments, it works in the active-backup mode.
`svc-monitor`	Listens to configuration changes of `service-template` and `service-instance`, as well as spawns and monitors virtual machines for the firewall, analyzer services, and so on. In multi-node deployments, it works in the active-backup mode.
`webui`	Consists of the `webserver` and `jobserver` services. Provides the Tungsten Fabric web UI.

Analytics services¶

Unsupported since MOSK 24.2

All analytics services run on Tungsten Fabric analytics nodes.

Service name	Service description
`alarm-gen`	Evaluates and manages the alarms rules.
`analytics-api`	Provides a REST API to interact with the Cassandra analytics database.
`analytics-nodemgr`	Collects all Tungsten Fabric analytics process data and sends this information to the Tungsten Fabric `collector`.
`analytics-database-nodemgr`	Provisions the init model if needed. Collects data of the `database` process and sends it to the Tungsten Fabric `collector`.
`collector`	Collects and analyzes data from all Tungsten Fabric services.
`query-engine`	Handles the queries to access data from the Cassandra database.
`snmp-collector`	Receives the authorization and configuration of the physical routers from the `config-nodemgr` service, polls the physical routers using the Simple Network Management Protocol (SNMP), and uploads the data to the Tungsten Fabric `collector`.
`topology`	Reads the SNMP information from the physical router user-visible entities (UVEs), creates a neighbor list, and writes the neighbor information to the physical router UVEs. The Tungsten Fabric web UI uses the neighbor list to display the physical topology.

vRouter¶

The Tungsten Fabric vRouter provides data forwarding to an OpenStack tenant instance and reports statistics to the Tungsten Fabric analytics service. The Tungsten Fabric vRouter is installed on all OpenStack compute nodes. Mirantis OpenStack for Kubernetes (MOSK) supports the kernel-based deployment of the Tungsten Fabric vRouter.

vRouter services on the OpenStack compute nodes¶
Service name	Service description
`vrouter-agent`	Connects to the Tungsten Fabric Controller container and the Tungsten Fabric DNS system using the Extensible Messaging and Presence Protocol (XMPP). The vRouter Agent acts as a local control plane. Each Tungsten Fabric vRouter Agent is connected to at least two Tungsten Fabric controllers in an active-active redundancy mode. The Tungsten Fabric vRouter Agent is responsible for all networking-related functions including routing instances, routes, and others. The Tungsten Fabric vRouter uses different gateways for the control and data planes. For example, the Linux system gateway is located on the management network, and the Tungsten Fabric gateway is located on the data plane network.
`vrouter-provisioner`	Provisions the node for the vRouter agent execution.

The following diagram illustrates the Tungsten Fabric kernel vRouter set up by the TF operator:

On the diagram above, the following types of networks interfaces are used:

eth0 - for the management (PXE) network (eth1 and eth2 are the slave interfaces of Bond0)
Bond0.x - for the MKE control plane network
Bond0.y - for the MKE data plane network

Third-party services¶

Service name	Service description
`cassandra`	On the Tungsten Fabric control plane nodes, maintains the configuration data of the Tungsten Fabric cluster. On the Tungsten Fabric analytics nodes, stores the `collector` service data.
`cassandra-operator`	The Kubernetes operator that enables the Cassandra clusters creation and management.
`kafka`	Handles the messaging bus and generates alarms across the Tungsten Fabric analytics containers.
`kafka-operator`	The Kubernetes operator that enables Kafka clusters creation and management.
`redis`	Stores the physical router UVE storage and serves as a messaging bus for event notifications.
`redis-operator`	The Kubernetes operator that enables Redis clusters creation and management.
`zookeeper`	Holds the active-backup status for the `device-manager`, `svc-monitor`, and the `schema-transformer` services. This service is also used for mapping of the Tungsten Fabric resources names to UUIDs.
`zookeeper-operator`	The Kubernetes operator that enables ZooKeeper clusters creation and management.
`rabbitmq`	Exchanges messages between API servers and original request senders.
`rabbitmq-operator`	The Kubernetes operator that enables RabbitMQ clusters creation and management.

Plugin services¶

All Tungsten Fabric plugin services are installed on the OpenStack Controller (Rockoon) nodes.

Service name	Service description
`neutron-server`	The Neutron server that includes the Tungsten Fabric plugin.
`octavia-api`	The Octavia API that includes the Tungsten Fabric Octavia driver.
`heat-api`	The Heat API that includes the Tungsten Fabric Heat resources and templates.

Image precaching DaemonSets¶

Along with the Tungsten Fabric services, MOSK deploys and updates special image precaching DaemonSets when the kind TFOperator resource is created or image references in it get updated. These DaemonSets precache container images on Kubernetes nodes minimizing possible downtime when updating container images. Cloud operator can disable image precaching through the TFOperator resource.

See also

Tungsten Fabric image precaching

Tungsten Fabric traffic flow¶

This section describes the types of traffic and traffic flow directions in a Mirantis OpenStack for Kubernetes (MOSK) cluster.

User interface and API traffic¶

The following diagram illustrates all types of UI and API traffic in a Mirantis OpenStack for Kubernetes cluster, including the monitoring and OpenStack API traffic. The OpenStack Dashboard pod hosts Horizon and acts as a proxy for all other types of traffic. TLS termination is also performed for this type of traffic.

SDN traffic¶

SDN or Tungsten Fabric traffic goes through the overlay Data network and processes east-west and north-south traffic for applications that run in a MOSK cluster. This network segment typically contains tenant networks as separate MPLS-over-GRE and MPLS-over-UDP tunnels. The traffic load depends on the workload.

The control traffic between the Tungsten Fabric controllers, edge routers, and vRouters uses the XMPP with TLS and iBGP protocols. Both protocols produce low traffic that does not affect MPLS over GRE and MPLS over UDP traffic. However, this traffic is critical and must be reliably delivered. Mirantis recommends configuring higher QoS for this type of traffic.

The following diagram displays both MPLS over GRE/MPLS over UDP and iBGP and XMPP traffic examples in a MOSK cluster:

Tungsten Fabric lifecycle management¶

Mirantis OpenStack for Kubernetes (MOSK) provides the Tungsten Fabric lifecycle management including pre-deployment custom configurations, updates, data backup and restoration, as well as handling partial failure scenarios, by means of the Tungsten Fabric operator.

This section is intended for the cloud operators who want to gain insight into the capabilities provided by the Tungsten Fabric operator along with the understanding of how its architecture allows for easy management while addressing the concerns of users of Tungsten Fabric-based MOSK clusters.

Tungsten Fabric Operator¶

The Tungsten Fabric Operator (TFO) is based on the Kubernetes operator SDK project. The Kubernetes operator SDK is a framework that uses the controller-runtime library to make writing operators easier by providing the following:

High-level APIs and abstractions to write the operational logic more intuitively.
Tools for scaffolding and code generation to bootstrap a new project fast.
Extensions to cover common operator use cases.

The TFO deploys the following sub-operators. Each sub-operator handles a separate part of a TF deployment:

TFO sub-operators¶
Network	Description
TFControl	Deploys the Tungsten Fabric control services, such as: Control DNS Control `Provisioner` 0
TFConfig	Deploys the Tungsten Fabric configuration services, such as: API Service monitor Schema transformer Device manager Configuration `Provisioner` 0 Database `Provisioner` 0
TFAnalytics Unsupported since MOSK 24.2	Deploys the Tungsten Fabric analytics services, such as: API Collector Alarm `Alarm-gen` SNMP Topology Alarm `Provisioner` 0 Database `Provisioner` 0 SNMP `Provisioner` 0
TFVrouter	Deploys a vRouter on each compute node with the following services: vRouter Agent `Provisioner` 0
TFWebUI	Deploys the following web UI services: Web server Job server
TFTool	Deploys the following tools for debug purposes: TF-CLI CTools
TFTest	An operator to run Tempest tests.

0(1,2,3,4,5,6,7): Since MOSK 24.3, Provisioner is a separate component for the vRouter, deployed as the tf-vrouter-provisioner DaemonSet. The NodeManager service is no longer deployed in TF setups.

Besides the sub-operators that deploy TF services, TFO uses operators to deploy and maintain third-party services, such as different types of storage, cache, message system, and so on. The following table describes all third-party operators:

TFO third-party sub-operators¶
Network	Description
casandra-operator	An upstream operator that automates the Cassandra HA storage operations for the configuration data.
zookeeper-operator	An upstream operator for deployment and automation of a ZooKeeper cluster.
redis-operator	An upstream operator that automates the Redis cluster deployment and keeps it healthy.
rabbitmq-operator	An operator for the messaging system based on RabbitMQ.

The following diagram illustrates a simplified TFO workflow:

TFOperator custom resource¶

The resource of kind TFOperator is a custom resource defined by a resource of kind CustomResourceDefinition.

The CustomResourceDefinition resource in Kubernetes uses the OpenAPI Specification version 2 to specify the schema of the defined resource. The Kubernetes API outright rejects the resources that do not pass this schema validation. Along with schema validation, TFOperator uses ValidatingAdmissionWebhook for extended validations when a custom resource is created or updated.

Important

Since 24.1, MOSK introduces the technical preview support for the API v2 for the Tungsten Fabric Operator. This version of the Tungsten Fabric Operator API aligns with the OpenStack Controller API and provides better interface for advanced configurations. Refer to Key differences between TFOperator API v1alpha1 and v2 for details.

For the list of configuration options available to a cloud operator, refer to Tungsten Fabric configuration. Also, check out the Tungsten Fabric Operator resources document of the MOSK version that your cluster has been deployed with.

TFOperator custom resource validation¶

Tungsten Fabric Operator uses ValidatingAdmissionWebhook to validate environment variables set to Tungsten Fabric components upon the TFOperator object creation or update. The following validations are performed:

Environment variables passed to the Tungsten Fabric components containers
Mapping between tfVersion and tfImageTag, if defined
Schedule for dbBackup
Data capacity format
Feature variable values
Availability of the dataStorageClass class

If required, you can disable ValidatingAdmissionWebhook through the TFOperator HelmBundle resource:

apiVersion: lcm.mirantis.com/v1alpha1
kind: HelmBundle
metadata:
  name: tungstenfabric-operator
  namespace: tf
spec:
  releases:
  - name: tungstenfabric-operator
    values:
      admission:
        enabled: false

Environment variables for Tungsten Fabric components¶

API v2 Available since MOSK 23.1

Warning

The features section of the TFOperator specification allows for easy configuration of all Tungsten Fabric features. Mirantis recommends updating the environment variables through envSettings directly.

Allowed environment variables for Tungsten Fabric components¶
Environment variables	Tungsten Fabric service and `envSettings` name
`INTROSPECT_LISTEN_ALL`	`analytics` (`alarmGen`, `api`, `collector`, `nodeMgr`, `query`, `snmp`, `topology`) `config` (`api`, `db-nodemgr`, `nodeMgr`) `control` (`control`, `dns`, `nodeMgr`) `vRouter` (`agent`, `nodeMgr`)
`PROVISION_DELAY` `PROVISION_RETRIES` `BGP_ASN` `ENCAP_PRIORITY` `VXLAN_VN_ID_MODE`	`analytics` (`provisioner`) `config` (`provisioner`) `control` (`provisioner`) `agent` (`provisioner`)
`CONFIG_API_LIST_OPTIMIZATION_ENABLED` `CONFIG_API_WORKER_COUNT` `CONFIG_API_MAX_REQUESTS` `FWAAS_ENABLE` `RABBITMQ_HEARTBEAT_INTERVAL` `DISABLE_VNC_API_STATS`	`config` (`api`)
`DNS_NAMED_MAX_CACHE_SIZE` `DNS_NAMED_MAX_RETRANSMISSIONS` `DNS_RETRANSMISSION_INTERVAL`	`control` (`dns`)
`WEBUI_LOG_LEVEL` `WEBUI_STATIC_AUTH_PASSWORD` `WEBUI_STATIC_AUTH_ROLE` `WEBUI_STATIC_AUTH_USER`	`webui` (`job`, `web`)
`ANALYTICS_CONFIG_AUDIT_TTL` `ANALYTICS_DATA_TTL` `ANALYTICS_FLOW_TTL` `ANALYTICS_STATISTICS_TTL` `COLLECTOR_disk_usage_percentage_high_watermark0` `COLLECTOR_disk_usage_percentage_high_watermark1` `COLLECTOR_disk_usage_percentage_high_watermark2` `COLLECTOR_disk_usage_percentage_low_watermark0` `COLLECTOR_disk_usage_percentage_low_watermark1` `COLLECTOR_disk_usage_percentage_low_watermark2` `COLLECTOR_high_watermark0_message_severity_level` `COLLECTOR_high_watermark1_message_severity_level` `COLLECTOR_high_watermark2_message_severity_level` `COLLECTOR_low_watermark0_message_severity_level` `COLLECTOR_low_watermark1_message_severity_level` `COLLECTOR_low_watermark2_message_severity_level` `COLLECTOR_pending_compaction_tasks_high_watermark0` `COLLECTOR_pending_compaction_tasks_high_watermark1` `COLLECTOR_pending_compaction_tasks_high_watermark2` `COLLECTOR_pending_compaction_tasks_low_watermark0` `COLLECTOR_pending_compaction_tasks_low_watermark1` `COLLECTOR_pending_compaction_tasks_low_watermark2` `COLLECTOR_LOG_FILE_COUNT` `COLLECTOR_LOG_FILE_SIZE`	`analytics` (`collector`)
`ANALYTICS_DATA_TTL` `QUERYENGINE_MAX_SLICE` `QUERYENGINE_MAX_TASKS` `QUERYENGINE_START_TIME`	`analytics` (`query`)
`SNMPCOLLECTOR_FAST_SCAN_FREQUENCY` `SNMPCOLLECTOR_SCAN_FREQUENCY`	`analytics` (`snmp`)
`TOPOLOGY_SCAN_FREQUENCY`	`analytics` (`topology`)
`PHYSICAL_INTERFACE` `SRIOV_PHYSICAL_INTERFACE` `SRIOV_PHYSICAL_NETWORK` `SRIOV_VF` `TSN_AGENT_MODE` `TSN_NODES` `AGENT_MODE` `FABRIC_SNAT_HASH_TABLE_SIZE` `PRIORITY_BANDWIDTH` `PRIORITY_ID` `PRIORITY_SCHEDULING` `PRIORITY_TAGGING` `QOS_DEF_HW_QUEUE` `QOS_LOGICAL_QUEUES` `QOS_QUEUE_ID` `VROUTER_GATEWAY` `HUGE_PAGES_2MB` `HUGE_PAGES_1GB` `DISABLE_TX_OFFLOAD` `DISABLE_STATS_COLLECTION`	`vRouter` (`agent`)

API v1alpha1 Removed in MOSK 25.1

Allowed environment variables for Tungsten Fabric components¶
Environment variables	Tungsten Fabric components and containers
`INTROSPECT_LISTEN_ALL` `LOG_DIR` `LOG_LEVEL` `LOG_LOCAL`	`tf-analytics` (`alarm-gen`, `api`, `collector`, `alarm-nodemgr`, `db-nodemgr`, `nodemgr`, `snmp-nodemgr`, `query-engine`, `snmp`, `topology`) `tf-config` (`api`, `db-nodemgr`, `nodemgr`) `tf-control` (`control`, `dns`, `nodemgr`) `tf-vrouter` (`agent`, `nodemgr`)
`LOG_DIR` `LOG_LEVEL` `LOG_LOCAL`	`tf-config` (`config`, `devicemgr`, `schema`, `svc-monitor`)
`PROVISION_DELAY` `PROVISION_RETRIES` `BGP_ASN` `ENCAP_PRIORITY` `VXLAN_VN_ID_MODE`	`tf-analytics` (`alarm-provisioner`, `db-provisioner`, `provisioner`, `snmp-provisioner`) `tf-config` (`db-provisioner`, `provisioner`) `tf-control` (`provisioner`) `tf-vrouter` (`provisioner`)
`CONFIG_API_LIST_OPTIMIZATION_ENABLED` `CONFIG_API_WORKER_COUNT` `CONFIG_API_MAX_REQUESTS` `FWAAS_ENABLE` `RABBITMQ_HEARTBEAT_INTERVAL` `DISABLE_VNC_API_STATS`	`tf-config` (`config`)
`DNS_NAMED_MAX_CACHE_SIZE` `DNS_NAMED_MAX_RETRANSMISSIONS` `DNS_RETRANSMISSION_INTERVAL`	`tf-control` (`dns`)
`WEBUI_LOG_LEVEL` `WEBUI_STATIC_AUTH_PASSWORD` `WEBUI_STATIC_AUTH_ROLE` `WEBUI_STATIC_AUTH_USER`	`tf-webui` (`job`, `web`)
`ANALYTICS_CONFIG_AUDIT_TTL` `ANALYTICS_DATA_TTL` `ANALYTICS_FLOW_TTL` `ANALYTICS_STATISTICS_TTL` `COLLECTOR_disk_usage_percentage_high_watermark0` `COLLECTOR_disk_usage_percentage_high_watermark1` `COLLECTOR_disk_usage_percentage_high_watermark2` `COLLECTOR_disk_usage_percentage_low_watermark0` `COLLECTOR_disk_usage_percentage_low_watermark1` `COLLECTOR_disk_usage_percentage_low_watermark2` `COLLECTOR_high_watermark0_message_severity_level` `COLLECTOR_high_watermark1_message_severity_level` `COLLECTOR_high_watermark2_message_severity_level` `COLLECTOR_low_watermark0_message_severity_level` `COLLECTOR_low_watermark1_message_severity_level` `COLLECTOR_low_watermark2_message_severity_level` `COLLECTOR_pending_compaction_tasks_high_watermark0` `COLLECTOR_pending_compaction_tasks_high_watermark1` `COLLECTOR_pending_compaction_tasks_high_watermark2` `COLLECTOR_pending_compaction_tasks_low_watermark0` `COLLECTOR_pending_compaction_tasks_low_watermark1` `COLLECTOR_pending_compaction_tasks_low_watermark2` `COLLECTOR_LOG_FILE_COUNT` `COLLECTOR_LOG_FILE_SIZE`	`tf-analytics` (`collector`)
`ANALYTICS_DATA_TTL` `QUERYENGINE_MAX_SLICE` `QUERYENGINE_MAX_TASKS` `QUERYENGINE_START_TIME`	`tf-analytics` (`query-engine`)
`SNMPCOLLECTOR_FAST_SCAN_FREQUENCY` `SNMPCOLLECTOR_SCAN_FREQUENCY`	`tf-analytics` (`snmp`)
`TOPOLOGY_SCAN_FREQUENCY`	`tf-analytics` (`topology`)
`PHYSICAL_INTERFACE` `SRIOV_PHYSICAL_INTERFACE` `SRIOV_PHYSICAL_NETWORK` `SRIOV_VF` `TSN_AGENT_MODE` `TSN_NODES` `AGENT_MODE` `FABRIC_SNAT_HASH_TABLE_SIZE` `PRIORITY_BANDWIDTH` `PRIORITY_ID` `PRIORITY_SCHEDULING` `PRIORITY_TAGGING` `QOS_DEF_HW_QUEUE` `QOS_LOGICAL_QUEUES` `QOS_QUEUE_ID` `VROUTER_GATEWAY` `HUGE_PAGES_2MB` `HUGE_PAGES_1GB` `DISABLE_TX_OFFLOAD` `DISABLE_STATS_COLLECTION`	`tf-vrouter` (`agent`)

Key differences between TFOperator API v1alpha1 and v2¶

This section outlines the main differences between the v1alpha1 and v2 versions of the TFOperator API:

Introduction of the features section:
- All non-default Tungsten Fabric and Tungsten Fabric Operator features can now be set in the features section.
- Setting environment variables is no longer necessary but can still be done using the envSetting field in each Tungsten Fabric service section.
Relocation of CustomSpec from the vRouter agent specification to the nodes section.
Reorganization of the controllers section:
- The controllers section has been integrated into the services section.
- The services section is now divided into groups: analytics, config, control, vRouter, and webUI.
- Configuration of third-party services can be performed through the analytics or config sections.
Configuration of the logging levels can be performed using the logging field, which is a separate field in each Tungsten Fabric services configuration.
Movement of the dataStorageClass and tfVersion fields to the upper level of the specification.
Introduction of the devOptions section enabling the setup of experimental development-related options.

See also

Tungsten Fabric Operator resources

Tungsten Fabric configuration¶

Mirantis OpenStack for Kubernetes (MOSK) allows you to easily adapt your Tungsten Fabric deployment to the needs of your environment through the TFOperator custom resource.

This section includes custom configuration details available to you.

Important

Since 24.1, MOSK introduces the technical preview support for the API v2 for the Tungsten Fabric Operator. This version of the Tungsten Fabric Operator API aligns with the OpenStack Controller API and provides better interface for advanced configurations. In MOSK 24.1, the API v2 is available only for the new product deployments with Tungsten Fabric.

Since 24.2, the API v2 becomes default for new product deployments and includes the ability to convert existing v1alpha1 TFOperator to v2 during update.

During the update to the 24.3 series, the old Tungsten Fabric cluster configuration API v1alpha1 is automatically converted and replaced with the v2 version. Therefore, since MOSK 24.3, start using the v2 TFOperator custom resource for any updates. The v1alpha1 TFOperator custom resource remains in the cluster but is no longer reconciled and will be automatically removed in MOSK 25.1.

Cassandra configuration¶

This section describes the Cassandra configuration through the Tungsten Fabric Operator custom resource.

Cassandra resource limits configuration¶

By default, Tungsten Fabric Operator sets up the following resource limits for Cassandra analytics and configuration StatefulSets:

Limits:
  cpu:     8
  memory:  32Gi
Requests:
  cpu:     1
  memory:  16Gi

This is a verified configuration suitable for most cases. However, if nodes are under a heavy load, the KubeContainerCPUThrottlingHigh StackLight alert may raise for Tungsten Fabric Pods of the tf-cassandra-analytics and tf-cassandra-config StatefulSets. If such alerts appear constantly, you can increase the limits through the TFOperator custom resource. For example:

API v2 Available since MOSK 24.1

spec:
  services:
    analytics:
      enabled: true
      cassandra:
        resources:
          limits:
            cpu: "12"
            memory: 32Gi
          requests:
            cpu: "2"
            memory: 16Gi
    config:
      cassandra:
        resources:
          limits:
            cpu: "12"
            memory: 32Gi
          requests:
            cpu: "2"
            memory: 16Gi

API v1alpha1 Removed in MOSK 25.1

spec:
  controllers:
    cassandra:
      deployments:
      - name: tf-cassandra-config
        resources:
          limits:
            cpu: "12"
            memory: 32Gi
          requests:
            cpu: "2"
            memory: 16Gi
      - name: tf-cassandra-analytics
        resources:
          limits:
            cpu: "12"
            memory: 32Gi
          requests:
            cpu: "2"
            memory: 16Gi

Custom configuration¶

To specify custom configurations for Cassandra clusters, use the configOptions settings in the TFOperator custom resource. For example, you may need to increase the file cache size in case of a heavy load on the nodes labeled with tfanalyticsdb=enabled or tfconfigdb=enabled:

API v2 Available since MOSK 24.1

spec:
  services:
    analytics:
      enabled: true
      cassandra:
        configOptions:
          file_cache_size_in_mb: 1024

API v1alpha1 Removed in MOSK 25.1

spec:
  controllers:
    cassandra:
       deployments:
       - name: tf-cassandra-analytics
         configOptions:
           file_cache_size_in_mb: 1024

Custom vRouter settings¶

TechPreview

Depending on the Tungsten Fabric Operator API version in use, proceed with one of the following options:

API v2 Available since MOSK 24.1

To specify custom settings for the Tungsten Fabric vRouter nodes, for example, to change the name of the tunnel network interface or enable debug level logging on some subset of nodes, use the nodes settings in the TFOperator custom resource.

For example, to enable debug level logging on a specific node or multiple nodes:

spec:
  nodes:
    <CUSTOMSPEC-NAME>:
      labels:
        name: <NODE-LABEL>
        value: <NODE-LABEL-VALUE>
      nodeVRouter:
        enabled: true
        envSettings:
          agent:
            env:
            - name: LOG_LEVEL
              value: SYS_DEBUG

API v1alpha1 Removed in MOSK 25.1

To specify custom settings for the Tungsten Fabric vRouter nodes, for example, to change the name of the tunnel network interface or enable debug level logging on some subset of nodes, use the customSpecs settings in the TFOperator custom resource.

For example, to enable debug level logging on a specific node or multiple nodes:

spec:
  controllers:
    tf-vrouter:
      agent:
        customSpecs:
        - name: <CUSTOMSPEC-NAME>
          label:
            name: <NODE-LABEL>
            value: <NODE-LABEL-VALUE>
          containers:
          - name: agent
            env:
            - name: LOG_LEVEL
              value: SYS_DEBUG

Caution

The customspecs:name value must follow the RFC 1123 international format. Verify that the name of a DaemonSet object is a valid DNS subdomain name.

The customSpecs parameter inherits all settings for the tf-vrouter containers that are set on the spec:controllers:agent level and overrides or adds additional parameters. The example configuration above overrides the logging level from SYS_INFO, which is the default logging level, to SYS_DEBUG.

For clusters with a multi-rack architecture, you may need to redefine the gateway IP for the Tungsten Fabric vRouter nodes using the VROUTER_GATEWAY parameter. For details, see Multi-rack architecture.

Control plane traffic interface¶

By default, the TF control service uses the management interface for the BGP and XMPP traffic. You can change the control service interface using the controlInterface parameter in the TFOperator custom resource, for example, to combine the BGP and XMPP traffic with the data (tenant) traffic:

API v2 Available since MOSK 24.1

spec:
  features:
    control:
      controlInterface: <tunnel-interface>

API v1alpha1 Removed in MOSK 25.1

spec:
  settings:
    controlInterface: <tunnel-interface>

Traffic encapsulation¶

Tungsten Fabric implements cloud tenants’ virtual networks as Layer 3 overlays. Tenant traffic gets encapsulated into one of the supported protocols and is carried over the infrastructure network between 2 compute nodes or a compute node and an edge router device.

In addition, Tungsten Fabric is capable of exchanging encapsulated traffic with external systems in order to build advanced virtual networking topologies, for example, BGP VPN connectivity between 2 MOSK clouds or a MOSK cloud and a cloud tenant premises.

MOSK supports the following encapsulation protocols:

MPLS over Generic Routing Encapsulation (GRE)
A traditional encapsulation method supported by several router vendors, including Cisco and Juniper. The feature is applicable when other encapsulation methods are not available. For example, an SDN gateway runs software that does not support MPLS over UDP.
MPLS over User Datagram Protocol (UDP)
A variation of the MPLS over GRE mechanism. It is the default and the most frequently used option in MOSK. MPLS over UDP replaces headers in UDP packets. In this case, a UDP port stores a hash of the packet payload (entropy). It provides a significant benefit for equal-cost multi-path (ECMP) routing load balancing. MPLS over UDP and MPLS over GRE transfer Layer 3 traffic only.
Virtual Extensible LAN (VXLAN) ^TechPrev
The combination of VXLAN and EVPN technologies is often used for creating advanced cloud networking topologies. For example, it can provide transparent Layer 2 interconnections between Virtual Network Functions running on top of the cloud and physical traffic generator appliances hosted somewhere else.

Encapsulation priority¶

The ENCAP_PRIORIY parameter defines the priority in which the encapsulation protocols are attempted to be used when setting the BGP VPN connectivity between the cloud and external systems.

By default, the encapsulation order is set to MPLSoUDP,MPLSoGRE,VXLAN. The cloud operator can change it depending their needs in the TFOperator custom resource as it is illustrated in Configuring encapsulation.

The list of supported encapsulated methods along with their order is shared between BGP peers as part of the capabilities information exchange when establishing a BGP session. Both parties must support the same encapsulation methods to build a tunnel for the network traffic.

For example, if the cloud operator wants to set up a Layer 2 VPN between the cloud and their network infrastructure, they configure the cloud’s virtual networks with VXLAN identifiers (VNIs) and do the same on the other side, for example, on a network switch. Also, VXLAN must be set in the first position in encapsulation priority order. Otherwise, VXLAN tunnels will not get established between endpoints, even though both endpoints may support the VXLAN protocol.

However, setting VXLAN first in the encapsulation priority order will not enforce VXLAN encapsulation between compute nodes or between compute nodes and gateway routers that use Layer 3 VPNs for communication.

Configuring encapsulation¶

The TFOperator custom resource allows you to define encapsulation settings for your Tungsten Fabric cluster.

Important

The TFOperator custom resource must be the only place to configure the cluster encapsulation. Performing these configurations through the Tungsten Fabric web UI, CLI, or API does not provide the configuration persistency, and the settings defined this way may get reset to defaults during the cluster services restart or update.

Note

Defining the default values for encapsulation parameters in the TFOperator custom resource is unnecessary.

Depending on the Tungsten Fabric operator API version in use, proceed with one of the following options:

API v2 Available since MOSK 24.1

Encapsulation settings¶
Parameter	Default value	Description
`encapPriority`	`MPLSoUDP,MPLSoGRE,VXLAN`	Defines the encapsulation priority order.
`vxlanVnIdMode`	`automatic`	Defines the Virtual Network ID type. The list of possible values includes: `automatic` - to assign the VXLAN identifier to virtual networks automatically. `configured` - to make cloud users explicitly provide the VXLAN identifier for the virtual networks.

Example configuration:

features:
  config:
    vxlanVnIdMode: automatic
    encapPriority: VXLAN,MPLSoUDP,MPLSoGRE

API v1alpha1 Removed in MOSK 25.1

Encapsulation settings¶
Parameter	Default value	Description
`ENCAP_PRIORITY`	`MPLSoUDP,MPLSoGRE,VXLAN`	Defines the encapsulation priority order.
`VXLAN_VN_ID_MODE`	`automatic`	Defines the Virtual Network ID type. The list of possible values includes: `automatic` - to assign the VXLAN identifier to virtual networks automatically. `configured` - to make cloud users explicitly provide the VXLAN identifier for the virtual networks. Typically, for a Layer 2 VPN use case, the `VXLAN_VN_ID_MODE` parameter is set to `configured`.

Example configuration:

controllers:
  tf-config:
    provisioner:
      containers:
      - env:
        - name: VXLAN_VN_ID_MODE
          value: automatic
        - name: ENCAP_PRIORITY
          value: VXLAN,MPLSoUDP,MPLSoGRE
        name: provisioner

Autonomous System Number (ASN)¶

In the routing fabric of a data centre, a MOSK cluster with Tungsten Fabric enabled can be represented either by a separate Autonomous System (AS) or as part of a bigger autonomous system. In either case, Tungsten Fabric needs to participate in the BGP peering, exchanging routes with external devices and within the cloud.

The Tungsten Fabric Controller acts as an internal (iBGP) route reflector for the cloud AS by populating /32 routes pointing to VMs across all compute nodes as well as the cloud’s edge gateway devices in case they belong to the same AS. Apart from being an iBGP router reflector for the cloud AS, the Tungsten Fabric Controller can act as a BGP peer for autonomous systems external to the cloud, for example, for the AS configured across the leaf-spine fabric of the data center.

The Autonomous System Number (ASN) setting contains the unique identifier of the autonomous system that the MOSK cluster with Tungsten Fabric belongs to. The ASN number does not affect the internal iBGP communication between vRouters running on the compute nodes. Such communication will work regardless of the ASN number settings. However, any network appliance that is not managed by the Tungsten Fabric control plane will have BGP configured manually. Therefore, the ASN settings should be configured accordingly on both sides. Otherwise, it would result in the inability to establish BPG sessions, regardless of whether the external device peers with Tungsten Fabric over iBGP or eBGP.

Configuring ASNs¶

The TFOperator custom resource enables you to define ASN settings for your Tungsten Fabric cluster.

Important

The TFOperator CR must be the only place to configure the cluster ASN. Performing these configurations through the Tungsten Fabric web UI, CLI, or API does not provide the configuration persistency, and the settings defined this way may get reset to defaults during the cluster services restart or update.

Note

Defining the default values for ASN parameters in the Tungsten Fabric Operator custom resource is unnecessary.

Depending on the Tungsten Fabric Operator API version in use, proceed with one of the following options:

API v2 Available since MOSK 24.1

ASN settings¶
Parameter	Default value	Description
`bgpAsn`	`64512`	Defines ASN of the control node.
`enable4ByteAS`	`false`	Enables the 4-byte ASN format.

Example configuration:

features:
  config:
    bgpAsn: 64515
    enable4ByteAS: true

API v1alpha1 Removed in MOSK 25.1

ASN settings¶
Parameter	Default value	Description
`BGP_ASN`	`64512`	Defines ASN of the control node.
`ENABLE_4BYTE_AS`	`FALSE`	Enables the 4-byte ASN format.

Example configuration:

controllers:
  tf-config:
    provisioner:
      containers:
      - env:
        - name: BGP_ASN
          value: "64515"
        - name: ENABLE_4BYTE_AS
          value: "true"
        name: provisioner
  tf-control:
    provisioner:
      containers:
      - env:
        - name: BGP_ASN
          value: "64515"
        name: provisioner

Access to external DNS¶

By default, the Tungsten Fabric tf-control-dns-external service is created to expose the Tungsten Fabric control dns. You can disable creation of this service through the enableDNSExternal parameter in the TFOperator custom resource. For example:

API v2 Available since MOSK 24.1

spec:
  features:
    control:
      enableDNSExternal: false

API v1alpha1 Removed in MOSK 25.1

spec:
  controllers:
    tf-control:
      dns:
        enableDNSExternal: false

Gateway for vRouter data plane network¶

If an edge router is accessible from the data plane through a gateway, define the vRouter gateway in the TFOperator custom resource. Otherwise, the default system gateway is used.

Depending on the Tungsten Fabric Operator API version in use, proceed with one of the following configurations:

API v2 Available since MOSK 24.1

Define the vRouterGateway parameter in the features section of the TFOperator custom resource:

spec:
  features:
    vRouter:
      vRouterGateway: <data-plane-network-gateway>

API v1alpha1 Removed in MOSK 25.1

Define the VROUTER_GATEWAY parameter in the TFOperator custom resource:

spec:
  controllers:
    tf-vrouter:
      agent:
        containers:
        - name: agent
          env:
          - name: VROUTER_GATEWAY
            value: <data-plane-network-gateway>

Tungsten Fabric image precaching¶

By default, MOSK deploys image precaching DaemonSets to minimize possible downtime when updating container images. You can disable creation of these DaemonSets by setting the imagePreCaching parameter in the TFOperator custom resource to false:

API v2 Available since MOSK 24.1

spec:
  features:
    imagePreCaching: false

API v1alpha1 Removed in MOSK 25.1

spec:
  settings:
    imagePreCaching: false

Graceful restart and long-lived graceful restart¶

Available since MOSK 23.2 for Tungsten Fabric 21.4 only TechPreview

Graceful restart and long-lived graceful restart are vital mechanisms within BGP (Border Gateway Protocol) routing, designed to optimize the routing tables convergence in scenarios where a BGP router restarts or a networking failure is experienced, leading to interruptions of router peering.

During a graceful restart, a router can signal its BGP peers about its impending restart, requesting them to retain the routes it had previously advertised as active. This allows for seamless network operation and minimal disruption to data forwarding during the router downtime.

The long-lived aspect of the long-lived graceful restart extends the graceful restart effectiveness beyond the usual restart duration. This extension provides an additional layer of resilience and stability to BGP routing updates, bolstering the network ability to manage unforeseen disruptions.

Caution

Mirantis does not generally recommend using the graceful restart and long-lived graceful restart features with the Tungsten Fabric XMPP helper, unless the configuration is done by proficient operators with at-scale expertise in networking domain and exclusively to address specific corner cases.

Configuring graceful restart and long-lived graceful restart¶

Tungsten Fabric Operator allows for easy enablement and configuration of the graceful restart and long-lived graceful restart features through the TFOperator custom resource:

API v2 Available since MOSK 24.1

spec:
  features:
    control:
      gracefulRestart:
        enabled: <BOOLEAN>
        bgpHelperEnabled: <BOOLEAN>
        xmppHelperEnabled: <BOOLEAN>
        restartTime: <TIME_IN_SECONDS>
        llgrRestartTime: <TIME_IN_SECONDS>
        endOfRibTimeout: <TIME_IN_SECONDS>

API v1alpha1 Removed in MOSK 25.1

spec:
  settings:
    settings:
      gracefulRestart:
        enabled: <BOOLEAN>
        bgpHelperEnabled: <BOOLEAN>
        xmppHelperEnabled: <BOOLEAN>
        restartTime: <TIME_IN_SECONDS>
        llgrRestartTime: <TIME_IN_SECONDS>
        endOfRibTimeout: <TIME_IN_SECONDS>

Graceful restart and long-lived graceful restart settings¶
Parameter	Default value	Description
`enabled`	`false`	Enables or disables graceful restart and long-lived graceful restart features.
`bgpHelperEnabled`	`false`	Specifies the time interval, when the Tungsten Fabric control services act as a graceful restart helper to the edge router or any other BGP peer by retaining the routes learned from this peer and advertising them to the rest of the network as applicable. Note BGP peer should support and be configured with graceful restart for all of the address families used.
`xmppHelperEnabled`	`false`	Specifies the time interval, when the datapath agent should retain the last route path from the Tungsten Fabric Controller when an XMPP-based connection is lost.
`restartTime`	`300`	Configures a non-zero restart time in seconds to advertise for graceful restart capability from peers.
`llgrRestartTime`	`300`	Specifies the amount of time in seconds the vRouter datapath should keep advertised routes from the Tungsten Fabric control services, when an XMPP connection between the control and vRouter agent services is lost. Note When graceful restart and long-lived graceful restart are both configured, the duration of the long-lived graceful restart timer is the sum of both timers.
`endOfRibTimeout`	`300`	Specifies the amount of time in seconds a control node waits to remove stale routes from a vRouter agent Routing Information Base (RIB).

Configuring the protocol for connecting to Cassandra clusters¶

To streamline and improve the efficiency of communication between clients and the database, Cassandra is transitioning away from the Thrift protocol in favor of the Query Language (CQL) protocol starting with MOSK 24.1. Since MOSK 24.2, Cassandra uses the CQL protocol by default.

CQL provides a more user-friendly and SQL-like interface for interacting with the database. With the move towards CQL, the Thrift-based client drivers are no longer actively supported encouraging the users to migrate to CQL-based client drivers to take advantage of new features and improvements in Cassandra.

If your cluster is running MOSK 24.1.x, you can enable the CQL protocol proceeding with one of the options below depending on the Tungsten Fabric Operator API version in use.

During update to MOSK 24.2, switching from Thrift to CQL is performed automatically. While it is possible to switch back to Thrift, Mirantis does not recommend it. If you choose to do so, specify thrift instead of cql in the configuration examples below.

API v2 Available since MOSK 24.1

Define the cassandraDriver parameter in the devOptions section of the TFOperator custom resource:

spec:
  devOptions:
    cassandraDriver: cql

API v1alpha1 Removed in MOSK 25.1

Define the CONFIGDB_CASSANDRA_DRIVER variable for the tf-analytics, tf-config, and tf-control controllers in the TFOperator custom resource:

spec:
  controllers:
    tf-analytics:
      alarm-gen:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: alarm-gen
      api:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: api
      collector:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: collector
      snmp:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: snmp
      topology:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: topology
    tf-config:
      api:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: api
      devicemgr:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: devicemgr
      schema:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: schema
      svc-monitor:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: svc-monitor
    tf-control:
      control:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: control
      dns:
        containers:
          - env:
              - name: CONFIGDB_CASSANDRA_DRIVER
                value: cql
            name: dns

SR-IOV Spoof Check control for Tungsten Fabric¶

Available since MOSK 24.2 TechPreview

MOSK provides the capability to enable SR-IOV Spoof Check control with the Neutron Tungsten Fabric backend.

The capability can be useful for certain network configurations. For example, you might need to allow traffic from a virtual function interface even when its MAC address does not match the MAC address inside the virtual machine. In this scenario, known as MAC spoofing, disabling spoof check enables the traffic to pass through regardless of the MAC address mismatch.

Caution

Certain NICs and drivers may not handle the spoofchk setting. For example, the Intel 82599ES NIC paired with the ixgbe driver disregards the spoofchk setting when VLAN tagging is enabled. Therefore, ensure compatibility with your hardware configuration regarding spoofchk handling before proceeding.

To enable SR-IOV Spoof Check control for Tungsten Fabric, enable SR-IOV interfaces handling by Nova os-vif plugin in the OpenStackDeployment custom resource:

services:
  compute:
    nova:
      values:
        conf:
          nova:
            workarounds:
              pass_hwveb_ports_to_os_vif_plugin: true

Now, you can enable and disable spoof checking for certain SR-IOV ports through the OpenStack CLI. To disable spoof checking on an SR-IOV port:

openstack port set --no-security-group --disable-port-security <SRIOV-PORT>

To enable spoof checking on an SR-IOV port:

openstack port set --enable-port-security <SRIOV-PORT>

See also

Availability zones¶

The Tungsten Fabric Operator provides a capability to configure the netns_availability_zone parameter of the Tungsten Fabric svc-monitor service through the netnsAZ parameter. This configuration enables MOSK users to specify an availability zone for Tungsten Fabric instances, such as HAProxy (load balancer instances) or SNAT routers.

Configuration:

spec:
  features:
    config:
      netnsAZ: <NETNS_AVAILABILITY_ZONE>

Tungsten Fabric services¶

The section explains specifics of the Tungsten Fabric services provided by Mirantis OpenStack for Kubernetes (MOSK). The list of the services and their supported features included in this section is not full and is being constantly amended based on the complexity of the architecture and use of a particular service.

Tungsten Fabric load balancing (HAProxy)¶

Note

Since 23.1, MOSK provides technology preview for Octavia Amphora load balancing. To start experimenting with the new load balancing solution, refer to Octavia Amphora load balancing.

MOSK ensures Octavia with Tungsten Fabric integration by OpenStack Octavia Driver with Tungsten Fabric HAProxy as a backend.

The Tungsten Fabric-based MOSK deployment supports creation, update, and deletion operations with the following standard load balancing API entities:

Load balancers

Note

For a load balancer creation operation, the driver supports only the vip-subnet-id argument, the vip-network-id argument is not supported.
Listeners
Pools
Health monitors

The Tungsten Fabric-based MOSK deployment does not support the following load balancing capabilities:

L7 load balancing capabilities, such as L7 policies, L7 rules, and others
Setting specific availability zones for load balancers and their resources
Using of the UDP protocol
Operations with Octavia quotas
Operations with Octavia flavors

Warning

The Tungsten Fabric-based MOSK deployment enables you to manage the load balancer resources by means of the OpenStack CLI or OpenStack Horizon. Do not perform any manipulations with the load balancer resources through the Tungsten Fabric web UI because in this case the changes will not be reflected on the OpenStack API side.

Octavia Amphora load balancing¶

Available since MOSK 23.1 TechPreview

Octavia Amphora (Amphora v2) load balancing provides a scalable and flexible solution for load balancing in cloud environments. MOSK deploys Amphora load balancer on each node of the OpenStack environment ensuring that load balancing services are easily accessible, highly scalable, and highly reliable.

Compared to the Octavia Tungsten Fabric driver for LBaaS v2 solution, Amphora offers several advanced features including:

Full compatibility with the Octavia API, which provides a standardized interface for load balancing in MOSK OpenStack environments. This makes it easier to manage and integrate with other OpenStack services.
Layer 7 policies and rules, which allow for more granular control over traffic routing and load balancing decisions. This enables users to optimize their application performance and improve the user experience.
Support for the UDP protocol, which is commonly used for real-time communications and other high-performance applications. This enables users to deploy a wider range of applications with the same load balancing infrastructure.

Enabling Octavia Amphora load balancing¶

By default, MOSK uses the Octavia Tungsten Fabric load balancing. Once Octavia Amphora load balancing is enabled, the existing Octavia Tungsten Fabric driver load balancers will continue to function normally. However, you cannot migrate your load balancer workloads from the old LBaaS v2 solution to Amphora.

Note

As long as MOSK provides Octavia Amphora load balancing as a technology preview feature, Mirantis cannot guarantee the stability of this solution and does not provide a migration path from Tungsten Fabric load balancing (HAProxy), which is used by default.

To enable Octavia Amphora load balancing:

Assign openstack-gateway: enabled labels to the compute nodes in either order.

Caution

Assigning the openstack-gateway: enabled labels on compute nodes is crucial for the effective operation of Octavia Amphora load balancing within an OpenStack environment. Double-check the labels assignment to guarantee proper configuration.
To make Amphora the default provider, specify it in the OpenStackDeployment custom resource:
```
spec:
  features:
    octavia:
      default_provider: amphorav2
```

Verify that the OpenStack Controller (Rockoon) has scheduled new Octavia pods that include health manager, worker, and housekeeping pods.

kubectl get pods -n openstack -l 'application=octavia,component in (worker, health_manager, housekeeping)'

Example of output for an environment with two compute nodes:

NAME                                    READY   STATUS    RESTARTS   AGE
octavia-health-manager-default-48znl    1/1     Running   0          4h32m
octavia-health-manager-default-jk82v    1/1     Running   0          4h34m
octavia-housekeeping-7bdf9cbd6c-24vc4   1/1     Running   0          4h34m
octavia-housekeeping-7bdf9cbd6c-h9ccv   1/1     Running   0          4h34m
octavia-housekeeping-7bdf9cbd6c-rptvv   1/1     Running   0          4h34m
octavia-worker-665f84fc7-8kdqd          1/1     Running   0          4h34m
octavia-worker-665f84fc7-j6jn9          1/1     Running   0          4h31m
octavia-worker-665f84fc7-kqf9t          1/1     Running   0          4h33m

Creating new load balancers¶

The workflow for creating new load balancers with Amphora is identical to the workflow for creating load balancers with Octavia Tungsten Fabric driver for LBaaS v2. You can do it either through the OpenStack Horizon UI or OpenStack CLI.

If you have not defined amphorav2 as default provider in the OpenStackDeployment custom resource, you can specify it explicitly when creating a load balancer using the provider argument:

openstack loadbalancer create --provider amphorav2

Tungsten Fabric known limitations¶

This section contains a summary of the Tungsten Fabric upstream features and use cases not supported in MOSK, features and use cases offered as Technology Preview in the current product release if any, and known limitations of Tungsten Fabric in integration with other product components.

Feature or use case	Status	Description
Tungsten Fabric web UI	Provided as is	MOSK provides the TF web UI as is and does not include this service in the support Service Level Agreement
Automatic generation of network port records in DNSaaS (Designate)	Not supported	As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names
Secret management (Barbican)	Not supported	It is not possible to use the certificates stored in Barbican to terminate HTTPs on a load balancer in a Tungsten Fabric deployment
Role Based Access Control (RBAC) for Neutron objects	Not supported
Advanced Tungsten Fabric features	Provided as is	MOSK provides the following advanced Tungsten Fabric features as is and does not include them in the support Service Level Agreement: Service Function Chaining Production ready multi-site SDN Layer 3 multihoming Long-Lived Graceful Restart (LLGR)
	Deprecated 0	DPDK
Tungsten Fabric and OpenStack Octavia Amphora integration	Technical Preview	Due to Tungsten Fabric Simple Virtual Gateway restriction, each virtual network can have only one VGW interface. As a result, MOSK should be limited to a single compute node with the `openstack-gateway=enabled` label. This limitation prevents OpenStack Octavia Amphora from functioning in a multi-rack deployment.

0: For the deprecation note, refer to DPDK

Tungsten Fabric integration with OpenStack¶

The levels of integration between OpenStack and Tungsten Fabric (TF) include controllers and services integration levels.

Controllers integration¶

The integration between the OpenStack and TF controllers is implemented through the shared Kubernetes openstack-tf-shared namespace. Both controllers have access to this namespace to read and write the Kubernetes kind: Secret objects.

The OpenStack Controller (Rockoon) posts the data into the openstack-tf-shared namespace required by the TF services. The TF controller watches this namespace. Once an appropriate secret is created, the TF controller obtains it into the internal data structures for further processing.

The OpenStack Controller includes the following data for the TF Controller:

tunnel_inteface
Name of the network interface for the TF data plane. This interface is used by TF for the encapsulated traffic for overlay networks.
Keystone authorization information
Keystone Administrator credentials and an up-and-running IAM service are required for the TF Controller to initiate the deployment process.
Nova metadata information
Required for the TF vRrouter agent service.

Also, the OpenStack Controller watches the openstack-tf-shared namespace for the vrouter_port parameter that defines the vRouter port number and passes it to the nova-compute pod.

Services integration¶

The list of the OpenStack services that are integrated with TF through their API include:

neutron-server - integration is provided by the contrail-neutron-plugin component that is used by the neutron-server service for transformation of the API calls to the TF API compatible requests.
nova-compute - integration is provided by the contrail-nova-vif-driver and contrail-vrouter-api packages used by the nova-compute service for interaction with the TF vRouter to the network ports.
octavia-api - integration is provided by the Octavia TF Driver that enables you to use OpenStack CLI and Horizon for operations with load balancers. See Tungsten Fabric load balancing (HAProxy) for details.

Warning

TF is not integrated with the following OpenStack services:

DNS service (Designate)
Key management (Barbican)

Tungsten Fabric IPv6 support¶

Tungsten Fabric allows running IPv6-enabled OpenStack tenant networks on top of the IPv4 underlay. You can create an IPv6 virtual network through the Tungsten Fabric web UI or OpenStack CLI in the same way as an IPv4 virtual network. The IPv6 functionality is enabled out of the box and does not require major changes in the cloud configuration. This section lists the IPv6 capabilities supported by MOSK, as well as those available and unavailable in the upstream OpenContrail software.

The following IPv6 features are supported and verified in MOSK:

Virtual machines with IPv6 and IPv4 interfaces
Virtual machines with IPv6-only interfaces
DHCPv6 and neighbor discovery
Policy and security groups
IPv6 flow set up, tear down, and aging
Flow set up and tear down based on a TCP state machine
Fat flow
Allowed address pair configuration with IPv6 addresses
Equal Cost Multi-Path (ECMP)

Additionally, the following IPv6 features are available in upstream OpenContrail according to its official documentation:

Protocol-based flow aging
IPv6 service chaining
Connectivity with gateway (MX Series device)
Virtual Domain Name Services (vDNS), name-to-IPv6 address resolution

The following IPv6 features are not available in upstream OpenContrail:

Any IPv6 Network Address Translation (NAT)
Load Balancing as a Service (LBaaS)
IPv6 fragmentation
Floating IPv6 address
Link-local and metadata services
Diagnostics for IPv6
Contrail Device Manager
Virtual customer premises equipment (vCPE)

Networking¶

Depending on the size of an OpenStack environment and the components that you use, you may want to have a single or multiple network interfaces, as well as run different types of traffic on a single or multiple VLANs.

This section provides the recommendations for planning the network configuration and optimizing the cloud performance.

Networking overview¶

Mirantis OpenStack for Kubernetes (MOSK) cluster networking is complex and defined by the security requirements and performance considerations. It is based on the Kubernetes cluster networking provided by Mirantis Container Cloud and expanded to facilitate the demands of the OpenStack virtualization platform.

A Container Cloud Kubernetes cluster provides a platform for MOSK and is considered a part of its control plane. All networks that serve Kubernetes and related traffic are considered control plane networks. The Kubernetes cluster networking is typically focused on connecting pods of different nodes as well as exposing the Kubernetes API and services running in pods into an external network.

The OpenStack networking connects virtual machines to each other and the outside world. Most of the OpenStack-related networks are considered a part of the data plane in an OpenStack cluster. Ceph networks are considered data plane networks for the purpose of this reference architecture.

When planning your OpenStack environment, consider the types of traffic that your workloads generate and design your network accordingly. If you anticipate that certain types of traffic, such as storage replication, will likely consume a significant amount of network bandwidth, you may want to move that traffic to a dedicated network interface to avoid performance degradation.

The following diagram provides a simplified overview of the underlay networking in a MOSK environment:

cluster-networking

Management cluster networking¶

This page summarizes the recommended networking architecture of a management cluster for a Mirantis OpenStack for Kubernetes (MOSK) cluster.

The main purpose of networking in a management cluster is to provide access to the management API that consists of:

Public API
Used by end users to provision and configure MOSK clusters and machines. Includes the management console.
LCM API
Used by LCM agents in MOSK clusters to obtain configuration and report status. Contains provider-specific services and internal API including LCMMachine and LCMCluster objects.

We recommend deploying the management cluster with a dedicated interface for the provisioning (PXE) network. The separation of the provisioning network from the management network ensures additional security and resilience of the solution.

MOSK end users typically should have access to the Keycloak service in the management cluster for authentication to the Horizon web UI. Therefore, we recommend that you connect the management network of the management cluster to an external network through an IP router. The default route on the management cluster nodes must be configured with the default gateway in the management network.

If you deploy the multi-rack configuration, ensure that the provisioning network of the management cluster is connected to an IP router that connects it to the provisioning networks of all racks.

The following types of networks are supported for management clusters in MOSK:

PXE network
Enables PXE boot of all bare metal machines. The PXE subnet provides IP addresses for DHCP and network boot of the bare metal hosts for initial inspection and operating system provisioning. This network may not have the default gateway or a router connected to it. The PXE subnet is defined by the operator during bootstrap.

Provides IP addresses for the bare metal management services, such as bare metal provisioning service (Ironic). These addresses are allocated and served by MetalLB.
Management network
Connects LCM agents running on the hosts to the LCM API. Serves the external connections to the management API. This network is also used for communication between kubelet and the Kubernetes API server inside a Kubernetes cluster. The MKE components use this network for communication inside a swarm cluster.

The LCM subnet provides IP addresses for Kubernetes nodes in the management cluster. This network also provides a Virtual IP (VIP) address for the load balancer that enables external access to the Kubernetes API of the management cluster. This VIP is also the endpoint to access the management API in the management cluster.

Provides IP addresses for externally accessible services, such as Keycloak, management console, StackLight. These addresses are allocated and served by MetalLB.
Kubernetes workloads network
^{Technology Preview}

Serves the internal traffic between workloads on the management cluster. The Kubernetes workloads subnet provides IP addresses that are assigned to nodes and used by Calico.
Out-of-Band (OOB) network
Connects to Baseboard Management Controllers of the servers that host the management cluster. The OOB subnet must be accessible from the management network through IP routing. The OOB network is not managed by MOSK and is not represented in the IPAM API.

MOSK cluster networking¶

Mirantis OpenStack for Kubernetes (MOSK) clusters managed by Mirantis Container Cloud use the following networks to serve different types of traffic:

MOSK network types¶
Network role	Description
Provisioning (PXE) network	Facilitates the iPXE boot of all bare metal machines in a MOSK cluster and provisioning of the operating system to machines. This network is only used during provisioning of the host. It must not be configured on an operational MOSK node.
Life-cycle management (LCM) network	Connects LCM Agents running on the hosts to the Container Cloud LCM API. The LCM API is provided by the management cluster. The LCM network is also used for communication between `kubelet` and the Kubernetes API server inside a Kubernetes cluster. The MKE components use this network for communication inside a swarm cluster. The LCM subnet(s) provides IP addresses that are statically allocated by the IPAM service to bare metal hosts. This network must be connected to the Kubernetes API endpoint of the management cluster through an IP router. LCM Agents running on MOSK clusters will connect to the management cluster API through this router. LCM subnets may be different per MOSK cluster as long as this connection requirement is satisfied. You can use more than one LCM network segment in a MOSK cluster. In this case, separated L2 segments and interconnected L3 subnets are still used to serve LCM and API traffic. All IP subnets in the LCM networks must be connected to each other by IP routes. These routes must be configured on the hosts through L2 templates. All IP subnets in the LCM network must be connected to the Kubernetes API endpoints of the management cluster through an IP router. You can manually select the load balancer IP address for external access to the cluster API and specify it in the `Cluster` object configuration. Alternatively, you can allocate a dedicated IP range for a virtual IP of the cluster API load balancer by adding a `Subnet` object with a special annotation. Mirantis recommends that this subnet stays unique per MOSK cluster. For details, see Create subnets. ARP or BGP announcement When using the ARP announcement of the IP address for the cluster API load balancer, the following limitations apply: Only one of the LCM networks can contain the API endpoint. This network is called API/LCM throughout this documentation. It consists of a VLAN segment stretched between all Kubernetes master nodes in the cluster and the IP subnet that provides IP addresses allocated to these nodes. The load balancer IP address must be allocated from the same subnet CIDR address that the LCM subnet uses. When using the BGP announcement of the IP address for the cluster API load balancer, which is available as Technology Preview since MOSK 23.2.2, no segment stretching is required between Kubernetes master nodes. Also, in this scenario, the load balancer IP address is not required to match the LCM subnet CIDR address. Depending of cluster needs, an operator can select how the VIP address for Kubernetes API is advertised. When BGP advertisement is used or the OpenStack control plane is deployed on separate nodes, as opposite to a compact control plane, it allows for a more flexible configuration and there is no need to search for a compromise such as the one described below. But, when using ARP advertisement on a compact control plane, the selection of network for advertising the VIP address for Kubernetes API may depend on whether the symmetry of service return traffic is required. Therefore, when using ARP advertisement on a compact control plane, select one of the following options in the drop-down list: Network selection for advertising the VIP address for Kubernetes API on a compact control plane For traffic symmetry between MOSK and management clusters and asymmetry in case of external clients: Use the API/LCM network to advertise the VIP address for Kubernetes API. Allocate this VIP address in the CIDR address of the API/LCM network. The gateway in the API/LCM network for a MOSK cluster must have a route to the management subnet of the management cluster. This is required to ensure symmetric traffic flow between the management and MOSK clusters. For traffic symmetry in case of external clients and asymmetry between MOSK and management clusters: Use external network to advertise the VIP address for Kubernetes API. Allocate this VIP address in the CIDR address of the external network. One of the gateways either in the API/LCM network, or in the external network for a MOSK cluster must have a route to the management subnet of the management cluster. This is required to establish the traffic flow between the management and MOSK clusters.
Kubernetes workloads network	Serves as an underlay network for traffic between pods in the MOSK cluster. Do not share this network between clusters. There might be more than one Kubernetes pods network segment in the cluster. In this case, they must be connected through an IP router. Kubernetes workloads network does not need an external access. The Kubernetes workloads subnet(s) provides IP addresses that are statically allocated by the IPAM service to all nodes and that are used by Calico for cross-node communication inside a cluster. By default, VXLAN overlay is used for Calico cross-node communication.
Kubernetes external network	Serves for an access to the OpenStack endpoints in a MOSK cluster. When using the ARP announcement of the external endpoints of load-balanced services, the network must contain a VLAN segment extended to all MOSK nodes connected to this network. When using the BGP announcement of the external endpoints of load-balanced services, which is available as Technology Preview since MOSK 23.2.2, there is no requirement of having a single VLAN segment extended to all MOSK nodes connected to this network. A typical MOSK cluster only has one external network. The external network must include at least two IP address ranges defined by separate `Subnet` objects in Container Cloud API: MOSK services address range Provides IP addresses for externally available load-balanced services, including OpenStack API endpoints. External address range Provides IP addresses to be assigned to network interfaces on all cluster nodes that are connected to this network. MetalLB speakers must run on the same nodes. For details, see Configure node selectors for MetalLB speakers. This is required for external traffic to return to the originating client. The default route on the MOSK nodes that are connected to the external network must be configured with the default gateway in the external network.
Storage access network	Serves for the storage access traffic from and to Ceph OSD services. A MOSK cluster may have more than one VLAN segment and IP subnet in the storage access network. All IP subnets of this network in a single cluster must be connected by an IP router. The storage access network does not require external access unless you want to directly expose Ceph to the clients outside of a MOSK cluster. Note A direct access to Ceph by the clients outside of a MOSK cluster is technically possible but not supported by Mirantis. Use at your own risk. The IP addresses from subnets in this network are statically allocated by the IPAM service to Ceph nodes. The Ceph OSD services bind to these addresses on their respective nodes. This is a public network in Ceph terms. 1
Storage replication network	Serves for the storage replication traffic between Ceph OSD services. A MOSK cluster may have more than one VLAN segment and IP subnet in this network as long as the subnets are connected by an IP router. This network does not require external access. The IP addresses from subnets in this network are statically allocated by the IPAM service to Ceph nodes. The Ceph OSD services bind to these addresses on their respective nodes. This is a cluster network in Ceph terms. 1
Out-of-Band (OOB) network	Connects Baseboard Management Controllers (BMCs) of the bare metal hosts. Must not be accessible from a MOSK cluster.

1(1,2): For more details about Ceph networks, see Ceph Network Configuration Reference.

The following diagram illustrates the networking schema of the Container Cloud deployment on bare metal with a MOSK cluster using ARP announcements:

Since 23.2.2, MOSK supports full L3 networking topology in the Technology Preview scope. The following diagram illustrates the networking schema of the Container Cloud deployment on bare metal with a MOSK cluster using BGP announcements:

Network types¶

This section describes network types for Layer 3 networks used for Kubernetes and Mirantis OpenStack for Kubernetes (MOSK) clusters along with requirements for each network type.

Note

Only IPv4 is currently supported by Container Cloud and IPAM for infrastructure networks. Both IPv4 and IPv6 are supported for OpenStack workloads.

The following diagram provides an overview of the underlay networks in a MOSK environment:

L3 networks for Kubernetes¶

A MOSK deployment typically requires the following types of networks:

Out-of-band (OOB) network
Connects the Baseboard Management Controllers (BMCs) of the hosts in the network to Ironic. This network is out of band for the host operating system.
PXE/provisioning network
Enables remote booting of servers through the PXE protocol. In management clusters, DHCP server listens on this network for hosts discovery and inspection. In managed clusters, hosts use this network for the initial PXE boot and provisioning.
Management network
Used in management clusters for managing MOSK infrastructure and for communication between containers in Kubernetes. Serves external connections to the management API and services of the management cluster.
LCM/API network
Since 23.2.2, MOSK supports full L3 networking topology in the Technology Preview scope. This enables deployment of specific cluster segments in dedicated racks without the need for L2 layer extension between them. For configuration procedure, see Configure BGP announcement for cluster API LB address and Configure BGP announcement of external addresses of Kubernetes load-balanced services in Deployment Guide.

If BGP announcement is configured for the MOSK cluster API LB address, the API/LCM network is not required. Announcement of the cluster API LB address is done using the LCM or external network.

If you configure ARP announcement of the load-balancer IP address for the MOSK cluster API, the API/LCM network must be configured on the Kubernetes manager nodes of the cluster. This network contains the Kubernetes API endpoint with the VRRP virtual IP address.

Depending of cluster needs, an operator can select how the VIP address for Kubernetes API is advertised. When BGP advertisement is used or the OpenStack control plane is deployed on separate nodes, as opposite to a compact control plane, it allows for a more flexible configuration and there is no need to search for a compromise such as the one described below.

But, when using ARP advertisement on a compact control plane, the selection of network for advertising the VIP address for Kubernetes API may depend on whether the symmetry of service return traffic is required. Therefore, when using ARP advertisement on a compact control plane, select one of the following options in the drop-down list:
Network selection for advertising the VIP address for Kubernetes API on a compact control plane
For traffic symmetry between MOSK and management clusters and asymmetry in case of external clients:

Use the API/LCM network to advertise the VIP address for Kubernetes API. Allocate this VIP address in the CIDR address of the API/LCM network.

The gateway in the API/LCM network for a MOSK cluster must have a route to the management subnet of the management cluster. This is required to ensure symmetric traffic flow between the management and MOSK clusters.

For traffic symmetry in case of external clients and asymmetry between MOSK and management clusters:

Use external network to advertise the VIP address for Kubernetes API. Allocate this VIP address in the CIDR address of the external network.

One of the gateways either in the API/LCM network, or in the external network for a MOSK cluster must have a route to the management subnet of the management cluster. This is required to establish the traffic flow between the management and MOSK clusters.
LCM network
Connects LCM agents running on a node to the LCM API of the management cluster. It is also used for communication between kubelet and the Kubernetes API server inside a Kubernetes cluster. The MKE components use this network for communication inside a swarm cluster. In management clusters, it is replaced by the management network.

Multiple VLAN segments and IP subnets can be created for a multi-rack architecture. Each server must be connected to one of the LCM segments and have an IP from the corresponding subnet.
Kubernetes external network
Used to expose the OpenStack, StackLight, and other services of the MOSK cluster. In management clusters, it is replaced by the management network.

Depending of cluster needs, an operator can select how the VIP address for Kubernetes API is advertised. When BGP advertisement is used or the OpenStack control plane is deployed on separate nodes, as opposite to a compact control plane, it allows for a more flexible configuration and there is no need to search for a compromise such as the one described below.

But, when using ARP advertisement on a compact control plane, the selection of network for advertising the VIP address for Kubernetes API may depend on whether the symmetry of service return traffic is required. Therefore, when using ARP advertisement on a compact control plane, select one of the following options in the drop-down list:
Network selection for advertising the VIP address for Kubernetes API on a compact control plane
For traffic symmetry between MOSK and management clusters and asymmetry in case of external clients:

Use the API/LCM network to advertise the VIP address for Kubernetes API. Allocate this VIP address in the CIDR address of the API/LCM network.

The gateway in the API/LCM network for a MOSK cluster must have a route to the management subnet of the management cluster. This is required to ensure symmetric traffic flow between the management and MOSK clusters.

For traffic symmetry in case of external clients and asymmetry between MOSK and management clusters:

Use external network to advertise the VIP address for Kubernetes API. Allocate this VIP address in the CIDR address of the external network.

One of the gateways either in the API/LCM network, or in the external network for a MOSK cluster must have a route to the management subnet of the management cluster. This is required to establish the traffic flow between the management and MOSK clusters.
Kubernetes workloads (pods) network
Used for communication between containers in Kubernetes. Each host has an address on this network, and this address is used by Calico as an endpoint to the underlay network.
Storage access network (Ceph)
Used for accessing the Ceph storage. Connects Ceph nodes to the storage clients. The Ceph OSD service is bound to the address on this network. In Ceph terms, this is a public network 0. We recommended that it is placed on a dedicated hardware interface.

Connects Ceph nodes to each other. Serves internal replication traffic. This is a cluster network in Ceph terms. 0
Storage replication network (Ceph)
Used for Ceph storage replication. Connects Ceph nodes to each other. Serves internal replication traffic.In Ceph terms, this is a cluster network 0. To ensure low latency and fast access, place the network on a dedicated hardware interface.

0(1,2,3): For details about Ceph networks, see Ceph Network Configuration Reference.

The following table summarizes the default names used for the bridges connected to the networks listed above:

Management cluster¶
Network type	Bridge name	Assignment method ^TechPreview
OOB network	N/A	N/A
PXE network	`bm-pxe`	By a static interface name
Management network	`k8s-lcm` 2	By a subnet label `ipam/SVC-k8s-lcm`
Kubernetes workloads network	`k8s-pods` 1	By a static interface name

MOSK cluster¶
Network type	Bridge name	Assignment method
OOB network	N/A	N/A
PXE network	N/A	N/A
LCM network	`k8s-lcm` 2	By a subnet label `ipam/SVC-k8s-lcm`
Kubernetes workloads network	`k8s-pods` 1	By a static interface name
Kubernetes external network	`k8s-ext`	By a static interface name
Storage access (public) network	`ceph-public`	By the subnet label `ipam/SVC-ceph-public`
Storage replication (cluster) network	`ceph-cluster`	By the subnet label `ipam/SVC-ceph-cluster`

1(1,2): Interface name for this network role is static and cannot be changed.
2(1,2): The use of this interface name (and network role) is mandatory for every cluster.

L3 networks for MOSK¶

The MOSK deployment additionally requires the following networks.

L3 networks for MOSK¶
Service name	Network	Description	VLAN name
Networking	Provider networks	Typically, a routable network used to provide the external access to OpenStack instances (a floating network). Can be used by the OpenStack services such as Ironic, Manila, and others, to connect their management resources.	`pr-floating`
Networking	Overlay networks (virtual networks)	The network used to provide denied, secure tenant networks with the help of the tunneling mechanism (VLAN/GRE/VXLAN). If the VXLAN and GRE encapsulation takes place, the IP address assignment is required on interfaces at the node level.	`neutron-tunnel`
Compute	Live migration network	The network used by the OpenStack compute service (Nova) to transfer data during live migration. Depending on the cloud needs, it can be placed on a dedicated physical network not to affect other networks during live migration. The IP address assignment is required on interfaces at the node level.	`lm-vlan`

The way of mapping of the logical networks described above to physical networks and interfaces on nodes depends on the cloud size and configuration. We recommend placing OpenStack networks on a dedicated physical interface (bond) that is not shared with storage and Kubernetes management network to minimize the influence on each other.

L3 networks requirements¶

The following tables describe networking requirements for a MOSK cluster, Container Cloud management and Ceph clusters.

Container Cloud management cluster networking requirements¶
Network type	Provisioning	Management
Suggested interface name	N/A	`k8s-lcm`
Minimum number of VLANs	1	1
Minimum number of IP subnets	3	2
Minimum recommended IP subnet size	8 IP addresses (Container Cloud management cluster hosts) 8 IP addresses (MetalLB for provisioning services) 16 IP addresses (DHCP range for directly connected servers)	8 IP addresses (Container Cloud management cluster hosts, API VIP) 16 IP addresses (MetalLB for Container Cloud services)
External routing	Not required	Required, may use proxy server
Multiple segments/stretch segment	Stretch segment for management cluster due to MetalLB Layer 2 limitations 3	Stretch segment due to VRRP, MetalLB Layer 2 limitations
Internal routing	Routing to separate DHCP segments, if in use	Routing to API endpoints of managed clusters for LCM Routing to MetalLB ranges of managed clusters for StackLight authentication Default route from Container Cloud management cluster hosts

3: Multiple VLAN segments with IP subnets can be added to the cluster configuration for separate DHCP domains.

Since 23.2.2, MOSK supports full L3 networking topology in the Technology Preview scope. This enables deployment of specific cluster segments in dedicated racks without the need for L2 layer extension between them. For configuration procedure, see Configure BGP announcement for cluster API LB address and Configure BGP announcement of external addresses of Kubernetes load-balanced services in Deployment Guide.

If you configure BGP announcement of the load-balancer IP address for a MOSK cluster API and for load-balanced services of the cluster:

Networking requirements for a MOSK cluster¶
Network type	Provisioning	LCM	External	Kubernetes workloads
Minimum number of VLANs	1 (optional)	1	1	1
Suggested interface name	N/A	`k8s-lcm`	`k8s-ext-v`	`k8s-pods` 4
Minimum number of IP subnets	1 (optional)	1	2	1
Minimum recommended IP subnet size	16 IPs (DHCP range)	IP per cluster node 1 IP for the API endpoint VIP	1 IP per MOSK controller node 16 IPs (MetalLB for StackLight, OpenStack services)	1 IP per cluster node
Stretch or multiple segments	Multiple	Multiple	Multiple For details, see Configure node selectors for MetalLB speakers.	Multiple
External routing	Not required	Not required	Required, default route	Not required
Internal routing	Routing to the provisioning network of the management cluster	Routing to the IP subnet of the Container Cloud management network Routing to all LCM IP subnets of the same MOSK cluster	Routing to the IP subnet of the Container Cloud Management API	Routing to all IP subnets of Kubernetes workloads

If you configure ARP announcement of the load-balancer IP address for a MOSK cluster API and for load-balanced services of the cluster:

Networking requirements for a MOSK cluster¶
Network type	Provisioning	LCM/API	LCM	External	Kubernetes workloads
Minimum number of VLANs	1 (optional)	1	1 (optional)	1	1
Suggested interface name	N/A	`k8s-lcm`	`k8s-lcm`	`k8s-ext-v`	`k8s-pods` 4
Minimum number of IP subnets	1 (optional)	1	1 (optional)	2	1
Minimum recommended IP subnet size	16 IPs (DHCP range)	3 IPs for Kubernetes manager nodes 1 IP for the API endpoint VIP	1 IP per MOSK node (Kubernetes worker)	1 IP per MOSK controller node 16 IPs (MetalLB for StackLight, OpenStack services)	1 IP per cluster node
Stretch or multiple segments	Multiple	Stretch due to VRRP limitations	Multiple	Stretch connected to all MOSK controller nodes. For details, see Configure node selectors for MetalLB speakers.	Multiple
External routing	Not required	Not required	Not required	Required, default route	Not required
Internal routing	Routing to the provisioning network of the management cluster	Routing to the IP subnet of the Container Cloud management network Routing to all LCM IP subnets of the same MOSK cluster, if in use	Routing to the IP subnet of the LCM/API network Routing to all IP subnets of the LCM network, if in use	Routing to the IP subnet of the Container Cloud Management API	Routing to all IP subnets of Kubernetes workloads

4(1,2): The bridge interface with this name is mandatory if you need to separate Kubernetes workloads traffic. You can configure this bridge over the VLAN or directly over the bonded or single interface.

Networking requirements for a Ceph cluster¶
Network type	Storage access	Storage replication
Minimum number of VLANs	1	1
Suggested interface name	`stor-public` 5	`stor-cluster` 5
Minimum number of IP subnets	1	1
Minimum recommended IP subnet size	1 IP per cluster node	1 IP per cluster node
Stretch or multiple segments	Multiple	Multiple
External routing	Not required	Not required
Internal routing	Routing to all IP subnets of the Storage access network	Routing to all IP subnets of the Storage replication network

Note

When selecting externally routable subnets, ensure that the subnet ranges do not overlap with the internal subnets ranges. Otherwise, internal resources of users will not be available from the MOSK cluster.

5(1,2): For details about Ceph networks, see Ceph Network Configuration Reference.

IP Address Management¶

Mirantis OpenStack for Kubernetes (MOSK) uses IP Address Management (IPAM) to keep track of the network addresses allocated to bare metal hosts. This is necessary to avoid IP address conflicts and expiration of address leases to machines through DHCP.

Note

Only IPv4 address family is currently supported by MOSK and IPAM. IPv6 is neiter supported nor used.

IPAM is provided by the kaas-ipam controller. Its functions include:

Allocation of IP address ranges or subnets to newly created clusters using the Subnet resource.

Note

Before Container Cloud 2.27.0 (Cluster releases 17.1.0, 16.1.0, or earlier) the deprecated SubnetPool resource was also used for this purpose. For details, see Deprecation Notes: SubnetPool resource management.
Allocation of IP addresses to machines and cluster services at the request of baremetal-provider using the IpamHost and IPaddr resources.
Creation and maintenance of host networking configuration on bare metal hosts using the IpamHost resources.

The IPAM service can support different networking topologies and network hardware configurations on the bare metal hosts.

In the most basic network configuration, IPAM uses a single L3 network to assign addresses to all bare metal hosts, as defined in MOSK cluster networking.

You can apply complex networking configurations to a bare metal host using L2 templates. L2 templates imply multihomed host networking and enable you to create a managed cluster where nodes use separate host networks for different types of traffic. Multihoming is required to ensure the security and performance of a MOSK cluster.

Caution

Modification of L2 templates in use is only allowed with a mandatory validation step from the infrastructure operator to prevent accidental cluster failures due to unsafe changes. The list of risks posed by modifying L2 templates includes:

Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause data loss.
Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.

For details, see Modify network configuration on an existing machine.

Multi-rack architecture¶

TechPreview

Mirantis OpenStack for Kubernetes (MOSK) enables you to deploy a cluster with a multi-rack architecture, where every data center cabinet (a rack) incorporates its own Layer 2 network infrastructure that does not extend beyond its top-of-rack switch. The architecture allows a MOSK cloud to integrate natively with the Layer 3-centric networking topologies such as Spine-Leaf that are commonly seen in modern data centers.

The architecture eliminates the need to stretch and manage VLANs across parts of a single data center, or to build VPN tunnels between the segments of a geographically distributed cloud.

The set of networks present in each rack depends on the backend used by the OpenStack networking service.

multi-rack-overview.html

Bare metal provisioning network¶

In the Mirantis Container Cloud and MOSK multi-rack reference architecture, every rack has its own L2 segment (VLAN) to bootstrap and install servers.

Segmentation of the provisioning network requires additional configuration of the underlay networking infrastructure and certain Container Cloud API objects. You need to configure a DHCP Relay agent on the border of each VLAN in the provisioning network. The agent handles broadcast DHCP requests coming from the bare metal servers in the rack and forwards them as unicast packets across L3 fabric of the data center to a Container Cloud management cluster.

multi-rack-bm.html

From the standpoint of Container Cloud API, you need to configure per-rack DHCP ranges by adding Subnet resources in Container Cloud as described in Configure multiple DHCP address ranges.

The DHCP server of Container Cloud automatically leases a temporary IP address from the DHCP range to the requester host depending on the address of the DHCP agent that relays the request.

Multi-rack MOSK cluster¶

To deploy a MOSK cluster with multi-rack reference architecture, you need to create a dedicated set of subnets and L2 templates for every rack in your cluster.

Every specific host type in the rack, which is defined by the role in the MOSK cluster and network-related hardware configuration, may require a specific L2 template.

Note

For MOSK 23.1 and older versions, due to the Container Cloud limitations, you need to configure the following networks to have L2 segments (VLANs) stretch across racks to all hosts of certain types in a multi-rack environment:

LCM/API network: Must be configured on the Kubernetes manager nodes of the MOSK cluster. Contains a Kubernetes API endpoint with a VRRP virtual IP address. Enables MKE cluster nodes to communicate with each other.
External network: Exposes OpenStack, StackLight, and other services of the MOSK cluster to external clients.

For details, see Underlay networking: routing configuration.

When planning space allocation for IP addresses in your cluster, pick large IP ranges for each type of network. Then you will split these ranges into per-rack subnets.

For example, if you allocate a /20 address block for LCM network, then you can create up to 16 Subnet objects with the /24 address block each for up to 16 racks. This way you can simplify routing on your hosts using the large /20 IP subnet as an aggregated route destination. For details, see Underlay networking: routing configuration.

Multi-rack MOSK cluster with Tungsten Fabric¶

A typical medium and more sized MOSK cloud consists of three or more racks that can generally be divided into the following major categories:

Compute/Storage racks that contain the hypervisors and instances running on top of them. Additionally, they contain nodes that store cloud applications’ block, ephemeral, and object data as part of the Ceph cluster.
Control plane racks that incorporate all the components needed by the cloud operator to manage its life cycle. Also, they include the services through which the cloud users interact with the cloud to deploy their applications, such as cloud APIs and web UI.

A control plane rack may also contain additional compute and storage nodes.

The diagram below will help you to plan the networking layout of a multi-rack MOSK cloud with Tungsten Fabric.

multi-rack-tf.html

Note

For MOSK 23.1 and older versions, Kubernetes masters (3 nodes) either need to be placed into a single rack or, if distributed across multiple racks for better availability, require stretching of the L2 segment of the management network across these racks. This requirement is caused by the Mirantis Kubernetes Engine underlay for MOSK relying on the Layer 2 VRRP protocol to ensure high availability of the Kubernetes API endpoint.

The table below provides a mapping between the racks and the network types participating in a multi-rack MOSK cluster with the Tungsten Fabric backend.

Networks and VLANs for a multi-rack MOSK cluster with TF¶
Network	VLAN name	Control Plane rack	Compute/Storage rack
Common/PXE	`lcm-nw`	Yes	Yes
Management	`lcm-nw`	Yes	Yes
External (MetalLB)	`k8s-ext-v`	Yes	No
Kubernetes workloads	`k8s-pods-v`	Yes	Yes
Storage access (Ceph)	`stor-frontend`	Yes	Yes
Storage replication (Ceph)	`stor-backend`	Yes	Yes
Overlay	`tenant-vlan`	Yes	Yes
Live migration	`lm-vlan`	Yes	Yes

Physical networks layout¶

This section summarizes the requirements for the physical layout of underlay network and VLANs configuration for the multi-rack architecture of Mirantis OpenStack for Kubernetes (MOSK).

Physical networking of a management cluster¶

Due to limitations of virtual IP address for Kubernetes API and of MetalLB load balancing in MOSK, the management cluster nodes must share VLAN segments in the provisioning and management networks.

In the multi-rack architecture, the management cluster nodes may be placed to a single rack or spread across three racks. In either case, provisioning and management network VLANs must be stretched across ToR switches of the racks.

The following diagram illustrates physical and L2 connections of a management cluster.

The network fabric reference configuration is a spine/leaf with 2 leaf ToR switches and one out-of-band (OOB) switch per rack.

Reference configuration uses the following switches for ToR and OOB:

Cisco WS-C3560E-24TD has 24 of 1 GbE ports. Used in OOB network segment.
Dell Force 10 S4810P has 48 of 1/10GbE ports. Used as ToR in Common/PXE network segment.

In the reference configuration, all odd interfaces from NIC0 are connected to Leaf TOR Switch 1, and all even interfaces from NIC0 are connected to Leaf TOR Switch 2. The Baseboard Management Controller (BMC) interfaces of the servers are connected to OOB Switch 1.

The following recommendations apply to all types of nodes:

Use the Link Aggregation Control Protocol (LACP) bonding mode with MC-LAG domains configured on leaf switches. This corresponds to the 802.3ad bond mode on hosts.
Use ports from different multi-port NICs when creating bonds. This makes network connections redundant if failure of a single NIC occurs.
Configure the ports that connect servers to the PXE network with PXE VLAN as native or untagged. On these ports, configure LACP fallback to ensure that the servers can reach DHCP server and boot over network.

See also

Troubleshoot iPXE boot issues

Physical networking of a MOSK cluster¶

External network¶

If you configure BGP announcement for IP addresses of load-balanced services of a MOSK cluster, the external network can consist of multiple VLAN segments connected to all nodes of a MOSK cluster where MetalLB speaker components are configured to announce IP addresses for Kubernetes load-balanced services. Mirantis recommends that you use OpenStack controller nodes for this purpose.

If you configure ARP announcement for IP addresses of load-balanced services of a MOSK cluster, the external network must consist of a single VLAN stretched to the ToR switches of all the racks where MOSK nodes connected to the external network are located. Those are the nodes where MetalLB speaker components are configured to announce IP addresses for Kubernetes load-balanced services. Mirantis recommends that you use OpenStack controller nodes for this purpose.

Depending of cluster needs, an operator can select how the VIP address for Kubernetes API is advertised. When BGP advertisement is used or the OpenStack control plane is deployed on separate nodes, as opposite to a compact control plane, it allows for a more flexible configuration and there is no need to search for a compromise such as the one described below.

But, when using ARP advertisement on a compact control plane, the selection of network for advertising the VIP address for Kubernetes API may depend on whether the symmetry of service return traffic is required. Therefore, when using ARP advertisement on a compact control plane, select one of the following options in the drop-down list:

Kubernetes manager nodes¶

Note

If BGP announcement is configured for MOSK cluster API LB address, Kubernetes manager nodes have no requirement to share the single stretched VLAN segment in the API/LCM network. All VLANs may be configured per rack.

If ARP announcement is configured for MOSK cluster API LB address, Kubernetes manager nodes must share the VLAN segment in the API/LCM network. In the multi-rack architecture, Kubernetes manager nodes may be spread across three racks. The API/LCM network VLAN must be stretched to the ToR switches of the racks. All other VLANs may be configured per rack. This requirement is caused by the Mirantis Kubernetes Engine underlay for MOSK relying on the Layer 2 VRRP protocol to ensure high availability of the Kubernetes API endpoint.

The following diagram illustrates physical and L2 network connections of the Kubernetes manager nodes in a MOSK cluster.

Caution

Such configuration does not apply to a compact control plane MOSK installation. See Create a MOSK cluster.

OpenStack controller nodes¶

The following diagram illustrates physical and L2 network connections of the control plane nodes in a MOSK cluster.

OpenStack compute nodes¶

All VLANs for OpenStack compute nodes may be configured per rack. No VLAN should be stretched across multiple racks.

The following diagram illustrates physical and L2 network connections of the compute nodes in a MOSK cluster.

OpenStack storage nodes¶

All VLANs for OpenStack storage nodes may be configured per rack. No VLAN should be stretched across multiple racks.

The following diagram illustrates physical and L2 network connections of the storage nodes in a MOSK cluster.

Underlay networking: routing configuration¶

This section describes requirements for the configuration of the underlay network for an MOSK cluster in a multi-rack reference configuration. The infrastructure operator must configure the underlay network according to these guidelines. Mirantis Container Cloud will not configure routing on the network devices.

Provisioning network¶

In the multi-rack reference architecture, every server rack has its own layer-2 segment (VLAN) for network bootstrap and installation of physical servers.

You need to configure top-of-rack (ToR) switches in each rack with the default gateway for the provisioning network VLAN. This gateway must also serve as a DHCP Relay Agent on the border of the VLAN. The agent handles broadcast DHCP requests coming from the bare metal servers in the rack and forwards them as unicast packets across the data center L3 fabric to the provisioning network of a Container Cloud management cluster.

Therefore, each ToR gateway must have an IP route to the IP subnet of the provisioning network of the management cluster. The provisioning network gateway, in turn, must have routes to all IP subnets of all racks.

The hosts of the management cluster must have routes to all IP subnets in the provisioning network through the gateway in the provisioning network of the management cluster.

All hosts in the management cluster must have IP addresses from the same IP subnet of the provisioning network. Even if the hosts of the management cluster are mounted to different racks, they must share a single provisioning VLAN segment.

See also

Configure multiple DHCP address ranges

Management network¶

All hosts of a management cluster must have IP addresses from the same subnet of the management network. Even if hosts of a management cluster are mounted to different racks, they must share a single management VLAN segment.

The gateway in this network is used as the default route on the nodes in a Container Cloud management cluster. This gateway must connect to external Internet networks directly or through a proxy server. If the Internet is accessible through a proxy server, you must configure Container Cloud bootstrap to use it as well. For details, see Deploy a management cluster.

This network connects a Container Cloud management cluster to Kubernetes API endpoints of MOSK clusters. It also connects LCM agents of MOSK nodes to the Kubernetes API endpoint of the management cluster.

The network gateway must have routes to all API/LCM subnets of all MOSK clusters.

LCM network¶

This network may include multiple VLANs, typically, one VLAN per rack. Each VLAN may have one or more IP subnets with gateways configured on ToR switches.

Each ToR gateway must provide routes to all other IP subnets in all other VLANs in the LCM network to enable communication between nodes in the cluster.

Note

If you configure BGP announcement of the load-balancer IP address for a MOSK cluster API:

All nodes of a MOSK cluster must be connected to the LCM network. Each host connected to this network must have routes to all IP subnets in the LCM network and to the management subnet of the management cluster, through the ToR gateway for the rack of this host.
It is not required to configure a separate API/LCM network. Announcement of the IP address of the load balancer is done using the LCM or external network.

If you configure ARP announcement of the load-balancer IP address for a MOSK cluster API:

All nodes of a MOSK cluster excluding manager nodes must be connected to the LCM network. Each host connected to this network must have routes to all IP subnets in the LCM network, including the API/LCM network of this MOSK cluster and to the Management subnet of the management cluster, through the ToR gateway for the rack of this host.
It is required to configure a separate API/LCM network. All manager nodes of a MOSK cluster must be connected to the API/LCM network. IP address announcement for load balancing is done using the API/LCM or external network.

API/LCM network¶

Note

If BGP announcement is configured for the MOSK cluster API LB address, the API/LCM network is not required. Announcement of the cluster API LB address is done using the LCM or external network.

If you configure ARP announcement of the load-balancer IP address for the MOSK cluster API, the API/LCM network must be configured on the Kubernetes manager nodes of the cluster. This network contains the Kubernetes API endpoint with the VRRP virtual IP address.

This network consists of a single VLAN shared between all MOSK manager nodes in a MOSK cluster, even if the nodes are spread across multiple racks. All manager nodes of a MOSK cluster must be connected to this network and have IP addresses from the same subnet in this network.

The gateway in this network must have routes to all IP subnets in the LCM network of this MOSK cluster.

External network¶

Note

The IP gateway in this network is used as the default route on all nodes in the MOSK cluster, which are connected to this network. This allows external users to connect to the OpenStack endpoints exposed as Kubernetes load-balanced services.

Dedicated IP address ranges must be configured as address pools for the MetalLB service. It is not obligatory to allocate these IP ranges from the external network CIDR address, but the route for the IP addresses that constitute these IP ranges must match the IP gateway in the external network. MetalLB allocates addresses from these address pools to Kubernetes load-balanced services.

Ceph public network¶

This network may include multiple VLANs and IP subnets, typically, one VLAN and IP subnet per rack. All IP subnets in this network must be connected by IP routes on the ToR switches.

Typically, every node in a MOSK cluster is connected to this network and have routes to all IP subnets from this network through its rack IP gateway.

This network is not connected to the external networks.

Ceph cluster network¶

This network may include multiple VLANs and IP subnets, typically, one VLAN and IP subnet per rack. All IP subnets in this network must be connected by IP routes on the ToR switches.

Every Ceph OSD node in a MOSK cluster must be connected to this network and have routes to all IP subnets from this network through its rack IP gateway.

This network is not connected to the external networks.

Kubernetes workloads network¶

This network may include multiple VLANs and IP subnets, typically, one VLAN and IP subnet per rack. All IP subnets in this network must be connected by IP routes on the ToR switches.

All nodes in a MOSK cluster must be connected to this network and have routes to all IP subnets from this network through its rack IP gateway.

This network is not connected to the external networks.

Performance optimization¶

The following recommendations apply to all types of nodes in the Mirantis OpenStack for Kubernetes (MOSK) clusters.

Jumbo frames¶

To improve the goodput, we recommend that you enable jumbo frames where possible. The jumbo frames have to be enabled on the whole path of the packets traverse. If one of the network components cannot handle jumbo frames, the network path uses the smallest MTU.

Bonding¶

To provide fault tolerance of a single NIC, we recommend using the link aggregation, such as bonding. The link aggregation is useful for linear scaling of bandwidth, load balancing, and fault protection. Depending on the hardware equipment, different types of bonds might be supported. Use the multi-chassis link aggregation as it provides fault tolerance at the device level. For example, MLAG on Arista equipment or vPC on Cisco equipment.

The Linux kernel supports the following bonding modes:

active-backup
balance-xor
802.3ad (LACP)
balance-tlb
balance-alb

Since LACP is the IEEE standard 802.3ad supported by the majority of network platforms, we recommend using this bonding mode. Use the Link Aggregation Control Protocol (LACP) bonding mode with MC-LAG domains configured on ToR switches. This corresponds to the 802.3ad bond mode on hosts.

Additionally, follow these recommendations in regards to bond interfaces:

Use ports from different multi-port NICs when creating bonds. This makes network connections redundant if failure of a single NIC occurs.
Configure the ports that connect servers to the PXE network with PXE VLAN as native or untagged. On these ports, configure LACP fallback to ensure that the servers can reach DHCP server and boot over network.

Spanning tree portfast mode¶

Configure Spanning Tree Protocol (STP) settings on the network switch ports to ensure that the ports start forwarding packets as soon as the link comes up. It helps avoid iPXE timeout issues and ensures reliable boot over network.

See also

Built-in load balancing¶

MOSK clusters use MetalLB for load balancing of services and HAProxy with VIP managed by Virtual Router Redundancy Protocol (VRRP) with Keepalived for the Kubernetes API load balancer.

Kubernetes API load balancing¶

Every control plane node of each Kubernetes cluster runs the kube-api service in a container. This service provides a Kubernetes API endpoint. Every control plane node also runs the haproxy server that provides load balancing with backend health checking for all kube-api endpoints as backends.

The default load balancing method is least_conn. With this method, a request is sent to the server with the least number of active connections. The default load balancing method cannot be changed using the MOSK management API.

Only one of the control plane nodes at any given time serves as a frontend for the Kubernetes API. To ensure this, the Kubernetes clients use a virtual IP (VIP) address for accessing Kubernetes API. This VIP is assigned to one node at a time using VRRP. Keepalived running on each control plane node provides health checking and failover of the VIP.

Keepalived is configured in multicast mode.

Note

The use of VIP address for load balancing of Kubernetes API requires that all control plane nodes of a Kubernetes cluster are connected to a shared L2 segment. This limitation prevents from installing full L3 topologies where control plane nodes are split between different L2 segments and L3 networks.

Services load balancing¶

The services provided by the Kubernetes clusters, including MOSK and user services, are balanced by MetalLB. The metallb-speaker service runs on every worker node in the cluster and handles connections to the service IP addresses.

MetalLB runs in the MAC-based (L2) mode. It means that all control plane nodes must be connected to a shared L2 segment. This is a limitation that does not allow installing full L3 cluster topologies.

Proxy support and cache of artifacts¶

Proxy support¶

If you require all Internet access to go through a proxy server for security and audit purposes, you can bootstrap management clusters using proxy. The proxy server settings consist of three standard environment variables that are set prior to the bootstrap process:

HTTP_PROXY
HTTPS_PROXY
NO_PROXY

These settings are not propagated to MOSK clusters. However, you can enable a separate proxy access on a MOSK cluster using the MOSK management console. This proxy is intended for the end user needs and is not used for a MOSK cluster deployment or for access to the Mirantis resources.

Caution

Since MOSK uses the OpenID Connect (OIDC) protocol for IAM authentication, management clusters require a direct non-proxy access from MOSK clusters.

StackLight components, which require external access, automatically use the same proxy that is configured for MOSK clusters.

On MOSK clusters with limited Internet access, a proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and external rules of Alertmanager notifications. For more details about proxy implementation in StackLight, see StackLight proxy.

For the list of Mirantis resources and IP addresses to be accessible from MOSK clusters, see System requirements for the seed node.

After enabling proxy support on MOSK clusters, proxy is used for:

Docker traffic on MOSK clusters
StackLight
OpenStack

Warning

Any modification to the Proxy object used in any cluster, for example, changing the proxy URL, NO_PROXY values, or certificate, leads to cordon-drain and Docker restart on the cluster machines.

Artifacts caching¶

MOSK clusters are deployed without direct Internet access to consume less Internet traffic in your cluster. The Mirantis artifacts used during MOSK clusters deployment are downloaded through a cache running on a management cluster. The feature is enabled by default.

Caution

IAM operations require a direct non-proxy access of a MOSK cluster to a management cluster.

Bare metal¶

The bare metal service provides for the discovery, deployment, and management of bare metal hosts. The bare metal management in Mirantis OpenStack for Kubernetes (MOSK) is implemented as a set of modular microservices. Each microservice implements a certain requirement or function within the bare metal management system.

Bare metal components¶

The bare metal management solution for MOSK includes the following components:

Bare metal components¶
Component	Description
OpenStack Ironic	The backend bare metal manager in a standalone mode with its auxiliary services that include `httpd`, `dnsmasq`, and `mariadb`.
OpenStack Ironic Inspector	Introspects and discovers the bare metal hosts inventory. Includes OpenStack Ironic Python Agent (IPA) that is used as a provision-time agent for managing bare metal hosts.
Ironic Operator	Monitors changes in the external IP addresses of `httpd`, `ironic`, and `ironic-inspector` and automatically reconciles the configuration for `dnsmasq`, `ironic`, `baremetal-provider`, and `baremetal-operator`.
Bare Metal Operator	Manages bare metal hosts through the Ironic API. The Bare Metal Operator implementation is based on the Metal³ project.
Bare metal resources manager	Ensures that the bare metal provisioning artifacts such as the distribution image of the operating system is available and up to date.
`cluster-api-provider-baremetal`	The plugin for the Kubernetes Cluster API integrated with MOSK. MOSK uses the Metal³ implementation of `cluster-api-provider-baremetal` for the Cluster API.
HAProxy	Load balancer for external access to the Kubernetes API endpoint.
LCM Agent	Used for physical and logical storage, physical and logical network, and control over the life cycle of a bare metal machine resources.
Ceph	Distributed shared storage is required by MOSK services for MOSK clusters to create persistent volumes to store their data. Ceph is not deployed on management clusters.
MetalLB	Load balancer for Kubernetes services on bare metal. 1
Keepalived	Monitoring service that ensures availability of the virtual IP for the external load balancer endpoint (HAProxy). 1
IPAM	IP address management services provide consistent IP address space to the machines in bare metal clusters. See details in IP Address Management.

1(1,2): For details, see Built-in load balancing.

The diagram below summarizes the following components and resource kinds:

Metal³-based bare metal management in MOSK (white)
Internal APIs (yellow)
External dependency components (blue)

Extended hardware configuration¶

Mirantis Container Cloud provides APIs that enable you to define hardware configurations that extend the reference architecture:

Bare Metal Host Profile API

Enables for quick configuration of host boot and storage devices and assigning of custom configuration profiles to individual machines. See Create a custom bare metal host profile.
IP Address Management API

Enables for quick configuration of host network interfaces and IP addresses and setting up of IP addresses ranges for automatic allocation. See Create L2 templates.

Typically, operations with the extended hardware configurations are available through the API and CLI, but not the management console.

Kubernetes underlay¶

The Kubernetes lifecycle management (LCM) engine in MOSK consists of the following components:

LCM Controller: Responsible for all LCM operations. Consumes the LCMCluster object and orchestrates actions through LCM Agent.
LCM Agent: Runs on the target host. Executes Ansible playbooks in headless mode.
Helm Controller: Responsible for the Helm charts life cycle, is installed by the provider as a Helm v3 chart.

The Kubernetes LCM components handle the following custom resources:

LCMCluster
LCMMachine
HelmBundle

The following diagram illustrates handling of the LCM custom resources by the Kubernetes LCM components. On a MOSK cluster, apiserver handles multiple Kubernetes objects, for example, deployments, nodes, RBAC, and so on.

The following sections outline key components of the Kubernetes LCM engine in MOSK.

LCM custom resources¶

The Kubernetes LCM components handle the following custom resources (CRs):

LCMMachine: Describes a machine that is located on a cluster. Contains the machine type, control, or worker StateItems that correspond to Ansible playbooks and miscellaneous actions, for example, downloading a file or executing a shell command. LCMMachine reflects the current state of a machine, for example, a node IP address, and each StateItem through its status. Multiple LCMMachine CRs can correspond to a single cluster.
LCMCluster: Describes a MOSK cluster. In its spec, LCMCluster contains a set of StateItems for each type of LCMMachine, which describe the actions that must be performed to deploy the cluster. LCMCluster is created by the provider using machineTypes of the Release object. The status field of LCMCluster reflects the status of the cluster, for example, the number of ready or requested nodes.
HelmBundle: Wrapper for Helm charts that is handled by Helm Controller. HelmBundle tracks what Helm charts must be installed on a MOSK cluster.

LCM Controller¶

LCM Controller runs on the management cluster and orchestrates the LCMMachine objects according to their type and their LCMCluster object.

Once the LCMCluster and LCMMachine objects are created, LCM Controller starts monitoring them to modify the spec fields and update the status fields of the LCMMachine objects when required. The status field of LCMMachine is updated by LCM Agent running on a node of a management or MOSK cluster.

Each LCMMachine has the following lifecycle states:

Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond to the prepare phase. This phase usually involves downloading the necessary archives and packages.
Deploy - the machine executes StateItems that correspond to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE) node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond to the reconfigure phase. The machine configuration is being updated without affecting workloads running on the machine.

The templates for StateItems are stored in the machineTypes field of an LCMCluster object, with separate lists for the MKE manager and worker nodes. Each StateItem has the execution phase field for a management and MOSK cluster:

The prepare phase is executed for all machines for which it was not executed yet. This phase comprises downloading the files necessary for the cluster deployment, installing the required packages, and so on.
During the deploy phase, a node is added to the cluster. LCM Controller applies the deploy phase to the nodes in the following order:
1. First manager node is deployed.
2. The remaining manager nodes are deployed one by one and the worker nodes are deployed in batches (by default, up to 50 worker nodes at the same time).

LCM Controller deploys and upgrades a MOSK cluster by setting StateItems of LCMMachine objects following the corresponding StateItems phases described above. A MOSK cluster update process follows the same logic that is used for a new deployment, that is applying a new set of StateItems to LCMMachines after updating the LCMCluster object. But if the existing worker node is being updated, LCM Controller performs draining and cordoning on this node honoring the Pod Disruption Budgets. This operation prevents unexpected disruptions of workloads.

LCM Agent¶

LCM Agent handles a single machine that belongs to a management or MOSK cluster. It runs on the machine operating system but communicates with apiserver of the management cluster. LCM Agent is deployed as a systemd unit using cloud-init. LCM Agent has a built-in self-upgrade mechanism.

LCM Agent monitors the spec of a particular LCMMachine object to reconcile the machine state with the object StateItems and update the LCMMachine status`` accordingly. The actions that LCM Agent performs while handling StateItems are as follows:

Download configuration files
Run shell commands
Run Ansible playbooks in headless mode

LCM Agent provides the IP address and host name of the machine for the status parameter of LCMMachine.

Helm Controller¶

Helm Controller handles management and MOSK clusters core addons, such as StackLight, and the application addons, such as the OpenStack components.

Helm Controller is installed as a separate Helm v3 chart by the provider. Its Pods are created using Deployment.

The Helm release information is stored in the KaaSRelease object for the management clusters and in the ClusterRelease object for management and MOSK clusters. These objects are used by the provider that uses the information from the ClusterRelease object together with the management API Cluster spec. In Cluster spec, the operator can specify the Helm release name and charts to use. By combining the information from the providerSpec parameter of the Cluster object and its ClusterRelease object, the cluster actuator generates LCMCluster objects. These objects are further handled by LCM Controller, and the HelmBundle object is handled by Helm Controller. HelmBundle must have the same name as the LCMCluster object for the cluster that HelmBundle applies to.

Although a cluster actuator can only create a single HelmBundle per cluster, Helm Controller can handle multiple HelmBundle objects per cluster.

Helm Controller handles HelmBundle objects and reconciles them with the state of Helm in its cluster.

Helm Controller can also be used by the management cluster with corresponding HelmBundle objects created as part of the initial management cluster setup.

MKE API limitations¶

To ensure MOSK stability in managing Mirantis Kubernetes Engine (MKE) clusters, the following MKE API functionality is not available for the MOSK-based MKE clusters as compared to the MKE clusters that are deployed not by MOSK. Use the MOSK management console or CLI for this functionality instead.

Public APIs limitations in a MOSK-based MKE¶
API endpoint	Limitation
`GET /swarm`	Swarm Join Tokens are filtered out for all users, including admins.
`PUT /api/ucp/config-toml`	All requests are forbidden.
`POST /nodes/{id}/update`	Requests for the following changes are forbidden: Change `Role` Add or remove the `com.docker.ucp.orchestrator.swarm` and `com.docker.ucp.orchestrator.kubernetes` labels.
`DELETE /nodes/{id}`	All requests are forbidden.

See also

MKE configuration management

MKE configuration management¶

This section describes configuration specifics of an MKE cluster deployed using MOSK.

MKE configuration managed by MOSK¶

Since Container Cloud 2.25.1 (Cluster releases 16.0.1 and 17.0.1), MOSK does not override changes in MKE configuration except the following list of parameters that are automatically managed by MOSK. These parameters are always overridden by MOSK default values if modified direclty using the MKE API. For details on configuration using the MKE API, see MKE configuration managed directly by the MKE API.

However, you can manually configure a few options from this list using the Cluster object of a MOSK cluster. They are labeled with the superscript and contain references to the respective configuration procedures in the Comments columns of the tables.

[audit_log_configuration]¶

MKE parameter name

Default value in MOSK

Comments

level

"metadata" 0

"" 1

You can configure this option either using MKE API with no MOSK overrides or using the Cluster object of a MOSK cluster. For details, see Configure Kubernetes auditing and profiling and MKE documentation: MKE audit logging.

If configured using the Cluster object, use the same object to disable the option. Otherwise, it will be overridden by MOSK.

support_bundle_include_audit_logs

false

For configuration procedure, see comments above.

0: For management clusters since MCC 2.26.0 (Cluster release 16.1.0)
1: For management and MOSK clusters since MCC 2.24.3 (Cluster releases 15.0.2 and 14.0.2)

[auth]¶

MKE parameter name	Default value in MOSK
`default_new_user_role`	`"restrictedcontrol"`
`backend`	`"managed"`
`samlEnabled`	`false`
`managedPasswordDisabled`	`false`

[auth.external_identity_provider]¶

MKE parameter name	Default value in MOSK
`issuer`	`"https://<Keycloak-external-address>/auth/realms/iam"`
`userServiceId`	`"<userServiceId>"`
`clientId`	`"kaas"`
`wellKnownConfigUrl`	`"https://<Keycloak-external-address>/auth/realms/iam/.well-known/openid-configuration"`
`caBundle`	`"<caCert>"`
`usernameClaim`	`""`
`httpProxy`	`""`
`httpsProxy`	`""`

[hardening_configuration]¶

MKE parameter name	Default value in MOSK
`hardening_enabled`	`true`
`limit_kernel_capabilities`	`true`
`pids_limit_int`	`100000`
`pids_limit_k8s`	`100000`
`pids_limit_swarm`	`100000`

[scheduling_configuration]¶

MKE parameter name	Default value in MOSK
`enable_admin_ucp_scheduling`	`true`
`default_node_orchestrator`	`kubernetes`

[tracking_configuration]¶

MKE parameter name	Default value in MOSK
`cluster_label`	`"prod"`

[cluster_config]¶

MKE parameter name	Default value in MOSK	Comments
`calico_ip_auto_method`	`interface=k8s-pods`
`calico_mtu`	`"1440"`	For configuration steps, see Set the MTU size for Calico.
`calico_vxlan`	`true`
`calico_vxlan_mtu`	`"1440"`
`calico_vxlan_port`	`"4792"`
`cloud_provider`	`""`
`controller_port`	`4443`
`custom_kube_api_server_flags`	`["--event-ttl=720h"]`	Applies only to MKE on the management cluster.
`custom_kube_controller_manager_flags`	`["--leader-elect-lease-duration=120s", "--leader-elect-renew-deadline=60s"]`
`custom_kube_scheduler_flags`	`["--leader-elect-lease-duration=120s", "--leader-elect-renew-deadline=60s"]`
`custom_kubelet_flags`	`["--serialize-image-pulls=false"]`
`etcd_storage_quota`	`""`	For configuration steps, see Increase storage quota for etcd.
`exclude_server_identity_headers`	`true`
`ipip_mtu`	`"1440"`
`kube_api_server_auditing`	`true` 3 `false` 4	For configuration steps, see Configure Kubernetes auditing and profiling.
`kube_api_server_audit_log_maxage` 5	`30`
`kube_api_server_audit_log_maxbackup` 5	`10`
`kube_api_server_audit_log_maxsize` 5	`10`
`kube_api_server_profiling_enabled`	`false`	For configuration steps, see Configure Kubernetes auditing and profiling.
`kube_apiserver_port`	`5443`
`kube_protect_kernel_defaults`	`true`
`local_volume_collection_mapping`	`false`
`manager_kube_reserved_resources`	`"cpu=1000m,memory=2Gi,ephemeral-storage=4Gi"`
`metrics_retention_time`	`"24h"`
`metrics_scrape_interval`	`"1m"`
`nodeport_range`	`"30000-32768"`
`pod_cidr`	`"10.233.64.0/18"`	You can override this value in `spec::clusterNetwork::pods::cidrBlocks:` of the `Cluster` object.
`priv_attributes_allowed_for_service_accounts` 2	`["hostBindMounts", "hostIPC", "hostNetwork", "hostPID", "kernelCapabilities", "privileged"]`
`priv_attributes_priv_attributes_service_accounts` 2	`["kube-system:helm-controller-sa", "kube-system:pod-garbage-collector", "stacklight:stacklight-helm-controller"]service_accounts`
`profiling_enabled`	`false`
`prometheus_memory_limit`	`"4Gi"`
`prometheus_memory_request`	`"2Gi"`
`secure_overlay`	`true`
`service_cluster_ip_range`	`"10.233.0.0/18"`	You can override this value in `spec::clusterNetwork::services::cidrBlocks:` of the `Cluster` object.
`swarm_port`	`2376`
`swarm_strategy`	`"spread"`
`unmanaged_cni`	`false`
`vxlan_vni`	`10000`
`worker_kube_reserved_resources`	`"cpu=100m,memory=300Mi,ephemeral-storage=500Mi"`

2(1,2): For priv_attributes parameters, you can add custom options on top of existing parameters using the MKE API.
3: For management clusters since MCC 2.26.0 (Cluster release 16.1.0).
4: For management and MOSK clusters since MCC 2.24.3 (Cluster releases 15.0.2 and 14.0.2).
5(1,2,3): For management and MOSK clusters since MCC 2.27.0 (Cluster releases 17.2.0 and 16.2.0). For configuration steps, see Configure Kubernetes auditing and profiling.

Note

All possible values for parameters labeled with the superscript, which you can manually configure using the Cluster object are described in MKE Operations Guide: Configuration options.

MKE configuration managed directly by the MKE API¶

Since Container Cloud 2.25.1 (Cluster releases 17.0.1 and 16.0.1), aside from MKE parameters described in MKE configuration managed by MOSK, MOSK does not override changes in MKE configuration that are applied directly through the MKE API. For configuration options and procedure, see MKE documentation:

MKE configuration options

Configure an existing MKE cluster

While using this procedure, replace the command to upload the newly edited MKE configuration file with the following one:

curl --silent --insecure -X PUT -H "X-UCP-Allow-Restricted-API: i-solemnly-swear-i-am-up-to-no-good" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file 'mke-config.toml' https://$MKE_HOST/api/ucp/config-toml

Important

Mirantis cannot guarrantee the expected behavior of the functionality configured using the MKE API as long as customer-specific configuration does not undergo testing within MOSK. Therefore, Mirantis recommends that you test custom MKE settings configured through the MKE API on a staging environment before applying them to production.

Storage¶

A MOSK cluster uses Ceph as a distributed storage system for file, block, and object storage. This section provides an overview of a Ceph cluster deployed by Container Cloud.

Ceph overview¶

Mirantis Container Cloud deploys Ceph on MOSK using Helm charts with the following components:

Rook Ceph Operator

A storage orchestrator that deploys Ceph on top of a Kubernetes cluster. Also known as Rook or Rook Operator. Rook operations include:

Deploying and managing a Ceph cluster based on provided Rook CRs such as CephCluster, CephBlockPool, CephObjectStore, and so on.
Orchestrating the state of the Ceph cluster and all its daemons.

KaaSCephCluster custom resource (CR)

Represents the customization of a Kubernetes installation and allows you to define the required Ceph configuration through the Container Cloud web UI before deployment. For example, you can define the failure domain, Ceph pools, Ceph node roles, number of Ceph components such as Ceph OSDs, and so on. The ceph-kcc-controller controller on the Container Cloud management cluster manages the KaaSCephCluster CR.

Ceph Controller

A Kubernetes controller that obtains the parameters from Container Cloud through a CR, creates CRs for Rook and updates its CR status based on the Ceph cluster deployment progress. It creates users, pools, and keys for OpenStack and Kubernetes and provides Ceph configurations and keys to access them. Also, Ceph Controller eventually obtains the data from the OpenStack Controller (Rockoon) for the Keystone integration and updates the Ceph Object Gateway services configurations to use Kubernetes for user authentication.

The Ceph Controller operations include:

Transforming user parameters from the Container Cloud Ceph CR into Rook CRs and deploying a Ceph cluster using Rook.
Providing integration of the Ceph cluster with Kubernetes.
Providing data for OpenStack to integrate with the deployed Ceph cluster.

Ceph Status Controller

A Kubernetes controller that collects all valuable parameters from the current Ceph cluster, its daemons, and entities and exposes them into the KaaSCephCluster status. Ceph Status Controller operations include:

Collecting all statuses from a Ceph cluster and corresponding Rook CRs.
Collecting additional information on the health of Ceph daemons.
Provides information to the status section of the KaaSCephCluster CR.

Ceph Request Controller

A Kubernetes controller that obtains the parameters from Container Cloud through a CR and manages Ceph OSD lifecycle management (LCM) operations. It allows for a safe Ceph OSD removal from the Ceph cluster. Ceph Request Controller operations include:

Providing an ability to perform Ceph OSD LCM operations.
Obtaining specific CRs to remove Ceph OSDs and executing them.
Pausing the regular Ceph Controller reconciliation until all requests are completed.

A typical Ceph cluster consists of the following components:

Ceph Monitors - three or, in rare cases, five Ceph Monitors.
Ceph Managers - one Ceph Manager in a regular cluster.
Ceph Object Gateway (radosgw) - Mirantis recommends having three or more radosgw instances for HA.
Ceph OSDs - the number of Ceph OSDs may vary according to the deployment needs.
Warning
- A Ceph cluster with 3 Ceph nodes does not provide hardware fault tolerance and is not eligible for recovery operations, such as a disk or an entire Ceph node replacement.
- A Ceph cluster uses the replication factor that equals 3. If the number of Ceph OSDs is less than 3, a Ceph cluster moves to the degraded state with the write operations restriction until the number of alive Ceph OSDs equals the replication factor again.

The placement of Ceph Monitors and Ceph Managers is defined in the KaaSCephCluster CR.

The following diagram illustrates the way a Ceph cluster is deployed in Container Cloud:

The following diagram illustrates the processes within a deployed Ceph cluster:

See also

Ceph limitations¶

A Ceph cluster configuration in MOSK includes but is not limited to the following limitations:

Only one Ceph Controller per MOSK cluster and only one Ceph cluster per Ceph Controller are supported.
The replication size for any Ceph pool must be set to more than 1.
Only one CRUSH tree per cluster. The separation of devices per Ceph pool is supported through device classes with only one pool of each type for a device class.
All CRUSH rules must have the same failure_domain.
Only the following types of CRUSH buckets are supported:
- topology.kubernetes.io/region
- topology.kubernetes.io/zone
- topology.rook.io/datacenter
- topology.rook.io/room
- topology.rook.io/pod
- topology.rook.io/pdu
- topology.rook.io/row
- topology.rook.io/rack
- topology.rook.io/chassis
RBD mirroring is not supported.
Consuming an existing Ceph cluster is not supported.
Lifted since MOSK 23.1 CephFS is unsupported. Multiple CephFS are supported since MOSK 25.1.
Only IPv4 is supported.
If two or more Ceph OSDs are located on the same device, there must be no dedicated WAL or DB for this class.
Only a full collocation or dedicated WAL and DB configurations are supported.
The minimum size of any defined Ceph OSD device is 5 GB.
Ceph OSDs support only raw disks as data devices meaning that no dm or lvm devices are allowed.
Lifted since MOSK 23.3 Ceph cluster does not support removable devices (with hotplug enabled) for deploying Ceph OSDs.
When adding a Ceph node with the Ceph Monitor role, if any issues occur with the Ceph Monitor, rook-ceph removes it and adds a new Ceph Monitor instead, named using the next alphabetic character in order. Therefore, the Ceph Monitor names may not follow the alphabetical order. For example, a, b, d, instead of a, b, c.
Reducing the number of Ceph Monitors is not supported and causes the Ceph Monitor daemons removal from random nodes.
Removal of the mgr role in the nodes section of the KaaSCephCluster CR does not remove Ceph Managers. To remove a Ceph Manager from a node, remove it from the nodes spec and manually delete the mgr pod in the Rook namespace.
Lifted since MOSK 24.1 Ceph does not support allocation of Ceph RGW pods on nodes where the Federal Information Processing Standard (FIPS) mode is enabled.

Ceph integration with OpenStack¶

The integration between Ceph and OpenStack (Rockoon) Controllers is implemented through the shared Kubernetes openstack-ceph-shared namespace. Both controllers have access to this namespace to read and write the Kubernetes kind: Secret objects.

As Ceph is required and only supported backend for several OpenStack services, all necessary Ceph pools must be specified in the configuration of the kind: MiraCeph custom resource as part of the deployment. Once the Ceph cluster is deployed, the Ceph Controller posts the information required by the OpenStack services to be properly configured as a kind: Secret object into the openstack-ceph-shared namespace. The OpenStack Controller watches this namespace. Once the corresponding secret is created, the OpenStack Controller transforms this secret to the data structures expected by the OpenStack-Helm charts. Even if an OpenStack installation is triggered at the same time as a Ceph cluster deployment, the OpenStack Controller halts the deployment of the OpenStack services that depend on Ceph availability until the secret in the shared namespace is created by the Ceph Controller.

For the configuration of Ceph Object Gateway as an OpenStack Object Storage, the reverse process takes place. The OpenStack Controller waits for the OpenStack-Helm to create a secret with OpenStack Identity (Keystone) credentials that Ceph Object Gateway must use to validate the OpenStack Identity tokens, and posts it back to the same openstack-ceph-shared namespace in the format suitable for consumption by the Ceph Controller. The Ceph Controller then reads this secret and reconfigures Ceph Object Gateway accordingly.

Addressing storage devices¶

There are several formats to use when specifying and addressing storage devices of a Ceph cluster. The default and recommended one is the /dev/disk/by-id format. This format is reliable and unaffected by the disk controller actions, such as device name shuffling or /dev/disk/by-path recalculating.

Difference between by-id, name, and by-path formats¶

The storage device /dev/disk/by-id format in most of the cases bases on a disk serial number, which is unique for each disk. A by-id symlink is created by the udev rules in the following format, where <BusID> is an ID of the bus to which the disk is attached and <DiskSerialNumber> stands for a unique disk serial number:

/dev/disk/by-id/<BusID>-<DiskSerialNumber>

Typical by-id symlinks for storage devices look as follows:

/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
/dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
/dev/disk/by-id/ata-WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D9DMEH

In the example above, symlinks contain the following IDs:

Bus IDs: nvme, scsi-SATA and ata
Disk serial numbers: SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543, HGST_HUS724040AL_PN1334PEHN18ZS and WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D9DMEH.

An exception to this rule is the wwn by-id symlinks, which are programmatically generated at boot. They are not solely based on disk serial numbers but also include other node information. This can lead to the wwn being recalculated when the node reboots. As a result, this symlink type cannot guarantee a persistent disk identifier and should not be used as a stable storage device symlink in a Ceph cluster.

The storage device name and by-path formats cannot be considered persistent because the sequence in which block devices are added during boot is semi-arbitrary. This means that block device names, for example, nvme0n1 and sdc, are assigned to physical disks during discovery, which may vary inconsistently from the previous node state. The same inconsistency applies to by-path symlinks, as they rely on the shortest physical path to the device at boot and may differ from the previous node state.

Therefore, Mirantis highly recommends using storage device by-id symlinks that contain disk serial numbers. This approach enables you to use a persistent device identifier addressed in the Ceph cluster specification.

Example KaaSCephCluster with device by-id identifiers¶

Below is an example KaaSCephCluster custom resource using the /dev/disk/by-id format for storage devices specification:

Note

MOSK enables you to use fullPath for the by-id symlinks since MOSK 23.3. For earlier product versions, use the name field instead.

 apiVersion: kaas.mirantis.com/v1alpha1
 kind: KaaSCephCluster
 metadata:
   name: ceph-cluster-managed-cluster
   namespace: managed-ns
 spec:
   cephClusterSpec:
     nodes:
       # Add the exact ``nodes`` names.
       # Obtain the name from the "get machine" list.
       cz812-managed-cluster-storage-worker-noefi-58spl:
         roles:
         - mgr
         - mon
       # All disk configuration must be reflected in ``status.providerStatus.hardware.storage`` of the ``Machine`` object
         storageDevices:
         - config:
             deviceClass: ssd
           fullPath: /dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231440912
       cz813-managed-cluster-storage-worker-noefi-lr4k4:
         roles:
         - mgr
         - mon
         storageDevices:
         - config:
             deviceClass: nvme
           fullPath: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
       cz814-managed-cluster-storage-worker-noefi-z2m67:
         roles:
         - mgr
         - mon
         storageDevices:
         - config:
             deviceClass: nvme
           fullPath: /dev/disk/by-id/nvme-SAMSUNG_ML1EB3T8HMLA-00007_S46FNY1R130423
     pools:
     - default: true
       deviceClass: ssd
       name: kubernetes
       replicated:
         size: 3
       role: kubernetes
   k8sCluster:
     name: managed-cluster
     namespace: managed-ns

Migrating device names used in KaaSCephCluster to device by-id symlinks¶

The majority of existing clusters uses device names as addressed storage devices identifiers in the spec.cephClusterSpec.nodes section of the KaaSCephCluster custom resource. Therefore, they are prone to the issue of inconsistent storage device identifiers during cluster update. Refer to Migrate Ceph cluster to address storage devices using by-id to mitigate possible risks.

See also

Logging, monitoring, and alerting¶

StackLight is the logging, monitoring, and alerting solution that provides a single pane of glass for cloud maintenance and day-to-day operations as well as offers critical insights into cloud health including operational information about the components deployed with Mirantis OpenStack for Kubernetes (MOSK). StackLight is based on Prometheus, an open-source monitoring solution and a time series database, and OpenSearch, the logs and notifications storage.

Deployment architecture¶

Mirantis OpenStack for Kubernetes (MOSK) deploys the StackLight stack as a release of a Helm chart that contains the helm-controller and helmbundles.lcm.mirantis.com (HelmBundle) custom resources. The StackLight HelmBundle consists of a set of Helm charts describing the StackLight components.

StackLight components¶

Apart from the OpenStack-specific components below, StackLight includes the following core components. By default, StackLight logging stack is disabled.

StackLight core components overview¶
StackLight component	Description
Alerta	Receives, consolidates, and deduplicates the alerts sent by Alertmanager and visually represents them through a simple web UI. Using the Alerta web UI, you can view the most recent or watched alerts, group, and filter alerts.
Alertmanager	Handles the alerts sent by client applications such as Prometheus, deduplicates, groups, and routes alerts to receiver integrations. Using the Alertmanager web UI, you can view the most recent `fired` alerts, silence them, or view the Alertmanager configuration.
Elasticsearch Curator	Maintains the data (indexes) in OpenSearch by performing such operations as creating, closing, or opening an index as well as deleting a snapshot. Also, manages the data retention policy in OpenSearch.
Elasticsearch Exporter ^{Compatible with OpenSearch}	The Prometheus exporter that gathers internal OpenSearch metrics.
Grafana	Builds and visually represents metric graphs based on time series databases. Grafana supports querying of Prometheus using the PromQL language.
Database backends	StackLight uses PostgreSQL for Alerta and Grafana. PostgreSQL reduces the data storage fragmentation while enabling high availability. High availability is achieved using Patroni, the PostgreSQL cluster manager that monitors for node failures and manages failover of the primary node. StackLight also uses Patroni to manage major version upgrades of PostgreSQL clusters, which allows leveraging the database engine functionality and improvements as they are introduced upstream in new releases, maintaining functional continuity without version lock-in.
Logging stack	Responsible for collecting, processing, and persisting logs and Kubernetes events. By default, when deploying through the MOSK management console, only the metrics stack is enabled on MOSK clusters. To enable StackLight to gather MOSK cluster logs, enable the logging stack during deployment. On management clusters, the logging stack is enabled by default. The logging stack components include: OpenSearch, which stores logs and notifications. Fluentd-logs, which collects logs, sends them to OpenSearch, generates metrics based on analysis of incoming log entries, and exposes these metrics to Prometheus. OpenSearch Dashboards, which provides real-time visualization of the data stored in OpenSearch and enables you to detect issues. Metricbeat, which collects Kubernetes events and sends them to OpenSearch for storage. Prometheus-es-exporter, which presents the OpenSearch data as Prometheus metrics by periodically sending configured queries to the OpenSearch cluster and exposing the results to a scrapable HTTP endpoint like other Prometheus targets. Note The logging mechanism performance depends on the cluster log load. In case of a high load, you may need to increase the default resource requests and limits for `fluentdLogs`. For details, see StackLight configuration parameters: Resource limits.
Metric collector	Collects telemetry data (CPU or memory usage, number of active alerts, and so on) from Prometheus and sends the data to centralized cloud storage for further processing and analysis. Metric collector runs on the management cluster. Note This component is designated for internal StackLight use only.
Prometheus	Gathers metrics. Automatically discovers and monitors the endpoints. Using the Prometheus web UI, you can view simple visualizations and debug. By default, the Prometheus database stores metrics of the past 15 days or up to 15 GB of data depending on the limit that is reached first.
Prometheus Blackbox Exporter	Allows monitoring endpoints over HTTP, HTTPS, DNS, TCP, and ICMP.
Prometheus-es-exporter	Presents the OpenSearch data as Prometheus metrics by periodically sending configured queries to the OpenSearch cluster and exposing the results to a scrapable HTTP endpoint like other Prometheus targets.
Prometheus Node Exporter	Gathers hardware and operating system metrics exposed by kernel.
Prometheus Relay	Adds a proxy layer to Prometheus to merge the results from underlay Prometheus servers to prevent gaps in case some data is missing on some servers. Is available only in the HA StackLight mode.
Salesforce notifier	Enables sending Alertmanager notifications to Salesforce to allow creating Salesforce cases and closing them once the alerts are resolved. Disabled by default.
Salesforce reporter	Queries Prometheus for the data about the amount of vCPU, vRAM, and vStorage used and available, combines the data, and sends it to Salesforce daily. Mirantis uses the collected data for further analysis and reports to improve the quality of customer support. Disabled by default.
Telegraf	Collects metrics from the system. Telegraf is plugin-driven and has the concept of two distinct set of plugins: input plugins collect metrics from the system, services, or third-party APIs; output plugins write and expose metrics to various destinations. The Telegraf agents used in MOSK include: `telegraf-ds-smart` monitors SMART disks, and runs on both management and MOSK clusters. `telegraf-ironic` monitors Ironic on management clusters. The `ironic` input plugin collects and processes data from Ironic HTTP API, while the `http_response` input plugin checks Ironic HTTP API availability. As an output plugin, to expose collected data as Prometheus target, Telegraf uses `prometheus`. `telegraf-docker-swarm` gathers metrics from the Mirantis Container Runtime API about the Docker nodes, networks, and Swarm services. This is a Docker Telegraf input plugin with downstream additions.
Telemeter	Enables a multi-cluster view through a Grafana dashboard of the management cluster. Telemeter includes a Prometheus federation push server and clients to enable isolated Prometheus instances, which cannot be scraped from a central Prometheus instance, to push metrics to the central location. The Telemeter services are distributed between the management cluster that hosts the Telemeter server and MOSK clusters that host the Telemeter client. The metrics from MOSK clusters are aggregated on management clusters. Note This component is designated for internal StackLight use only.

OpenStack-specific StackLight components overview¶
StackLight component	Description
Prometheus native exporters and endpoints	Export the existing metrics as Prometheus metrics and include: `libvirt-exporter` `memcached-exporter` `mysql-exporter` `rabbitmq-exporter` `tungstenfabric-exporter`
Telegraf OpenStack plugin	Collects and processes the OpenStack metrics.

Every Helm chart contains a default values.yml file. These default values are partially overridden by custom values defined in the StackLight Helm chart.

StackLight database modes¶

During the StackLight configuration when deploying a MOSK cluster, you can define the HA or non-HA StackLight architecture type. Non-HA StackLight requires a backend storage provider, for example, a Ceph cluster. For details, see Storage.

The non-HA mode is set by default on MOSK clusters. On management clusters, StackLight is deployed in the HA mode only. The following table lists the differences between the HA and non-HA modes:

StackLight database modes¶
Non-HA StackLight mode ^default	HA StackLight mode
One Prometheus instance One Alertmanager instance ^{Since MCC 2.24.0 (14.0.0)} One OpenSearch instance One PostgreSQL instance One `iam-proxy` instance One persistent volume is provided for storing data. In case of a service or node failure, a new pod is redeployed and the volume is reattached to provide the existing data. Such setup has a reduced hardware footprint but provides less performance.	Two Prometheus instances Two Alertmanager instances Three OpenSearch instances Three PostgreSQL instances Two `iam-proxy` instances ^{Since MCC 2.23.0 (11.7.0)} Local Volume Provisioner is used to provide local host storage. In case of a service or node failure, the traffic is automatically redirected to any other running Prometheus or OpenSearch server. For better performance, Mirantis recommends that you deploy StackLight in the HA mode. Two `iam-proxy` instances ensure access to HA components if one `iam-proxy` node fails.

Note

Before Container Cloud 2.24.0 (Cluster releases 12.7.0, 11.7.0, or earlier), Alertmanager has 2 replicas in the non-HA mode.

Depending on the cluster type and selected StackLight database mode, StackLight is deployed on the following number of nodes:

Node placement for StackLight database modes¶
Cluster	StackLight database mode	Target nodes
Management	HA mode	All Kubernetes master nodes
MOSK	Non-HA mode	All nodes with the `stacklight` label. If no nodes have the `stacklight` label, StackLight is spread across all worker nodes. The minimal requirement is at least 1 worker node.
	HA mode	All nodes with the `stacklight` label. The minimal requirement is 3 nodes with the `stacklight` label. Otherwise, StackLight deployment does not start.

Supported features¶

Using the MOSK management console, on the pre-deployment stage of a MOSK cluster, you can view, enable or disable, or tune the following StackLight features available:

StackLight HA mode.
Database retention size and time for Prometheus.
Tunable index retention period for OpenSearch.
Tunable PersistentVolumeClaim (PVC) size for Prometheus and OpenSearch set to 16 GB for Prometheus and 30 GB for OpenSearch by default. The PVC size must be logically aligned with the retention periods or sizes for these components.
Email and Slack receivers for the Alertmanager notifications.
Predefined set of dashboards.

Predefined set of alerts and capability to add new custom alerts for Prometheus in the following exemplary format:

- alert: HighErrorRate
  expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
  for: 10m
  labels:
    severity: page
  annotations:
    summary: High request latency

Authentication flow¶

StackLight provides five web UIs including Prometheus, Alertmanager, Alerta, OpenSearch Dashboards, and Grafana. Access to StackLight web UIs is protected by Keycloak-based Identity and access management (IAM). All web UIs except Alerta are exposed to IAM through the IAM proxy middleware. The Alerta configuration provides direct integration with IAM.

The following diagram illustrates accessing the IAM-proxied StackLight web UIs, for example, Prometheus web UI:

Authentication flow for the IAM-proxied StackLight web UIs:

A user enters the public IP of a StackLight web UI, for example, Prometheus web UI.
The public IP leads to IAM proxy, deployed as a Kubernetes LoadBalancer, which protects the Prometheus web UI.
LoadBalancer routes the HTTP request to Kubernetes internal IAM proxy service endpoints, specified in the X-Forwarded-Proto or X-Forwarded-Host headers.
The Keycloak login form opens (the login_url field in the IAM proxy configuration, which points to Keycloak realm) and the user enters the user name and password.
Keycloak validates the user name and password.
The user obtains access to the Prometheus web UI (the upstreams field in the IAM proxy configuration).

Note

The discovery URL is the URL of the IAM service.
The upstream URL is the hidden endpoint of a web UI (Prometheus web UI in the example above).

The following diagram illustrates accessing the Alerta web UI:

Authentication flow for the Alerta web UI:

A user enters the public IP of the Alerta web UI.
The public IP leads to Alerta deployed as a Kubernetes LoadBalancer type.
LoadBalancer routes the HTTP request to the Kubernetes internal Alerta service endpoint.
The Keycloak login form opens (Alerta refers to the IAM realm) and the user enters the user name and password.
Keycloak validates the user name and password.
The user obtains access to the Alerta web UI.

Monitored components¶

StackLight measures, analyzes, and reports in a timely manner about failures that may occur in the following Mirantis OpenStack for Kubernetes (MOSK) components and their sub-components.

StackLight monitors the following core components:

Component	Sub-components
Ceph	n/a
Ironic	n/a
Kubernetes services	Calico etcd Kubernetes cluster Kubernetes containers Kubernetes deployments Kubernetes nodes
NGINX	n/a
Node hardware and operating system	n/a
PostgreSQL	n/a
StackLight	Alertmanager OpenSearch Grafana Prometheus Prometheus Relay Salesforce notifier Telemeter
SSL certificates	n/a
Mirantis Kubernetes Engine (MKE)	Docker/Swarm metrics (through Telegraf) Built-in MKE metrics

Apart from the core components above, StackLight monitors the following OpenStack-specific components:

Component	Sub-components
Libvirt	n/a
Memcached	n/a
MariaDB	n/a
NTP	n/a
OpenStack	Barbican Cinder Designate Glance Heat Horizon Ironic Keystone Neutron Nova Octavia
OpenStack SSL certificates	n/a
Tungsten Fabric	Casandra Kafka Redis ZooKeeper
RabbitMQ	n/a

Storage-based log retention strategy¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0)

StackLight uses a storage-based log retention strategy that optimizes storage utilization and ensures effective data retention. A proportion of available disk space is defined as 80% of disk space allocated for the OpenSearch node with the following data types:

80% for system logs
10% for audit logs
5% for OpenStack notifications
5% for Kubernetes events

This approach ensures that storage resources are efficiently allocated based on the importance and volume of different data types.

The logging index management implies the following advantages:

Storage-based rollover mechanism
The rollover mechanism for system and audit indices enforces shard size based on available storage, ensuring optimal resource utilization.
Consistent shard allocation
The number of primary shards per index is dynamically set based on cluster size, which boosts search and facilitates ingestion for large clusters.
Minimal size of cluster state
The logging size of the cluster state is minimal and uses static mappings, which are based on Elastic Common Schema (ESC) with slight deviations from the standard. Dynamic mapping in index templates is avoided to reduce overhead.
Storage compression
The system and audit indices utilize the best_compression codec that minimizes the size of stored indices, resulting in significant storage savings of up to 50% on average.
No filter by logging level
In light of non-even severity level over components in MOSK, logs of all severity levels are collected to prevent ignorance of important logs of low severity while debugging a cluster. Filtering by tags is still available.

See also

StackLight logging indices

OpenSearch and Prometheus storage sizing¶

Caution

Calculations in this document are based on numbers from a real-scale test cluster with 34 nodes. The exact space required for metrics and logs must be calculated depending on the ongoing cluster operations. Some operations force the generation of additional metrics and logs. The values below are approximate. Use them only as recommendations.

During the deployment of a new cluster, you must specify the OpenSearch retention time and Persistent Volume Claim (PVC) size, Prometheus PVC, retention time, and retention size. When configuring an existing cluster, you can only set OpenSearch retention time, Prometheus retention time, and retention size.

The following table describes the recommendations for both OpenSearch and Prometheus retention size and PVC size for a cluster with 34 nodes. Retention time depends on the space allocated for the data. To calculate the required retention time, use the {retention time} = {retention size} / {amount of data per day} formula.

Service

Required space per day

Description

OpenSearch

StackLight in non-HA mode:

202 - 253 GB for the entire cluster
~6 - 7.5 GB for a single node

StackLight in HA mode:

404 - 506 GB for the entire cluster
~12 - 15 GB for a single node

When setting Persistent Volume Claim Size for OpenSearch during the cluster creation, take into account that it defines the PVC size for a single instance of the OpenSearch cluster. StackLight in HA mode has 3 OpenSearch instances. Therefore, for a total OpenSearch capacity, multiply the PVC size by 3.

Prometheus

11 GB for the entire cluster
~400 MB for a single node

Every Prometheus instance stores the entire database. Multiple replicas store multiple copies of the same data. Therefore, treat the Prometheus PVC size as the capacity of Prometheus in the cluster. Do not sum them up.

Prometheus has built-in retention mechanisms based on the database size and time series duration stored in the database. Therefore, if you miscalculate the PVC size, retention size set to ~1 GB less than the PVC size will prevent disk overfilling.

StackLight integration with OpenStack¶

StackLight integration with OpenStack includes automatic discovery of RabbitMQ credentials for notifications and OpenStack credentials for OpenStack API metrics. For details, see the openstack.rabbitmq.credentialsConfig and openstack.telegraf.credentialsConfig parameters description in StackLight configuration parameters.

Workload monitoring¶

Lifecycle management operations of a MOSK cluster may impose impact on its workloads and, specifically, may cause network connectivity interruptions for instances running in OpenStack. To make sure that the downtime caused on the cloud applications still fits into Service Level Agreements (SLAs), MOSK provides the tooling to measure the network availability of instances.

Additionally, continuous monitoring of the network connectivity in the cluster is essential for early detection of infrastructure problems.

MOSK offers cloud operators to oversee the availability of workloads hosted in their OpenStack infrastructure on several levels:

Monitoring of floating IP addresses through the Cloudprober service
Monitoring of network ports availability through the Portprober service

Floating IP address availability monitoring (Cloudprober)¶

Available since MOSK 23.2 TechPreview

The floating IP address availability monitoring service (Cloudprober) is a special probing agent that starts on controller nodes and periodically pings selected floating IP addresses. As of today, the agent supports only Internet Control Message Protocol (ICMP) to determine the IP address availability.

instance_availability_arch

To monitor the availability of floating IP addresses, your MOSK cluster and workloads need to meet the following requirements:

There must be the layer-3 connectivity between the clusters floating IP networks and nodes running the OpenStack control plane.
The guest operating system of the monitored OpenStack instances must allow the ICMP ingress and egress traffic.
OpenStack security groups used by the monitored instances must allow the ICMP ingress and egress traffic.

To enable the floating IP address availability monitoring service, use the following OpenStackDeployment definition:

spec:
  features:
    services:
      - cloudprober

For the detailed configuration procedure of the floating IP address availability monitoring service, refer to Configure monitoring of cloud workload availability.

Network port availability monitoring (Portprober)¶

Available since MOSK 24.2 TechPreview

The network port availability monitoring service (Portprober) is implemented as an extension to OpenStack Neutron service which gets enabled automatically together with the cloudprober service described above.

Also, you can enable Portprober explicitly, regardless of whether Cloudprober is enabled or not. To do so, specify the following structure in the OpenStackDeployment custom resource:

spec:
  features:
    neutron:
      extensions:
        portprober:
          enabled: true

The Portprober service is supported only for the following cloud configurations:

OpenStack version is Antelope or newer
Neutron OVS backend for networking (Tungsten Fabric and OVN backends are not supported)

portprober

The Portprober agent automatically connects to all OpenStack virtual networks and probes all the ports that are plugged in there and are in the bound state, meaning they are associated with an instance or a network service.

The service makes no difference between private and external networks and also reports the availability of the ports that belong to virtual routers.

The service relies on the ARP protocol to determine port availability and does not require any security groups to be assigned to monitored instances, as opposed to the Floating IP address monitoring service (Cloudprober).

Known limitations¶

Among the known limitations of the network port availability monitoring service is the lack of support for IPv6. The service ignores the ports that do not have IPv4 addresses associated with them.

Outbound cluster metrics¶

The data collected and transmitted through an encrypted channel back to Mirantis provides our Customer Success Organization information to better understand the operational usage patterns our customers are experiencing as well as to provide feedback on product usage statistics to enable our product teams to enhance our products and services for our customers.

Mirantis collects the following statistics using configuration-collector:

Since MCC 2.26.0 (17.1.0 and 16.1.0)

Mirantis collects hardware information using the following metrics:

mcc_hw_machine_chassis
mcc_hw_machine_cpu_model
mcc_hw_machine_cpu_number
mcc_hw_machine_nics
mcc_hw_machine_ram
mcc_hw_machine_storage (storage devices and disk layout)
mcc_hw_machine_vendor

Before MCC 2.26.0 (17.0.0, 16.0.0, or eralier)

Mirantis collects the summary of all deployed MOSK configurations using the following objects, if any:

Note

The data is anonymized from all sensitive information, such as IDs, IP addresses, passwords, private keys, and so on.

Cluster
Machine
MCCUpgrade
BareMetalHost

BareMetalHostProfile
IPAMHost
IPAddr

KaaSCephCluster
L2Template
Subnet

Note

In Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0), Mirantis does not collect any configuration summary in light of the configuration-collector refactoring.

The node-level resource data are broken down into three broad categories: Cluster, Node, and Namespace. The telemetry data tracks Allocatable, Capacity, Limits, Requests, and actual Usage of node-level resources.

Terms explanation¶
Term	Definition
Allocatable	On a Kubernetes Node, the amount of compute resources that are available for pods
Capacity	The total number of available resources regardless of current consumption
Limits	Constraints imposed by Administrators
Requests	The resources that a given container application is requesting
Usage	The actual usage or consumption of a given resource

The full list of the outbound data includes:

From MOSK clusters

cluster_alerts_firing ^{Since MOSK 23.1}
cluster_filesystem_size_bytes
cluster_filesystem_usage_bytes
cluster_filesystem_usage_ratio
cluster_master_nodes_total
cluster_nodes_total
cluster_persistentvolumeclaim_requests_storage_bytes
cluster_total_alerts_triggered
cluster_capacity_cpu_cores
cluster_capacity_memory_bytes
cluster_usage_cpu_cores
cluster_usage_memory_bytes
cluster_usage_per_capacity_cpu_ratio
cluster_usage_per_capacity_memory_ratio
cluster_worker_nodes_total
cluster_workload_pods_total ^{Since MOSK 23.1}
cluster_workload_containers_total ^{Since MOSK 23.1}
kaas_info
kaas_cluster_machines_ready_total
kaas_cluster_machines_requested_total
kaas_clusters
kaas_cluster_updating ^{Since MOSK 22.5}
kaas_license_expiry
kaas_machines_ready
kaas_machines_requested
kubernetes_api_availability
mcc_cluster_update_plan_status ^{Since MOSK 24.3 as TechPreview}
mke_api_availability
mke_cluster_nodes_total
mke_cluster_containers_total
mke_cluster_vcpu_free
mke_cluster_vcpu_used
mke_cluster_vram_free
mke_cluster_vram_used
mke_cluster_vstorage_free
mke_cluster_vstorage_used
node_labels ^{Since MOSK 23.2}
openstack_cinder_api_latency_90
openstack_cinder_api_latency_99
openstack_cinder_api_status ^{Removed in MOSK 24.1}
openstack_cinder_availability
openstack_cinder_volumes_total
openstack_glance_api_status
openstack_glance_availability
openstack_glance_images_total
openstack_glance_snapshots_total ^{Removed in MOSK 24.1}
openstack_heat_availability
openstack_heat_stacks_total
openstack_host_aggregate_instances ^{Removed in MOSK 23.2}
openstack_host_aggregate_memory_used_ratio ^{Removed in MOSK 23.2}
openstack_host_aggregate_memory_utilisation_ratio ^{Removed in MOSK 23.2}
openstack_host_aggregate_cpu_utilisation_ratio ^{Removed in MOSK 23.2}
openstack_host_aggregate_vcpu_used_ratio ^{Removed in MOSK 23.2}
openstack_instance_availability
openstack_instance_create_end
openstack_instance_create_error
openstack_instance_create_start
openstack_keystone_api_latency_90
openstack_keystone_api_latency_99
openstack_keystone_api_status ^{Removed in MOSK 24.1}
openstack_keystone_availability
openstack_keystone_tenants_total
openstack_keystone_users_total
openstack_kpi_provisioning
openstack_lbaas_availability
openstack_mysql_flow_control
openstack_neutron_api_latency_90
openstack_neutron_api_latency_99
openstack_neutron_api_status ^{Removed in MOSK 24.1}
openstack_neutron_availability
openstack_neutron_lbaas_loadbalancers_total
openstack_neutron_networks_total
openstack_neutron_ports_total
openstack_neutron_routers_total
openstack_neutron_subnets_total
openstack_nova_all_compute_cpu_utilisation
openstack_nova_all_compute_mem_utilisation
openstack_nova_all_computes_total
openstack_nova_all_vcpus_total
openstack_nova_all_used_vcpus_total
openstack_nova_all_ram_total_gb
openstack_nova_all_used_ram_total_gb
openstack_nova_all_disk_total_gb
openstack_nova_all_used_disk_total_gb
openstack_nova_api_status ^{Removed in MOSK 24.1}
openstack_nova_availability
openstack_nova_compute_cpu_utilisation
openstack_nova_compute_mem_utilisation
openstack_nova_computes_total
openstack_nova_disk_total_gb
openstack_nova_instances_active_total
openstack_nova_ram_total_gb
openstack_nova_used_disk_total_gb
openstack_nova_used_ram_total_gb
openstack_nova_used_vcpus_total
openstack_nova_vcpus_total
openstack_public_api_status ^{Since MOSK 22.5}
openstack_quota_instances
openstack_quota_ram_gb
openstack_quota_vcpus
openstack_quota_volume_storage_gb
openstack_rmq_message_deriv
openstack_usage_instances
openstack_usage_ram_gb
openstack_usage_vcpus
openstack_usage_volume_storage_gb
osdpl_aodh_alarms ^{Since MOSK 23.3}
osdpl_api_success ^{Since MOSK 24.1}
osdpl_cinder_zone_volumes ^{Since MOSK 23.3}
osdpl_ironic_nodes ^{Since MOSK 25.1}
osdpl_manila_shares ^{Since MOSK 24.2}
osdpl_masakari_hosts ^{Since MOSK 24.2}
osdpl_neutron_availability_zone_info ^{Since MOSK 23.3}
osdpl_neutron_zone_routers ^{Since MOSK 23.3}
osdpl_nova_aggregate_hosts ^{Since MOSK 23.3}
osdpl_nova_audit_orphaned_allocations ^{Since MOSK 24.3}
osdpl_nova_availability_zone_info ^{Since MOSK 23.3}
osdpl_nova_availability_zone_instances ^{Since MOSK 23.3}
osdpl_nova_availability_zone_hosts ^{Since MOSK 23.3}
osdpl_version_info ^{Since MOSK 23.3}
tf_operator_info ^{Since MOSK 23.3 for Tungsten Fabric}

StackLight logging indices¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0)

StackLight logging indices are managed by OpenSearch data streams, which are introduced in OpenSearch 2.6. It is a convenient way to manage insert-only pipelines such as log message collection. The solution consists of the following elements:

Data stream objects that can be referred to as alias:
- Audit - dedicated for Container Cloud, MKE, and host audit logs, ensuring data integrity and security.
- System - replaces Logstash for system logs, provides a streamlined approach to log management.
Write index - current index where ingestion can be performed without removing a data stream.
Read indices - indices created after the rollover mechanism is applied.
Rollover policy - creating new write index for data stream based on the size of shards

Example of an initial index list:

health status index               uuid                    pri rep docs.count docs.deleted store.size pri.store.size
green  open   .ds-audit-000001    30q4HLGmR0KmpRR8Kvy5jw    1   1    2961719            0    496.3mb          248mb
green  open   .ds-system-000001   5_eFtMAFQa6aFB7nttHjkA    1   1       2476            0      6.1mb            3mb

Example of the index after the rollover is applied to the audit index:

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .ds-audit-000001    30q4HLGmR0KmpRR8Kvy5jw   1   1    9819913            0      1.5gb        784.8mb
green  open   .ds-audit-000002    U1fbs0i9TJmOsAOoR7cERg   1   1    2961719            0    496.3mb          248mb
green  open   .ds-system-000001   5_eFtMAFQa6aFB7nttHjkA   1   1       2476            0      6.1mb            3mb

See also

Audit and system index templates¶

The following table contains a simplified template of the audit and system indices. The user can perform aggregation queries over keyword fields.

Audit and system template¶
Field	Type	Description
`@timestamp`	date	Time when a log event was produced, if available in the parsed message. Otherwise time when the event was ingested.
`container.id`	keyword	Identifier of the Docker container that the application generating the event was running in.
`container.image`	text	Name of the Docker image defined as `<registry>/<repo>:<tag>`.
`container.name`	keyword	Name of the Docker container that the application generating the event was running in.
`event.source`	keyword	Source of the event: `"file"`, `"journal"`, or `"container"`.
`event.provider`	keyword	Name of the application that produced the message.
`host.hostname`	keyword	Name of the host that the message was collected from.
`log.file.path`	keyword	Path on the host to the source file for the message if the message was not produced by the application running in the container or system unit.
`log.level`	keyword	Severity level of the event taken from the parsed message content.
`message`	text	Unparsed content of the event message.
`orchestrator.labels`	flat_object	Kubernetes metadata labels of the pod that runs the Docker container of the application.
`orchestrator.namespace`	keyword	Kubernetes namespace where the application pod was running.
`orchestrator.pod`	keyword	Kubernetes pod name of the pod running the application Docker container.
`orchestrator.type`	keyword	Type of orchestrator: `"mke"` or `"kubernetes"`. Empty for host file logs and journal logs.

The following table contains a simplified template of extra fields for the system index that are not present in the audit template.

System template - extra fields¶
Field	Type	Description
`http.destination.address`	keyword	IP address of the HTTP request destination.
`http.destination.domain`	keyword	Name of the OpenStack service that the HTTP request was sent to.
`http.request.duration`	long	Request duration in nanoseconds.
`http.request.id`	keyword	Request ID generated by OpenStack.
`http.request.method`	keyword	HTTP request method.
`http.request.path`	keyword	Path of the HTTP URL request.
`http.response.status_code`	long	HTTP status code of the response.
`http.source.address`	keyword	IP address of the HTTP request source.

System index mapping to the Logstash index¶

The following table lists mapping of the system index fields to the Logstash ones:

System index fields mapped to Logstash index fields¶
System	Logstash ^{Removed in MCC 2.26.0 (17.1.0 and 16.1.0)}
`@timestamp`	`@timestamp`
`container.id`	`docker.container_id`
`container.image`	`kubernetes.container_image`
`container.name`	`kubernetes.container_name`
`event.source`	n/a
`event.provider`	`logger`
`host.hostname`	`hostname`
`http.destination.address`	`parsed.upstream_addr`
`http.destination.domain`	`parsed.upstream_name`
`http.request.duration`	`parsed.duration`
`http.request.id`	`parsed.req_id`
`http.request.method`	`parsed.method`
`http.request.path`	`parsed.path`
`http.response.status_code`	`parsed.code`
`http.source.address`	`parsed.host`
`log.file.path`	n/a
`log.level`	`severity_label`
`message`	`message`
`orchestrator.labels`	`kubernetes.labels`
`orchestrator.namespace`	`kubernetes.namespace_name`
`orchestrator.pod`	`kubernetes.pod_name`
`orchestrator.type`	n/a

StackLight proxy¶

StackLight components, which require external access, automatically use the same proxy that is configured for MOSK clusters. Therefore, you only need to configure proxy during deployment of your management or MOSK clusters. No additional actions are required to set up proxy for StackLight. For more details about implementation of proxy support in MOSK, see Proxy support and cache of artifacts.

Note

Proxy handles only the HTTP and HTTPS traffic. Therefore, for clusters with limited or no Internet access, it is not possible to set up Alertmanager email notifications, which use SMTP, when proxy is used.

Proxy is used for the following StackLight components:

Component	Cluster type	Usage
Alertmanager	Any	As a default http_config for all HTTP-based receivers except the predefined HTTP-alerta and HTTP-salesforce. For these receivers, `http_config` is overridden on the receiver level.
Metric Collector	Management	To send outbound cluster metrics to Mirantis.
Salesforce notifier	Any	To send notifications to the Salesforce instance.
Salesforce reporter	Any	To send metric reports to the Salesforce instance.

See also

StackLight integration with OpenStack

Blueprints¶

This section contains a collection of Mirantis OpenStack for Kubernetes (MOSK) architecture blueprints that include common cluster topology and configuration patterns that can be referred to when building a MOSK cloud. Every blueprint is validated by Mirantis and is known to work. You can use these blueprints alone or in combination, although the interoperability of all possible combinations can not be guaranteed.

The section provides information on the target use cases, pros and cons of every blueprint and outlines the extents of its applicability. However, do not hesitate to reach out to Mirantis if you have any questions or doubts on whether a specific blueprint can be applied when designing your cloud.

Remote compute nodes¶

Introduction to edge computing¶

Although a classic cloud approach allows resources to be distributed across multiple regions, it still needs powerful data centers to host control planes and compute clusters. Such regional centralization poses challenges when the number of data consumers grows. It becomes hard to access the resources hosted in the cloud even though the resources are located in the same geographic region. The solution would be to bring the data closer to the consumer. And this is exactly what edge computing provides.

Edge computing is a paradigm that brings computation and data storage closer to the sources of data or the consumer. It is designed to improve response time and save bandwidth.

A few examples of use cases for edge computing include:

Hosting a video stream processing application on premises of a large stadium during the Super Bowl match
Placing the inventory or augmented reality services directly in the industrial facilities, such as storage, powerplant, shipyard, and so on
A small computation node deployed in a far-distanced village supermarket to host an application for store automatization and accounting

These and many other use cases could be solved by deploying multiple edge clusters managed from a single central place. The idea of centralized management plays a significant role for the business efficiency of the edge cloud environment:

Cloud operators obtain a single management console for the cloud that simplifies the Day-1 provisioning of new edge sites and Day-2 operations across multiple geographically distributed points of presence
Cloud users get ability to transparently connect their edge applications with central databases or business logic components hosted in data centers or public clouds

Depending on the size, location, and target use case, the points of presence comprising an edge cloud environment can be divided into five major categories. Mirantis OpenStack powered by Mirantis Container Cloud offers reference architectures to address the centralized management in core and regional data centers as well as edge sites.

Untitled Diagram

Overview of the remote compute nodes approach¶

Remote compute nodes is one of the approaches to the implementation of the edge computing concept offered by MOSK. The topology consists of a MOSK cluster residing in a data center, which is extended with multiple small groups of compute nodes deployed in geographically distanced remote sites. Remote compute nodes are integrated into the MOSK cluster just like the nodes in the central site with their configuration and life cycle managed through the same means.

Along with compute nodes, remote sites need to incorporate network gateway components that allow application users to consume edge services directly without looping the traffic through the central site.

Untitled Diagram

Design considerations for a remote site¶

Deployment of an edge cluster managed from a single central place starts with a proper planning. This section provides recommendations on how to approach the deployment design.

Compute nodes aggregation into availability zones¶

Mirantis recommends organizing nodes in each remote site into separate Availability Zones in the MOSK Compute (OpenStack Nova), Networking (OpenStack Neutron), and Block Storage (OpenStack Cinder) services. This enables the cloud users to be aware of the failure domain represented by a remote site and distribute the parts of their applications accordingly.

Storage¶

Typically, high latency in between the central control plane and remote sites makes it not feasible to rely on Ceph as a storage for the instance root/ephemeral and block data.

Mirantis recommends that you configure the remote sites to use the following backends:

Local storage (LVM or QCOW2) as a storage backend for the MOSK Compute service. See Image storage backend for the configuration details.
LVM on iSCSI backend for the MOSK Block Storage service. See Enable LVM block storage for the enablement procedure.

To maintain the small size of a remote site, the compute nodes need to be hyper-converged and combine the compute and block storage functions.

Site sizing¶

There is no limitation on the number of the remote sites and their size. However, when planning the cluster, ensure consistency between the total number of nodes managed by a single control plane and the value of the size parameter set in the OpenStackDeployment custom resource. For the list of supported sizes, refer to Main elements.

Additionally, the sizing of the remote site needs to take into account the characteristics of the networking channel with the main site.

Typically, an edge site consists of 3-7 compute nodes installed in a single, usually rented, rack.

Network latency and bandwidth¶

Mirantis recommends keeping the network latency between the main and remote sites as low as possible. For stable interoperability of cluster components, the latency needs to be around 30-70 milliseconds. Though, depending on the cluster configuration and dynamism of the workloads running in the remote site, the stability of the cluster can be preserved with the latency of up to 190 milliseconds.

The bandwidth of the communication channel between the main and remote sites needs to be sufficient to run the following traffic:

The control plane and management traffic, such as OpenStack messaging, database access, MOSK underlay Kubernetes cluster control plane, and so on. A single remote compute node in the idle state requires at minimum 1.5 Mbit/s of bandwidth to perform the non-data plane communications.
The data plane traffic, such as OpenStack image operations, instances VNC console traffic, and so on, that heavily depend on the profile of the workloads and other aspects of the cloud usage.

In general, Mirantis recommends having a minimum of 100 MBit/s bandwidth between the main and remote sites.

Loss of connectivity to the central site¶

MOSK remote compute nodes architecture is designed to tolerate a temporary loss of connectivity between the main cluster and the remote sites. In case of a disconnection, the instances running on remote compute nodes will keep running normally preserving their ability to read and write ephemeral and block storage data presuming it is located in the same site, as well as connectivity to their neighbours and edge application users. However, the instances will not have access to any cloud services or applications located outside of their remote site.

Since the MOSK control plane communicates with remote compute nodes through the same network channel, cloud users will not be able to perform any manipulations, for example, instance creation, deletion, snapshotting, and so on, over their edge applications until the connectivity gets restored. MOSK services providing high availability to cloud applications, such as the Instance HA service and Network service, need to be connected to the remote compute nodes to perform a failover of application components running in the remote site.

Once the connectivity between the main and the remote site restores, all functions become available again. The period during which an edge application can sustain normal function after a connectivity loss is determined by multiple factors including the selected networking backend for the MOSK cluster. Mirantis recommends that a cloud operator performs a set of test manipulations over the cloud resources hosted in the remote site to ensure that it has been fully restored.

Long-lived graceful restart in Tungsten Fabric¶

When configured in Tungsten Fabric-powered clouds, the Graceful restart and long-lived graceful restart feature significantly improves the MOSK ability to sustain the connectivity of workloads running at remote sites in situations when a site experiences a loss of connection to the central hosting location of the control plane.

Extensive testing has demonstrated that remote sites can effectively withstand a 72-hour control plane disconnection with zero impact on the running applications.

Security of cross-site communication¶

Given that a remote site communicates with its main MOSK cluster across a wide area network (WAN), it becomes important to protect sensitive data from being intercepted and viewed by a third party. Specifically, you should ensure the protection of the data belonging to the following cloud components:

Mirantis Container Cloud life-cycle management plane
Bare metal servers provisioning and control, Kubernetes cluster deployment and management, Mirantis StackLight telemetry
MOSK control plane
Communication between the components of OpenStack, Tungsten Fabric, and Mirantis Ceph
MOSK data plane
Cloud application traffic

The most reliable way to protect the data is to configure the network equipment in the data center and the remote site to encapsulate all the bypassing remote-to-main communications into an encrypted VPN tunnel. Alternatively, Mirantis Container Cloud and MOSK can be configured to force encryption of specific types of network traffic, such as:

Kubernetes networking for MOSK underlying Kubernetes cluster that handles the vast majority of in-MOSK communications
OpenStack tenant networking that carries all the cloud application traffic

The ability to enforce traffic encryption depends on the specific version of the Mirantis Container Cloud and MOSK in use, as well as the selected SDN backend for OpenStack.

Remote compute nodes with Tungsten Fabric¶

TechPreview

In MOSK, the main cloud that controls remote computes can be the regional site that locates the regional cluster and the MOSK control plane. Additionally, it can contain a local storage and compute nodes.

The remote computes implementation in MOSK considers Tungsten Fabric as an SDN solution.

Remote computes bare metal servers are configured as Kubernetes workers hosting the deployments for:

Tungsten Fabric vRouter-gateway service
Nova-compute
Local storage (LVM with iSCSI block storage)

Large clusters¶

This section describes a validated MOSK cluster architecture that is capable of handling 10,000 instances under a single control plane.

Hardware characteristics¶

Node roles layout¶
Role	Nodes count	Server specification
Management cluster Kubernetes nodes	3	16 vCPU 3.4 GHz 32 GB RAM 2 x 480 GB SSD drives 2 x 10 Gbps NICs
MOSK cluster Kubernetes master nodes	3	16 vCPU 3.4 GHz 32 GB RAM 2 x 480 GB SSD drives 2 x 10 Gbps NICs
OpenStack controller nodes	5	64 vCPU 2.5 GHz 256 RAM 2 x 240 GB SSD drives 2 x 3.8 TB NVMe drives 2 x 25 Gbps NICs
OpenStack compute and storage nodes	Up to 500 total	64 vCPU 2.5 GHz 256 RAM 2 x 240 GB SSD drives 2 x 3.8 TB NVMe drives 2 x 25 Gbps NICs
StackLight nodes	3	64 vCPU 2.5 GHz 256 RAM 2 x 240 GB SSD drives 2 x 3.8 TB NVMe drives 2 x 25 Gbps NICs

Cluster architecture¶

Cluster architecture¶
Configuration	Value
Dedicated StackLight nodes	Yes
Dedicated Ceph storage nodes	Yes
Dedicated control plane Kubernetes nodes	Yes
Dedicated OpenStack gateway nodes	No, collocated with OpenStack controller nodes
OpenStack networking backend	Open vSwitch, no Distributed Virtual Router
Cluster size in the `OpenStackDeployment` CR	`medium`

Cluster validation¶

The architecture validation is perfomed by means of simultanious creation of multiple OpenStack resources of various types and execution of functional tests against each resource. The amount of resources hosted in the cluster at the moment when a certain threshold of non-operational resources starts being observed, is described below as cluster capacity limit.

Note

A successfully created resource has the Active status in the API and passes the functional tests, for example, its floating IP address is accessible. The MOSK cluster is considered to be able to handle the created resources if it successfully performs the LCM operations including the OpenStack services restart, both on the control and data plane.

Note

The key limiting factor for creating more OpenStack objects in this illustrative setup is hardware resources (vCPU and RAM) available on the compute nodes.

OpenStack resource capacity limits¶
OpenStack resource	Limit
Instances	11101
Network ports - instances	37337
Network ports - service (avg. per gateway node)	3517
Volumes	2784
Routers	2448
Networks	3383
Orchestration stacks	2419

Hardware resources utilization¶

Consumed hardware resources by a filled up cluster in the idle state¶
Node role	Load average	vCPU	RAM in GB
OpenStack controller + gateway	10	10	100
OpenStack compute	30	25	160
Ceph storage	2	2	15
StackLight	10	8	102
Kubernetes master	10	6	13

Cephless cloud¶

Available since MOSK 23.2 TechPreview

Persistent storage is a key component of any MOSK deployment. Out of the box, MOSK includes an open-source software-defined storage solution (Ceph), which hosts various kinds of cloud application data, such as root and ephemeral disks for virtual machines, virtual machine images, attachable virtual block storage, and object data. In addition, a Ceph cluster usually acts as a storage for the internal MOSK components, such as Kubernetes, OpenStack, StackLight, and so on.

Being distributed and redundant by design, Ceph requires a certain minimum amount of servers, also known as OSD or storage nodes, to work. A production-grade Ceph cluster typically consists of at least nine storage nodes, while a development and test environment may include four to six servers. For details, refer to MOSK cluster hardware requirements.

It is possible to reduce the overall footprint of a MOSK cluster by collocating the Ceph components with hypervisors on the same physical servers; this is also known as hyper-converged design. However, this architecture still may not satisfy the requirements of certain use cases for the cloud.

Standalone telco-edge MOSK clouds typically consist of three to seven servers hosted in a single rack, where every piece of CPU, memory and disk resources is strictly accounted and better be dedicated to the cloud workloads, rather than control plane. For such clouds, where the cluster footprint is more important than the resiliency of the application data storage, it makes sense either not to have a Ceph cluster at all or to replace it with some primitive non-redundant solution.

Enterprise virtualization infrastructure with third-party storage is not a rare strategy among large companies that rely on proprietary storage appliances, provided by NetApp, Dell, HPE, Pure Storage, and other major players in the data storage sector. These industry leaders offer a variety of storage solutions meticulously designed to suit various enterprise demands. Many companies, having already invested substantially in proprietary storage infrastructure, prefer integrating MOSK with their existing storage systems. This approach allows them to leverage this investment rather than incurring new costs and logistical complexities associated with migrating to Ceph.

Architecture¶

Cephless-architecture

Kind of data	MOSK component	Data storage in Cephless architecture	Configuration
Root and ephemeral disks of instances	Compute service (OpenStack Nova)	Compute node local file system (QCOW2 images). Compute node local storage devices (LVM volumes). You can select QCOW2 and LVM backend per compute node. Volumes through the “boot from volume” feature of the Compute service. You can select the Boot from volume option when spinning up a new instance as a cloud user.	Image storage backend Enable LVM ephemeral storage
Volumes	Block Storage service (OpenStack Cinder)	MOSK standard LVM+iSCSI backend for the Block Storage service. This aligns in a seamless manner with the concept of hyper-converged design, wherein the LVM volumes are collocated on the compute nodes. Third-party storage.	Volume configuration
Volumes backups	Block Storage service (OpenStack Cinder)	External NFS share TechPreview External S3 endpoint TechPreview Alternatively, you can disable the volume backup functionality.	Backup configuration
Tungsten Fabric database backups	Tungsten Fabric (Cassandra, ZooKeeper)	External NFS share TechPreview Alternatively, you can disable the Tungsten Fabric database backups functionality.	Tungsten Fabric database
OpenStack database backups	OpenStack (MariaDB)	External NFS share TechPreview External S3-compatible storage TechPreview Local file system of one of the MOSK controller nodes. By default, database backups are stored on the local file system on the node where the MariaDB service is running. This imposes a risk to cloud security and resiliency. For enterprise environments, it is a common requirement to store all the backup data externally. Alternatively, you can disable the database backup functionality.
Results of functional testing	OpenStack Tempest	Local file system of MOSK controller nodes. The `openstack-tempest-run-tests` job responsible for running the Tempest suite stores the results of its execution in a volume requested through the `pvc-tempest` PersistentVolumeClaim (PVC). The subject volume can be created by the local volume provisioner on the same Kubernetes worker node, where the job runs. Usually, it is a MOSK controller node.	Run Tempest tests
Instance images and snapshots	Image service (OpenStack Glance)	You can configure the Block Storage service (OpenStack Cinder) to be used as a storage backend for images and snapshots. In this case, each image is represented as a volume. Important Representing volumes as images implies a hard requirement for the selected block storage backend to support multi-attach capability that is concurrent reads and writes to and from a single volume.	Enable Cinder backend for Glance
Application object data	Object storage service (Ceph RADOS Gateway)	External S3, Swift, or any other third-party storage solutions compatible with object access protocols. Note An external object storage solution will not be integrated into the MOSK identity service (OpenStack Keystone), the cloud applications will need to take care of managing access to their object data themselves. If no Ceph is deployed as part of a cluster, the MOSK built-in Object Storage service API endpoints are disabled automatically.
Logs, metrics, alerts	Mirantis StackLight (Prometeus, Alertmanager, Patroni, OpenSearch)	Local file system of MOSK controller nodes. StackLight must be deployed in the HA mode, when all its data gets stored on the local file system of the nodes running StackLight services. In this mode, StackLight components get configured to handle the data replication themselves.	Deployment architecture

Limitations¶

The determination of whether a MOSK cloud will include Ceph or not should take place during its planning and design phase. Once the deployment is complete, reconfiguring the cloud to switch between Ceph and non-Ceph architectures becomes impossible.
Mirantis recommends avoiding substitution of Ceph-backed persistent volumes in the MOSK underlying Kubernetes cluster with local volumes (local volume provisioner) for production environments. MOSK does not support such configuration unless the components that rely on these volumes can replicate their data themselves, for example, StackLight. Volumes provided by the local volume provisioner are not redundant, as they are bound to just a single node and can only be mounted from the Kubernetes pods running on the same nodes.

Node maintenance API¶

This section describes internal implementation of the node maintenance API and how OpenStack and Tungsten Fabric controllers communicate with LCM and each other during a managed cluster update.

Node maintenance API objects¶

The node maintenance API consists of the following objects:

Cluster level:
- ClusterWorkloadLock
- ClusterMaintenanceRequest
Node level:
- NodeWorkloadLock
- NodeMaintenanceRequest

WorkloadLock objects¶

The WorkloadLock objects are created by each Application Controller. These objects prevent LCM from performing any changes on the cluster or node level while the lock is in the active state. The inactive state of the lock means that the Application Controller has finished its work and the LCM can proceed with the node or cluster maintenance.

ClusterWorkloadLock object example configuration¶

apiVersion: lcm.mirantis.com/v1alpha1
kind: ClusterWorkloadLock
metadata:
  name: cluster-1-openstack
spec:
  controllerName: openstack
status:
  state: active # inactive;active;failed (default: active)
  errorMessage: ""
  release: "6.16.0+21.3"

NodeWorkloadLock object example configuration¶

apiVersion: lcm.mirantis.com/v1alpha1
kind: NodeWorkloadLock
metadata:
  name: node-1-openstack
spec:
  nodeName: node-1
  controllerName: openstack
status:
  state: active # inactive;active;failed (default: active)
  errorMessage: ""
  release: "6.16.0+21.3"

MaintenanceRequest objects¶

The MaintenanceRequest objects are created by LCM. These objects notify Application Controllers about the upcoming maintenance of a cluster or a specific node.

ClusterMaintenanceRequest object example configuration¶

apiVersion: lcm.mirantis.com/v1alpha1
kind: ClusterMaintenanceRequest
metadata:
  name: cluster-1
spec:
  scope: drain # drain;os

NodeMaintenanceRequest object example configuration¶

 apiVersion: lcm.mirantis.com/v1alpha1
 kind: NodeMaintenanceRequest
 metadata:
   name: node-1
 spec:
   nodeName: node-1
   scope: drain # drain;os

The scope parameter in the object specification defines the impact on the managed cluster or node. The list of the available options include:

drain
A regular managed cluster update. Each node in the cluster goes over a drain procedure. No node reboot takes place, a maximum impact includes restart of services on the node including Docker, which causes the restart of all containers present in the cluster.
os
A node might be rebooted during the update. Triggers the workload evacuation by the OpenStack Controller (Rockoon).

When the MaintenanceRequest object is created, an Application Controller executes a handler to prepare workloads for maintenance and put appropriate WorkloadLock objects into the inactive state.

When maintenance is over, LCM removes MaintenanceRequest objects, and the Application Controllers move their WorkloadLocks objects into the active state.

OpenStack Controller maintenance API¶

When LCM creates the ClusterMaintenanceRequest object, the OpenStack Controller (Rockoon) ensures that all OpenStack components are in the Healthy state, which means that the pods are up and running, and the readiness probes are passing.

ClusterMaintenanceRequest object creation flow¶

ClusterMaintenanceRequest - create

When LCM creates the NodeMaintenanceRequest, the OpenStack Controller:

Prepares components on the node for maintenance by removing nova-compute from scheduling.
If the reboot of a node is possible, the instance migration workflow is triggered. The Operator can configure the instance migration flow through the Kubernetes node annotation and should define the required option before the managed cluster update. For configuration details, refer to Instance migration configuration for hosts.

Also, since MOSK 25.1, cloud users can mark their instances for LCM to handle them individually during host maintenance operations. This allows for greater flexibility during cluster updates, especially for workloads that are sensitive to live migration. For details, refer to Configure per-instance migration mode.
If the OpenStack Controller cannot migrate instances due to errors, it is suspended unless all instances are migrated manually or the openstack.lcm.mirantis.com/instance_migration_mode annotation is set to skip.

NodeMaintenanceRequest object creation flow¶

NodeMaintenanceRequest - create

When the node maintenance is over, LCM removes the NodeMaintenanceRequest object and the OpenStack Controller:

Verifies that the Kubernetes Node becomes Ready.
Verifies that all OpenStack components on a given node are Healthy, which means that the pods are up and running, and the readiness probes are passing.
Ensures that the OpenStack components are connected to RabbitMQ. For example, the Neutron Agents become alive on the node, and compute instances are in the UP state.

Note

The OpenStack Controller enables you to have only one nodeworkloadlock object at a time in the inactive state. Therefore, the update process for nodes is sequential.

NodeMaintenanceRequest object removal flow¶

NodeMaintenanceRequest - delete

When the cluster maintenance is over, the OpenStack Controller sets the ClusterWorkloadLock object to back active and the update completes.

CLusterMaintenanceRequest object removal flow¶

ClusterMaintenanceRequest - delete

Tungsten Fabric Controller maintenance API¶

The Tungsten Fabric (TF) Controller creates and uses both types of workloadlocks that include ClusterWorkloadLock and NodeWorkloadLock.

When the ClusterMaintenanceRequest object is created, the TF Controller verifies the TF cluster health status and proceeds as follows:

If the cluster is Ready , the TF Controller moves the ClusterWorkloadLock object to the inactive state.
Otherwise, the TF Controller keeps the ClusterWorkloadLock object in the active state.

When the NodeMaintenanceRequest object is created, the TF Controller verifies the vRouter pod state on the corresponding node and proceeds as follows:

If all containers are Ready, the TF Controller moves the NodeWorkloadLock object to the inactive state.
Otherwise, the TF Controller keeps the NodeWorkloadLock in the active state.

Note

If there is a NodeWorkloadLock object in the inactive state present in the cluster, the TF Controller does not process the NodeMaintenanceRequest object for other nodes until this inactive NodeWorkloadLock object becomes active.

When the cluster LCM removes the MaintenanceRequest object, the TF Controller waits for the vRouter pods to become ready and proceeds as follows:

If all containers are in the Ready state, the TF Controller moves the NodeWorkloadLock object to the active state.
Otherwise, the TF Controller keeps the NodeWorkloadLock object in the inactive state.

Cluster update flow¶

This section describes the MOSK cluster update flow to the product releases that contain major updates and require node reboot such as support for new Linux kernel, and similar.

The diagram below illustrates the sequence of operations controlled by LCM and taking place during the update under the hood. We assume that the ClusterWorkloadLock and NodeWrokloadLock objects present in the cluster are in the active state before the cloud operator triggers the update.

Cluster update flow

See also

For details about the Application Controllers flow during different maintenance stages, refer to:

Phase 1: The Operator triggers the update¶

The Operator sets appropriate annotations on nodes and selects suitable migration mode for workloads.
The Operator triggers the managed cluster update through the Mirantis Container Cloud web UI as described in Step 2. Initiate MOSK cluster update.
LCM creates the ClusterMaintenance object and notifies the application controllers about planned maintenance.

Phase 2: LCM triggers the OpenStack and Ceph update¶

The OpenStack update starts.
Ceph is waiting for the OpenStack ClusterWorkloadLock object to become inactive.
When the OpenStack update is finalized, the OpenStack Controller marks ClusterWorkloadLock as inactive.
The Ceph Controller triggers an update of the Ceph cluster.
When the Ceph update is finalized, Ceph marks the ClusterWorkloadLock object as inactive.

Phase 3: LCM initiates the Kubernetes master nodes update¶

If a master node has collocated roles, LCM creates NodeMainteananceRequest for the node.
All Application Controllers mark their NodeWorkloadLock objects for this node as inactive.
LCM starts draining the node by gracefully moving out all pods from the node. The DaemonSet pods are not evacuated and left running.
LCM downloads the new version of the LCM Agent and runs its states.

Note

While running Ansible states, the services on the node may be restarted.
The above flow is applied to all Kubernetes master nodes one by one.
LCM removes NodeMainteananceRequest.

Phase 4: LCM initiates the Kubernetes worker nodes update¶

LCM creates NodeMaintenanceRequest for the node with specifying scope.
Application Controllers start preparing the node according to the scope.
LCM waits until all Application Controllers mark their NodeWorkloadLock objects for this node as inactive.
All pods are evacuated from the node by draining it. This does not apply to the DaemonSet pods, which cannot be removed.
LCM downloads the new version of the LCM Agent and runs its states.

Note

While running Ansible states, the services on the node may be restarted.
The above flow is applied to all Kubernetes worker nodes one by one.
LCM removes NodeMainteananceRequest.

Phase 5: Finalization¶

LCM triggers the update for all other applications present in the cluster, such as StackLight, Tungsten Fabric, and others.
LCM removes ClusterMaintenanceRequest.

After a while the cluster update completes and becomes fully operable again.

Parallelizing node update operations¶

Available since MOSK 23.2 TechPreview

OpenStack Controller (Rockoon)

Since MOSK 25.1, the OpenStack Controller has been open-sourced under the name Rockoon and is maintained as an independent open-source project going forward.

MOSK enables you to parallelize node update operations, significantly improving the efficiency of your deployment. This capability applies to any operation that utilizes the Node Maintenance API, such as cluster updates or graceful node reboots.

The core implementation of parallel updates is handled by the LCM Controller ensuring seamless execution of parallel operations. LCM starts performing an operation on the node only when all NodeWorkloadLock objects for the node are marked as inactive. By default, the LCM Controller creates one NodeMaintenanceRequest at a time.

Each application controller, including Ceph, OpenStack, and Tungsten Fabric Controllers, manages parallel NodeMaintenanceRequest objects independently. The controllers determine how to handle and execute parallel node maintenance requests based on specific requirements of their respective applications. To understand the workflow of the Node Maintenance API, refer to WorkloadLock objects.

Enhancing parallelism during node updates¶

Set the nodes update order.
You can optimize parallel updates by setting the order in which nodes are updated. You can accomplish this by configuring upgradeIndex of the Machine object. For the procedure, refer to Change the upgrade order of a machine.
Increase parallelism.
Boost parallelism by adjusting the maximum number of worker node updates that are allowed during LCM operations using the spec.providerSpec.value.maxWorkerUpgradeCount configuration parameter, which is set to 1 by default.

For configuration details, refer to Configure the parallel update of worker nodes.
Execute LCM operations.
Run LCM operations, such as cluster updates, taking advantage of the increased parallelism.

OpenStack nodes update¶

By default, the OpenStack Controller handles the NodeMaintenanceRequest objects as follows:

Updates the OpenStack controller nodes sequentially (one by one).
Updates the gateway nodes sequentially. Technically, you can increase the number of gateway nodes upgrades allowed in parallel using the nwl_parallel_max_gateway parameter but Mirantis does not recommend to do so.
Updates the compute nodes in parallel. The default number of allowed parallel updates is 30. You can adjust this value through the nwl_parallel_max_compute parameter.

Parallelism considerations for compute nodes

When considering parallelism for compute nodes, take into account that during certain pod restarts, for example, the openvswitch-vswitchd pods, a brief instance downtime may occur. Select a suitable level of parallelism to minimize the impact on workloads and prevent excessive load on the control plane nodes.

If your cloud environment is distributed across failure domains, which are represented by Nova availability zones, you can limit the parallel updates of nodes to only those within the same availability zone. This behavior is controlled by the respect_nova_az option in the OpenStack Controller.

The OpenStack Controller configuration is stored in the rockoon-config configMap of the osh-system namespace. The options are picked up automatically after update. To learn more about the OpenStack Controller (Rockoon) configuration parameters, refer to OpenStack Controller configuration.

Ceph nodes update¶

By default, the Ceph Controller handles the NodeMaintenanceRequest objects as follows:

Updates the non-storage nodes sequentially. Non-storage nodes include all nodes that have mon, mgr, rgw, or mds roles.
Updates storage nodes in parallel. The default number of allowed parallel updates is calculated automatically based on the minimal failure domain in a Ceph cluster.
Parallelism calculations for storage nodes

The Ceph Controller automatically calculates the parallelism number in the following way:
- Finds the minimal failure domain for a Ceph cluster. For example, the minimal failure domain is rack.
- Filters all currently requested nodes by minimal failure domain. For example, parallelism equals to 5, and LCM requests 3 nodes from the rack1 rack and 2 nodes from the rack2 rack.
- Handles each filtered node group one by one. For example, the controller handles in parallel all nodes from rack1 before processing nodes from rack2.

The Ceph Controller handles non-storage nodes before the storage ones. If there are node requests for both node types, the Ceph Controller handles sequentially the non-storage nodes first. Therefore, Mirantis recommends setting the upgrade index of a higher priority for the non-storage nodes to decrease the total upgrade time.

If the minimal failure domain is host, the Ceph Controller updates only one storage node per failure domain unit. This results in updating all Ceph nodes sequentially, despite the potential for increased parallelism.

Tungsten Fabric nodes update¶

By default, the Tungsten Fabric Controller handles the NodeMaintenanceRequest objects as follows:

Updates the Tungsten Fabric Controller and gateway nodes sequentially.
Updates the vRouter nodes in parallel. The Tungsten Fabric Controller allows updating up to 30 vRouter nodes in parallel.

Maximum amount of vRouter nodes in maintenance

While the Tungsten Fabric Controller has the capability to process up to 30 NodeMaintenanceRequest objects targeted to vRouter nodes, the actual amount may be lower. This is due to a check that ensures OpenStack readiness to unlock the relevant nodes for maintenance. If OpenStack allows for maintenance, the Tungsten Fabric Controller verifies the vRouter pods. Upon successful verification, the NodeWorkloadLock object is switched to the maintenance mode.

Deployment Guide¶

Mirantis OpenStack for Kubernetes (MOSK) enables the operator to create, scale, update, and upgrade OpenStack deployments on Kubernetes through a declarative API.

The Kubernetes built-in features, such as flexibility, scalability, and declarative resource definition make MOSK a robust solution.

Plan the deployment¶

The detailed plan of any Mirantis OpenStack for Kubernetes (MOSK) deployment is determined on a per-cloud basis. For the MOSK reference architecture and design overview, see Reference Architecture.

Also, read through Bare metal components as a MOSK cluster is deployed on top of a bare metal management cluster.

Note

Receive updates often and use continuous delivery. For example, any non-isolated deployment of Mirantis Container Cloud.
Have significant deviations from the reference architecture or third party extensions installed.
Are managed under the Mirantis OpsCare program.
Run business-critical workloads where even the slightest application downtime is unacceptable.

A typical staging cloud is a complete copy of the production environment including the hardware and software configurations, but with a bare minimum of compute and storage capacity.

Provision and deploy a management cluster¶

The bare metal management system enables the Infrastructure Operator to deploy a Container Cloud management cluster on a set of bare metal servers. It also enables Container Cloud to deploy MOSK clusters on bare metal servers without a pre-provisioned operating system.

This section instructs you on how to provision and deploy a Container Cloud management cluster.

Deploy a management cluster¶

Note

The deprecated bootstrap procedure using Bootstrap v1 was removed for the sake of Bootstrap v2 in Container Cloud 2.26.0.

Introduction¶

Mirantis Container Cloud Bootstrap v2 provides best user experience to set up Container Cloud. Using Bootstrap v2, you can provision and operate management clusters using required objects through the Container Cloud API.

Basic concepts and components of Bootstrap v2 include:

Bootstrap cluster
Bootstrap cluster is any kind-based Kubernetes cluster that contains a minimal set of Container Cloud bootstrap components allowing the user to prepare the configuration for management cluster deployment and start the deployment. The list of these components includes:
- Bootstrap Controller
  Controller that is responsible for:
  
  Configuration of a bootstrap cluster with provider charts through the bootstrap Helm bundle.
  
  Configuration and deployment of a management cluster and its related objects.
- Helm Controller
  Operator that manages Helm chart releases. It installs the Container Cloud bootstrap and provider charts configured in the bootstrap Helm bundle.
- Public API charts
  Helm charts that contain custom resource definitions for Container Cloud resources.
- Admission Controller
  Controller that performs mutations and validations for the Container Cloud resources including cluster and machines configuration.
Currently one bootstrap cluster can be used for deployment of only one management cluster. For example, to add a new management cluster with different settings, a new bootstrap cluster must be created from scratch.
Bootstrap region
BootstrapRegion is the first object to create in the bootstrap cluster for the Bootstrap Controller to identify and install provider components onto the bootstrap cluster. After, the user can prepare and deploy a management cluster with related resources.

The bootstrap region is a starting point for the cluster deployment. The user needs to approve the BootstrapRegion object. Otherwise, the Bootstrap Controller will not be triggered for the cluster deployment.
Bootstrap Helm bundle
Helm bundle that contains charts configuration for the bootstrap cluster. This object is managed by the Bootstrap Controller that updates the provider bundle in the BootstrapRegion object. The Bootstrap Controller always configures provider charts listed in the regional section of the Container Cloud release for the provider. Depending on the cluster configuration, the Bootstrap Controller may update or reconfigure this bundle even after the cluster deployment starts. For example, the Bootstrap Controller enables the provider in the bootstrap cluster only after the bootstrap region is approved for the deployment.

Overview of the deployment workflow¶

Management cluster deployment consists of several sequential stages. Each stage finishes when a specific condition is met or specific configuration applies to a cluster or its machines.

In case of issues at any deployment stage, you can identify the problem and adjust it on the fly. The cluster deployment does not abort until all stages complete by means of the infinite-timeout option enabled by default in Bootstrap v2.

Infinite timeout prevents the bootstrap failure due to timeout. This option is useful in the following cases:

The network speed is slow for artifacts downloading
The infrastructure configuration does not allow fast booting
The inspection of a bare-metal node presupposes more than two HDDSATA disks to attach to a machine

You can track the status of each stage in the bootstrapStatus section of the Cluster object that is updated by the Bootstrap Controller.

The Bootstrap Controller starts deploying the cluster after you approve the BootstrapRegion configuration.

The following table describes deployment states of a management cluster that apply in the strict order.

Deployment states of a management cluster¶
Step	State	Description
1	`ProxySettingsHandled`	Verifies proxy configuration in the `Cluster` object. If the bootstrap cluster was created without a proxy, no actions are applied to the cluster.
2	`ClusterSSHConfigured`	Verifies SSH configuration for the cluster and machines. You can provide any number of SSH public keys, which are added to cluster machines. But the Bootstrap Controller always adds the `bootstrap-key` SSH public key to the cluster configuration. The Bootstrap Controller uses this SSH key to manage the `lcm-agent` configuration on cluster machines. The `bootstrap-key` SSH key is copied to a `bootstrap-key-<clusterName>` object containing the cluster name in its name.
3	`ProviderUpdatedInBootstrap`	Synchronizes the provider and settings of its components between the `Cluster` object and bootstrap Helm bundle. Settings provided in the cluster configuration have higher priority than the default settings of the bootstrap cluster, except CDN.
4	`ProviderEnabledInBootstrap`	Enables the provider and its components if any were disabled by the Bootstrap Controller during preparation of the bootstrap region. A cluster and machines deployment starts after the provider enablement.
5	Nodes readiness	Waits for the provider to complete nodes deployment that comprises VMs creation and MKE installation.
6	`ObjectsCreated`	Creates required namespaces and IAM secrets.
7	`ProviderConfigured`	Verifies the provider configuration in the provisioned cluster.
8	`HelmBundleReady`	Verifies the Helm bundle readiness for the provisioned cluster.
9	`ControllersDisabledBeforePivot`	Collects the list of deployment controllers and disables them to prepare for pivot.
10	`PivotDone`	Moves all cluster-related objects from the bootstrap cluster to the provisioned cluster. The copies of `Cluster` and `Machine` objects remain in the bootstrap cluster to provide the status information to the user. About every minute, the Bootstrap Controller reconciles the status of the `Cluster` and `Machine` objects of the provisioned cluster to the bootstrap cluster.
11	`ControllersEnabledAfterPivot`	Enables controllers in the provisioned cluster.
12	`MachinesLCMAgentUpdated`	Updates the `lcm-agent` configuration on machines to target LCM agents to the provisioned cluster.
13	`HelmControllerDisabledBeforeConfig`	Disables the Helm Controller before reconfiguration.
14	`HelmControllerConfigUpdated`	Updates the Helm Controller configuration for the provisioned cluster.
15	Cluster readiness	Contains information about the global cluster status. The Bootstrap Controller verifies that OIDC, Helm releases, and all Deployments are ready. Once the cluster is ready, the Bootstrap Controller stops managing the cluster.

Set up a bootstrap cluster¶

The setup of a bootstrap cluster comprises preparation of the seed node, configuration of environment variables, acquisition of the Container Cloud license file, and execution of the bootstrap script.

To set up a bootstrap cluster:

Prepare the seed node:

Verify that the hardware allocated for the installation meets the minimal requirements described in Requirements.
Install basic Ubuntu 22.04 server using standard installation images of the operating system on the bare metal seed node.
Log in to the seed node that is running Ubuntu 22.04.

Configure the operating system and network:

Verify that the seed node has direct access to the Baseboard Management Controller (BMC) of each bare metal host. All target hardware nodes must be in the power off state.

For example, using the IPMI tool:
```
apt install ipmitool
ipmitool -I lanplus -H 'IPMI IP' -U 'IPMI Login' -P 'IPMI password' \
chassis power status
```
Example of system response:
```
Chassis Power is off
```

Prepare the bootstrap script:

Download and run the Container Cloud bootstrap script:

sudo apt-get update
sudo apt-get install wget
wget https://binary.mirantis.com/releases/get_container_cloud.sh
chmod 0755 get_container_cloud.sh
./get_container_cloud.sh

Change the directory to the kaas-bootstrap folder created by the script.

Obtain a Container Cloud license file required for the bootstrap:
Obtain a Container Cloud license
1. Select from the following options:
  - Open the email from support@mirantis.com with the subject Mirantis Container Cloud License File or Mirantis OpenStack License File
  - In the Mirantis CloudCare Portal, open the Account or Cloud page
2. Download the License File and save it as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.
3. Verify that mirantis.lic contains the previously downloaded Container Cloud license by decoding the license JWT token, for example, using jwt.io.
  
  Example of a valid decoded Container Cloud license data with the mandatory license field:
  { "exp": 1652304773, "iat": 1636669973, "sub": "demo", "license": { "dev": false, "limits": { "clusters": 10, "workers_per_cluster": 10 }, "openstack": null } }
Warning

The MKE license does not apply to mirantis.lic. For details about MKE license, see MKE documentation.

Export mandatory parameters:

Bare metal network mandatory parameters

Export the following mandatory parameters using the commands and table below:

export KAAS_BM_ENABLED="true"
#
export KAAS_BM_PXE_IP="172.16.59.5"
export KAAS_BM_PXE_MASK="24"
export KAAS_BM_PXE_BRIDGE="br0"

Bare metal prerequisites data¶
Parameter	Description	Example value
`KAAS_BM_PXE_IP`	The provisioning IP address in the PXE network. This address will be assigned on the seed node to the interface defined by the `KAAS_BM_PXE_BRIDGE` parameter described below. The PXE service of the bootstrap cluster uses this address to network boot bare metal hosts.	`172.16.59.5`
`KAAS_BM_PXE_MASK`	The PXE network address prefix length to be used with the `KAAS_BM_PXE_IP` address when assigning it to the seed node interface.	`24`
`KAAS_BM_PXE_BRIDGE`	The PXE network bridge name that must match the name of the bridge created on the seed node during preparation of the system and network configuration described earlier in this procedure.	`br0`

Optional. Configure proxy settings to bootstrap the cluster using proxy:

Proxy configuration

Add the following environment variables:

HTTP_PROXY
HTTPS_PROXY
NO_PROXY
PROXY_CA_CERTIFICATE_PATH

Example snippet:

export HTTP_PROXY=http://proxy.example.com:3128
export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
export NO_PROXY=172.18.10.0,registry.internal.lan
export PROXY_CA_CERTIFICATE_PATH="/home/ubuntu/.mitmproxy/mitmproxy-ca-cert.cer"

The following formats of variables are accepted:

Proxy configuration data¶
Variable	Format
`HTTP_PROXY` `HTTPS_PROXY`	`http://proxy.example.com:port` - for anonymous access. `http://user:password@proxy.example.com:port` - for restricted access.
`NO_PROXY`	Comma-separated list of IP addresses or domain names.
`PROXY_CA_CERTIFICATE_PATH`	Optional. Absolute path to the proxy CA certificate for man-in-the-middle (MITM) proxies. Must be placed on the bootstrap node to be trusted. For details, see Install a CA certificate for a MITM proxy on a bootstrap node. Warning If you require Internet access to go through a MITM proxy, ensure that the proxy has streaming enabled as described in Enable streaming for MITM.

For implementation details, see Proxy support and cache of artifacts.

After the bootstrap cluster is set up, the bootstrap-proxy object is created with the provided proxy settings. You can use this object later for the Cluster object configuration.

Deploy the bootstrap cluster:
```
./bootstrap.sh bootstrapv2
```
Make sure that port 80 is open for localhost to prevent security requirements for the seed node:

Note

Kind uses port mapping for the master node.
```
telnet localhost 80
```
Example of a positive system response:
```
Connected to localhost.
```
Example of a negative system response:
```
telnet: connect to address ::1: Connection refused
telnet: Unable to connect to remote host
```
To open port 80:
```
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
```

Deploy a management cluster¶

This section contains an overview of the cluster-related objects along with the configuration procedure of these objects during deployment of a management cluster using Bootstrap v2 through the Container Cloud API.

Overview of the cluster-related objects in the Container Cloud API/CLI¶

The following cluster-related objects are available through the Container Cloud API. Use these objects to deploy a management cluster using the Container Cloud API.

Cluster objects¶
Object name	Description
`BootstrapRegion`	Region and provider names for a management cluster and all related objects. First object to create in the bootstrap cluster. For the bootstrap region definition, see Introduction.
`SSHKey`	Optional. SSH configuration with any number of SSH public keys to be added to cluster machines. By default, any bootstrap cluster has a pregenerated `bootstrap-key` object to use for the cluster configuration. This is the service SSH key used by the Bootstrap Controller to access machines for their deployment. The private part of `bootstrap-key` is always saved to `kaas-bootstrap/ssh_key`.
`Proxy`	Proxy configuration. Mandatory for offline environments with no direct access to the Internet. Such configuration usually contains proxy for the bootstrap cluster and already has the `bootstrap-proxy` object to use in the cluster configuration by default. For proxy implementation details, see Requirements for a MITM proxy and Proxy support and cache of artifacts.
`Cluster`	Provider configuration for a management cluster.
`Machine`	Machine configuration that must fit the following requirements: Role - only `manager` Number - odd for the management cluster HA Mandatory labels - `provider` and `cluster-name`
`ServiceUser`	Service user is the initial user to create in Keycloak for access to a newly deployed management cluster. By default, it has the `global-admin`, `operator` (namespaced), and `bm-pool-operator` (namespaced) roles. You can delete `serviceuser` after setting up other required users with specific roles or after any integration with an external identity provider, such as LDAP.
`BareMetalHost` Private API since MCC 2.29.0 (16.4.0)	Information about hardware configuration of a machine. Required for further machine selection during bootstrap. For details, see BareMetalHost resource.
`BareMetalHostInventory` Available since MCC 2.29.0 (16.4.0)	Information about hardware configuration of a machine. For details, see BareMetalHostInventory resource. Note Before update of the management cluster to Container Cloud 2.29.0 (Cluster release 16.4.0), instead of `BareMetalHostInventory`, use the `BareMetalHost` object. For details, see BareMetalHost resource. Caution While the Cluster release of the management cluster is 16.4.0, `BareMetalHostInventory` operations are allowed to `m:kaas@management-admin` only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
`BareMetalHostCredential`	The object is created for each `BareMetalHostInventory` and contains information about the Baseboard Management Controller (`bmc`) credentials. For details, see BareMetalHostCredential resource.
`BareMetalHostProfile`	Provisioning and configuration settings of the storage devices and the operating system. For details, see BareMetalHostProfile resource.
`L2Template`	Advanced host networking configuration for clusters, which enables, for example, creation of bond interfaces on top of physical interfaces on the host or the use of multiple subnets to separate different types of network traffic. For details, see L2Template.
`MetalLBConfig`	Default and mandatory object for the MetalLB configuration. For details, see MetalLBConfig.
`MetalLBConfigTemplate` Unsupported since MCC 2.28.0 (16.3.0)	Deprecated in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0) and unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). Before Container Cloud 2.27.0, the default object for the MetalLB configuration, which enables the use of `Subnet` objects to define MetalLB IP address pools. For details, see MetalLBConfigTemplate.
`Subnet`	Configuration for IP address allocation for cluster nodes. For details, see Subnet.

Deploy a management cluster using CLI¶

The following procedure describes how to prepare and deploy a management cluster using Bootstrap v2 by operating YAML templates available in the kaas-bootstrap/templates/ folder.

To deploy a management cluster using CLI:

Set up a bootstrap cluster.
Export kubeconfig of the kind cluster:
```
export KUBECONFIG=<pathToKindKubeconfig>
```
By default, <pathToKindKubeconfig> is $HOME/.kube/kind-config-clusterapi.
Configure BIOS on a bare metal host.
Navigate to kaas-bootstrap/templates/bm.
Warning

The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying objects containing credentials. Such Container Cloud objects include:
BareMetalHostCredential

ClusterOIDCConfiguration

License

Proxy

ServiceUser

TLSConfig
Therefore, do not use kubectl apply on these objects. Use kubectl create, kubectl patch, or kubectl edit instead.

If you used kubectl apply on these objects, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the objects using kubectl edit.

Create the BootstrapRegion object by modifying bootstrapregion.yaml.template.

Note

In the following steps, apply the changes to objects using the commands below with the required template name:

./kaas-bootstrap/bin/kubectl create -f \
    kaas-bootstrap/templates/bm/<templateName>.yaml.template

Create the ServiceUser object by modifying serviceusers.yaml.template.
Configuration of serviceusers.yaml.template
Service user is the initial user to create in Keycloak for access to a newly deployed management cluster. By default, it has the global-admin, operator (namespaced), and bm-pool-operator (namespaced) roles.

You can delete serviceuser after setting up other required users with specific roles or after any integration with an external identity provider, such as LDAP.
apiVersion: kaas.mirantis.com/v1alpha1 kind: ServiceUserList items: - apiVersion: kaas.mirantis.com/v1alpha1 kind: ServiceUser metadata: name: <USERNAME> spec: password: value: <PASSWORD>

Optional. Prepare any number of additional SSH keys using the following example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: PublicKey
metadata:
  name: <SSHKeyName>
  namespace: default
spec:
  publicKey: |
    <insert your public key here>

Optional. Add the Proxy object using the example below:

apiVersion: kaas.mirantis.com/v1alpha1
kind: Proxy
metadata:
  name: <proxyName>
  namespace: default
spec:
  ...

Inspect the default bare metal host profile definition in baremetalhostprofiles.yaml.template and adjust it to fit your hardware configuration. For details, see Customize the default bare metal host profile.
Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:
- A raw device partition with a file system on it
- A device partition in a volume group with a logical volume that has a file system on it
- An mdadm RAID device with a file system on it
- An LVM RAID device with a file system on it
The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

In baremetalhosts.yaml.template, update the bare metal host definitions according to your environment configuration. Use the reference table below to manually set all parameters that start with SET_.

Mandatory parameters for a bare metal host template

Parameter	Description	Example value
`SET_MACHINE_0_IPMI_USERNAME`	The IPMI user name to access the BMC. 0	`user`
`SET_MACHINE_0_IPMI_PASSWORD`	The IPMI password to access the BMC. 0	`password`
`SET_MACHINE_0_MAC`	The MAC address of the first master node in the PXE network.	`ac:1f:6b:02:84:71`
`SET_MACHINE_0_BMC_ADDRESS`	The IP address of the BMC endpoint for the first master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.	`192.168.100.11`
`SET_MACHINE_1_IPMI_USERNAME`	The IPMI user name to access the BMC. 0	`user`
`SET_MACHINE_1_IPMI_PASSWORD`	The IPMI password to access the BMC. 0	`password`
`SET_MACHINE_1_MAC`	The MAC address of the second master node in the PXE network.	`ac:1f:6b:02:84:72`
`SET_MACHINE_1_BMC_ADDRESS`	The IP address of the BMC endpoint for the second master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.	`192.168.100.12`
`SET_MACHINE_2_IPMI_USERNAME`	The IPMI user name to access the BMC. 0	`user`
`SET_MACHINE_2_IPMI_PASSWORD`	The IPMI password to access the BMC. 0	`password`
`SET_MACHINE_2_MAC`	The MAC address of the third master node in the PXE network.	`ac:1f:6b:02:84:73`
`SET_MACHINE_2_BMC_ADDRESS`	The IP address of the BMC endpoint for the third master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.	`192.168.100.13`

0(1,2,3,4,5,6): The parameter requires a user name and password in plain text.

Configure cluster network:

Important

Bootstrap V2 supports only separated PXE and LCM networks.

Update the network object definition in ipam-objects.yaml.template according to the environment configuration. By default, this template implies the use of separate PXE and life-cycle management (LCM) networks.
Manually set all parameters that start with SET_.
To ensure successful bootstrap, enable asymmetric routing on the interfaces of the management cluster nodes. This is required because the seed node relies on one network by default, which can potentially cause traffic asymmetry.

In the kernelParameters section of baremetalhostprofiles.yaml.template, set rp_filter to 2. This enables loose mode as defined in RFC3704.
Example configuration of asymmetric routing
... kernelParameters: ... sysctl: # Enables the "Loose mode" for the "k8s-lcm" interface (management network) net.ipv4.conf.k8s-lcm.rp_filter: "2" # Enables the "Loose mode" for the "bond0" interface (PXE network) net.ipv4.conf.bond0.rp_filter: "2" ...
Note

More complicated solutions that are not described in this manual include getting rid of traffic asymmetry, for example:
- Configure source routing on management cluster nodes.
- Plug the seed node into the same networks as the management cluster nodes, which requires custom configuration of the seed node.

For configuration details of bond network interface for the PXE and management network, see Configure NIC bonding.

For general concepts of configuring separate PXE and LCM networks for a management cluster, see Separate PXE and management networks. For current object templates and variable names to use, see the following tables.

Network parameters mapping overview

Deployment file name	Parameters list to update manually
`ipam-objects.yaml.template`	`SET_LB_HOST` `SET_MGMT_ADDR_RANGE` `SET_MGMT_CIDR` `SET_MGMT_DNS` `SET_MGMT_NW_GW` `SET_MGMT_SVC_POOL` `SET_PXE_ADDR_POOL` `SET_PXE_ADDR_RANGE` `SET_PXE_CIDR` `SET_PXE_SVC_POOL` `SET_VLAN_ID`
`bootstrap.env`	`KAAS_BM_PXE_IP` `KAAS_BM_PXE_MASK` `KAAS_BM_PXE_BRIDGE`

Mandatory network parameters of the IPAM object template

The following table contains examples of mandatory parameter values to set in ipam-objects.yaml.template for the network scheme that has the following networks:

172.16.59.0/24 - PXE network
172.16.61.0/25 - LCM network

Parameter	Description	Example value
`SET_PXE_CIDR`	The IP address of the PXE network in the CIDR notation. The minimum recommended network size is 256 addresses (`/24` prefix length).	`172.16.59.0/24`
`SET_PXE_SVC_POOL`	The IP address range to use for endpoints of load balancers in the PXE network for the Container Cloud services: Ironic-API, DHCP server, HTTP server, and caching server. The minimum required range size is 5 addresses.	`172.16.59.6-172.16.59.15`
`SET_PXE_ADDR_POOL`	The IP address range in the PXE network to use for dynamic address allocation for hosts during inspection and provisioning. The minimum recommended range size is 30 addresses for management cluster nodes if it is located in a separate PXE network segment. Otherwise, it depends on the number of managed cluster nodes to deploy in the same PXE network segment as the management cluster nodes.	`172.16.59.51-172.16.59.200`
`SET_PXE_ADDR_RANGE`	The IP address range in the PXE network to use for static address allocation on each management cluster node. The minimum recommended range size is 6 addresses.	`172.16.59.41-172.16.59.50`
`SET_MGMT_CIDR`	The IP address of the LCM network for the management cluster in the CIDR notation. If managed clusters will have their separate LCM networks, those networks must be routable to the LCM network. The minimum recommended network size is 128 addresses (`/25` prefix length).	`172.16.61.0/25`
`SET_MGMT_NW_GW`	The default gateway address in the LCM network. This gateway must provide access to the OOB network of the Container Cloud cluster and to the Internet to download the Mirantis artifacts.	`172.16.61.1`
`SET_LB_HOST`	The IP address of the externally accessible MKE API endpoint of the cluster in the CIDR notation. This address must be within the management `SET_MGMT_CIDR` network but must NOT overlap with any other addresses or address ranges within this network. External load balancers are not supported.	`172.16.61.5/32`
`SET_MGMT_DNS`	An external (non-Kubernetes) DNS server accessible from the LCM network.	`8.8.8.8`
`SET_MGMT_ADDR_RANGE`	The IP address range that includes addresses to be allocated to bare metal hosts in the LCM network for the management cluster. When this network is shared with managed clusters, the size of this range limits the number of hosts that can be deployed in all clusters sharing this network. When this network is solely used by a management cluster, the range must include at least 6 addresses for bare metal hosts of the management cluster.	`172.16.61.30-172.16.61.40`
`SET_MGMT_SVC_POOL`	The IP address range to use for the externally accessible endpoints of load balancers in the LCM network for the Container Cloud services, such as Keycloak, web UI, and so on. The minimum required range size is 19 addresses.	`172.16.61.10-172.16.61.29`
`SET_VLAN_ID`	The VLAN ID used for isolation of LCM network. The bootstrap.sh process and the seed node must have routable access to the network in this VLAN.	`3975`

While using separate PXE and LCM networks, the management cluster services are exposed in different networks using two separate MetalLB address pools:

Services exposed through the PXE network are as follows:
- Ironic API as a bare metal provisioning server
- HTTP server that provides images for network boot and server provisioning
- Caching server for accessing the Container Cloud artifacts deployed on hosts
Services exposed through the LCM network are all other Container Cloud services, such as Keycloak, web UI, and so on.

The default MetalLB configuration described in the MetalLBConfig object template of metallbconfig.yaml.template uses two separate MetalLB address pools. Also, it uses the interfaces selector in its l2Advertisements template.

Caution

When you change the L2Template object template in ipam-objects.yaml.template, ensure that interfaces listed in the interfaces field of the MetalLBConfig.spec.l2Advertisements section match those used in your L2Template. For details about the interfaces selector, see MetalLBConfig spec.

See Configure and verify MetalLB for details on MetalLB configuration.

In cluster.yaml.template:
1. Set the mandatory label:
```
labels:
  kaas.mirantis.com/provider: baremetal
```
2. Update the cluster-related settings to fit your deployment.
Optional. Technology Preview. Deprecated since Container Cloud 2.29.0 (Cluster release 16.4.0). Enable WireGuard for traffic encryption on the Kubernetes workloads network.
WireGuard configuration
1. Ensure that the Calico MTU size is at least 60 bytes smaller than the interface MTU size of the workload network. IPv4 WireGuard uses a 60-byte header. For details, see Set the MTU size for Calico.
2. In cluster.yaml.template, enable WireGuard by adding the secureOverlay parameter:
  spec: ... providerSpec: value: ... secureOverlay: true
  Caution
  
  Changing this parameter on a running cluster causes a downtime that can vary depending on the cluster size.
For more details about WireGuard, see Calico documentation: Encrypt in-cluster pod traffic.
Configure StackLight. For parameters description, see StackLight configuration parameters.
Optional. Configure additional cluster settings as described in Configure optional settings.
In machines.yaml.template:
1. Add the following mandatory machine labels:
```
labels:
 kaas.mirantis.com/provider: baremetal
 cluster.sigs.k8s.io/cluster-name: <clusterName>
 cluster.sigs.k8s.io/control-plane: "true"
```
2. Adjust spec and labels sections of each entry according to your deployment.
3. Adjust the spec.providerSpec.value.hostSelector values to match BareMetalHostInventory corresponding to each machine. For details, see spec:providerSpec for instance configuration.

Monitor the inspecting process of the baremetal hosts and wait until all hosts are in the available state:

kubectl get bmh -o go-template='{{- range .items -}} {{.status.provisioning.state}}{{"\n"}} {{- end -}}'

Example of system response:

available
available
available

Monitor the BootstrapRegion object status and wait until it is ready.
```
kubectl get bootstrapregions -o go-template='{{(index .items 0).status.ready}}{{"\n"}}'
```
To obtain more granular status details, monitor status.conditions:
```
kubectl get bootstrapregions -o go-template='{{(index .items 0).status.conditions}}{{"\n"}}'
```
For a more user-friendly system response, consider using dedicated tools such as jq or yq and adjust the -o flag to output in the json or yaml format accordingly.
Change the directory to /kaas-bootstrap/.
Approve the BootstrapRegion object to start the cluster deployment:
```
./container-cloud bootstrap approve all
```
Caution

Once you approve the BootstrapRegion object, no cluster or machine modification is allowed.

Warning

Do not manually restart or power off any of the bare metal hosts during the bootstrap process.
Monitor the deployment progress. For description of deployment stages, see Overview of the deployment workflow.
Verify that network addresses used on your clusters do not overlap with the following default MKE network addresses for Swarm and MCR:
- 10.0.0.0/16 is used for Swarm networks. IP addresses from this network are virtual.
- 10.99.0.0/16 is used for MCR networks. IP addresses from this network are allocated on hosts.
Verification of Swarm and MCR network addresses
To verify Swarm and MCR network addresses, run on any master node:
docker info
Example of system response:
Server: ... Swarm: ... Default Address Pool: 10.0.0.0/16 SubnetSize: 24 ... Default Address Pools: Base: 10.99.0.0/16, Size: 20 ...
Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress network is created by default and occupies the 10.0.0.0/24 address block. Also, three MCR networks are created by default and occupy three address blocks: 10.99.0.0/20, 10.99.16.0/20, 10.99.32.0/20.

To verify the actual networks state and addresses in use, run:
docker network ls docker network inspect <networkName>
Optional. If you plan to use multiple L2 segments for provisioning of managed cluster nodes, consider the requirements specified in Configure multiple DHCP address ranges.

Configure bare metal settings¶

During creation of a bare metal management cluster using Bootstrap v2, configure several cluster settings to fit your deployment.

Configure BIOS on a bare metal host¶

Note

Before update of the management cluster to Container Cloud 2.29.0 (Cluster release 16.4.0), instead of BareMetalHostInventory, use the BareMetalHost object. For details, see BareMetalHost resource.

Caution

While the Cluster release of the management cluster is 16.4.0, BareMetalHostInventory operations are allowed to m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.

Before adding new BareMetalHostInventory objects, configure hardware hosts to correctly boot them over the PXE network.

Important

Consider the following common requirements for hardware hosts configuration:

Update firmware for BIOS and Baseboard Management Controller (BMC) to the latest available version, especially if you are going to apply the UEFI configuration.

Container Cloud uses the ipxe.efi binary loader that might be not compatible with old firmware and have vendor-related issues with UEFI booting. For example, the Supermicro issue. In this case, we recommend using the legacy booting format.
Configure all or at least the PXE NIC on switches.

If the hardware host has more than one PXE NIC to boot, we strongly recommend setting up only one in the boot order. It speeds up the provisioning phase significantly.

Some hardware vendors require a host to be rebooted during BIOS configuration changes from legacy to UEFI or vice versa for the extra option with NIC settings to appear in the menu.
Connect only one Ethernet port on a host to the PXE network at any given time. Collect the physical address (MAC) of this interface and use it to configure the BareMetalHostInventory object describing the host.

To configure BIOS on a bare metal host:

Legacy hardware host configuration

Enable the global BIOS mode using BIOS > Boot > boot mode select > legacy. Reboot the host if required.
Enable the LAN-PXE-OPROM support using the following menus:
- BIOS > Advanced > PCI/PCIe Configuration > LAB OPROM TYPE > legacy
- BIOS > Advanced > PCI/PCIe Configuration > Network Stack > enabled
- BIOS > Advanced > PCI/PCIe Configuration > IPv4 PXE Support > enabled
Set up the configured boot order:
1. BIOS > Boot > Legacy-Boot-Order#1 > Hard Disk
2. BIOS > Boot > Legacy-Boot-Order#2 > NIC
Save changes and power off the host.

UEFI hardware host configuration

Enable the global BIOS mode using BIOS > Boot > boot mode select > UEFI. Reboot the host if required.
Enable the LAN-PXE-OPROM support using the following menus:
- BIOS > Advanced > PCI/PCIe Configuration > LAB OPROM TYPE > uefi
- BIOS > Advanced > PCI/PCIe Configuration > Network Stack > enabled
- BIOS > Advanced > PCI/PCIe Configuration > IPv4 PXE Support > enabled
Note

UEFI support might not apply to all NICs. But at least built-in network interfaces should support it.
Set up the configured boot order:
1. BIOS > Boot > UEFI-Boot-Order#1 > UEFI Hard Disk
2. BIOS > Boot > UEFI-Boot-Order#2 > UEFI Network
Save changes and power off the host.

Customize the default bare metal host profile¶

This section provides description of the bare metal host profile settings and provides instructions on how to configure this profile before deploying Mirantis Container Cloud on physical servers.

The bare metal host profile is a Kubernetes custom resource. It allows the infrastructure operator to define how the storage devices and the operating system are provisioned and configured.

The bootstrap templates for a bare metal deployment include the template for the default BareMetalHostProfile object in the following file that defines the default bare metal host profile:

templates/bm/baremetalhostprofiles.yaml.template

Note

Using BareMetalHostProfile, you can configure LVM or mdadm-based software RAID support during a management or managed cluster creation. For details, see Configure RAID support.

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

The customization procedure of BareMetalHostProfile is almost the same for the management and managed clusters, with the following differences:

For a management cluster, the customization automatically applies to machines during bootstrap. And for a managed cluster, you apply the changes using kubectl before creating a managed cluster.
For a management cluster, you edit the default baremetalhostprofiles.yaml.template. And for a managed cluster, you create a new BareMetalHostProfile with the necessary configuration.

For the procedure details, see Create a custom bare metal host profile. Use this procedure for both types of clusters considering the differences described above.

Configure NIC bonding¶

You can configure L2 templates for the management cluster to set up a bond network interface for the PXE and management network.

This configuration must be applied to the bootstrap templates, before you run the bootstrap script to deploy the management cluster.

Configuration requirements for NIC bonding

Add at least two physical interfaces to each host in your management cluster.
Connect at least two interfaces per host to an Ethernet switch that supports Link Aggregation Control Protocol (LACP) port groups and LACP fallback.
Configure an LACP group on the ports connected to the NICs of a host.
Configure the LACP fallback on the port group to ensure that the host can boot over the PXE network before the bond interface is set up on the host operating system.
Configure server BIOS for both NICs of a bond to be PXE-enabled.
If the server does not support booting from multiple NICs, configure the port of the LACP group that is connected to the PXE-enabled NIC of a server to be the primary port. With this setting, the port becomes active in the fallback mode.
Configure the ports that connect servers to the PXE network with the PXE VLAN as native or untagged.

For reference configuration of network fabric in a baremetal-based cluster, see Physical networks layout.

To configure a bond interface that aggregates two interfaces for the PXE and management network:

In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:
1. Verify that only the following parameters for the declaration of {{nic 0}} and {{nic 1}} are set, as shown in the example below:
  dhcp4
  
  dhcp6
  
  match
  
  set-name
  Remove other parameters.
2. Verify that the declaration of the bond interface bond0 has the interfaces parameter listing both Ethernet interfaces.
3. Verify that the node address in the PXE network (ip "bond0:mgmt-pxe" in the below example) is bound to the bond interface or to the virtual bridge interface tied to that bond.
  
  Caution
  
  No VLAN ID must be configured for the PXE network from the host side.
4. Configure bonding options using the parameters field. The only mandatory option is mode. See the example below for details.
  
  Note
  
  You can set any mode supported by netplan and your hardware.
  
  Important
  
  Bond monitoring is disabled in Ubuntu by default. However, Mirantis highly recommends enabling it using the Media Independent Interface (MII) monitoring by setting the mii-monitor-interval parameter to a non-zero value. For details, see Linux documentation: bond monitoring.

Verify your configuration using the following example:

kind: L2Template
metadata:
  name: kaas-mgmt
  ...
spec:
  ...
  l3Layout:
    - subnetName: kaas-mgmt
      scope:      namespace
  npTemplate: |
    version: 2
    ethernets:
      {{nic 0}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 0}}
        set-name: {{nic 0}}
      {{nic 1}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}
    bonds:
      bond0:
        interfaces:
          - {{nic 0}}
          - {{nic 1}}
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
        dhcp4: false
        dhcp6: false
        addresses:
          - {{ ip "bond0:mgmt-pxe" }}
    vlans:
      k8s-lcm:
        id: SET_VLAN_ID
        link: bond0
        addresses:
          - {{ ip "k8s-lcm:kaas-mgmt" }}
        nameservers:
          addresses: {{ nameservers_from_subnet "kaas-mgmt" }}
        routes:
          - to: 0.0.0.0/0
            via: {{ gateway_from_subnet "kaas-mgmt" }}
    ...

Proceed to bootstrap your management cluster as described in Deploy a management cluster using CLI.

Separate PXE and management networks¶

This section describes how to configure a dedicated PXE network for a management bare metal cluster. A separate PXE network allows isolating sensitive bare metal provisioning process from the end users. The users still have access to Container Cloud services, such as Keycloak, to authenticate workloads in managed clusters, such as Horizon in a Mirantis OpenStack for Kubernetes cluster.

Note

This additional configuration procedure must be completed as part of the main Deploy a management cluster using CLI procedure. It substitutes or appends some configuration parameters and templates that are used in the main procedure for the management cluster to use two networks, PXE and management, instead of one PXE/management network. Mirantis recommends considering the main procedure first.

The following table describes the overall network mapping scheme with all L2/L3 parameters, for example, for two networks, PXE (CIDR 10.0.0.0/24) and management (CIDR 10.0.11.0/24):

Network mapping overview¶
Deployment file name	Network	Parameters and values
`cluster.yaml`	Management	`SET_LB_HOST=10.0.11.90` `SET_METALLB_ADDR_POOL=10.0.11.61-10.0.11.80`
`ipam-objects.yaml`	PXE	`SET_IPAM_CIDR=10.0.0.0/24` `SET_PXE_NW_GW=10.0.0.1` `SET_PXE_NW_DNS=8.8.8.8` `SET_IPAM_POOL_RANGE=10.0.0.100-10.0.0.109` `SET_METALLB_PXE_ADDR_POOL=10.0.0.61-10.0.0.70`
`ipam-objects.yaml`	Management	`SET_LCM_CIDR=10.0.11.0/24` `SET_LCM_RANGE=10.0.11.100-10.0.11.199` `SET_LB_HOST=10.0.11.90` `SET_METALLB_ADDR_POOL=10.0.11.61-10.0.11.80`
`bootstrap.sh`	PXE	`KAAS_BM_PXE_IP=10.0.0.20` `KAAS_BM_PXE_MASK=24` `KAAS_BM_PXE_BRIDGE=br0` `KAAS_BM_BM_DHCP_RANGE=10.0.0.30,10.0.0.59,255.255.255.0` `BOOTSTRAP_METALLB_ADDRESS_POOL=10.0.0.61-10.0.0.80`

When using separate PXE and management networks, the management cluster services are exposed in different networks using two separate MetalLB address pools:

Services exposed through the PXE network are as follows:
- Ironic API as a bare metal provisioning server
- HTTP server that provides images for network boot and server provisioning
- Caching server for accessing the Container Cloud artifacts deployed on hosts
Services exposed through the management network are all other Container Cloud services, such as Keycloak, web UI, and so on.

To configure separate PXE and management networks:

Inspect guidelines to follow during configuration of the Subnet object as a MetalLB address pool as described MetalLB configuration guidelines for subnets.
To ensure successful bootstrap, enable asymmetric routing on the interfaces of the management cluster nodes. This is required because the seed node relies on one network by default, which can potentially cause traffic asymmetry.

In the kernelParameters section of baremetalhostprofiles.yaml.template, set rp_filter to 2. This enables loose mode as defined in RFC3704.
Example configuration of asymmetric routing
... kernelParameters: ... sysctl: # Enables the "Loose mode" for the "k8s-lcm" interface (management network) net.ipv4.conf.k8s-lcm.rp_filter: "2" # Enables the "Loose mode" for the "bond0" interface (PXE network) net.ipv4.conf.bond0.rp_filter: "2" ...
Note

More complicated solutions that are not described in this manual include getting rid of traffic asymmetry, for example:
- Configure source routing on management cluster nodes.
- Plug the seed node into the same networks as the management cluster nodes, which requires custom configuration of the seed node.

In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:

Substitute all Subnet object templates with the new ones as described in the example template below
Update the L2 template spec.l3Layout and spec.npTemplate fields as described in the example template below

Deprecated since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0): the last Subnet template named mgmt-pxe-lb in the example above will be used to configure the MetalLB address pool in the PXE network. The bare metal provider will automatically configure MetalLB with address pools using the Subnet objects identified by specific labels.

Warning

The bm-pxe address must have a separate interface with only one address on this interface.

Verify the current MetalLB configuration that is stored in MetalLB objects:
```
kubectl -n metallb-system get ipaddresspools,l2advertisements
```
For the example configuration described above, the system outputs a similar content:
```
NAME AGE
ipaddresspool.metallb.io/default 129m
ipaddresspool.metallb.io/services-pxe 129m

NAME AGE
l2advertisement.metallb.io/default 129m
l2advertisement.metallb.io/services-pxe 129m
```
To verify the MetalLB objects:
```
kubectl -n metallb-system get <object> -o json | jq '.spec'
```
For the example configuration described above, the system outputs a similar content for ipaddresspool objects:
```
{
 "addresses": [
 "10.0.11.61-10.0.11.80"
 ],
 "autoAssign": true,
 "avoidBuggyIPs": false
}
$ kubectl -n metallb-system get ipaddresspool.metallb.io/services-pxe -o json | jq '.spec'
{
 "addresses": [
 "10.0.0.61-10.0.0.70"
 ],
 "autoAssign": false,
 "avoidBuggyIPs": false
}
```
The auto-assign parameter will be set to false for all address pools except the default one. So, a particular service will get an address from such an address pool only if the Service object has a special metallb.universe.tf/address-pool annotation that points to the specific address pool name.
Note

It is expected that every Container Cloud service on a management
cluster will be assigned to one of the address pools. Current consideration is to have two MetalLB address pools:
- services-pxe is a reserved address pool name to use for the Container Cloud services in the PXE network (Ironic API, HTTP server, caching server).
 
 The bootstrap cluster also uses the services-pxe address pool for its provision services for management cluster nodes to be provisioned from the bootstrap cluster. After the management cluster is deployed, the bootstrap cluster is deleted and that address pool is solely used by the newly deployed cluster.
- default is an address pool to use for all other Container Cloud services in the management network. No annotation is required on the Service objects in this case.

In addition to the network parameters defined in Deploy a management cluster using CLI, configure the following ones by replacing them in templates/bm/ipam-objects.yaml.template:

New subnet template parameters¶
Parameter	Description	Example value
`SET_LCM_CIDR`	Address of a management network for the management cluster in the CIDR notation. You can later share this network with managed clusters where it will act as the LCM network. If managed clusters have their separate LCM networks, those networks must be routable to the management network.	`10.0.11.0/24`
`SET_LCM_RANGE`	Address range that includes addresses to be allocated to bare metal hosts in the management network for the management cluster. When this network is shared with managed clusters, the size of this range limits the number of hosts that can be deployed in all clusters that share this network. When this network is solely used by a management cluster, the range should include at least 3 IP addresses for bare metal hosts of the management cluster.	`10.0.11.100-10.0.11.109`
`SET_METALLB_PXE_ADDR_POOL`	Address range to be used for LB endpoints of the Container Cloud services: Ironic-API, HTTP server, and caching server. This range must be within the PXE network. The minimum required range is 5 IP addresses.	`10.0.0.61-10.0.0.70`

The following parameters will now be tied to the management network while their meaning remains the same as described in Deploy a management cluster using CLI:

Subnet template parameters migrated to management network¶
Parameter	Description	Example value
`SET_LB_HOST`	IP address of the externally accessible API endpoint of the management cluster. This address must NOT be within the `SET_METALLB_ADDR_POOL` range but within the management network. External load balancers are not supported.	`10.0.11.90`
`SET_METALLB_ADDR_POOL`	The address range to be used for the externally accessible LB endpoints of the Container Cloud services, such as Keycloak, web UI, and so on. This range must be within the management network. The minimum required range is 19 IP addresses.	`10.0.11.61-10.0.11.80`

Proceed to further steps in Deploy a management cluster using CLI.

Configure multiple DHCP address ranges¶

To facilitate multi-rack and other types of distributed bare metal datacenter topologies, the dnsmasq DHCP server used for host provisioning in Container Cloud supports working with multiple L2 segments through network routers that support DHCP relay.

Container Cloud has its own DHCP relay running on one of the management cluster nodes. That DHCP relay serves for proxying DHCP requests in the same L2 domain where the management cluster nodes are located.

Caution

Networks used for hosts provisioning of a managed cluster must have routes to the PXE network of the management cluster. This configuration enables hosts to have access to the management cluster services that are used during host provisioning.

Management cluster nodes must have routes through the PXE network to PXE network segments used on a managed cluster. The following example contains L2 template fragments for a management cluster node:

To configure DHCP ranges for dnsmasq, create the Subnet objects tagged with the ipam/SVC-dhcp-range label while setting up subnets for a managed cluster using CLI.

Caution

Support of multiple DHCP ranges has the following limitations:

Using of custom DNS server addresses for servers that boot over PXE is not supported.
The Subnet objects for DHCP ranges cannot be associated with any specific cluster, as DHCP server configuration is only applicable to the management cluster where DHCP server is running. The cluster.sigs.k8s.io/cluster-name label will be ignored.

Configure DHCP ranges for dnsmasq¶

Create the Subnet objects tagged with the ipam/SVC-dhcp-range label.

Caution

For cluster-specific subnets, create Subnet objects in the same namespace as the related Cluster object project. For shared subnets, create Subnet objects in the default namespace.

To create the Subnet objects, refer to Create subnets.

Use the following Subnet object example to specify DHCP ranges and DHCP options to pass the default route address:
```
apiVersion: "ipam.mirantis.com/v1alpha1"
kind: Subnet
metadata:
  name: mgmt-dhcp-range
  namespace: default
  labels:
    ipam/SVC-dhcp-range: ""
    kaas.mirantis.com/provider: baremetal
spec:
  cidr: 10.11.0.0/24
  gateway: 10.11.0.1
  includeRanges:
    - 10.11.0.121-10.11.0.125
    - 10.11.0.191-10.11.0.199
```
Note

Setting of custom nameservers in the DHCP subnet is not supported.

After creation of the above Subnet object, the provided data will be utilized to render the Dnsmasq object used for configuration of the dnsmasq deployment. You do not have to manually edit the Dnsmasq object.

Verify that the changes are applied to the Dnsmasq object:

kubectl --kubeconfig <pathToMgmtClusterKubeconfig> \
-n kaas get dnsmasq dnsmasq-dynamic-config -o json

See also

DHCP range requirements for PXE

Configure DHCP relay on ToR switches¶

For servers to access the DHCP server across the L2 segment boundaries, for example, from another rack with a different VLAN for PXE network, you must configure DHCP relay (agent) service on the border switch of the segment. For example, on a top-of-rack (ToR) or leaf (distribution) switch, depending on the data center network topology.

Warning

To ensure predictable routing for the relay of DHCP packets, Mirantis strongly advises against the use of chained DHCP relay configurations. This precaution limits the number of hops for DHCP packets, with an optimal scenario being a single hop.

This approach is justified by the unpredictable nature of chained relay configurations and potential incompatibilities between software and hardware relay implementations.

The dnsmasq server listens on the PXE network of the management cluster by using the dhcp-lb Kubernetes Service.

To configure the DHCP relay service, specify the external address of the dhcp-lb Kubernetes Service as an upstream address for the relayed DHCP requests, which is the IP helper address for DHCP. There is the dnsmasq deployment behind this service that can only accept relayed DHCP requests.

To obtain the actual IP address issued to the dhcp-lb Kubernetes Service:

kubectl -n kaas get service dhcp-lb

Migration of DHCP configuration for existing management clusters¶

Note

This section applies only to existing management clusters that are created before Container Cloud 2.24.0 (Cluster release 14.0.0).

Caution

Since Container Cloud 2.24.0, you can only remove the deprecated dnsmasq.dhcp_range, dnsmasq.dhcp_ranges, dnsmasq.dhcp_routers, and dnsmasq.dhcp_dns_servers values from the cluster spec.

The Admission Controller does not accept any other changes in these values. This configuration is completely superseded by the Subnet object.

The DHCP configuration automatically migrated from the cluster spec to Subnet objects after cluster upgrade to Container Cloud 2.21.0 (Cluster release 11.5.0).

To remove the deprecated dnsmasq parameters from the cluster spec:

Open the management cluster spec for editing.
In the baremetal-operator release values, remove the dnsmasq.dhcp_range, dnsmasq.dhcp_ranges, dnsmasq.dhcp_routers, and dnsmasq.dhcp_dns_servers parameters. For example:
```
regional:
- helmReleases:
 - name: baremetal-operator
 values:
 dnsmasq:
 dhcp_range: 10.204.1.0,10.204.5.255,255.255.255.0
```
Caution

The dnsmasq.dhcp_<name> parameters of the baremetal-operator Helm chart values in the Cluster spec are deprecated since the Cluster release 11.5.0 and removed in the Cluster release 14.0.0.
Ensure that the required DHCP ranges and options are set in the Subnet objects. For configuration details, see Configure DHCP ranges for dnsmasq.

The dnsmasq configuration options dhcp-option=3 and dhcp-option=6 are absent in the default configuration. So, by default, dnsmasq will send the DNS server and default route to DHCP clients as defined in the dnsmasq official documentation:

The netmask and broadcast address are the same as on the host running dnsmasq.
The DNS server and default route are set to the address of the host running dnsmasq.
If the domain name option is set, this name is sent to DHCP clients.

Enable dynamic IP allocation¶

Available since MCC 2.26.0 (Cluster release 16.1.0)

This section instructs you on how to enable dynamic IP allocation feature to increase the amount of baremetal hosts to be provisioned in parallel on managed clusters.

Using this feature, you can effortlessly deploy a large managed cluster by provisioning up to 100 hosts simultaneously. In addition to dynamic IP allocation, this feature disables the ping check in the DHCP server. Therefore, if you plan to deploy large managed clusters, enable this feature during the management cluster bootstrap.

Caution

Before using this feature, familiarize yourself with DHCP range requirements for PXE.

To enable dynamic IP allocation for large managed clusters:

In the Cluster object of the management cluster, modify the configuration of baremetal-operator by setting dynamic_bootp to true:

spec:
  ...
  providerSpec:
    value:
      kaas:
        ...
        regional:
          - helmReleases:
            - name: baremetal-operator
              values:
                dnsmasq:
                  dynamic_bootp: true
            provider: baremetal
          ...

Set a custom external IP address for the DHCP service¶

Available since MCC 2.25.0 (Cluster release 16.0.0)

This section instructs you on how to set a custom external IP address for the dhcp-lb service so that it remains the same during management cluster upgrades and other LCM operations.

The changes of dhcp-lb service address may lead to the necessity of changing configuration for DHCP relays on ToR switches. The described procedure allows you to avoid such unwanted changes. This configuration makes sense when you use multiple DHCP address ranges on your deployment. See Configure multiple DHCP address ranges for details.

To set a custom external IP address for the dhcp-lb service:

In the Cluster object of the management cluster, modify the configuration of the baremetal-operator release by setting dnsmasq.dedicated_udp_service_address_pool to true:

spec:
  ...
  providerSpec:
    value:
      kaas:
        ...
        regional:
          - helmReleases:
            ...
            - name: baremetal-operator
              values:
                dnsmasq:
                  dedicated_udp_service_address_pool: true
                  ...
            provider: baremetal
          ...

In the MetalLBConfig object of the management cluster, modify the ipAddressPools object list by adding the dhcp-lb object and the serviceAllocation parameters for the default object:

ipAddressPools:
- name: default
  spec:
    addresses:
    - 112.181.11.41-112.181.11.60
    autoAssign: true
    avoidBuggyIPs: false
    serviceAllocation:
      serviceSelectors:
      - matchExpressions:
        - key: app.kubernetes.io/name
          operator: NotIn
          values:
          - dhcp-lb
- name: services-pxe
  spec:
    addresses:
    - 10.0.24.122-10.0.24.140
    autoAssign: false
    avoidBuggyIPs: false
- name: dhcp-lb
  spec:
    addresses:
    - 10.0.24.121/32
    autoAssign: true
    avoidBuggyIPs: false
    serviceAllocation:
      namespaces:
      - kaas
      serviceSelectors:
      - matchExpressions:
        - key: app.kubernetes.io/name
          operator: In
          values:
          - dhcp-lb

Select non-overlapping IP addresses for all the ipAddressPools that you use: default, services-pxe, and dhcp-lb.

In the MetalLBConfig object of the management cluster, modify the l2Advertisements object list by adding dhcp-lb to the ipAddressPools section in the pxe object spec:

Note

A cluster may have a different L2Advertisement object name instead of pxe.
```
l2Advertisements:
...
- name: pxe
  spec:
    ipAddressPools:
    - services-pxe
    - dhcp-lb
    ...
```

Configure optional settings¶

Note

Consider this section as part of the Bootstrap v2 CLI procedure.

During creation of a management cluster, you can configure optional cluster settings using the Container Cloud API by modifying cluster.yaml.template.

To configure optional cluster settings:

Technology Preview. Enable custom host names for cluster machines. When enabled, any machine host name in a particular region matches the related Machine object name. For example, instead of the default kaas-node-<UID>, a machine host name will be master-0. The custom naming format is more convenient and easier to operate with.
Configuration for custom host names on the management and its future managed clusters
1. In cluster.yaml.template, find the spec.providerSpec.value.kaas.regional.helmReleases.name: baremetal-provider section.
2. Under values.config, add customHostnamesEnabled: true:
 regional: - helmReleases: - name: baremetal-provider values: config: allInOneAllowed: false customHostnamesEnabled: true internalLoadBalancers: false provider: baremetal-provider

Optional. Technology Preview. Enable the Linux Audit daemon auditd to monitor activity of cluster processes and prevent potential malicious activity.

Configure OIDC integration with LDAP or Google OAuth. For details, see Configure LDAP for IAM or Configure Google OAuth IdP for IAM.

Configure NTP server. You can disable NTP that is enabled by default. This option disables the management of chrony configuration by MOSK to use your own system for chrony management. Otherwise, configure the regional NTP server parameters as described below.

Applies since Container Cloud 2.26.0 (Cluster release 16.1.0). If you plan to deploy large managed clusters, enable dynamic IP allocation to increase the amount of baremetal hosts to be provisioned in parallel. For details, see Enable dynamic IP allocation.

Now, proceed with completing the bootstrap process using the Container Cloud Bootstrap API as described in Deploy a management cluster.

Post-deployment steps¶

After bootstrapping the management cluster, collect and save the following cluster details in a secure location:

Obtain the management cluster kubeconfig:

./container-cloud get cluster-kubeconfig \
--kubeconfig <pathToKindKubeconfig> \
--cluster-name <clusterName>

By default, pathToKindKubeconfig is $HOME/.kube/kind-config-clusterapi.

Obtain the Keycloak credentials as described in Access the Keycloak Admin Console.
Obtain MariaDB credentials for IAM.
Remove the kind cluster:
```
./bin/kind delete cluster -n <kindClusterName>
```
By default, kindClusterName is clusterapi.

Now, you can proceed with operating your management cluster through the Container Cloud web UI and deploying MOSK clusters as described in Operations Guide.

Create initial users after a management cluster bootstrap¶

Once you bootstrap your management cluster, create Keycloak users for access to the Container Cloud web UI.

Mirantis recommends creating at least two users, user and operator, that are required for a typical MOSK deployment.

To create the user for access to the Container Cloud web UI:

./container-cloud bootstrap user add \
    --username <userName> \
    --roles <roleName> \
    --kubeconfig <pathToMgmtKubeconfig>

Note

You will be asked for the user password interactively.

User creation parameters¶
Flag	Description
`--username`	Required. Name of the user to create.
`--roles`	Required. Comma-separated list of roles to assign to the user. If you run the command without the `--namespace` flag, you can assign the following roles: `global-admin` - read and write access for global role bindings `writer` - read and write access `reader` - view access `operator` - create and manage access to the `BareMetalHost` and `BareMetalHostInventory` (since Container Cloud 2.29.1, Cluster release 16.4.1) objects `management-admin` - full access to the management cluster, available since Container Cloud 2.25.0 (Cluster release 16.0.0) If you run the command for a specific project using the `--namespace` flag, you can assign the following roles: `operator` or `writer` - read and write access `user` or `reader` - view access `member` - read and write access (excluding IAM objects) `bm-pool-operator` - create and manage access to the `BareMetalHost` and `BareMetalHostInventory` (since Container Cloud 2.29.1, Cluster release 16.4.1) objects
`--kubeconfig`	Required. Path to the management cluster `kubeconfig` generated during the management cluster bootstrap.
`--namespace`	Optional. Name of the Container Cloud project where the user will be created. If not set, a global user will be created for all Container Cloud projects with the corresponding role access to view or manage all public objects.
`--password-stdin`	Optional. Flag to provide the user password through `stdin`: echo '$PASSWORD' \| ./container-cloud bootstrap user add \ --username <userName> \ --roles <roleName> \ --kubeconfig <pathToMgmtKubeconfig> \ --password-stdin

To delete the user:

./container-cloud bootstrap user delete --username <userName> --kubeconfig <pathToMgmtKubeconfig>

Requirements for a MITM proxy¶

Note

For MOSK clusters, the feature is generally available since MOSK 23.1.

While bootstrapping a Container Cloud management cluster using proxy, you may require Internet access to go through a man-in-the-middle (MITM) proxy. Such configuration requires that you enable streaming and install a CA certificate on a bootstrap node.

Enable streaming for MITM¶

Ensure that the MITM proxy is configured with enabled streaming. For example, if you use mitmproxy, enable the stream_large_bodies=1 option:

./mitmdump --set stream_large_bodies=1

Install a CA certificate for a MITM proxy on a bootstrap node¶

Log in to the bootstrap node.
Install ca-certificates:
```
apt install ca-certificates
```
Copy your CA certificate to the /usr/local/share/ca-certificates/ directory. For example:
```
sudo cp ~/.mitmproxy/mitmproxy-ca-cert.cer /usr/local/share/ca-certificates/mitmproxy-ca-cert.crt
```
Replace ~/.mitmproxy/mitmproxy-ca-cert.cer with the path to your CA certificate.

Caution

The target CA certificate file must be in the PEM format with the .crt extension.
Apply the changes:
```
sudo update-ca-certificates
```

Now, proceed with bootstrapping your management cluster.

Configure external identity provider for IAM¶

This section describes how to configure authentication for management cluster depending on the external identity provider type integrated to your deployment.

Configure LDAP for IAM¶

If you integrate LDAP for IAM to Mirantis OpenStack for Kubernetes, add the required LDAP configuration to cluster.yaml.template during the management cluster bootstrap.

Note

The example below defines the recommended non-anonymous authentication type. If you require anonymous authentication, replace the following parameters with authType: "none":

authType: "simple"
bindCredential: ""
bindDn: ""

To configure LDAP for IAM:

Open templates/bm/cluster.yaml.template.

Configure the keycloak:userFederation:providers: and keycloak:userFederation:mappers: sections as required:

spec:
  providerSpec:
    value:
      kaas:
        management:
          helmReleases:
          - name: iam
            values:
              keycloak:
                userFederation:
                  providers:
                    - displayName: "<LDAP_NAME>"
                      providerName: "ldap"
                      priority: 1
                      fullSyncPeriod: -1
                      changedSyncPeriod: -1
                      config:
                        pagination: "true"
                        debug: "false"
                        searchScope: "1"
                        connectionPooling: "true"
                        usersDn: "<DN>" # "ou=People, o=<ORGANIZATION>, dc=<DOMAIN_COMPONENT>"
                        userObjectClasses: "inetOrgPerson,organizationalPerson"
                        usernameLDAPAttribute: "uid"
                        rdnLDAPAttribute: "uid"
                        vendor: "ad"
                        editMode: "READ_ONLY"
                        uuidLDAPAttribute: "uid"
                        connectionUrl: "ldap://<LDAP_DNS>"
                        syncRegistrations: "false"
                        authType: "simple"
                        bindCredential: ""
                        bindDn: ""
                  mappers:
                    - name: "username"
                      federationMapperType: "user-attribute-ldap-mapper"
                      federationProviderDisplayName: "<LDAP_NAME>"
                      config:
                        ldap.attribute: "uid"
                        user.model.attribute: "username"
                        is.mandatory.in.ldap: "true"
                        read.only: "true"
                        always.read.value.from.ldap: "false"
                    - name: "full name"
                      federationMapperType: "full-name-ldap-mapper"
                      federationProviderDisplayName: "<LDAP_NAME>"
                      config:
                        ldap.full.name.attribute: "cn"
                        read.only: "true"
                        write.only: "false"
                    - name: "last name"
                      federationMapperType: "user-attribute-ldap-mapper"
                      federationProviderDisplayName: "<LDAP_NAME>"
                      config:
                        ldap.attribute: "sn"
                        user.model.attribute: "lastName"
                        is.mandatory.in.ldap: "true"
                        read.only: "true"
                        always.read.value.from.ldap: "true"
                    - name: "email"
                      federationMapperType: "user-attribute-ldap-mapper"
                      federationProviderDisplayName: "<LDAP_NAME>"
                      config:
                        ldap.attribute: "mail"
                        user.model.attribute: "email"
                        is.mandatory.in.ldap: "false"
                        read.only: "true"
                        always.read.value.from.ldap: "true"

Verify that the userFederation section is located on the same level as the initUsers section.
Verify that all attributes set in the mappers section are defined for users in the specified LDAP system. Missing attributes may cause authorization issues.

Now, return to the bootstrap instruction for your management cluster.

Configure Google OAuth IdP for IAM¶

Caution

The instruction below applies to the DNS-based management clusters. If you bootstrap a non-DNS-based management cluster, configure Google OAuth IdP for Keycloak after bootstrap using the official Keycloak documentation.

If you integrate Google OAuth external identity provider for IAM to Mirantis OpenStack for Kubernetes, create the authorization credentials for IAM in your Google OAuth account and configure cluster.yaml.template during the bootstrap of the management cluster.

To configure Google OAuth IdP for IAM:

Create Google OAuth credentials for IAM:
1. Log in to your https://console.developers.google.com.
2. Navigate to Credentials.
3. In the APIs Credentials menu, select OAuth client ID.
4. In the window that opens:
 1. In the Application type menu, select Web application.
 2. In the Authorized redirect URIs field, type in <keycloak-url>/auth/realms/iam/broker/google/endpoint, where <keycloak-url> is the corresponding DNS address.
 3. Press Enter to add the URI.
 4. Click Create.
 A page with your client ID and client secret opens. Save these credentials for further usage.
Log in to the bootstrap node.
Open templates/bm/cluster.yaml.template.

In the keycloak:externalIdP: section, add the following snippet with your credentials created in previous steps:

keycloak:
  externalIdP:
    google:
      enabled: true
      config:
        clientId: <Google_OAuth_client_ID>
        clientSecret: <Google_OAuth_client_secret>

Now, return to the bootstrap instruction for your management cluster.

Create a managed cluster¶

After bootstrapping your baremetal-based Mirantis Container Cloud management cluster, you can create a baremetal-based managed cluster to deploy Mirantis OpenStack for Kubernetes using the Container Cloud API.

Create a project for MOSK clusters¶

Note

The procedure below applies only to the Container Cloud web UI users with the m:kaas@global-admin or m:kaas@writer access role assigned by the infrastructure operator.

The default project (Kubernetes namespace) is dedicated for management clusters only. MOSK clusters require a separate project. You can create as many projects as required by your company infrastructure.

To create a project for MOSK clusters:

Log in to the Container Cloud web UI as m:kaas@global-admin or m:kaas@writer.
In the Projects tab, click Create.
Type the new project name.
Click Create.

Note

Due to the known issue 50168, access to the newly created project becomes available in five minutes after project creation.

Create a custom bare metal host profile¶

The bare metal host profile is a Kubernetes custom resource. It enables the operator to define how the storage devices and the operating system are provisioned and configured.

This section describes the bare metal host profile default settings and configuration of custom profiles for managed clusters using Container Cloud API. The section also applies to a management cluster with a few differences described in Customize the default bare metal host profile.

Default configuration of the host system storage¶

The default host profile requires three storage devices in the following strict order:

Boot device and operating system storage
This device contains boot data and operating system data. It is partitioned using the GUID Partition Table (GPT) labels. The root file system is an ext4 file system created on top of an LVM logical volume. For a detailed layout, refer to the table below.
Local volumes device
This device contains an ext4 file system with directories mounted as persistent volumes to Kubernetes. These volumes are used by the Mirantis Container Cloud services to store its data, including monitoring and identity databases.
Ceph storage device
This device is used as a Ceph datastore or Ceph OSD on managed clusters.

The following table summarizes the default configuration of the host system storage set up by the Container Cloud bare metal management.

Default configuration of the bare metal host storage¶
Device/partition	Name/Mount point	Recommended size, GB	Description
`/dev/sda1`	`bios_grub`	4 MiB	The mandatory GRUB boot partition required for non-UEFI systems.
`/dev/sda2`	`UEFI` -> `/boot/efi`	0.2 GiB	The boot partition required for the UEFI boot mode.
`/dev/sda3`	`config-2`	64 MiB	The mandatory partition for the `cloud-init` configuration. Used during the first host boot for initial configuration.
`/dev/sda4`	`lvm_root_part`	100% of the remaining free space in the LVM volume group	The main LVM physical volume that is used to create the root file system.
`/dev/sdb`	`lvm_lvp_part` -> `/mnt/local-volumes`	100% of the remaining free space in the LVM volume group	The LVM physical volume that is used to create the file system for `LocalVolumeProvisioner`.
`/dev/sdc`	`-`	100% of the remaining free space in the LVM volume group	Clean raw disk that will be used for the Ceph storage backend on managed clusters.

If required, you can customize the default host storage configuration. For details, see Create MOSK host profiles.

Wipe a device or partition¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0)

Before deploying a cluster, you may need to erase existing data from hardware devices to be used for deployment. You can either erase an existing partition or remove all existing partitions from a physical device. For this purpose, use the wipeDevice structure that configures cleanup behavior during configuration of a custom bare metal host profile described in Create MOSK host profiles.

The wipeDevice structure contains the following options:

eraseMetadata
Configures metadata cleanup of a device
eraseDevice
Configures a complete cleanup of a device

Erase metadata from a device¶

When you enable the eraseMetadata option, which is disabled by default, the Ansible provisioner attempts to clean up the existing metadata from the target device. Examples of metadata include:

Existing file system
Logical Volume Manager (LVM) or Redundant Array of Independent Disks (RAID) configuration

The behavior of metadata erasure varies depending on the target device:

If a device is part of other logical devices, for example, a partition, logical volume, or MD RAID volume, such logical device is disassembled and its file system metadata is erased. On the final erasure step, the file system metadata of the target device is erased as well.
If a device is a physical disk, then all its nested partitions along with their nested logical devices, if any, are erased and disassembled. On the final erasure step, all partitions and metadata of the target device are removed.

Caution

None of the eraseMetadata actions include overwriting the target device with data patterns. For this purpose, use the eraseDevice option as described in Erase a device.

To enable the eraseMetadata option, use the wipeDevice field in the spec:devices section of the BareMetalHostProfile object. For a detailed description of the option, see BareMetalHostProfile resource.

Erase a device¶

If you require not only disassembling of existing logical volumes but also removing of all data ever written to the target device, configure the eraseDevice option, which is disabled by default. This option is not applicable to paritions, LVM, or MD RAID logical volumes because such volumes may use caching that prevents a physical device from being erased properly.

Important

The eraseDevice option does not replace the secure erase.

To configure the eraseDevice option, use the wipeDevice field in the spec:devices section of the BareMetalHostProfile object. For a detailed description of the option, see BareMetalHostProfile resource.

Create MOSK host profiles¶

Different types of MOSK nodes require differently configured host storage. This section describes how to create custom host profiles for different types of MOSK nodes.

You can create custom profiles for managed clusters using Container Cloud API.

Note

The procedure below also applies to management clusters.

You can use flexible size units throughout bare metal host profiles. For example, you can now use either sizeGiB: 0.1 or size: 100Mi when specifying a device size.

Mirantis recommends using only one parameter name type and units throughout the configuration files. If both sizeGiB and size are used, sizeGiB is ignored during deployment and the suffix is adjusted accordingly. For example, 1.5Gi will be serialized as 1536Mi. The size without units is counted in bytes. For example, size: 120 means 120 bytes.

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

To create MOSK bare metal host profiles:

Select from the following options:
- For a management cluster, log in to the bare metal seed node that will be used to bootstrap the management cluster.
- For a managed cluster, log in to the local machine where you management cluster kubeconfig is located and where kubectl is installed.
Note

The management cluster kubeconfig is created automatically during the last stage of the management cluster bootstrap.
Select from the following options:
- For a management cluster, open templates/bm/baremetalhostprofiles.yaml.template for editing.
- For a managed cluster, create a new bare metal host profile for MOSK compute nodes in a YAML file under the templates/bm/ directory.

Edit the host profile using the example template below to meet your hardware configuration requirements:

apiVersion: metal3.io/v1alpha1
kind: BareMetalHostProfile
metadata:
  name: <PROFILE_NAME>
  namespace: <PROJECT_NAME>
spec:
  devices:
  # From the HW node, obtain the first device, which size is at least 60Gib
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      minSize: 60Gi
      type: ssd
      wipe: true
    partitions:
    - name: bios_grub
      partflags:
      - bios_grub
      size: 4Mi
      wipe: true
    - name: uefi
      partflags:
      - esp
      size: 200Mi
      wipe: true
    - name: config-2
      size: 64Mi
      wipe: true
    # This partition is only required on compute nodes if you plan to
    # use LVM ephemeral storage.
    - name: lvm_nova_part
      wipe: true
      size: 100Gi
    - name: lvm_root_part
      size: 0
      wipe: true
  # From the HW node, obtain the second device, which size is at least 60Gib
  # If a device exists but does not fit the size,
  # the BareMetalHostProfile will not be applied to the node
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      minSize: 60Gi
      type: ssd
      wipe: true
  # From the HW node, obtain the disk device with the exact name
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      minSize: 60Gi
      wipe: true
    partitions:
    - name: lvm_lvp_part
      size: 0
      wipe: true
  # Example of wiping a device w\o partitioning it.
  # Mandatory for the case when a disk is supposed to be used for Ceph backend
  # later
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      wipe: true
  fileSystems:
  - fileSystem: vfat
    partition: config-2
  - fileSystem: vfat
    mountPoint: /boot/efi
    partition: uefi
  - fileSystem: ext4
    logicalVolume: root
    mountPoint: /
  - fileSystem: ext4
    logicalVolume: lvp
    mountPoint: /mnt/local-volumes/
  logicalVolumes:
  - name: root
    size: 0
    vg: lvm_root
  - name: lvp
    size: 0
    vg: lvm_lvp
  postDeployScript: |
    #!/bin/bash -ex
    echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
  preDeployScript: |
    #!/bin/bash -ex
    echo $(date) 'pre_deploy_script done' >> /root/pre_deploy_done
  volumeGroups:
  - devices:
    - partition: lvm_root_part
    name: lvm_root
  - devices:
    - partition: lvm_lvp_part
    name: lvm_lvp
  grubConfig:
    defaultGrubOptions:
    - GRUB_DISABLE_RECOVERY="true"
    - GRUB_PRELOAD_MODULES=lvm
    - GRUB_TIMEOUT=20
  kernelParameters:
    sysctl:
    # For the list of options prohibited to change, refer to
    # https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.html
      kernel.dmesg_restrict: "1"
      kernel.core_uses_pid: "1"
      fs.file-max: "9223372036854775807"
      fs.aio-max-nr: "1048576"
      fs.inotify.max_user_instances: "4096"
      vm.max_map_count: "262144"

Add or edit the mandatory parameters in the new BareMetalHostProfile object. For the parameters description, see BareMetalHostProfile spec <bmhprofile-spec>.
Note

If asymmetric traffic is expected on some of the managed cluster nodes, enable the loose mode for the corresponding interfaces on those nodes by setting the net.ipv4.conf.<interface-name>.rp_filter parameter to "2" in the kernelParameters.sysctl section. For example:
```
kernelParameters:
 sysctl:
 net.ipv4.conf.k8s-lcm.rp_filter: "2"
```
Configure required disks for the Ceph cluster as described in Configure Ceph disks in a host profile.
Optional. Configure wiping of the target device or partition to be used for cluster deployment as described in Wipe a device or partition.

Optional. Configure multiple devices for LVM volume using the example template extract below for reference.

Caution

The following template extract contains only sections relevant to LVM configuration with multiple PVs. Expand the main template described in the previous step with the configuration below if required.

spec:
  devices:
    ...
    - device:
      ...
      partitions:
        - name: lvm_lvp_part1
          size: 0
          wipe: true
    - device:
      ...
      partitions:
        - name: lvm_lvp_part2
          size: 0
          wipe: true
volumeGroups:
  ...
  - devices:
    - partition: lvm_lvp_part1
    - partition: lvm_lvp_part2
    name: lvm_lvp
logicalVolumes:
  ...
  - name: root
    size: 0
    vg: lvm_lvp
fileSystems:
  ...
  - fileSystem: ext4
    logicalVolume: root
    mountPoint: /

Optional. Technology Preview. Configure support of the Redundant Array of Independent Disks (RAID) that allows, for example, installing a cluster operating system on a RAID device, refer to Configure RAID support.

Optional. Configure the RX/TX buffer size for physical network interfaces and txqueuelen for any network interfaces.

This configuration can greatly benefit high-load and high-performance network interfaces. You can configure these parameters using the udev rules. For example:

postDeployScript: |
  #!/bin/bash -ex
  ...
  echo 'ACTION=="add|change", SUBSYSTEM=="net", KERNEL=="eth*|en*", RUN+="/sbin/ethtool -G $name rx 4096 tx 4096"' > /etc/udev/rules.d/59-net.ring.rules

  echo 'ACTION=="add|change", SUBSYSTEM=="net", KERNEL=="eth*|en*|bond*|k8s-*|v*" ATTR{tx_queue_len}="10000"' > /etc/udev/rules.d/58-net.txqueue.rules

Select from the following options:
- For a management cluster, proceed with the cluster bootstrap procedure as described in Deploy a management cluster.
- For a managed cluster, select from the following options:
 Using the Container Cloud web UI Available since MCC 2.26.0 (17.1.0 and 16.1.0)
 1. Log in to the Container Cloud web UI with the operator permissions.
 2. Switch to the required non-default project using the Switch Project action icon located on top of the main left-side navigation panel.
 
 Caution
 
 Do not create a MOSK cluster in the default project (Kubernetes namespace), which is dedicated for the management cluster only. If no projects are defined, first create a new mosk project as described in Create a project for MOSK clusters.
 3. In the left sidebar, navigate to Baremetal and click the Host Profiles tab.
 4. Click Create Host Profile.
 5. Fill out the Create host profile form:
 
 Name
 Name of the bare metal host profile.
 
 Specification
 BareMetalHostProfile object specification in the YAML format that you have previously created. Click Edit to edit the BareMetalHostProfile object if required.
 
 Note
 
 Before Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), the field name is YAML file, and you can upload the required YAML file instead of inserting and editing it.
 
 Labels
 Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). Key-value pairs attached to BareMetalHostProfile.
 Using the Container Cloud API
 1. Add the bare metal host profile to your management cluster:
 
 kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <managedClusterProjectName> apply -f <pathToBareMetalHostProfileFile>
 2. If required, further modify the host profile:
 
 kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <managedClusterProjectName> edit baremetalhostprofile <hostProfileName>
Repeat the steps above to create host profiles for other OpenStack node roles such as control plane nodes and storage nodes.

Now, proceed to Enable huge pages in a host profile.

Configure Ceph disks in a host profile¶

This section describes how to configure devices for the Ceph cluster in the BareMetalHostProfile object of a managed cluster.

To configure disks for a Ceph cluster:

Open the BareMetalHostProfile object of a managed cluster for editing.

In the spec.devices section, add each disk intended for use as a Ceph OSD data device with size: 0 and wipe: true.

Example configuration for sde - sdh disks to use as Ceph OSDs:

spec:
  devices:
  ...
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
      size: 0
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
      size: 0
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:3
      size: 0
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:4
      size: 0
      wipe: true

Since MOSK 23.2, if you plan to use a separate metadata device for Ceph OSD, configure the spec.devices section as described below.

Important

Mirantis highly recommends configuring disk partitions for Ceph OSD metadata using BareMetalHostProfile.

Configuration of a separate metadata device for Ceph OSD

Add the device to spec.devices with a single partition that will use the entire disk size.

For example, if you plan to use four Ceph OSDs with a separate metadata device for each Ceph OSD, configure the spec.devices section as follows:

spec:
  devices:
  ...
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:5
      wipe: true
    partitions:
    - name: ceph_meta
      size: 0
      wipe: true

Create a volume group on top of the defined partition and create the required number of logical volumes (LVs) on top of the created volume group (VG). Add one logical volume per one Ceph OSD on the node.

Example snippet of an LVM configuration for a Ceph metadata disk:

spec:
  ...
  volumeGroups:
  ...
  - devices:
    - partition: ceph_meta
    name: bluedb
  logicalVolumes:
  ...
  - name: meta_1
    size: 25%VG
    vg: bluedb
  - name: meta_2
    size: 25%VG
    vg: bluedb
  - name: meta_3
    size: 25%VG
    vg: bluedb
  - name: meta_4
    size: 25%VG
    vg: bluedb

Important

Plan LVs of a separate metadata device thoroughly. Any logical volume misconfiguration causes redeployment of all Ceph OSDs that use this disk as metadata devices.

Note

General Ceph recommendation is to have a metadata device in between 1% to 4% of the Ceph OSD data size. Mirantis highly recommends having at least 4% of Ceph OSD data size.

If you plan using a disk as a separate metadata device for 10 Ceph OSDs, define the size of an LV for each Ceph OSD in between 1% to 4% of the corresponding Ceph OSD data size. If RADOS Gateway is enabled, the minimum data size must be 4%. For details, see Ceph documentation: Bluestore config reference.

For example, if the total data size of 10 Ceph OSDs equals 1Tb with 100Gb each, assign a metadata disk less than 10Gb with 1Gb per each LV. The recommended size is 40Gb with 4Gb per each LV.

After applying BareMetalHostProfile, the bare metal provider creates an LVM partitioning for the metadata disk and places these volumes as /dev paths, for example, /dev/bluedb/meta_1 or /dev/bluedb/meta_3.

Example template of a host profile configuration for Ceph

spec:
  ...
  devices:
  ...
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
      wipe: true
  - device:
      byName: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:3
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:4
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:5
      wipe: true
    partitions:
    - name: ceph_meta
      size: 0
      wipe: true
  volumeGroups:
  ...
  - devices:
    - partition: ceph_meta
    name: bluedb
  logicalVolumes:
  ...
  - name: meta_1
    size: 25%VG
    vg: bluedb
  - name: meta_2
    size: 25%VG
    vg: bluedb
  - name: meta_3
    size: 25%VG
    vg: bluedb
  - name: meta_4
    size: 25%VG
    vg: bluedb

After applying such BareMetalHostProfile to a node, the nodes spec of the KaaSCephCluster object contains the following storageDevices section:

Since MOSK 23.3

spec:
  cephClusterSpec:
    ...
    nodes:
      ...
      machine-1:
        ...
        storageDevices:
        - fullPath: /dev/disk/by-id/scsi-SATA_ST4000NM002A-2HZ_WS20NNKC
          config:
            metadataDevice: /dev/bluedb/meta_1
        - fullPath: /dev/disk/by-id/ata-ST4000NM002A-2HZ101_WS20NEGE
          config:
            metadataDevice: /dev/bluedb/meta_2
        - fullPath: /dev/disk/by-id/scsi-0ATA_ST4000NM002A-2HZ_WS20LEL3
          config:
            metadataDevice: /dev/bluedb/meta_3
        - fullPath: /dev/disk/by-id/ata-HGST_HUS724040ALA640_PN1334PEDN9SSU
          config:
            metadataDevice: /dev/bluedb/meta_4

Before MOSK 23.3

spec:
  cephClusterSpec:
    ...
    nodes:
      ...
      machine-1:
        ...
        storageDevices:
        - name: sde
          config:
            metadataDevice: /dev/bluedb/meta_1
        - name: sdf
          config:
            metadataDevice: /dev/bluedb/meta_2
        - name: sdg
          config:
            metadataDevice: /dev/bluedb/meta_3
        - name: sdh
          config:
            metadataDevice: /dev/bluedb/meta_4

Enable huge pages in a host profile¶

The BareMetalHostProfile API allows configuring a host to use the huge pages feature of the Linux kernel on managed clusters. The procedure included in this section applies to both new and existing cluster deployments.

Note

Huge pages is a mode of operation of the Linux kernel. With huge pages enabled, the kernel allocates the RAM in bigger chunks, or pages. This allows kernel-based virtual machines and virtual machines running on it to use the host RAM more efficiently and improves the performance of the virtual machines.

To enable huge pages in a custom bare metal host profile for a managed cluster:

Log in to the local machine where you management cluster kubeconfig is located and where kubectl is installed.

Note

The management cluster kubeconfig is created automatically during the last stage of the management cluster bootstrap.
Open for editing or create a new bare metal host profile under the templates/bm/ directory.
Edit the grubConfig section of the host profile spec using the example below to configure the kernel boot parameters and enable huge pages:
```
spec:
  grubConfig:
    defaultGrubOptions:
    - GRUB_DISABLE_RECOVERY="true"
    - GRUB_PRELOAD_MODULES=lvm
    - GRUB_TIMEOUT=20
    - GRUB_CMDLINE_LINUX_DEFAULT="hugepagesz=1G hugepages=N"
```
The example configuration above will allocate N huge pages of 1 GB each on the server boot. The last hugepagesz parameter value is default unless default_hugepagesz is defined. For details about possible values, see official Linux kernel documentation.

Add the bare metal host profile to your management cluster:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> apply -f <pathToBareMetalHostProfileFile>

If required, further modify the host profile:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit baremetalhostprofile <hostProfileName>

Proceed to Create a MOSK cluster.

Configure RAID support¶

TechPreview

During a management or MOSK cluster creation, you can configure the support of the software-based Redundant Array of Independent Disks (RAID) using BareMetalHosProfile to set up an LVM-based RAID level 1 (raid1) or an mdadm-based RAID level 0, 1, or 10 (raid0, raid1, or raid10).

If required, you can further configure RAID in the same profile, for example, to install a cluster operating system onto a RAID device.

Caution

RAID configuration on already provisioned bare metal machines or on an existing cluster is not supported.

To start using any kind of RAIDs, reprovisioning of machines with a new BaremetalHostProfile is required.
Mirantis supports the raid1 type of RAID devices both for LVM and mdadm.
Mirantis supports the raid0 type for the mdadm RAID to be on par with the LVM linear type.
Mirantis recommends having at least two physical disks for raid0 and raid1 devices to prevent unnecessary complexity.
Mirantis supports the raid10 type for mdadm RAID. At least four physical disks are required for this type of RAID.
Only an even number of disks can be used for a raid1 or raid10 device.

Create an LVM software RAID (raid1)¶

TechPreview

Warning

The EFI system partition partflags: ['esp'] must be a physical partition in the main partition table of the disk, not under LVM or mdadm software RAID.

During configuration of your custom bare metal host profile, you can create an LVM-based software RAID device raid1 by adding type: raid1 to the logicalVolume spec in BaremetalHostProfile.

For the LVM RAID parameters description, refer to BareMetalHostProfile spec.
For a bare metal host profile configuration, refer to Create a custom bare metal host profile.

Caution

The logicalVolume spec of the raid1 type requires at least two devices (partitions) in volumeGroup where you build a logical volume. For an LVM of the linear type, one device is enough.

You can use flexible size units throughout bare metal host profiles. For example, you can now use either sizeGiB: 0.1 or size: 100Mi when specifying a device size.

Note

The LVM raid1 requires additional space to store the raid1 metadata on a volume group, roughly 4 MB for each partition. Therefore, you cannot create a logical volume of exactly the same size as the partitions it works on.

For example, if you have two partitions of 10 GiB, the corresponding raid1 logical volume size will be less than 10 GiB. For that reason, you can either set size: 0 to use all available space on the volume group, or set a smaller size than the partition size. For example, use size: 9.9Gi instead of size: 10Gi for the logical volume.

The following example illustrates an extract of BaremetalHostProfile with / on the LVM raid1.

...
devices:
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      minSize: 200Gi
      type: hdd
      wipe: true
    partitions:
      - name: root_part1
        size: 120Gi
    partitions:
      - name: rest_sda
        size: 0
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      minSize: 200Gi
      type: hdd
      wipe: true
    partitions:
      - name: root_part2
        size: 120Gi
    partitions:
      - name: rest_sdb
        size: 0
volumeGroups:
  - name: vg-root
    devices:
      - partition: root_part1
      - partition: root_part2
  - name: vg-data
    devices:
      - partition: rest_sda
      - partition: rest_sdb
logicalVolumes:
  - name: root
    type: raid1  ## <-- LVM raid1
    vg: vg-root
    size: 119.9Gi
  - name: data
    type: linear
    vg: vg-data
    size: 0
fileSystems:
  - fileSystem: ext4
    logicalVolume: root
    mountPoint: /
    mountOpts: "noatime,nodiratime"
  - fileSystem: ext4
    logicalVolume: data
    mountPoint: /mnt/data

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

Create LVM volume groups on top of RAID devices¶

TechPreview

You can configure an LVM volume group on top of mdadm-based RAID devices as physical volumes using the BareMetalHostProfile resource. List the required RAID devices in a separate field of the volumeGroups definition within the storage configuration of BareMetalHostProfile.

You can use flexible size units throughout bare metal host profiles. For example, you can now use either sizeGiB: 0.1 or size: 100Mi when specifying a device size.

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

The following example illustrates an extract of BaremetalHostProfile with a volume group named lvm_nova to be created on top of an mdadm-based RAID device raid1:

...
devices:
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      minSize: 60Gi
      type: ssd
      wipe: true
    partitions:
      - name: bios_grub
        partflags:
          - bios_grub
        size: 4Mi
      - name: uefi
        partflags:
          - esp
        size: 200Mi
      - name: config-2
        size: 64Mi
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      minSize: 30Gi
      type: ssd
      wipe: true
    partitions:
      - name: md0_part1
  - device:
      workBy: "by_id,by_wwn,by_path,by_name"
      minSize: 30Gi
      type: ssd
      wipe: true
    partitions:
      - name: md0_part2
softRaidDevices:
  - devices:
      - partition: md0_part1
      - partition: md0_part2
    level: raid1
    metadata: "1.0"
    name: /dev/md0
volumeGroups:
  - devices:
      - softRaidDevice: /dev/md0
    name: lvm_nova
...

Create an mdadm software RAID (raid0, raid1, raid10)¶

TechPreview

Warning

The EFI system partition partflags: ['esp'] must be a physical partition in the main partition table of the disk, not under LVM or mdadm software RAID.

During configuration of your custom bare metal host profile as described in Create a custom bare metal host profile, you can create an mdadm-based software RAID device raid0 and raid1 by describing the mdadm devices under the softRaidDevices field in BaremetalHostProfile. For example:

...
softRaidDevices:
- name: /dev/md0
  devices:
  - partition: sda1
  - partition: sdb1
- name: raid-name
  devices:
  - partition: sda2
  - partition: sdb2
...

You can also use the raid10 type for the mdadm-based software RAID devices. This type requires at least four and in total an even number of storage devices available on your servers. For example:

softRaidDevices:
- name: /dev/md0
  level: raid10
  devices:
    - partition: sda1
    - partition: sdb1
    - partition: sdd1

The following fields in softRaidDevices describe RAID devices:

name
Name of the RAID device to refer to throughout the baremetalhostprofile.
level
Type or level of RAID used to create a device, defaults to raid1. Set to raid0 or raid10 to create a device of the corresponding type.
devices
List of physical devices or partitions used to build a software RAID device. It must include at least two partitions or devices to build a raid0 and raid1 devices and at least four for raid10.

For the rest of the mdadm RAID parameters, see BareMetalHostProfile spec.

Caution

The mdadm RAID devices cannot be created on top of LVM devices.

You can use flexible size units throughout bare metal host profiles. For example, you can now use either sizeGiB: 0.1 or size: 100Mi when specifying a device size.

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

The following example illustrates an extract of BaremetalHostProfile with / on the mdadm raid1 and some data storage on raid0:

The following example illustrates an extract of BaremetalHostProfile with data storage on a raid10 device:

Create an mdadm software RAID level 10 (raid10)¶

TechPreview

Warning

The EFI system partition partflags: ['esp'] must be a physical partition in the main partition table of the disk, not under LVM or mdadm software RAID.

You can deploy MOSK on local software-based Redundant Array of Independent Disks (RAID) devices to withstand failure of one device at a time.

Using a custom bare metal host profile, you can configure and create an mdadm-based software RAID device of type raid10 if you have an even number of devices available on your servers. At least four storage devices are required for such RAID device.

During configuration of your custom bare metal host profile as described in Create a custom bare metal host profile, create an mdadm-based software RAID device raid10 by describing the mdadm devices under the softRaidDevices field. For example:

...
softRaidDevices:
- name: /dev/md0
  level: raid10
  devices:
    - partition: sda1
    - partition: sdb1
    - partition: sdc1
    - partition: sdd1
...

The following fields in softRaidDevices describe RAID devices:

name
Name of the RAID device to refer to throughout the baremetalhostprofile.
devices
List of physical devices or partitions used to build a software RAID device. It must include at least four partitions or devices to build a raid10 device.
level
Type or level of RAID used to create device. Set to raid10 or raid1 to create a device of the corresponding type.

For the rest of the mdadm RAID parameters, see BareMetalHostProfile spec.

Caution

The mdadm RAID devices cannot be created on top of an LVM device.

The following example illustrates an extract of baremetalhostprofile with data storage on a raid10 device:

...
devices:
  - device:
      minSize: 60Gi
      wipe: true
    partitions:
      - name: bios_grub1
        partflags:
          - bios_grub
        size: 4Mi
        wipe: true
      - name: uefi
        partflags:
          - esp
        size: 200Mi
        wipe: true
      - name: config-2
        size: 64Mi
        wipe: true
      - name: lvm_root
        size: 0
        wipe: true
  - device:
      minSize: 60Gi
      wipe: true
    partitions:
      - name: md_part1
        partflags:
          - raid
        size: 40Gi
        wipe: true
  - device:
      minSize: 60Gi
      wipe: true
    partitions:
      - name: md_part2
        partflags:
          - raid
        size: 40Gi
        wipe: true
  - device:
      minSize: 60Gi
      wipe: true
    partitions:
      - name: md_part3
        partflags:
          - raid
        size: 40Gi
        wipe: true
  - device:
      minSize: 60Gi
      wipe: true
    partitions:
      - name: md_part4
        partflags:
          - raid
        size: 40Gi
        wipe: true
fileSystems:
  - fileSystem: vfat
    partition: config-2
  - fileSystem: vfat
    mountPoint: /boot/efi
    partition: uefi
  - fileSystem: ext4
    mountOpts: rw,noatime,nodiratime,lazytime,nobarrier,commit=240,data=ordered
    mountPoint: /
    partition: root
  - filesystem: ext4
    mountPoint: /var
    softRaidDevice: /dev/md0
softRaidDevices:
  - devices:
      - partition: md_root_part1
      - partition: md_root_part2
      - partition: md_root_part3
      - partition: md_root_part4
    level: raid10
    metadata: "1.2"
    name: /dev/md0
...

Warning

When building the raid10 array on top of device partitions, make sure that only one partition per device is used for a given array.

Although having two partitions located on the same physical device as array members is technically possible, it may lead to data loss if mdadm selects both partitions of the same drive to be mirrored. In such case, redundancy against entire drive failure is lost.

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

Create a MOSK cluster¶

With L2 networking templates, you can create MOSK clusters with advanced host networking configurations. For example, you can create bond interfaces on top of physical interfaces on the host or use multiple subnets to separate different types of network traffic.

You can use several host-specific L2 templates per one cluster to support different hardware configurations. For example, you can create L2 templates with a different number and layout of NICs to be applied to specific machines of one cluster.

You can also use multiple L2 templates to support different roles for nodes in a MOSK installation. You can create L2 templates with different logical interfaces and assign them to individual machines based on their roles in a MOSK cluster.

Caution

Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause data loss.
Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.

For details, see Modify network configuration on an existing machine.

Since MOSK 23.2.2, in the Technology Preview scope, you can create a MOSK cluster with the multi-rack topology, where cluster nodes including Kubernetes masters are distributed across multiple racks without L2 layer extension between them, and use BGP for announcement of the cluster API load balancer address and external addresses of Kubernetes load-balanced services.

Implementation of the multi-rack topology implies the use of Rack and MultiRackCluster objects that support configuration of BGP announcement of the cluster API load balancer address. For the configuration procedure, refer to Configure BGP announcement for cluster API LB address. For configuring the BGP announcement of external addresses of Kubernetes load-balanced services, refer to Configure and verify MetalLB.

Follow the procedures described in the below subsections to configure initial settings and advanced network objects for your managed clusters.

Create a managed bare metal cluster¶

This section instructs you on how to configure and deploy a managed cluster that is based on the baremetal-based management cluster through the Mirantis Container Cloud web UI.

Note

Due to the known issue 50181, creation of a compact managed cluster or addition of any labels to the control plane nodes is not available through the Container Cloud web UI.

To create a managed cluster on bare metal:

Available since the Cluster release 16.1.0 on the management cluster. If you plan to deploy a large managed cluster, enable dynamic IP allocation to increase the amount of baremetal hosts to be provisioned in parallel. For details, see Enable dynamic IP allocation.
Available since Container Cloud 2.24.0 (Cluster release 14.0.0). Optional. Technology Preview. Enable custom host names for cluster machines. When enabled, any machine host name in a particular region matches the related Machine object name. For example, instead of the default kaas-node-<UID>, a machine host name will be master-0. The custom naming format is more convenient and easier to operate with.

For details, see Configure host names for cluster machines.

Skip this step if you enabled this feature during management cluster bootstrap, because custom host names will be automatically enabled on the related managed cluster as well.
Log in to the Container Cloud web UI with the writer permissions.
Switch to the required non-default project using the Switch Project action icon located on top of the main left-side navigation panel.

Caution

Do not create a MOSK cluster in the default project (Kubernetes namespace), which is dedicated for the management cluster only. If no projects are defined, first create a new mosk project as described in Create a project for MOSK clusters.
In the SSH keys tab, click Add SSH Key to upload the public SSH key that will be used for the SSH access to VMs.

Optional. Enable proxy access to the managed cluster:

Proxy configuration

In the Proxies tab, configure proxy:

Click Add Proxy.

In the Add New Proxy wizard, fill out the form with the following parameters:

Proxy configuration¶
Parameter	Description
Proxy Name	Name of the proxy server to use during MOSK cluster creation.
Region ^{Removed in MCC 2.26.0 (16.1.0 and 17.1.0)}	From the drop-down list, select the required region.
HTTP Proxy	Add the HTTP proxy server domain name in the following format: `http://proxy.example.com:port` - for anonymous access `http://user:password@proxy.example.com:port` - for restricted access
HTTPS Proxy	Add the HTTPS proxy server domain name in the same format as for HTTP Proxy.
No Proxy	Comma-separated list of IP addresses or domain names.

For implementation details, see Proxy support and cache of artifacts.

If your proxy requires a trusted CA certificate, select the CA Certificate check box and paste a CA certificate for a MITM proxy to the corresponding field or upload a certificate using Upload Certificate.

Note

The possibility to use a MITM proxy with a CA certificate is available since MOSK 23.1.

For the list of Mirantis resources and IP addresses to be accessible from MOSK clusters, see Reference Architecture: Requirements.

In the Clusters tab, click Create Cluster.

Configure the new cluster in the Create New Cluster wizard that opens:

Define general and Kubernetes parameters:

Create new cluster: General, Provider, and Kubernetes¶
Section	Parameter name	Description
General settings	Cluster name	The cluster name.
	Provider	Select Baremetal.
	Region ^{Removed since MOSK 24.1}	From the drop-down list, select Baremetal.
	Release version	Select a Container Cloud version with the OpenStack label tag. Otherwise, you will not be able to deploy MOSK on this managed cluster.
	Proxy	Optional. From the drop-down list, select the proxy server name that you have previously created.
	SSH keys	From the drop-down list, select the SSH key name that you have previously added for SSH access to the bare metal hosts.
	Container Registry	From the drop-down list, select the Docker registry name that you have previously added using the Container Registries tab. For details, see Define a custom CA certificate for a private Docker registry.
	Enable WireGuard	Optional. Technology Preview. Deprecated since Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0). Available since Container Cloud 2.24.0 (Cluster release 14.0.0). Enable WireGuard for traffic encryption on the Kubernetes workloads network. WireGuard configuration Ensure that the Calico MTU size is at least 60 bytes smaller than the interface MTU size of the workload network. IPv4 WireGuard uses a 60-byte header. For details, see Set the MTU size for Calico. Enable WireGuard by selecting the Enable WireGuard check box. Caution Changing this parameter on a running cluster causes a downtime that can vary depending on the cluster size. Note This parameter was renamed from Enable Secure Overlay to Enable WireGuard in Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). For more details about WireGuard, see Calico documentation: Encrypt in-cluster pod traffic.
	Parallel Upgrade Of Worker Machines	Optional. Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). The maximum number of the worker nodes to update simultaneously. It serves as an upper limit on the number of machines that are drained at a given moment of time. Defaults to `1`. You can also configure this option after deployment before the cluster update.
	Parallel Preparation For Upgrade Of Worker Machines	Optional. Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0) The maximum number of worker nodes being prepared at a given moment of time, which includes downloading of new artifacts. It serves as a limit for the network load that can occur when downloading the files to the nodes. Defaults to `50`. You can also configure this option after deployment before the cluster update.
Provider	LB host IP	The IP address of the load balancer endpoint that will be used to access the Kubernetes API of the new cluster. This IP address must be located in the API/LCM or external network for ARP announcement and in the LCM or external network for BGP announcement. See Underlay networking: routing configuration for details.
	LB address range ^{Removed in 24.3}	The range of IP addresses that can be assigned to load balancers for Kubernetes Services by MetalLB. For a more flexible MetalLB configuration, refer to Configure and verify MetalLB. Note Since MOSK 24.3, MetalLB configuration must be added after cluster creation.
Kubernetes	Services CIDR blocks	The Kubernetes Services CIDR blocks. For example, `10.233.0.0/18`.
	Pods CIDR blocks	The Kubernetes pods CIDR blocks. For example, `10.233.64.0/18`. Note The network subnet size of Kubernetes pods influences the number of nodes that can be deployed in the cluster. The default subnet size `/18` is enough to create a cluster with up to 256 nodes. Each node uses the `/26` address blocks (64 addresses), at least one address block is allocated per node. These addresses are used by the Kubernetes pods with `hostNetwork: false`. The cluster size may be limited further when some nodes use more than one address block.

Configure StackLight:

Note

If StackLight is enabled in non-HA mode but Ceph is not deployed yet, StackLight will not be installed and will be stuck in the Yellow state waiting for a successful Ceph installation. Once the Ceph cluster is deployed, the StackLight installation resumes. To deploy a Ceph cluster, refer to Add a Ceph cluster.

StackLight configuration

Section	Parameter name	Description
StackLight	Enable Monitoring	Selected by default. Deselect to skip StackLight deployment. Note You can also enable, disable, or configure StackLight parameters after deploying a managed cluster. For details, see Change a cluster configuration and StackLight configuration procedure.
	Enable Logging	Select to deploy the StackLight logging stack. For details about the logging components, see Deployment architecture. Note The logging mechanism performance depends on the cluster log load. In case of a high load, you may need to increase the default resource requests and limits for `fluentdLogs`. For details, see StackLight resource limits.
	HA Mode	Select to enable StackLight monitoring in High Availability (HA) mode. For differences between HA and non-HA modes, see Deployment architecture. If disabled, StackLight requires a Ceph cluster. To deploy a Ceph cluster, refer to Add a Ceph cluster.
	StackLight Default Logs Severity Level	Log severity (verbosity) level for all StackLight components. The default value for this parameter is Default component log level that respects original defaults of each StackLight component. For details about severity levels, see StackLight log verbosity.
	StackLight Component Logs Severity Level	The severity level of logs for a specific StackLight component that overrides the value of the StackLight Default Logs Severity Level parameter. For details about severity levels, see StackLight log verbosity. Expand the drop-down menu for a specific component to display its list of available log levels.
OpenSearch	Logstash Retention Time ^{Removed in MOSK 24.1}	Available if you select Enable Logging. Specifies the `logstash-*` index retention time.
	Events Retention Time	Available if you select Enable Logging. Specifies the `kubernetes_events-*` index retention time.
	Notifications Retention Time	Available if you select Enable Logging. Specifies the `notification-*` index retention time.
	Persistent Volume Claim Size	Available if you select Enable Logging. The OpenSearch persistent volume claim size.
	Collected Logs Severity Level	Available if you select Enable Logging. The minimum severity of all Container Cloud components logs collected in OpenSearch. For details about severity levels, see StackLight logging.
Prometheus	Retention Time	The Prometheus database retention period.
	Retention Size	The Prometheus database retention size.
	Persistent Volume Claim Size	The Prometheus persistent volume claim size.
	Enable Watchdog Alert	Select to enable the Watchdog alert that fires as long as the entire alerting pipeline is functional.
	Custom Alerts	Specify alerting rules for new custom alerts or upload a YAML file in the following exemplary format: - alert: HighErrorRate expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5 for: 10m labels: severity: page annotations: summary: High request latency For details, see Official Prometheus documentation: Alerting rules. For the list of the predefined StackLight alerts, see Operations Guide: StackLight alerts.
StackLight Email Alerts	Enable Email Alerts	Select to enable the StackLight email alerts.
	Send Resolved	Select to enable notifications about resolved StackLight alerts.
	Require TLS	Select to enable transmitting emails through TLS.
	Email alerts configuration for StackLight	Fill out the following email alerts parameters as required: To - the email address to send notifications to. From - the sender address. SmartHost - the SMTP host through which the emails are sent. Authentication username - the SMTP user name. Authentication password - the SMTP password. Authentication identity - the SMTP identity. Authentication secret - the SMTP secret.
StackLight Slack Alerts	Enable Slack alerts	Select to enable the StackLight Slack alerts.
	Send Resolved	Select to enable notifications about resolved StackLight alerts.
	Slack alerts configuration for StackLight	Fill out the following Slack alerts parameters as required: API URL - The Slack webhook URL. Channel - The channel to send notifications to, for example, #channel-for-alerts.

Click Create.

To monitor cluster readiness, see Verify cluster status.

Available since Container Cloud 2.24.0 (Cluster releases 14.0.0 and 15.0.1). Optional. Technology Preview. Enable the Linux Audit daemon auditd to monitor activity of cluster processes and prevent potential malicious activity.

Optional. Colocate the OpenStack control plane with the managed cluster Kubernetes manager nodes by adding the following field to the Cluster object spec:
```
spec:
  providerSpec:
    value:
      dedicatedControlPlane: false
```
Note

This feature is available as technical preview. Use such configuration for testing and evaluation purposes only.
Optional. Customize MetalLB speakers that are deployed on all Kubernetes nodes except master nodes by default. For details, see Configure node selectors for MetalLB speakers.
Configure the MetalLB parameters related to IP address allocation and announcement for load-balanced cluster services. For details, see Configure and verify MetalLB.
Proceed to Obtain and use details about network interfaces.

Note

Once you have created a MOSK cluster, some StackLight alerts may raise as false-positive until you deploy the Mirantis OpenStack environment.

Configure MetalLB¶

Before configuring subnets for a MOSK cluster, set up and verify MetalLB parameters as described in the following subsections.

Configure and verify MetalLB¶

This section describes how to set up and verify MetalLB parameters before configuring subnets for a MOSK cluster.

Caution

This section also applies to the bootstrap procedure of a management cluster with the following differences:

Instead of the Cluster object, configure templates/bm/cluster.yaml.template.
Instead of the MetalLBConfig object, configure templates/bm/metallbconfig.yaml.template.
Instead of creating specific IPAM objects such as Subnet and L2Template (as well as Rack and MultiRackCluster when using BGP configuration), add their settings to templates/bm/ipam-objects.yaml.template.

Configuration rules for the ‘MetalLBConfig’ object¶

Caution

The use of the MetalLBConfig object is mandatory after your management cluster upgrade to the Cluster release 16.0.0.

The following rules and requirements apply to configuration of the MetalLBConfig object:

Define one MetalLBConfig object per cluster.
Define the following mandatory labels:

cluster.sigs.k8s.io/cluster-name
Specifies the cluster name where the MetalLB address pool is used.

kaas.mirantis.com/region
Specifies the region name of the cluster where the MetalLB address pool is used.

kaas.mirantis.com/provider
Specifies the provider of the cluster where the MetalLB address pool is used.

Note

The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
Intersection of IP address ranges within any single MetalLB address pool is not allowed.
At least one MetalLB address pool must have the auto-assign policy enabled so that unannotated services can have load balancer IP addresses allocated for them.
When configuring multiple address pools with the auto-assign policy enabled, keep in mind that it is not determined in advance which pool of those multiple address pools is used to allocate an IP address for a particular unannotated service.

Note

You can optimize address announcement for load-balanced services using the interfaces selector for the l2Advertisements object. This selector allows for address announcement only on selected host interfaces. For details, see MetalLBConfig spec.

Configure and verify MetalLB using the web UI¶

Available since MOSK 24.3

Note

The BGP configuration is not yet supported in the Container Cloud web UI. Meantime, use the CLI for this purpose. For details, see Configure and verify MetalLB using the CLI.

Read the MetalLB configuration guidelines described in Configuration rules for the ‘MetalLBConfig’ object.
Optional. Configure parameters related to MetalLB components life cycle such as deployment and update using the metallb Helm chart values in the Cluster spec section. For example:
- Increase Pod resource limits for MetalLB as described in Limits for common components of any cluster type.
- Configure Pod node selectors or affinity for MetalLB speakers as described in Configure node selectors for MetalLB speakers.
Log in to the Container Cloud web UI with the writer permissions.
Switch to the required non-default project using the Switch Project action icon located on top of the main left-side navigation panel.

Caution

Do not create a MOSK cluster in the default project (Kubernetes namespace), which is dedicated for the management cluster only. If no projects are defined, first create a new mosk project as described in Create a project for MOSK clusters.
In the Networks section, click the MetalLB Configs tab.
Click Create MetalLB Config.
Fill out the Create MetalLB Config form as required:
- Name
  Name of the MetalLB object being created.
- Cluster
  Name of the cluster that the MetalLB object is being created for
- IP Address Pools
  List of MetalLB IP address pool descriptions that will be used to create the MetalLB IPAddressPool objects. Click the + button on the right side of the section to add more objects.
  - Name
    IP address pool name.
  - Addresses
    Comma-separated ranges of the IP addresses included into the address pool.
  - Auto Assign
    Enable auto-assign policy for unannotated services to have load balancer IP addresses allocated to them. At least one MetalLB address pool must have the auto-assign policy enabled.
  - Service Allocation
    IP address pool allocation to services. Click Edit to insert a service allocation object with required label selectors for services in the YAML format. For example:
    
    serviceSelectors: - matchExpressions: - key: app.kubernetes.io/name operator: NotIn values: - dhcp-lb
    
    For details on the MetalLB IPAddressPool object type, see MetalLB documentation.
  - L2 Advertisements
    List of MetalLBL2Advertisement objects to create MetalLB L2Advertisement objects.
    
    The l2Advertisements object allows defining interfaces to optimize the announcement. When you use the interfaces selector, LB addresses are announced only on selected host interfaces.
    
    Mirantis recommends using the interfaces selector if nodes use separate host networks for different types of traffic. The pros of such configuration are as follows: less spam on other interfaces and networks and limited chances to reach IP addresses of load-balanced services from irrelevant interfaces and networks.
    
    Caution
    
    Interface names in the interfaces list must match those on the corresponding nodes.
    
    Add the following parameters:
    
    Name
    Name of the l2Advertisements object.
    
    Interfaces
    Optional. Comma-separated list of interface names that must match the ones on the corresponding nodes. These names are defined in L2 templates that are linked to the selected cluster.
    
    IP Address Pools
    Select the IP adress pool to use for the l2Advertisements object.
    
    Node Selectors
    Optional. Match labels and values for the Kubernetes node selector to limit the nodes announced as next hops for the LoadBalancer IP. If you do not provide any labels, all nodes are announced as next hops.
    
    For details on the MetalLB L2Advertisements object type, see MetalLB documentation.
Click Create.
In Networks > MetalLB Configs, verify the status of the created MetalLB object:
- Ready - object is operational.
- Error - object is non-operational. Hover over the status to obtain details of the issue.
Note

To verify the object details, in Networks > MetalLB Configs, click the More action icon in the last column of the required object section and select MetalLB Config info.
Proceed to creating cluster subnets as described in Create subnets.

Configure and verify MetalLB using the CLI¶

Optional. Configure parameters related to MetalLB components life cycle such as deployment and update using the metallb Helm chart values in the Cluster spec section. For example:
- Increase Pod resource limits for MetalLB as described in Limits for common components of any cluster type.
- Configure Pod node selectors or affinity for MetalLB speakers as described in Configure node selectors for MetalLB speakers.
Configure the MetalLB parameters related to IP address allocation and announcement for load-balanced cluster services:
Since MOSK 24.2
Mandatory after a management cluster upgrade to the Cluster release 17.2.0. Recommended and default since MOSK 24.2.

Create the MetalLBConfig object:
- For configuration rules and requirements, see Configuration rules for the ‘MetalLBConfig’ object.
- For the object description, see MetalLBConfig.
- For configuration examples that use MetalLBConfig, see the following sections in the Container Cloud documentation:
  - Example of a complete template configuration for cluster creation (the MetalLB configuration objects step)
  - MetalLB configuration examples
In the Technology Preview scope, you can use BGP for announcement of external addresses of Kubernetes load-balanced services for a MOSK cluster. To configure the BGP announcement mode for MetalLB, use MetalLBConfig object.

The use of BGP is required to announce IP addresses for load-balanced services when using MetalLB on nodes that are distributed across multiple racks. In this case, setting of rack-id labels on nodes is required, they are used in node selectors for BGPPeer, BGPAdvertisement, or both MetalLB objects to properly configure BGP connections from each node.
Configuration example of the Machine object for the BGP announcement mode
apiVersion: cluster.k8s.io/v1alpha1 kind: Machine metadata: name: test-cluster-compute-1 namespace: mosk-ns labels: cluster.sigs.k8s.io/cluster-name: test-cluster ipam/RackRef: rack-1 # reference to the "rack-1" Rack kaas.mirantis.com/provider: baremetal spec: providerSpec: value: ... nodeLabels: - key: rack-id # node label can be used in "nodeSelectors" inside value: rack-1 # "BGPPeer" and/or "BGPAdvertisement" MetalLB objects ...
Configuration example of the MetalLBConfig object for the BGP announcement mode
apiVersion: ipam.mirantis.com/v1alpha1 kind: MetalLBConfig metadata: name: test-cluster-metallb-config namespace: mosk-ns labels: cluster.sigs.k8s.io/cluster-name: test-cluster kaas.mirantis.com/provider: baremetal spec: ... bgpPeers: - name: svc-peer-1 spec: holdTime: 0s keepaliveTime: 0s peerAddress: 10.77.42.1 peerASN: 65100 myASN: 65101 nodeSelectors: - matchLabels: rack-id: rack-1 # references the nodes having # the "rack-id=rack-1" label bgpAdvertisements: - name: services spec: aggregationLength: 32 aggregationLengthV6: 128 ipAddressPools: - services peers: - svc-peer-1 ...

The bgpPeers and bgpAdvertisements fields are used to configure BGP announcement instead of l2Advertisements.
The use of BGP for announcement also allows for better balancing of service traffic between cluster nodes as well as gives more configuration control and flexibility for infrastructure administrators. For configuration examples, refer to MetalLB configuration examples. For configuration procedure, refer to Configure BGP announcement for cluster API LB address.
Since MOSK 23.2
Select from the following options:
- Deprecated since MOSK 24.2 and unsupported since MOSK 24.3. Mandatory after a management cluster upgrade to the Cluster release 16.0.0. Recommended and default since MOSK 23.2 in the Technology Preview scope. Create MetalLBConfig and MetalLBConfigTemplate objects. This method allows using the Subnet object to define MetalLB address pools.
  - For configuration rules and requirements, see Configuration rules for the ‘MetalLBConfig’ object.
  - For objects description, see MetalLBConfig and MetalLBConfigTemplate.
  - For configuration examples that use MetalLBConfig and MetalLBConfigTemplate, see the following sections in Container Cloud documentation:
    
    Example of a complete template configuration for cluster creation (the MetalLB configuration objects step)
    
    MetalLB configuration examples
    
    MetalLBConfigTemplate configuration examples
  Since MOSK 23.2.2, in the Technology Preview scope, you can use BGP for announcement of external addresses of Kubernetes load-balanced services for a MOSK cluster. To configure the BGP announcement mode for MetalLB, use MetalLBConfig and MetalLBConfigTemplate objects.
  
  The use of BGP is required to announce IP addresses for load-balanced services when using MetalLB on nodes that are distributed across multiple racks. In this case, setting of rack-id labels on nodes is required, they are used in node selectors for BGPPeer, BGPAdvertisement, or both MetalLB objects to properly configure BGP connections from each node.
  Configuration example of the Machine object for the BGP announcement mode
  
  apiVersion: cluster.k8s.io/v1alpha1 kind: Machine metadata: name: test-cluster-compute-1 namespace: mosk-ns labels: cluster.sigs.k8s.io/cluster-name: test-cluster ipam/RackRef: rack-1 # reference to the "rack-1" Rack kaas.mirantis.com/provider: baremetal kaas.mirantis.com/region: region-one spec: providerSpec: value: ... nodeLabels: - key: rack-id # node label can be used in "nodeSelectors" inside value: rack-1 # "BGPPeer" and/or "BGPAdvertisement" MetalLB objects ...
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
  Configuration example of the MetalLBConfigTemplate object for the BGP announcement mode
  
  apiVersion: ipam.mirantis.com/v1alpha1 kind: MetalLBConfigTemplate metadata: name: test-cluster-metallb-config-template namespace: mosk-ns labels: cluster.sigs.k8s.io/cluster-name: test-cluster kaas.mirantis.com/provider: baremetal kaas.mirantis.com/region: region-one spec: templates: ... bgpPeers: | - name: svc-peer-1 spec: peerAddress: 10.77.42.1 peerASN: 65100 myASN: 65101 nodeSelectors: - matchLabels: rack-id: rack-1 # references the nodes having # the "rack-id=rack-1" label bgpAdvertisements: | - name: services spec: ipAddressPools: - services peers: - svc-peer-1 ...
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
  
  The bgpPeers and bgpAdvertisements fields are used to configure BGP announcement instead of l2Advertisements.
  The use of BGP for announcement also allows for better balancing of service traffic between cluster nodes as well as gives more configuration control and flexibility for infrastructure administrators. For configuration examples, refer to MetalLBConfigTemplate. For configuration procedure, refer to Configure BGP announcement for cluster API LB address.
- Not recommended. Configure the configInline value in the MetalLB chart of the Cluster object.
  
  Warning
  
  This option is deprecated since MOSK 23.2 and is removed during the management cluster upgrade to the Cluster release 16.0.0, which is introduced in Container Cloud 2.25.0.
  
  Therefore, this option becomes unavailable on MOSK 23.2 clusters after the parent management cluster upgrade to 2.25.0.
- Not recommended. Configure the Subnet objects without MetalLBConfigTemplate.
  
  Warning
  
  This option is deprecated since MOSK 23.2 and is removed during the management cluster upgrade to the Cluster release 16.0.0, which is introduced in Container Cloud 2.25.0.
  
  Therefore, this option becomes unavailable on MOSK 23.2 clusters after the parent management cluster upgrade to 2.25.0.
Caution

If the MetalLBConfig object is not used for MetalLB configuration related to address allocation and announcement for load-balanced services, then automated migration applies during cluster creation or update to MOSK 23.2.

During automated migration, the MetalLBConfig and MetalLBConfigTemplate objects are created and contents of the MetalLB chart configInline value is converted to the parameters of the MetalLBConfigTemplate object.

Any change to the configInline value made on a MOSK 23.2 cluster will be reflected in the MetalLBConfigTemplate object.

This automated migration is removed during your management cluster upgrade to the Cluster release 16.0.0, which is introduced in Container Cloud 2.25.0, together with the possibility to use the configInline value of the MetalLB chart. After that, any changes in MetalLB configuration related to address allocation and announcement for load-balanced services are applied using the MetalLBConfig, MetalLBConfigTemplate, and Subnet objects only.
Before MOSK 23.2
Select from the following options:
- Configure Subnet objects. For details, see MetalLB configuration guidelines for subnets.
- Configure the configInline value for the MetalLB chart in the Cluster object.
- Configure both the configInline value for the MetalLB chart and Subnet objects.
  
  The resulting MetalLB address pools configuration will contain address ranges from both cluster specification and Subnet objects. All address ranges for L2 address pools will be aggregated into a single L2 address pool and sorted as strings.
Changes to be applied since MOSK 23.2

The configuration options above become deprecated since 23.2, and automated migration of MetalLB parameters applies during cluster creation or update to MOSK 23.2.

During automated migration, the MetalLBConfig and MetalLBConfigTemplate objects are created and contents of the MetalLB chart configInline value is converted to the parameters of the MetalLBConfigTemplate object.

Any change to the configInline value made on a MOSK 23.2 cluster will be reflected in the MetalLBConfigTemplate object.

This automated migration is removed during your management cluster upgrade to Container Cloud 2.25.0 together with the possibility to use the configInline value of the MetalLB chart. After that, any changes in MetalLB configuration related to address allocation and announcement for load-balanced services will be applied using the MetalLBConfigTemplate and Subnet objects only.

Verify the current MetalLB configuration:

Since MOSK 22.5

Verify the MetalLB configuration that is stored in MetalLB objects:

kubectl -n metallb-system get ipaddresspools,l2advertisements

The example system output:

NAME                                    AGE
ipaddresspool.metallb.io/default        129m
ipaddresspool.metallb.io/services-pxe   129m

NAME                                      AGE
l2advertisement.metallb.io/default        129m
l2advertisement.metallb.io/services-pxe   129m

Verify one of the listed above MetalLB objects:

kubectl -n metallb-system get <object> -o json | jq '.spec'

The example system output for ipaddresspool objects:

$ kubectl -n metallb-system get ipaddresspool.metallb.io/default -o json | jq '.spec'
{
  "addresses": [
    "10.0.11.61-10.0.11.80"
  ],
  "autoAssign": true,
  "avoidBuggyIPs": false
}
$ kubectl -n metallb-system get ipaddresspool.metallb.io/services-pxe -o json | jq '.spec'
{
  "addresses": [
    "10.0.0.61-10.0.0.70"
  ],
  "autoAssign": false,
  "avoidBuggyIPs": false
}

Before MOSK 22.5

Verify the MetalLB configuration that is stored in the ConfigMap object:

kubectl -n metallb-system get cm metallb -o jsonpath={.data.config}

An example of a successful output:

address-pools:
- name: default
  protocol: layer2
  addresses:
  - 10.0.11.61-10.0.11.80
- name: services-pxe
  protocol: layer2
  auto-assign: false
  addresses:
  - 10.0.0.61-10.0.0.70

The auto-assign parameter will be set to false for all address pools except the default one. So, a particular service will get an address from such an address pool only if the Service object has a special metallb.universe.tf/address-pool annotation that points to the specific address pool name.

Note

It is expected that every Kubernetes service on a management cluster will be assigned to one of the address pools. Current consideration is to have two MetalLB address pools:

services-pxe is a reserved address pool name to use for the Kubernetes services in the PXE network (Ironic API, HTTP server, caching server).
default is an address pool to use for all other Kubernetes services in the management network. No annotation is required on the Service objects in this case.

Proceed to creating cluster subnets as described in Create subnets.

Configure node selectors for MetalLB speakers¶

By default, MetalLB speakers are deployed on all Kubernetes nodes except master nodes. You can configure MetalLB to run its speakers on a particular set of nodes. This decreases the number of nodes that should be connected to external network. In this scenario, only a few nodes are exposed for ingress traffic from the outside world.

To customize a node selector for a MetalLB speaker:

Using kubeconfig of the Container Cloud management cluster, open the MOSK Cluster object for editing:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <OSClusterNamespace> edit cluster <OSClusterName>

In the spec:providerSpec:value:helmReleases section, add the speaker.nodeSelector field for metallb:
```
spec:
  ...
  providerSpec:
    value:
      ...
      helmReleases:
      - name: metallb
        values:
          ...
          speaker:
            nodeSelector:
              metallbSpeakerEnabled: "true"
```
The metallbSpeakerEnabled: "true" parameter in this example is the label on Kubernetes nodes where MetalLB speakers will be deployed. It can be an already existing node label or a new one.

Note

The issue [24435] MetalLB speaker fails to announce the LB IP for the Ingress service, which is related to collocation of MetalLB speakers and the OpenStack Ingress service pods is addressed in MOSK 22.5. For details, see Release Notes: Set externalTrafficPolicy=Local for the OpenStack Ingress service.

You can add user-defined labels to nodes using the nodeLabels field.

This field contains the list of node labels to be attached to a node for the user to run certain components on separate cluster nodes. The list of allowed node labels is located in the Cluster object status providerStatus.releaseRef.current.allowedNodeLabels field.

If the value field is not defined in allowedNodeLabels, a label can have any value. For example:
```
allowedNodeLabels:
- displayName: Stacklight
  key: stacklight
```
Before or after a machine deployment, add the required label from the allowed node labels list with the corresponding value to spec.providerSpec.value.nodeLabels in machine.yaml. For example:
```
nodeLabels:
- key: stacklight
  value: enabled
```
Adding of a node label that is not available in the list of allowed node labels is restricted.

MetalLB configuration guidelines for subnets¶

Note

Consider this section as obsolete since MOSK 24.2 due to the MetalLBConfigTemplate object deprecation. For details, see Deprecation Notes: MetalLBConfigTemplate object.

Caution

This section also applies to the bootstrap procedure of a management cluster with the following difference: instead of creating the Subnet object, add its configuration to ipam-objects.yaml.template located in kaas-bootstrap/templates/bm/.

The Kubernetes Subnet object is created for a management cluster from templates during bootstrap.

Each Subnet object can define either a MetalLB address range or MetalLB address pool. A MetalLB address pool may contain one or several address ranges. The following rules apply to creation of address ranges or pools:

To designate a subnet as a MetalLB address pool or range, use the ipam/SVC-MetalLB label key. Set the label value to "1".
The object must contain the cluster.sigs.k8s.io/<cluster-name> label to reference the name of the target cluster where the MetalLB address pool is used.
You may create multiple subnets with the ipam/SVC-MetalLB label to define multiple IP address ranges or multiple address pools for MetalLB in the cluster.
The IP addresses of the MetalLB address pool are not assigned to the interfaces on hosts. This subnet is virtual. Do not include such subnets to the L2 template definitions for your cluster.
If a Subnet object defines a MetalLB address range, no additional object properties are required.
You can use any number of Subnet objects that define a single MetalLB address range. In this case, all address ranges are aggregated into a single MetalLB L2 address pool named services having the auto-assign policy enabled.
Intersection of IP address ranges within any single MetalLB address pool is not allowed.

The bare metal provider verifies intersection of IP address ranges. If it detects intersection, the MetalLB configuration is blocked and the provider logs contain corresponding error messages.

Use the following labels to identify the Subnet object as a MetalLB address pool and configure the name and protocol for that address pool. All labels below are mandatory for the Subnet object that configures a MetalLB address pool.

Mandatory Subnet labels for a MetalLB address pool¶
Label	Description
Labels to link `Subnet` to the target MOSK clusters within a management cluster.	`cluster.sigs.k8s.io/cluster-name` Specifies the cluster name where the MetalLB address pool is used. `kaas.mirantis.com/region` Specifies the region name of the cluster where the MetalLB address pool is used. `kaas.mirantis.com/provider` Specifies the provider of the cluster where the MetalLB address pool is used. Note The `kaas.mirantis.com/region` label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
`ipam/SVC-MetalLB`	Defines that the `Subnet` object will be used to provide a new address pool or range for MetalLB.
`metallb/address-pool-name`	Every address pool must have a distinct name. The `services-pxe` address pool is mandatory when configuring a dedicated PXE network in the management cluster. This name will be used in annotations for services exposed through the PXE network. A bootstrap cluster also uses the `services-pxe` address pool for its provision services so that management cluster nodes can be provisioned from the bootstrap cluster. After a management cluster is deployed, the bootstrap cluster is deleted and that address pool is solely used by the newly deployed cluster.
`metallb/address-pool-auto-assign`	Configures the auto-assign policy of an address pool. Boolean. Caution For the address pools defined using the MetalLB Helm chart values in the `Cluster` `spec` section, auto-assign policy is set to `true` and is not configurable. For any service that does not have a specific MetalLB annotation configured, MetalLB allocates external IPs from arbitrary address pools that have the auto-assign policy set to `true`. Only for the service that has a specific MetalLB annotation with the address pool name, MetalLB allocates external IPs from the address pool having the auto-assign policy set to `false`.
`metallb/address-pool-protocol`	Sets the address pool protocol. The only supported value is `layer2` (default).

Caution

Do not set the same address pool name for two or more Subnet objects. Otherwise, the corresponding MetalLB address pool configuration fails with a warning message in the bare metal provider log.

Caution

For the auto-assign policy, the following configuration rules apply:

At least one MetalLB address pool must have the auto-assign policy enabled so that unannotated services can have load balancer IPs allocated for them. To satisfy this requirement, either configure one of address pools using the Subnet object with metallb/address-pool-auto-assign: "true" or configure address range(s) using the Subnet object(s) without metallb/address-pool-* labels.
When configuring multiple address pools with the auto-assign policy enabled, keep in mind that it is not determined in advance which pool of those multiple address pools is used to allocate an IP for a particular unannotated service.

Configure BGP announcement for cluster API LB address¶

Available since MOSK 23.2.2 TechPreview

When you create a MOSK cluster with the multi-rack topology, where Kubernetes masters are distributed across multiple racks without an L2 layer extension between them, you must configure BGP announcement of the cluster API load balancer address.

For clusters where Kubernetes masters are in the same rack or with an L2 layer extension between masters, you can configure either BGP or L2 (ARP) announcement of the cluster API load balancer address. The L2 (ARP) announcement is used by default and its configuration is covered in Create a managed bare metal cluster.

Caution

Create Rack and MultiRackCluster objects, which are described in the below procedure, before initiating the provisioning of master nodes to ensure that both BGP and netplan configurations are applied simultaneously during the provisioning process.

To enable the use of BGP announcement for the cluster API LB address:

In the Cluster object, set the useBGPAnnouncement parameter to true:

spec:
  providerSpec:
    value:
      useBGPAnnouncement: true

Create the MultiRackCluster object that is mandatory when configuring BGP announcement for the cluster API LB address. This object enables you to set cluster-wide parameters for configuration of BGP announcement.

In this scenario, the MultiRackCluster object must be bound to the corresponding Cluster object using the cluster.sigs.k8s.io/cluster-name label.

Container Cloud uses the bird BGP daemon for announcement of the cluster API LB address. For this reason, set the corresponding bgpdConfigFileName and bgpdConfigFilePath parameters in the MultiRackCluster object, so that bird can locate the configuration file. For details, see the configuration example below.

The bgpdConfigTemplate object contains the default configuration file template for the bird BGP daemon, which you can override in Rack objects.

The defaultPeer parameter contains default parameters of the BGP connection from master nodes to infrastructure BGP peers, which you can override in Rack objects.
Configuration example for MultiRackCluster
apiVersion: ipam.mirantis.com/v1alpha1 kind: MultiRackCluster metadata: name: multirack-test-cluster namespace: mosk-ns labels: cluster.sigs.k8s.io/cluster-name: test-cluster kaas.mirantis.com/provider: baremetal kaas.mirantis.com/region: region-one spec: bgpdConfigFileName: bird.conf bgpdConfigFilePath: /etc/bird bgpdConfigTemplate: | ... defaultPeer: localASN: 65101 neighborASN: 65100 neighborIP: "" password: deadbeef
Note

The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
For the object description, see MultiRackCluster.
Create the Rack object(s). This object is mandatory when configuring BGP announcement for the cluster API LB address and it allows you to configure BGP announcement parameters for each rack.

In this scenario, Rack objects must be bound to Machine objects corresponding to master nodes of the cluster. Each Rack object describes the configuration for the bird BGP daemon used to announce the cluster API LB address from a particular master node or from several master nodes in the same rack.

The Rack object fields are described in Rack.
Set a reference to the Rack object used to configure the bird BGP daemon for a particular master node to announce the cluster API LB IP:
Since MOSK 25.1
In the Machine objects for all master nodes, set the ipam/RackRef label with the value equal to the name of the corresponding Rack object. For example:
apiVersion: cluster.k8s.io/v1alpha1 kind: Machine metadata: labels: ipam/RackRef: rack-master-1 # reference to the "rack-master-1" Rack ...
Before MOSK 25.1. (deprecated)
In the BareMetalHost objects for all cluster nodes, set the ipam.mirantis.com/rack-ref annotation with the value equal to the name of the corresponding Rack object. For example:
apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: annotations: ipam.mirantis.com/rack-ref: rack-master-1 # reference to the "rack-master-1" Rack ...

Optional. Using the Machine object, define the rack-id node label that is not used for BGP announcement of the cluster API LB IP but can be used for MetalLB.

The rack-id node label is required for MetalLB node selectors when MetalLB is used to announce LB IP addresses on nodes that are distributed across multiple racks. In this scenario, the L2 (ARP) announcement mode cannot be used for MetalLB because master nodes are in different L2 segments. So, the BGP announcement mode must be used for MetalLB, and node selectors are required to properly configure BGP connections from each node. See Configure and verify MetalLB for details.

The L2Template object includes the lo interface configuration to set the IP address for the bird BGP daemon that will be advertised as the cluster API LB address. The {{ cluster_api_lb_ip }} function is used in npTemplate to obtain the cluster API LB address value.

The configuration example for the scenario where Kubernetes masters are in the same rack or with an L2 layer extension between masters is described in Single rack configuration example.

The configuration example for the scenario where Kubernetes masters are distributed across multiple racks without L2 layer extension between them is described in Multiple rack configuration example.

See also

Obtain and use details about network interfaces¶

To simplify operations with L2 templates, before you start creating them, inspect the general workflow of a network interface name gathering and processing.

Network interface naming workflow:

The operator creates a BareMetalHostInventory object.

Note

Before update of the management cluster to Container Cloud 2.29.0 (Cluster release 16.4.0), instead of BareMetalHostInventory, use the BareMetalHost object. For details, see BareMetalHost resource.

Caution

While the Cluster release of the management cluster is 16.4.0, BareMetalHostInventory operations are allowed to m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
The BareMetalHostInventory object executes the introspection stage and becomes ready.

The operator collects information about NIC count, naming, and so on for further changes in the mapping logic.

At this stage, the NICs order in the object may randomly change during each introspection, but the NICs names are always the same. For more details, see Predictable Network Interface Names.

For example:

# Example commands:
# kubectl -n managed-ns get bmh baremetalhost1 -o custom-columns='NAME:.metadata.name,STATUS:.status.provisioning.state'
# NAME            STATE
# baremetalhost1  ready

# kubectl -n managed-ns get bmh baremetalhost1 -o yaml
# Example output:

apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
...
status:
...
    nics:
    - ip: fe80::ec4:7aff:fe6a:fb1f%eno2
      mac: 0c:c4:7a:6a:fb:1f
      model: 0x8086 0x1521
      name: eno2
      pxe: false
    - ip: fe80::ec4:7aff:fe1e:a2fc%ens1f0
      mac: 0c:c4:7a:1e:a2:fc
      model: 0x8086 0x10fb
      name: ens1f0
      pxe: false
    - ip: fe80::ec4:7aff:fe1e:a2fd%ens1f1
      mac: 0c:c4:7a:1e:a2:fd
      model: 0x8086 0x10fb
      name: ens1f1
      pxe: false
    - ip: 192.168.1.151 # Temp. PXE network adress
      mac: 0c:c4:7a:6a:fb:1e
      model: 0x8086 0x1521
      name: eno1
      pxe: true
 ...

The operator selects from the following options:
- Create an L2Template object with the ifMapping configuration. For details, see Create L2 templates.
- Create a Machine object, with the l2TemplateIfMappingOverride configuration. For details, see Override network interfaces naming and order.
The operator creates a Machine or Subnet object.
The baremetal-provider service links the Machine object to the BareMetalHostInventory object.
The kaas-ipam and baremetal-provider services collect hardware information from the BareMetalHostInventory object and use it to configure host networking and services.
The kaas-ipam service:
1. Spawns the IpamHost object.
2. Renders the L2Template object.
3. Spawns the ipaddr object.
4. Updates the IpamHost object status with all rendered and linked information.
The baremetal-provider service collects the rendered networking information from the IpamHost object
The baremetal-provider service proceeds with the IpamHost object provisioning.

Now proceed to Create subnets.

Create subnets¶

After creating the MetalLB configuration as described in Configure and verify MetalLB and before creating L2 templates, ensure that you have the required subnets that can be used in the L2 template to allocate IP addresses for the MOSK cluster nodes. Where required, create a number of subnets for a particular project using the Subnet CR. A subnet has the following logical scopes:

Each subnet used in an L2 template has its logical scope that is set using the scope parameter in the corresponding L2Template.spec.l3Layout section. One of the following logical scopes is used for each subnet referenced in an L2 template:

global - CR uses the default namespace. A subnet can be used for any cluster located in any project.
namespaced - CR uses the namespace that corresponds to a particular project where MOSK clusters are located. A subnet can be used for any cluster located in the same project.
cluster - Unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). CR uses the namespace where the referenced cluster is located. A subnet is only accessible to the cluster that L2Template.metadata.labels:cluster.sigs.k8s.io/cluster-name (mandatory since MOSK 23.3) or L2Template.spec.clusterRef (deprecated since MOSK 23.3) refers to. The Subnet objects with the cluster scope will be created for every new cluster.

Note

The use of the ipam/SVC-MetalLB label in Subnet objects is unsupported as part of the MetalLBConfigTemplate object deprecation since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). No actions are required for existing objects. A Subnet object containing this label will be ignored by baremetal-provider after cluster update to the mentioned Cluster releases.

You can have subnets with the same name in different projects. In this case, the subnet that has the same project as the cluster will be used. One L2 template may often reference several subnets, those subnets may have different scopes in this case.

The IP address objects (IPaddr CR) that are allocated from subnets always have the same project as their corresponding IpamHost objects, regardless of the subnet scope.

You can create subnets using either the Container Cloud web UI or CLI.

Service labels and their life cycle¶

Any Subnet object may contain ipam/SVC-<serviceName> labels. All IP addresses allocated from the Subnet object that has service labels defined inherit those labels.

When a particular IpamHost uses IP addresses allocated from such labeled Subnet objects, the ServiceMap field in IpamHost.Status contains information about which IPs and interfaces correspond to which service labels (that have been set in the Subnet objects). Using ServiceMap, you can understand what IPs and interfaces of a particular host are used for network traffic of a given service.

Container Cloud uses the following service labels that allow using of the specific subnets for particular Container Cloud services:

ipam/SVC-k8s-lcm
ipam/SVC-ceph-cluster
ipam/SVC-ceph-public
ipam/SVC-dhcp-range
ipam/SVC-MetalLB ^{Unsupported since 24.3}
ipam/SVC-LBhost

Caution

The use of the ipam/SVC-k8s-lcm label is mandatory for every cluster.

Important

A label value is not mandatory and can be empty but it must match the value in the related L2Template object, in which the corresponding subnet is used. Otherwise, network configuration for related hosts will not be rendered due to not found subnets.

You can also add custom service labels to the Subnet objects the same way you add Container Cloud service labels. The mapping of IPs and interfaces to the defined services is displayed in IpamHost.Status.ServiceMap.

You can assign multiple service labels to one network. You can also assign the ceph-* and dhcp-range services to multiple networks. In the latter case, the system sorts the IP addresses in the ascending order:

serviceMap:
  ipam/SVC-ceph-cluster:
    - ifName: ceph-br2
      ipAddress: 10.0.10.11
    - ifName: ceph-br1
      ipAddress: 10.0.12.22
  ipam/SVC-ceph-public:
    - ifName: ceph-public
      ipAddress: 10.1.1.15
  ipam/SVC-k8s-lcm:
    - ifName: k8s-lcm
      ipAddress: 10.0.1.52

You can add service labels during creation of subnets as described in Create subnets.

Create subnets for a managed cluster using web UI¶

After creating the MetalLB configuration as described in Configure and verify MetalLB and before creating an L2 template, create the required subnets to use in the L2 template to allocate IP addresses for the managed cluster nodes.

To create subnets for a managed cluster using web UI:

Log in to the Container Cloud web UI with the operator permissions.
Switch to the required non-default project using the Switch Project action icon located on top of the main left-side navigation panel.

Caution

Do not create a MOSK cluster in the default project (Kubernetes namespace), which is dedicated for the management cluster only. If no projects are defined, first create a new mosk project as described in Create a project for MOSK clusters.
Create basic cluster settings as described in Create a managed bare metal cluster.
Select one of the following options:
Since MCC 2.26.0 (17.1.0, 16.1.0)
1. In the left sidebar, navigate to Networks. The Subnets tab opens.
2. Click Create Subnet.
3. Fill out the Create subnet form as required:
  - Name
    Subnet name.
  - Subnet Type
    Subnet type:
    
    DHCP ^Optional
    DHCP subnet that configures DHCP address ranges used by the DHCP server on the management cluster. For details, see Configure multiple DHCP address ranges.
    
    LB
    Cluster API LB subnet.
    
    LCM
    LCM subnet(s).
    
    Storage access ^Optional
    Available in the web UI since Container Cloud 2.28.0 (17.3.0 and 16.3.0). Storage access subnet.
    
    Storage replication ^Optional
    Available in the web UI since Container Cloud 2.28.0 (17.3.0 and 16.3.0). Storage replication subnet.
    
    Custom ^Optional
    Custom subnet. For example, external or for Kubernetes workloads. For details, see optional steps in Create subnets for a managed cluster using CLI.
    
    MetalLB
    Services subnet(s).
    
    Warning
    
    Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), disregard this parameter during subnet creation. Configure MetalLB separately as described in Configure and verify MetalLB.
    
    This parameter is removed from the Container Cloud web UI in Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0).
    
    For description of subnet types in a managed cluster, see MOSK cluster networking.
  - Cluster
    Cluster name that the subnet is being created for. Not required only for the DHCP subnet.
  - CIDR
    A valid IPv4 address of the subnet in the CIDR notation, for example, 10.11.0.0/24.
  - Include Ranges ^Optional
    A comma-separated list of IP address ranges within the given CIDR that should be used in the allocation of IPs for nodes. The gateway, network, broadcast, and DNSaddresses will be excluded (protected) automatically if they intersect with one of the range. The IPs outside the given ranges will not be used in the allocation. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
    
    Warning
    
    Do not use values that are out of the given CIDR.
  - Exclude Ranges ^Optional
    A comma-separated list of IP address ranges within the given CIDR that should not be used in the allocation of IPs for nodes. The IPs within the given CIDR but outside the given ranges will be used in the allocation. The gateway, network, broadcast, and DNS addresses will be excluded (protected) automatically if they are included in the CIDR. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
    
    Warning
    
    Do not use values that are out of the given CIDR.
  - Gateway ^Optional
    A valid IPv4 gateway address, for example, 10.11.0.9. Does not apply to the MetalLB subnet.
  - Nameservers
    IP addresses of nameservers separated by a comma. Does not apply to the DHCP and MetalLB subnet types.
  - Use whole CIDR
    Optional. Select to use the whole IPv4 address range that is set in the CIDR field. Useful when defining single IP address (/32), for example, in the Cluster API load balancer (LB) subnet.
    
    If not set, the network address and broadcast address in the IP subnet are excluded from the address allocation.
  - Labels
    Key-value pairs attached to the selected subnet:
    
    Caution
    
    The values of the created subnet labels must match the ones in the spec.l3Layout section of the corresponding L2Template object.
    
    Optional user-defined labels to distinguish different subnets of the same type. For an example of user-defined labels, see Expand IP addresses capacity in an existing cluster.
    
    The following special values define the storage subnets:
    
    ipam/SVC-ceph-cluster
    
    ipam/SVC-ceph-public
    
    For more examples of label usage, see Service labels and their life cycle and Create subnets for a managed cluster using CLI.
    
    Click Add a label and assign the first custom label with the required name and value. To assign consecutive labels, use the + button located in the right side of the Labels section.
    
    MetalLB:
    
    Warning
    
    Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), disregard this label during subnet creation. Configure MetalLB separately as described in Configure and verify MetalLB.
    
    The label will be removed from the Container Cloud web UI in one of the following releases.
    
    metallb/address-pool-name
    Name of the subnet address pool. Exemplary values: services, default, external, services-pxe.
    
    The latter label is dedicated for management clusters only. For details about address pool names of a management cluster, see Separate PXE and management networks.
    
    metallb/address-pool-auto-assign
    Enables automatic assignment of address pool. Boolean.
    
    metallb/address-pool-protocol
    Defines the address pool protocol. Possible values:
    
    layer2 - announcement using the ARP protocol.
    
    bgp - announcement using the BGP protocol. Technology Preview.
    
    For description of these protocols, refer to the MetalLB documentation.
4. Click Create.
5. In the Networks tab, verify the status of the created subnet:
  - Ready - object is operational.
  - Error - object is non-operational. Hover over the status to obtain details of the issue.
  Note
  
  To verify subnet details, in the Networks tab, click the More action icon in the last column of the required subnet and select Subnet info.
Before MCC 2.26.0 (17.0.0, 16.0.0, or earlier)
1. In the Clusters tab, click the required cluster and scroll down to the Subnets section.
2. Click Add Subnet.
3. Fill out the Add new subnet form as required:
  - Subnet Name
    Subnet name.
  - CIDR
    A valid IPv4 CIDR, for example, 10.11.0.0/24.
  - Include Ranges ^Optional
    A comma-separated list of IP address ranges within the given CIDR that should be used in the allocation of IPs for nodes. The gateway, network, broadcast, and DNSaddresses will be excluded (protected) automatically if they intersect with one of the range. The IPs outside the given ranges will not be used in the allocation. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
    
    Warning
    
    Do not use values that are out of the given CIDR.
  - Exclude Ranges ^Optional
    A comma-separated list of IP address ranges within the given CIDR that should not be used in the allocation of IPs for nodes. The IPs within the given CIDR but outside the given ranges will be used in the allocation. The gateway, network, broadcast, and DNS addresses will be excluded (protected) automatically if they are included in the CIDR. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
    
    Warning
    
    Do not use values that are out of the given CIDR.
  - Gateway ^Optional
    A valid gateway address, for example, 10.11.0.9.
4. Click Create.

Proceed to creating L2 templates as described in Create L2 templates.

Create subnets for a managed cluster using CLI¶

Prerequisites for a multi-rack cluster¶

Ensure that the underlay network in your cluster satisfies the requirements listed in Networking.
Configure all required VLANs on the network switches as described in Network types.
Configure gateway IP addresses on the ToR switches. Set up static routes as described in Underlay networking: routing configuration.

Create subnets using CLI¶

Create a cluster using one of the following options:
- Container Cloud web UI: Create a managed bare metal cluster
- API Reference: Cluster resource
Log in to a local machine where your management cluster kubeconfig is located and where kubectl is installed.

Note

The management cluster kubeconfig is created during the last stage of the management cluster bootstrap.

Create the subnet.yaml file with a number of global or namespaced subnets depending on the configuration of your cluster:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <SubnetFileName.yaml>

Note

In the command above and in the steps below, substitute the parameters enclosed in angle brackets with the corresponding values.

Example of a subnet.yaml file:

apiVersion: ipam.mirantis.com/v1alpha1
kind: Subnet
metadata:
  name: demo
  namespace: demo-namespace
  labels:
    kaas.mirantis.com/provider: baremetal
spec:
  cidr: 10.11.0.0/24
  gateway: 10.11.0.9
  includeRanges:
  - 10.11.0.5-10.11.0.70
  nameservers:
  - 172.18.176.6

Note

The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.

Specification fields of the Subnet object¶
Parameter	Description
`cidr` (singular)	A valid IPv4 CIDR, for example, 10.11.0.0/24.
`includeRanges` (list)	A comma-separated list of IP address ranges within the given CIDR that should be used in the allocation of IPs for nodes. The gateway, network, broadcast, and DNSaddresses will be excluded (protected) automatically if they intersect with one of the range. The IPs outside the given ranges will not be used in the allocation. Each element of the list can be either an interval `10.11.0.5-10.11.0.70` or a single address `10.11.0.77`. Warning Do not use values that are out of the given CIDR.
`excludeRanges` (list)	A comma-separated list of IP address ranges within the given CIDR that should not be used in the allocation of IPs for nodes. The IPs within the given CIDR but outside the given ranges will be used in the allocation. The gateway, network, broadcast, and DNS addresses will be excluded (protected) automatically if they are included in the CIDR. Each element of the list can be either an interval `10.11.0.5-10.11.0.70` or a single address `10.11.0.77`. Warning Do not use values that are out of the given CIDR.
`useWholeCidr` (boolean)	If set to `true`, the subnet address (10.11.0.0 in the example above) and the broadcast address (10.11.0.255 in the example above) are included into the address allocation for nodes. Otherwise, (`false` by default), the subnet address and broadcast address are excluded from the address allocation.
`gateway` (singular)	A valid gateway address, for example, 10.11.0.9.
`nameservers` (list)	A list of the IP addresses of name servers. Each element of the list is a single address, for example, 172.18.176.6.

Configuration rules:

The subnet for the LCM network must contain the ipam/SVC-k8s-lcm: "1" label. For details, see Service labels and their life cycle.
Each cluster must use at least one subnet for its LCM network. Every node must have the address allocated in the LCM network using such subnet(s).

Each node of every cluster must have one and only IP address in the LCM network that is allocated from one of the Subnet objects having the ipam/SVC-k8s-lcm label defined. Therefore, all Subnet objects used for LCM networks must have the ipam/SVC-k8s-lcm label defined.
You can use any interface name for the LCM network traffic. The Subnet objects for the LCM network must have the ipam/SVC-k8s-lcm label. For details, see Service labels and their life cycle.

Note

You may use different subnets to allocate IP addresses to different Container Cloud components in your cluster. Add a label with the ipam/SVC- prefix to each subnet that is used to configure a Container Cloud service. For details, see Service labels and their life cycle and the optional steps below.

Configure DHCP relay agents on the edges of the broadcast domains in the provisioning network, as needed.

Make sure to assign the IP address ranges you want to allocate to the hosts using DHCP for discovery and inspection. Create subnets using these IP parameters. Specify the IP address of your DHCP relay as the default gateway in the corresponding Subnet object.

Caution

Support of multiple DHCP ranges has the following limitations:

Using of custom DNS server addresses for servers that boot over PXE is not supported.
The Subnet objects for DHCP ranges cannot be associated with any specific cluster, as the DHCP server configuration is only applicable to the management cluster where the DHCP server is running. The cluster.sigs.k8s.io/cluster-name label will be ignored.

Configuration examples:

Optional. Add subnets for configuring multiple DHCP ranges. For details, see Configure multiple DHCP address ranges.
Add one or more subnets for the LCM network:
- Set the ipam/SVC-k8s-lcm label with the value "1" to create a subnet that will be used to assign IP addresses in the LCM network.
- Optional. Set the cluster.sigs.k8s.io/cluster-name label to the name of the target cluster during the subnet creation.
- Use this subnet in the L2 template for cluster nodes.
- Using the L2 template, assign this subnet to the interface connected to your LCM network.
Precautions for the LCM network usage
- Each cluster must use at least one subnet for its LCM network. Every node must have the address allocated in the LCM network using such subnet(s).
- Each node of every cluster must have one and only IP address in the LCM network that is allocated from one of the Subnet objects having the ipam/SVC-k8s-lcm label defined. Therefore, all Subnet objects used for LCM networks must have the ipam/SVC-k8s-lcm label defined.
- You can use any interface name for the LCM network traffic. The Subnet objects for the LCM network must have the ipam/SVC-k8s-lcm label. For details, see Service labels and their life cycle.
Configuration examples:
Single-rack cluster
apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: labels: kaas.mirantis.com/provider: baremetal ipam/SVC-k8s-lcm: "1" name: lcm-nw namespace: <MOSKClusterNamespace> spec: cidr: 172.16.43.0/24 gateway: 172.16.43.1 includeRanges: - 172.16.43.10-172.16.43.100 nameservers: - 8.8.8.8
Multi-rack cluster
Example mosk-racks-lcm-subnets.yaml
Note

Subnet labels such as rack-x-lcm, rack-api-lcm, and so on are optional. You can use them in L2 templates to select Subnet objects by label.

apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: name: rack-1-lcm namespace: mosk-namespace-name labels: ipam/SVC-k8s-lcm: "1" kaas.mirantis.com/provider: baremetal cluster.sigs.k8s.io/cluster-name: mosk-cluster-name rack-1-lcm: "true" spec: cidr: 10.20.111.0/24 gateway: 10.20.111.1 includeRanges: - 10.20.111.16-10.20.111.255 nameservers: - 8.8.8.8 --- apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: name: rack-2-lcm namespace: mosk-namespace-name labels: ipam/SVC-k8s-lcm: "1" kaas.mirantis.com/provider: baremetal cluster.sigs.k8s.io/cluster-name: mosk-cluster-name rack-2-lcm: "true" spec: cidr: 10.20.112.0/24 gateway: 10.20.112.1 includeRanges: - 10.20.112.16-10.20.112.255 nameservers: - 8.8.8.8 --- apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: name: rack-3-lcm namespace: mosk-namespace-name labels: ipam/SVC-k8s-lcm: "1" kaas.mirantis.com/provider: baremetal cluster.sigs.k8s.io/cluster-name: mosk-cluster-name rack-3-lcm: "true" spec: cidr: 10.20.113.0/24 gateway: 10.20.113.1 includeRanges: - 10.20.113.16-10.20.113.255 nameservers: - 8.8.8.8 --- # Add more subnet object templates as required using the above example # (one subnet per rack)
Example mosk-racks-api-lcm-subnet.yaml
Note

Since 23.2.2, MOSK supports full L3 networking topology in the Technology Preview scope. This enables deployment of specific cluster segments in dedicated racks without the need for L2 layer extension between them. For configuration procedure, see Configure BGP announcement for cluster API LB address and Configure BGP announcement of external addresses of Kubernetes load-balanced services in Deployment Guide.

If BGP announcement is configured for the MOSK cluster API LB address, the API/LCM network is not required. Announcement of the cluster API LB address is done using the LCM or external network.

If you configure ARP announcement of the load-balancer IP address for the MOSK cluster API, the API/LCM network must be configured on the Kubernetes manager nodes of the cluster. This network contains the Kubernetes API endpoint with the VRRP virtual IP address.

This network contains Kubernetes API endpoint with the VRRP virtual IP address. This is the IP address space that Container Cloud uses to ensure communication between the LCM agents and the management API. These addresses are also used by Kubernetes nodes for communication. The addresses from the subnet are assigned to all Kubernetes manager nodes of the MOSK cluster.

apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: name: rack-api-lcm namespace: mosk-namespace-name labels: ipam/SVC-k8s-lcm: "1" kaas.mirantis.com/provider: baremetal cluster.sigs.k8s.io/cluster-name: mosk-cluster-name rack-api-lcm: "true" spec: cidr: 10.20.110.0/24 gateway: 10.20.110.1 includeRanges: - 10.20.110.16-10.20.110.25 nameservers: - 8.8.8.8
Optional. Add a subnet for external connection to the Kubernetes services exposed by the MOSK cluster. The network is used to expose the OpenStack, StackLight, and other MOSK services. Configuration examples:
Single-rack cluster
apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: labels: kaas.mirantis.com/provider: baremetal name: k8s-ext-subnet namespace: <MOSKClusterNamespace> spec: cidr: 172.16.45.0/24 gateway: 172.16.45.1 includeRanges: - 172.16.45.10-172.16.45.100 nameservers: - 8.8.8.8
Multi-rack cluster
Note

Since 23.2.2, MOSK supports full L3 networking topology in the Technology Preview scope. This enables deployment of specific cluster segments in dedicated racks without the need for L2 layer extension between them. For configuration procedure, see Configure BGP announcement for cluster API LB address and Configure BGP announcement of external addresses of Kubernetes load-balanced services in Deployment Guide.

If you configure BGP announcement for IP addresses of load-balanced services of a MOSK cluster, the external network can consist of multiple VLAN segments connected to all nodes of a MOSK cluster where MetalLB speaker components are configured to announce IP addresses for Kubernetes load-balanced services. Mirantis recommends that you use OpenStack controller nodes for this purpose.

If you configure ARP announcement for IP addresses of load-balanced services of a MOSK cluster, the external network must consist of a single VLAN stretched to the ToR switches of all the racks where MOSK nodes connected to the external network are located. Those are the nodes where MetalLB speaker components are configured to announce IP addresses for Kubernetes load-balanced services. Mirantis recommends that you use OpenStack controller nodes for this purpose.

The subnets are used to assign addresses to the external interfaces of the MOSK controller nodes and will be used to assign the default gateway to these hosts. The default gateway for other hosts of the MOSK cluster is assigned using the LCM and optionally API/LCM subnets.

Example of a subnet where a single VLAN segment is stretched to all MOSK controller nodes:
apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: name: k8s-external namespace: mosk-namespace-name labels: kaas.mirantis.com/provider: baremetal cluster.sigs.k8s.io/cluster-name: mosk-cluster-name k8s-external: true spec: cidr: 10.20.120.0/24 gateway: 10.20.120.1 # This will be the default gateway on hosts includeRanges: - 10.20.120.16-10.20.120.20 nameservers: - 8.8.8.8
Example of subnets where separate VLAN segments per rack are used:
apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: name: rack-1-k8s-ext namespace: mosk-namespace-name labels: kaas.mirantis.com/provider: baremetal cluster.sigs.k8s.io/cluster-name: mosk-cluster-name rack-1-k8s-ext: true spec: cidr: 10.20.121.0/24 gateway: 10.20.121.1 # This will be the default gateway on hosts includeRanges: - 10.20.121.16-10.20.121.20 nameservers: - 8.8.8.8 --- apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: name: rack-2-k8s-ext namespace: mosk-namespace-name labels: kaas.mirantis.com/provider: baremetal cluster.sigs.k8s.io/cluster-name: mosk-cluster-name rack-2-k8s-ext: true spec: cidr: 10.20.122.0/24 gateway: 10.20.122.1 # This will be the default gateway on hosts includeRanges: - 10.20.122.16-10.20.122.20 nameservers: - 8.8.8.8 --- apiVersion: ipam.mirantis.com/v1alpha1 kind: Subnet metadata: name: rack-3-k8s-ext namespace: mosk-namespace-name labels: kaas.mirantis.com/provider: baremetal cluster.sigs.k8s.io/cluster-name: mosk-cluster-name rack-3-k8s-ext: true spec: cidr: 10.20.123.0/24 gateway: 10.20.123.1 # This will be the default gateway on hosts includeRanges: - 10.20.123.16-10.20.123.20 nameservers: - 8.8.8.8
Configuration rules:
- Make sure that loadBalancerHost is set to "" (empty string) in the Cluster spec.
```
spec:
 providerSpec:
 value:
 apiVersion: baremetal.k8s.io/v1alpha1
 kind: BaremetalClusterProviderSpec
 ...
 loadBalancerHost: ""
```
- Create a subnet with the ipam/SVC-LBhost label having the "1" value to make the baremetal-provider use this subnet for allocation of addresses for cluster API endpoints. One IP address will be allocated for each cluster to serve its Kubernetes/MKE API endpoint.
- Make sure that master nodes have host local-link addresses in the same subnet as the cluster API endpoint address. These host IP addresses will be used for VRRP traffic. The cluster API endpoint address will be assigned to the same interface on one of the master nodes where these host IP addresses are assigned.
- Mirantis highly recommends that you assign the cluster API endpoint address from the LCM or external network. For details on cluster network types, refer to MOSK cluster networking.
To add an address allocation scope of API endpoints, create a subnet in the corresponding namespace with a reference to the target cluster using the cluster.sigs.k8s.io/cluster-name label. For example:
```
apiVersion: "ipam.mirantis.com/v1alpha1"
kind: Subnet
metadata:
 name: lbhost-mgmt-cluster
 namespace: default
 labels:
 kaas.mirantis.com/provider: baremetal
 cluster.sigs.k8s.io/cluster-name: mgmt-cluster
 ipam/SVC-LBhost: "presents"
spec:
 cidr: "10.0.30.100/32"
 useWholeCidr: true
```

Optional. Add a subnet(s) for the storage access network. Ceph will automatically use this subnet for its external connections. A Ceph OSD will look for and bind to an address from this subnet when it is started on a machine. Configuration examples:

Configuration rules:

Set the ipam/SVC-ceph-public label with the value "1" to create a subnet that will be used to configure the Ceph public network.
Set the cluster.sigs.k8s.io/cluster-name label to the name of the target cluster during the subnet creation.
Use this subnet in the L2 template for all cluster nodes except Kubernetes manager nodes.
Assign this subnet to the interface connected to your storage access network.

Optional. Add a subnet(s) for the storage replication network. Ceph will automatically use this network for its internal replication traffic. Configuration examples:

Configuration rules:

Set the ipam/SVC-ceph-cluster label with the value "1" to create a subnet that will be used to configure the Ceph cluster network.
Set the cluster.sigs.k8s.io/cluster-name label to the name of the target cluster during the subnet creation.
Use this subnet in the L2 template for storage nodes.
Assign this subnet to the interface connected to your storage replication network.

Optional. Add a subnet for the Kubernetes Pods traffic. The addresses from this subnet are assigned to interfaces connected to the Kubernetes workloads network and used by Calico CNI as underlay for traffic between the pods in the Kubernetes cluster. Configuration examples:

Configuration rules:

Use this subnet in the L2 template for all nodes in the cluster.
Use the npTemplate.bridges.k8s-pods bridge name in the L2 template. This bridge name is reserved for the Kubernetes workloads network. When the k8s-pods bridge is defined in an L2 template, Calico CNI uses that network for routing the Pods traffic between nodes.

Optional. Add a subnet for the MOSK overlay network. this is the underlay network for VXLAN tunnels for the MOSK tenant traffic. If deployed with Tungsten Fabric, it is used for the MPLS over UDP+GRE traffic. Configuration examples:

Configuration rules:

Use this subnet in the L2 template for the compute and gateway (controller) nodes in the MOSK cluster.
Assign this subnet to the interface connected to your MOSK overlay network.
This network is used to provide denied and secure tenant networks with the help of the tunneling mechanism (VLAN/GRE/VXLAN). If the VXLAN and GRE encapsulation takes place, the IP address assignment is required on interfaces at the node level. On the Tungsten Fabric deployments, this network is used for MPLS over UDP+GRE traffic.

Optional. Add a subnet for the MOSK live migration network. This subnet is used by the Compute service (OpenStack Nova) to transfer data during live migration. Depending on the cloud needs, you can place it on a dedicated physical network not to affect other networks during live migration. The IP address assignment is required on interfaces at the node level. Configuration examples:

Configuration rules:

Use this subnet in the L2 template for compute nodes in the MOSK cluster.
Assign this subnet to the interface connected to your MOSK overlay network.

Verify that the subnet is successfully created:

kubectl get subnet kaas-mgmt -oyaml

In the system output, verify the Subnet object status.

Status fields of the Subnet object¶
Parameter	Description
`state` ^{Since 23.1}	Contains a short state description and a more detailed one if applicable. The short status values are as follows: `OK` - object is operational. `ERR` - object is non-operational. This status has a detailed description in the `messages` list. `TERM` - object was deleted and is terminating.
`messages` ^{Since 23.1}	Contains error or warning messages if the object state is `ERR`. For example, `ERR: Wrong includeRange for CIDR…`.
`statusMessage`	Deprecated since MOSK 23.1 and will be removed in one of the following releases in favor of `state` and `messages`. Since MOSK 23.2, this field is not set for the objects of newly created clusters.
`cidr`	Reflects the actual CIDR, has the same meaning as `spec.cidr`.
`gateway`	Reflects the actual gateway, has the same meaning as `spec.gateway`.
`nameservers`	Reflects the actual name servers, has same meaning as `spec.nameservers`.
`ranges`	Specifies the address ranges that are calculated using the fields from `spec: cidr, includeRanges, excludeRanges, gateway, useWholeCidr`. These ranges are directly used for nodes IP allocation.
`allocatable`	Includes the number of currently available IP addresses that can be allocated for nodes from the subnet.
`allocatedIPs`	Specifies the list of IPv4 addresses with the corresponding `IPaddr` object IDs that were already allocated from the subnet.
`capacity`	Contains the total number of IP addresses being held by ranges that equals to a sum of the `allocatable` and `allocatedIPs` parameters values.
`objCreated`	Date, time, and IPAM version of the `Subnet` CR creation.
`objStatusUpdated`	Date, time, and IPAM version of the last update of the `status` field in the `Subnet` CR.
`objUpdated`	Date, time, and IPAM version of the last `Subnet` CR update by `kaas-ipam`.

Example of a successfully created subnet:

apiVersion: ipam.mirantis.com/v1alpha1
kind: Subnet
metadata:
  labels:
    ipam/UID: 6039758f-23ee-40ba-8c0f-61c01b0ac863
    kaas.mirantis.com/provider: baremetal
    ipam/SVC-k8s-lcm: "1"
  name: kaas-mgmt
  namespace: default
spec:
  cidr: 10.0.0.0/24
  excludeRanges:
  - 10.0.0.100
  - 10.0.0.101-10.0.0.120
  gateway: 10.0.0.1
  includeRanges:
  - 10.0.0.50-10.0.0.90
  nameservers:
  - 172.18.176.6
status:
  allocatable: 38
  allocatedIPs:
  - 10.0.0.50:0b50774f-ffed-11ea-84c7-0242c0a85b02
  - 10.0.0.51:1422e651-ffed-11ea-84c7-0242c0a85b02
  - 10.0.0.52:1d19912c-ffed-11ea-84c7-0242c0a85b02
  capacity: 41
  cidr: 10.0.0.0/24
  gateway: 10.0.0.1
  objCreated: 2021-10-21T19:09:32Z  by  v5.1.0-20210930-121522-f5b2af8
  objStatusUpdated: 2021-10-21T19:14:18.748114886Z  by  v5.1.0-20210930-121522-f5b2af8
  objUpdated: 2021-10-21T19:09:32.606968024Z  by  v5.1.0-20210930-121522-f5b2af8
  nameservers:
  - 172.18.176.6
  ranges:
  - 10.0.0.50-10.0.0.90

Proceed to creating L2 templates as described in Create L2 templates.

Create L2 templates¶

After you create subnets for the MOSK cluster as described in Create subnets, follow the procedure below to create L2 templates for different types of OpenStack nodes in the cluster.

See the following subsections for templates that implement the MOSK Reference Architecture: Networking. You may adjust the templates according to the requirements of your architecture using the last two subsections of this section. They explain mandatory parameters of the templates and supported configuration options.

Create an L2 template for a new cluster¶

After you create subnets for one or more MOSK clusters or projects as described in Create subnets, follow the procedure below to create L2 templates for a MOSK cluster.

L2 templates are used directly during provisioning. This way, a hardware node obtains and applies a complete network configuration during the first system boot.

Caution

Update any L2 template created before Container Cloud 2.9.0 (Cluster releases 6.14.0, 5.15.0, or earlier) to the new format as described in Container Cloud Release Notes: Switch L2 templates to the new format.

Сreate an L2 template for a new MOSK cluster¶

Caution

Create L2 templates before adding any machines to your new MOSK cluster.

Log in to a local machine where your management cluster kubeconfig is located and where kubectl is installed.

Note

The management cluster kubeconfig is created during the last stage of the management cluster bootstrap.
Create a set of L2Template YAML files specific to your deployment using exemplary templates provided in Create L2 templates.

Note

You can create several L2 templates with different configurations to be applied to different nodes of the same cluster. See Assign L2 templates to machines for details.
Add or edit the mandatory labels and parameters in the new L2 template. For description of mandatory labels and parameters, see L2Template.
Optional. To designate an L2 template as default, assign the ipam/DefaultForCluster label to it. Only one L2 template in a cluster can have this label. It will be used for machines that do not have an L2 template explicitly assigned to them.

Note

You may skip this step and add the default label along with other custom labels using the Container Cloud web UI, as described below in this procedure.

To assign the default template to the cluster:

Since MCC 2.25.0 (17.0.0 and 16.0.0)

Use the mandatory cluster.sigs.k8s.io/cluster-name label in the L2 template metadata section.

Before MCC 2.25.0 (15.x, 14.x, or earlier)

Use the cluster.sigs.k8s.io/cluster-name label or the clusterRef parameter in the L2 template spec section. During cluster update to 2.25.0, this deprecated parameter is automatically migrated to the cluster.sigs.k8s.io/cluster-name label.
Optional. Add custom labels to the L2 template. You can refer to these labels to assign the L2 template to machines.
Add the L2 template to your management cluster. Select one of the following options:
Using the Container Cloud API
kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <pathToL2TemplateYamlFile>
Using the Container Cloud web UI Since MCC 2.26.0 (17.1.0 and 16.1.0)
1. Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
2. Switch to the required non-default project using the Switch Project action icon located on top of the main left-side navigation panel.
 
 Caution
 
 Do not create a MOSK cluster in the default project (Kubernetes namespace), which is dedicated for the management cluster only. If no projects are defined, first create a new mosk project as described in Create a project for MOSK clusters.
3. In the left sidebar, navigate to Networks and click the L2 Templates tab.
4. Click Create L2 Template.
5. Fill out the Create L2 Template form as required:
 - Name
 L2 template name.
 - Cluster
 Cluster name that the L2 template is being added for. To set the L2 template as default for all machines, also select Set default for the cluster.
 - Specification
 L2 specification in the YAML format that you have previously created. Click Edit to edit the L2 template if required.
 
 Note
 
 Before Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), the field name is YAML file, and you can upload the required YAML file instead of inserting and editing it.
 - Labels
 Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). Key-value pairs attached to the L2 template. For details, see L2Template metadata.
Optional. Further modify the template, if required. For description of parameters, see L2Template.
```
kubectl --kubeconfig <pathToManagementClusterKubeconfig> \
-n <ProjectNameForNewMOSKCluster> edit l2template <L2templateName>
```
Caution

Modification of L2 templates in use is only allowed with a mandatory validation step from the infrastructure operator to prevent accidental cluster failures due to unsafe changes. The list of risks posed by modifying L2 templates includes:
- Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.
- Connections between services are interrupted unexpectedly, which can cause data loss.
- Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.
For details, see Modify network configuration on an existing machine.
Proceed with Add a machine. The resulting L2 template will be used to render the netplan configuration for the MOSK cluster machines.

Workflow of the netplan configuration using an L2 template¶

The kaas-ipam service uses the data from BareMetalHost, L2Template, and Subnet objects to generate the netplan configuration for every cluster machine.

Note

Before update of the management cluster to Container Cloud 2.29.0 (Cluster release 16.4.0), instead of BareMetalHostInventory, use the BareMetalHost object. For details, see BareMetalHost resource.

Caution

While the Cluster release of the management cluster is 16.4.0, BareMetalHostInventory operations are allowed to m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
The generated netplan configuration is saved in the status.netconfigFiles section of the IpamHost object. If the status.netconfigFilesState field of the IpamHost object is OK, the configuration was rendered in the IpamHost object successfully. Otherwise, the status contains an error message.
Caution

The following fields of the ipamHost status are renamed since MOSK 23.1 in the scope of the L2Template and IpamHost objects refactoring:
- netconfigV2 to netconfigCandidate
- netconfigV2state to netconfigCandidateState
- netconfigFilesState to netconfigFilesStates (per file)
No user actions are required after renaming.

The format of netconfigFilesState changed after renaming. The netconfigFilesStates field contains a dictionary of statuses of network configuration files stored in netconfigFiles. The dictionary contains the keys that are file paths and values that have the same meaning for each file that netconfigFilesState had:
- For a successfully rendered configuration file: OK: <timestamp> <sha256-hash-of-rendered-file>, where a timestamp is in the RFC 3339 format.
- For a failed rendering: ERR: <error-message>.
The baremetal-provider service copies data from status.netconfigFiles of the IpamHost object to the Spec.StateItemsOverwrites[‘deploy’][‘bm_ipam_netconfigv2’] parameter of LCMMachine.
The lcm-agent service on every host synchronizes the LCMMachine data to its host. The lcm-agent service runs a playbook to update the netplan configuration on the host during the pre-download and deploy phases.

Create L2 templates for a multi-rack MOSK cluster¶

For a multi-rack MOSK cluster, you need to create one L2 template for each type of server in each rack. This may result in a large number of L2 templates in your configuration.

For example, if you have a three-rack deployment of MOSK with 4 types of nodes evenly distributed across three racks, you have to create at least the following L2 templates:

rack-1-k8s-manager, rack-2-k8s-manager, rack-3-k8s-manager for Kubernetes control plane nodes, unless you use the compact control plane option.
rack-1-mosk-control, rack-2-mosk-control, rack-3-mosk-control for OpenStack controller nodes in each rack.
rack-1-mosk-compute, rack-2-mosk-compute, rack-3-mosk-compute for OpenStack compute nodes in each rack.
rack-1-mosk-storage, rack-2-mosk-storage, rack-3-mosk-storage for OpenStack storage nodes in each rack.

In total, twelve L2 templates are required for this relatively simple cluster. In the following sections, the examples cover only one rack, but can be easily expanded to more racks.

Note

Three servers are required for Kubernetes control plane and for the OpenStack control plane. So, you might not need more L2 templates for these roles when expanding beyond three racks.

Now, proceed to creating L2 templates for your cluster, starting from Create an L2 template for a Kubernetes manager node.

Create an L2 template for a Kubernetes manager node¶

Caution

Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause data loss.
Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.

For details, see Modify network configuration on an existing machine.

According to the reference architecture, the Kubernetes manager nodes in the MOSK cluster must be connected to the following networks:

PXE network
API/LCM network (if you configure ARP announcement of the load-balancer IP address for the MOSK cluster API)
LCM network (if you configure BGP announcement of the load-balancer IP address for the MOSK cluster API)
Kubernetes workloads network

Caution

If you plan to deploy MOSK cluster with the compact control plane option, skip this section entirely and proceed with Create an L2 template for a MOSK controller node.

To create L2 templates for Kubernetes manager nodes:

Create or open the mosk-l2templates.yml file that contains the L2 templates you are preparing.

Add L2 templates using the following example. Adjust the values of specific parameters according to the specifications of your environment, specifically the name of your project (namespace) and cluster, IP address ranges and networks, subnet names.

L2 template example for Kubernetes manager node¶

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
    cluster.sigs.k8s.io/cluster-name: mosk-cluster-name
    rack1-mosk-manager: "true"
  name: rack1-mosk-manager
  namespace: mosk-namespace-name
spec:
  autoIfMappingPrio:
  - provision
  - eno
  - ens
  - enp
  l3Layout:
  - subnetName: api-lcm
    scope: namespace
  - subnetName: rack1-k8s-pods
    scope: namespace
  npTemplate: |-
    version: 2
    ethernets:
      {{nic 0}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 0}}
        set-name: {{nic 0}}
        mtu: 9000
      {{nic 1}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}
        mtu: 9000
      {{nic 2}}
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
        mtu: 9000
      {{nic 3}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
        mtu: 9000
    bonds:
      bond0:
        mtu: 9000
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
        interfaces:
        - {{nic 0}}
        - {{nic 1}}
    vlans:
      k8s-lcm-v:
        id: 403
        link: bond0
        mtu: 9000
      k8s-pods-v:
        id: 408
        link: bond0
        mtu: 9000
    bridges:
      k8s-lcm:
        interfaces: [k8s-lcm-v]
        addresses:
        - {{ ip "k8s-lcm:api-lcm" }}
        nameservers:
          addresses: {{nameservers_from_subnet "api-lcm"}}
        gateway4: {{ gateway_from_subnet "api-lcm" }}
      k8s-pods:
        interfaces: [k8s-pods-v]
        addresses:
        - {{ip "k8s-pods:rack1-k8s-pods"}}
        mtu: 9000
        routes:
          - to: 10.199.0.0/22 # aggregated address space for Kubernetes workloads
            via: {{gateway_from_subnet "rack1-k8s-pods"}}

Note

Before MOSK 23.3, an L2 template requires clusterRef: <clusterName> in the spec section. Since MOSK 23.3, this parameter is deprecated and automatically migrated to the cluster.sigs.k8s.io/cluster-name: <clusterName> label.

To create L2 templates for other racks, change the rack identifier in the names and labels above.

Proceed with Create an L2 template for a MOSK controller node. The resulting L2 templates will be used to render the netplan configuration for the managed cluster machines.

Create an L2 template for a MOSK controller node¶

Caution

Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause data loss.
Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.

For details, see Modify network configuration on an existing machine.

According to the reference architecture, MOSK controller nodes must be connected to the following networks:

PXE network
LCM network
External network
Kubernetes workloads network
Storage access network (if deploying with Ceph as a backend for ephemeral storage)
Floating IP and provider networks. Not required for deployment with Tungsten Fabric.
Tenant underlay networks. If deploying with VXLAN networking or with Tungsten Fabric. In the latter case, the BGP service is configured over this network.

To create L2 templates for MOSK controller nodes:

Create or open the mosk-l2template.yml file that contains the L2 templates.

Add L2 templates using the following example. Adjust the values of specific parameters according to the specification of your environment.

Example of an L2 template for a MOSK controller node¶

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
    cluster.sigs.k8s.io/cluster-name: mosk-cluster-name
    rack1-mosk-controller: "true"
  name: rack1-mosk-controller
  namespace: mosk-namespace-name
spec:
  autoIfMappingPrio:
  - provision
  - eno
  - ens
  - enp
  l3Layout:
  - subnetName: mgmt-lcm
    scope: global
  - subnetName: rack1-k8s-lcm
    scope: namespace
  - subnetName: k8s-external
    scope: namespace
  - subnetName: rack1-k8s-pods
    scope: namespace
  - subnetName: rack1-ceph-public
    scope: namespace
  - subnetName: rack1-tenant-tunnel
    scope: namespace
  npTemplate: |-
    version: 2
    ethernets:
      {{nic 0}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 0}}
        set-name: {{nic 0}}
        mtu: 9000
      {{nic 1}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}
        mtu: 9000
      {{nic 2}}
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
        mtu: 9000
      {{nic 3}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
        mtu: 9000
    bonds:
      bond0:
        mtu: 9000
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
        interfaces:
        - {{nic 0}}
        - {{nic 1}}
      bond1:
        mtu: 9000
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
        interfaces:
        - {{nic 2}}
        - {{nic 3}}
    vlans:
      k8s-lcm-v:
        id: 403
        link: bond0
        mtu: 9000
      k8s-ext-v:
        id: 409
        link: bond0
        mtu: 9000
      k8s-pods-v:
        id: 408
        link: bond0
        mtu: 9000
      pr-floating:
        id: 407
        link: bond1
        mtu: 9000
      stor-frontend:
        id: 404
        link: bond0
        addresses:
        - {{ip "stor-frontend:rack1-ceph-public"}}
        mtu: 9000
        routes:
        - to: 10.199.16.0/22 # aggregated address space for Ceph public network
          via: {{ gateway_from_subnet "rack1-ceph-public" }}
      tenant-tunnel:
        id: 406
        link: bond1
        addresses:
        - {{ip "tenant-tunnel:rack1-tenant-tunnel"}}
        mtu: 9000
        routes:
        - to: 10.195.0.0/22 # aggregated address space for tenant networks
          via: {{ gateway_from_subnet "rack1-tenant-tunnel" }}
    bridges:
      k8s-lcm:
        interfaces: [k8s-lcm-v]
        addresses:
        - {{ ip "k8s-lcm:rack1-k8s-lcm" }}
        nameservers:
          addresses: {{nameservers_from_subnet "rack1-k8s-lcm"}}
        routes:
        - to: 10.197.0.0/21 # aggregated address space for LCM and API/LCM networks
          via: {{ gateway_from_subnet "rack1-k8s-lcm" }}
        - to: {{ cidr_from_subnet "mgmt-lcm" }}
          via: {{ gateway_from_subnet "rack1-k8s-lcm" }}
      k8s-ext:
        interfaces: [k8s-ext-v]
        addresses:
        - {{ip "k8s-ext:k8s-external"}}
        nameservers:
          addresses: {{nameservers_from_subnet "k8s-external"}}
        gateway4: {{ gateway_from_subnet "k8s-external" }}
        mtu: 9000
      k8s-pods:
        interfaces: [k8s-pods-v]
        addresses:
        - {{ip "k8s-pods:rack1-k8s-pods"}}
        mtu: 9000
        routes:
        - to: 10.199.0.0/22 # aggregated address space for Kubernetes workloads
          via: {{ gateway_from_subnet "rack1-k8s-pods" }}

Note

Caution

If you plan to deploy a MOSK cluster with the compact control plane option and configure ARP announcement of the load-balancer IP address for the MOSK cluster API, the API/LCM network will be used for MOSK controller nodes. Therefore, change the rack1-k8s-lcm subnet to the api-lcm one in the corresponding L2Template object:

spec:
  ...
  l3Layout:
  ...
  - subnetName: api-lcm
    scope: namespace
  ...
  npTemplate: |-
  ...
    bridges:
      k8s-lcm:
        interfaces: [k8s-lcm-v]
        addresses:
        - {{ ip "k8s-lcm:api-lcm" }}
        nameservers:
          addresses: {{nameservers_from_subnet "api-lcm"}}
        routes:
        - to: 10.197.0.0/21 # aggregated address space for LCM and API/LCM networks
          via: {{ gateway_from_subnet "api-lcm" }}
        - to: {{ cidr_from_subnet "mgmt-lcm" }}
          via: {{ gateway_from_subnet "api-lcm" }}
  ...

Proceed with Create an L2 template for a MOSK compute node.

Create an L2 template for a MOSK compute node¶

Caution

Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause data loss.
Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.

For details, see Modify network configuration on an existing machine.

According to the reference architecture, MOSK compute nodes must be connected to the following networks:

PXE network
LCM network
Kubernetes workloads network
Storage access network (if deploying with Ceph as a backend for ephemeral storage)
Floating IP and provider networks (if deploying OpenStack with DVR)
Tenant underlay networks

To create L2 templates for MOSK compute nodes:

Add L2 templates to the mosk-l2templates.yml file using the following example. Adjust the values of parameters according to the specification of your environment.

Example of an L2 template for a MOSK compute node¶

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
    cluster.sigs.k8s.io/cluster-name: mosk-cluster-name
    rack1-mosk-compute: "true"
  name: rack1-mosk-compute
  namespace: mosk-namespace-name
spec:
  autoIfMappingPrio:
  - provision
  - eno
  - ens
  - enp
  l3Layout:
  - subnetName: rack1-k8s-lcm
    scope: namespace
  - subnetName: rack1-k8s-pods
    scope: namespace
  - subnetName: rack1-ceph-public
    scope: namespace
  - subnetName: rack1-tenant-tunnel
    scope: namespace
  npTemplate: |-
    version: 2
    ethernets:
      {{nic 0}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 0}}
        set-name: {{nic 0}}
        mtu: 9000
      {{nic 1}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}
        mtu: 9000
      {{nic 2}}
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
        mtu: 9000
      {{nic 3}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
        mtu: 9000
    bonds:
      bond0:
        mtu: 9000
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
        interfaces:
        - {{nic 0}}
        - {{nic 1}}
      bond1:
        mtu: 9000
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
        interfaces:
        - {{nic 2}}
        - {{nic 3}}
    vlans:
      k8s-lcm-v:
        id: 403
        link: bond0
        mtu: 9000
      k8s-pods-v:
        id: 408
        link: bond0
        mtu: 9000
      pr-floating:
        id: 407
        link: bond1
        mtu: 9000
      stor-frontend:
        id: 404
        link: bond0
        addresses:
        - {{ip "stor-frontend:rack1-ceph-public"}}
        mtu: 9000
        routes:
        - to: 10.199.16.0/22 # aggregated address space for Ceph public network
          via: {{ gateway_from_subnet "rack1-ceph-public" }}
      tenant-tunnel:
        id: 406
        link: bond1
        addresses:
        - {{ip "tenant-tunnel:rack1-tenant-tunnel"}}
        mtu: 9000
        routes:
        - to: 10.195.0.0/22 # aggregated address space for tenant networks
          via: {{ gateway_from_subnet "rack1-tenant-tunnel" }}
    bridges:
      k8s-lcm:
        interfaces: [k8s-lcm-v]
        addresses:
        - {{ ip "k8s-lcm:rack1-k8s-lcm" }}
        nameservers:
          addresses: {{nameservers_from_subnet "rack1-k8s-lcm"}}
        gateway4: {{ gateway_from_subnet "rack1-k8s-lcm" }}
      k8s-pods:
        interfaces: [k8s-pods-v]
        addresses:
        - {{ip "k8s-pods:k8s-pods-subnet"}}
        mtu: 9000
        routes:
          - to: 10.199.0.0/22 # aggregated address space for Kubernetes workloads
            via: {{gateway_from_subnet "rack1-k8s-pods"}}

Note

Proceed with Create an L2 template for a MOSK storage node.

Create an L2 template for a MOSK storage node¶

Caution

Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause data loss.
Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.

For details, see Modify network configuration on an existing machine.

According to the reference architecture, MOSK storage nodes in the MOSK cluster must be connected to the following networks:

PXE network
LCM network
Kubernetes workloads network
Storage access network
Storage replication network

To create L2 templates for MOSK storage nodes:

Add L2 templates to the mosk-l2templates.yml file using the following example. Adjust the values of parameters according to the specification of your environment.

Example of an L2 template for a MOSK storage node¶

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
    cluster.sigs.k8s.io/cluster-name: mosk-cluster-name
    rack1-mosk-storage: "true"
  name: rack1-mosk-storage
  namespace: mosk-namespace-name
spec:
  autoIfMappingPrio:
  - provision
  - eno
  - ens
  - enp
  l3Layout:
  - subnetName: rack1-k8s-lcm
    scope: namespace
  - subnetName: rack1-k8s-pods
    scope: namespace
  - subnetName: rack1-ceph-public
    scope: namespace
  - subnetName: rack1-ceph-cluster
    scope: namespace
  npTemplate: |-
    version: 2
    ethernets:
      {{nic 0}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 0}}
        set-name: {{nic 0}}
        mtu: 9000
      {{nic 1}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}
        mtu: 9000
      {{nic 2}}
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
        mtu: 9000
      {{nic 3}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
        mtu: 9000
    bonds:
      bond0:
        mtu: 9000
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
        interfaces:
        - {{nic 0}}
        - {{nic 1}}
      bond1:
        mtu: 9000
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
        interfaces:
        - {{nic 2}}
        - {{nic 3}}
    vlans:
      k8s-lcm-v:
        id: 403
        link: bond0
        mtu: 9000
      k8s-pods-v:
        id: 408
        link: bond0
        mtu: 9000
      stor-frontend:
        id: 404
        link: bond0
        addresses:
        - {{ip "stor-frontend:rack1-ceph-public"}}
        mtu: 9000
        routes:
        - to: 10.199.16.0/22 # aggregated address space for Ceph public network
          via: {{ gateway_from_subnet "rack1-ceph-public" }}
      stor-backend:
        id: 405
        link: bond1
        addresses:
        - {{ip "stor-backend:rack1-ceph-cluster"}}
        mtu: 9000
        routes:
        - to: 10.199.32.0/22 # aggregated address space for Ceph cluster network
          via: {{ gateway_from_subnet "rack1-ceph-cluster" }}
    bridges:
      k8s-lcm:
        interfaces: [k8s-lcm-v]
        addresses:
        - {{ ip "k8s-lcm:rack1-k8s-lcm" }}
        nameservers:
          addresses: {{nameservers_from_subnet "rack1-k8s-lcm"}}
        gateway4: {{ gateway_from_subnet "rack1-k8s-lcm" }}
      k8s-pods:
        interfaces: [k8s-pods-v]
        addresses:
        - {{ip "k8s-pods:k8s-pods-subnet"}}
        mtu: 9000
        routes:
          - to: 10.199.0.0/22 # aggregated address space for Kubernetes workloads
            via: {{gateway_from_subnet "rack1-k8s-pods"}}

Note

Proceed with the L2 template configuration procedure described in Create an L2 template for a new cluster.

L2 template example with bonds and bridges¶

This section contains an exemplary L2 template that demonstrates how to set up bonds and bridges on hosts for your managed clusters.

Parameters of the bond interface¶

Configure bonding options using the parameters field. The only mandatory option is mode. See the example below for details.

Note

You can set any mode supported by netplan and your hardware.

Important

Bond monitoring is disabled in Ubuntu by default. However, Mirantis highly recommends enabling it using the Media Independent Interface (MII) monitoring by setting the mii-monitor-interval parameter to a non-zero value. For details, see Linux documentation: bond monitoring.

Example of an L2 template with interfaces bonding¶

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  name: test-managed
  namespace: managed-ns
  labels:
    cluster.sigs.k8s.io/cluster-name: mosk-cluster-name
spec:
  autoIfMappingPrio:
    - provision
    - eno
    - ens
    - enp
  l3Layout:
    - subnetName: mgmt-lcm
      scope: global
    - subnetName: demo-lcm
      scope: namespace
    - subnetName: demo-ext
      scope: namespace
    - subnetName: demo-pods
      scope: namespace
    - subnetName: demo-ceph-cluster
      scope: namespace
    - subnetName: demo-ceph-public
      scope: namespace
  npTemplate: |
    version: 2
    ethernets:
      ten10gbe0s0:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
      ten10gbe0s1:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
    bonds:
      bond0:
        interfaces:
          - ten10gbe0s0
          - ten10gbe0s1
        mtu: 9000
        parameters:
          mode: 802.3ad
          mii-monitor-interval: 100
    vlans:
      k8s-lcm-vlan:
        id: 1009
        link: bond0
      k8s-ext-vlan:
        id: 1001
        link: bond0
      k8s-pods-vlan:
        id: 1002
        link: bond0
      stor-frontend:
        id: 1003
        link: bond0
      stor-backend:
        id: 1004
        link: bond0
    bridges:
      k8s-lcm:
        interfaces: [k8s-lcm-vlan]
        addresses:
          - {{ip "k8s-lcm:demo-lcm"}}
        routes:
          - to: {{ cidr_from_subnet "mgmt-lcm" }}
            via: {{ gateway_from_subnet "demo-lcm" }}
      k8s-ext:
        interfaces: [k8s-ext-vlan]
        addresses:
          - {{ip "k8s-ext:demo-ext"}}
        nameservers:
          addresses: {{nameservers_from_subnet "demo-ext"}}
        gateway4: {{ gateway_from_subnet "demo-ext" }}
      k8s-pods:
        interfaces: [k8s-pods-vlan]
        addresses:
          - {{ip "k8s-pods:demo-pods"}}
      ceph-cluster:
        interfaces: [stor-backend]
        addresses:
          - {{ip "ceph-cluster:demo-ceph-cluster"}}
      ceph-public:
        interfaces: [stor-frontend]
        addresses:
          - {{ip "ceph-public:demo-ceph-public"}}

Note

For description of networks, refer to Network types.

L2 template example for automatic multiple subnet creation¶

Unsupported since MCC 2.28.0 (17.3.0 and 16.3.0)

Warning

The SubnetPool object is unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). For details, see Deprecation Notes: SubnetPool resource management.

This section contains an exemplary L2 template for automatic multiple subnet creation as described in Automate multiple subnet creation using SubnetPool. This template also contains the L3Layout section that allows defining the Subnet scopes and enables auto-creation of the Subnet objects from the SubnetPool objects. For details about auto-creation of the Subnet objects see Automate multiple subnet creation using SubnetPool.

For details on how to create L2 templates, see Create an L2 template for a new cluster.

Caution

Do not assign an IP address to the PXE nic 0 NIC explicitly to prevent the IP duplication during updates. The IP address is automatically assigned by the bootstrapping engine.

Example of an L2 template for multiple subnets:

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  name: test-managed
  namespace: managed-ns
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
    cluster.sigs.k8s.io/cluster-name: my-cluster
spec:
  autoIfMappingPrio:
    - provision
    - eno
    - ens
    - enp
  l3Layout:
    - subnetName: lcm-subnet
      scope:      namespace
    - subnetName: subnet-1
      subnetPool: kaas-mgmt
      scope:      namespace
    - subnetName: subnet-2
      subnetPool: kaas-mgmt
      scope:      cluster
  npTemplate: |
    version: 2
    ethernets:
      onboard1gbe0:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 0}}
        set-name: {{nic 0}}
        # IMPORTANT: do not assign an IP address here explicitly
        # to prevent IP duplication issues. The IP will be assigned
        # automatically by the bootstrapping engine.
        # addresses: []
      onboard1gbe1:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}
      ten10gbe0s0:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
        addresses:
          - {{ip "2:subnet-1"}}
      ten10gbe0s1:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
        addresses:
          - {{ip "3:subnet-2"}}
    bridges:
      k8s-lcm:
        interfaces: [onboard1gbe0]
        addresses:
          - {{ip "k8s-lcm:lcm-subnet"}}
        gateway4: {{gateway_from_subnet "lcm-subnet"}}
        nameservers:
          addresses: {{nameservers_from_subnet "lcm-subnet"}}

Note

In the template above, the following networks are defined in the l3Layout section:

lcm-subnet - the subnet name to use for the LCM network in npTemplate. This subnet is shared between the project clusters because it has the namespaced scope.
- Since a subnet pool is not in use, create the corresponding Subnet object before machines are attached to cluster manually. For details, see Create subnets for a managed cluster using CLI.
- Mark this Subnet with the ipam/SVC-k8s-lcm label. The L2 template must contain the definition of the virtual Linux bridge (k8s-lcm in the L2 template example) that is used to set up the LCM network interface. IP addresses for the defined bridge must be assigned from the LCM subnet, which is marked with the ipam/SVC-k8s-lcm label.
  - Each node of every cluster must have one and only IP address in the LCM network that is allocated from one of the Subnet objects having the ipam/SVC-k8s-lcm label defined. Therefore, all Subnet objects used for LCM networks must have the ipam/SVC-k8s-lcm label defined.
  - You can use any interface name for the LCM network traffic. The Subnet objects for the LCM network must have the ipam/SVC-k8s-lcm label. For details, see Service labels and their life cycle.
subnet-1 - unless already created, this subnet will be created from the kaas-mgmt subnet pool. The subnet name must be unique within the project. This subnet is shared between the project clusters.
subnet-2 - will be created from the kaas-mgmt subnet pool. This subnet has the cluster scope. Therefore, the real name of the Subnet CR object consists of the subnet name defined in l3Layout and the cluster UID. But the npTemplate section of the L2 template must contain only the subnet name defined in l3Layout. The subnets of the cluster scope are not shared between clusters.

Caution

Using the l3Layout section, define all subnets that are used in the npTemplate section. Defining only part of subnets is not allowed.

If labelSelector is used in l3Layout, use any custom label name that differs from system names. This allows for easier cluster scaling in case of adding new subnets as described in Expand IP addresses capacity in an existing cluster.

Mirantis recommends using a unique label prefix such as user-defined/.

Example of a complete template configuration for cluster creation¶

The following example contains all required objects of an advanced network and host configuration for a managed cluster.

The procedure below contains:

Various .yaml objects to be applied with a managed cluster kubeconfig
Useful comments inside the .yaml example files
Example hardware and configuration data, such as network, disk, auth, that must be updated accordingly to fit your cluster configuration
Example templates, such as l2template and baremetalhostprofline, that illustrate how to implement a specific configuration

Caution

The exemplary configuration described below is not production ready and is provided for illustration purposes only.

For illustration purposes, all files provided in this exemplary procedure are named by the Kubernetes object types:

Note

Caution

managed-ns_BareMetalHostInventory_cz7700-managed-cluster-control-noefi.yaml
managed-ns_BareMetalHostInventory_cz7741-managed-cluster-control-noefi.yaml
managed-ns_BareMetalHostInventory_cz7743-managed-cluster-control-noefi.yaml
managed-ns_BareMetalHostInventory_cz812-managed-cluster-storage-worker-noefi.yaml
managed-ns_BareMetalHostInventory_cz813-managed-cluster-storage-worker-noefi.yaml
managed-ns_BareMetalHostInventory_cz814-managed-cluster-storage-worker-noefi.yaml
managed-ns_BareMetalHostInventory_cz815-managed-cluster-worker-noefi.yaml
managed-ns_BareMetalHostProfile_bmhp-cluster-default.yaml
managed-ns_BareMetalHostProfile_worker-storage1.yaml
managed-ns_Cluster_managed-cluster.yaml
managed-ns_KaaSCephCluster_ceph-cluster-managed-cluster.yaml
managed-ns_L2Template_bm-1490-template-controls-netplan-cz7700-pxebond.yaml
managed-ns_L2Template_bm-1490-template-controls-netplan.yaml
managed-ns_L2Template_bm-1490-template-workers-netplan.yaml
managed-ns_Machine_cz7700-managed-cluster-control-noefi-.yaml
managed-ns_Machine_cz7741-managed-cluster-control-noefi-.yaml
managed-ns_Machine_cz7743-managed-cluster-control-noefi-.yaml
managed-ns_Machine_cz812-managed-cluster-storage-worker-noefi-.yaml
managed-ns_Machine_cz813-managed-cluster-storage-worker-noefi-.yaml
managed-ns_Machine_cz814-managed-cluster-storage-worker-noefi-.yaml
managed-ns_Machine_cz815-managed-cluster-worker-noefi-.yaml
managed-ns_PublicKey_managed-cluster-key.yaml
managed-ns_cz7700-cred.yaml
managed-ns_cz7741-cred.yaml
managed-ns_cz7743-cred.yaml
managed-ns_cz812-cred.yaml
managed-ns_cz813-cred.yaml
managed-ns_cz814-cred.yaml
managed-ns_cz815-cred.yaml
managed-ns_Subnet_lcm-nw.yaml
managed-ns_Subnet_metallb-public-for-managed.yaml (obsolete)
managed-ns_Subnet_metallb-public-for-extiface.yaml
managed-ns_MetalLBConfig-lb-managed.yaml
managed-ns_MetalLBConfigTemplate-lb-managed-template.yaml (obsolete)
managed-ns_Subnet_storage-backend.yaml
managed-ns_Subnet_storage-frontend.yaml
default_Namespace_managed-ns.yaml

Caution

The procedure below presumes that you apply each new .yaml file using kubectl create -f <file_name.yaml>.

To create an example configuration for a managed cluster creation:

Verify that you have configured the following items:
1. All bmh nodes for PXE boot as described in Add a bare metal host using CLI
2. All physical NICs of the bmh nodes
3. All required physical subnets and routing
Create an empty .yaml file with the namespace object:
```
apiVersion: v1
```
Select from the following options:
Since MCC 2.21.0 (11.5.0, 7.11.0)
Create the required number of .yaml files with the BareMetalHostCredential objects for each bmh node with the unique name and authentication data. The following example contains one BareMetalHostCredential object:

Note

The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
managed-ns_cz815-cred.yaml
apiVersion: kaas.mirantis.com/v1alpha1 kind: BareMetalHostCredential metadata: name: cz815-cred namespace: managed-ns labels: kaas.mirantis.com/region: region-one spec: username: admin password: value: supersecret
Before MCC 2.21.0 (11.4.0, 8.10.0, 7.10.0, or earlier)
Create the required number of .yaml files with the Secret objects for each bmh node with the unique name and authentication data. The following example contains one Secret object:
managed-ns_cz815-cred.yaml
apiVersion: v1 data: password: YWRtaW4= username: ZW5naW5lZXI= kind: Secret metadata: labels: kaas.mirantis.com/credentials: 'true' kaas.mirantis.com/provider: baremetal kaas.mirantis.com/region: region-one name: cz815-cred namespace: managed-ns

Create a set of files with the bmh nodes configuration:

Since MCC 2.29.0 (17.4.0 and 16.4.0)

Since MCC 2.21.0 (11.5.0 and 7.11.0)

Before MCC 2.21.0 (11.4.0, 8.10.0, 7.10.0, or earlier)

Verify that the inspecting phase has started:

KUBECONFIG=kubeconfig kubectl -n managed-ns get bmh -o wide

Example of system response:

NAME                                       STATUS STATE CONSUMER BMC           BOOTMODE ONLINE ERROR REGION
cz7700-managed-cluster-control-noefi       OK     inspecting     192.168.1.12  legacy   true         region-one
cz7741-managed-cluster-control-noefi       OK     inspecting     192.168.1.76  legacy   true         region-one
cz7743-managed-cluster-control-noefi       OK     inspecting     192.168.1.78  legacy   true         region-one
cz812-managed-cluster-storage-worker-noefi OK     inspecting     192.168.1.182 legacy   true         region-one

Wait for inspection to complete. Usually, it takes up to 15 minutes.

Collect the bmh hardware information to create the l2template and bmh objects:

KUBECONFIG=kubeconfig kubectl -n managed-ns get bmh -o wide

Example of system response:

NAME                                       STATUS STATE CONSUMER BMC           BOOTMODE ONLINE ERROR REGION
cz7700-managed-cluster-control-noefi       OK     available      192.168.1.12  legacy   true         region-one
cz7741-managed-cluster-control-noefi       OK     available      192.168.1.76  legacy   true         region-one
cz7743-managed-cluster-control-noefi       OK     available      192.168.1.78  legacy   true         region-one
cz812-managed-cluster-storage-worker-noefi OK     available      192.168.1.182 legacy   true         region-one

KUBECONFIG=kubeconfig kubectl -n managed-ns get bmh cz7700-managed-cluster-control-noefi -o yaml | less

Example of system response:

..
nics:
- ip: ""
  mac: 0c:c4:7a:1d:f4:a6
  model: 0x8086 0x10fb
  # discovered interfaces
  name: ens4f0
  pxe: false
  # temporary PXE address discovered from baremetal-mgmt
- ip: 172.16.170.30
  mac: 0c:c4:7a:34:52:04
  model: 0x8086 0x1521
  name: enp9s0f0
  pxe: true
  # duplicates temporary PXE address discovered from baremetal-mgmt
  # since we have fallback-bond configured on host
- ip: 172.16.170.33
  mac: 0c:c4:7a:34:52:05
  model: 0x8086 0x1521
  # discovered interfaces
  name: enp9s0f1
  pxe: false
...
storage:
- by_path: /dev/disk/by-path/pci-0000:00:1f.2-ata-1
  model: Samsung SSD 850
  name: /dev/sda
  rotational: false
  sizeBytes: 500107862016
- by_path: /dev/disk/by-path/pci-0000:00:1f.2-ata-2
  model: Samsung SSD 850
  name: /dev/sdb
  rotational: false
  sizeBytes: 500107862016
....

Create bare metal host profiles:

managed-ns_BareMetalHostProfile_bmhp-cluster-default.yaml

apiVersion: metal3.io/v1alpha1
kind: BareMetalHostProfile
metadata:
  labels:
    cluster.sigs.k8s.io/cluster-name: managed-cluster
    # This label indicates that this profile will be default in
    # namespaces, so machines w\o exact profile selecting will use
    # this template
    kaas.mirantis.com/defaultBMHProfile: 'true'
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
  name: bmhp-cluster-default
  namespace: managed-ns
spec:
  devices:
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-1
      minSize: 120Gi
      wipe: true
    partitions:
    - name: bios_grub
      partflags:
      - bios_grub
      size: 4Mi
      wipe: true
    - name: uefi
      partflags:
      - esp
      size: 200Mi
      wipe: true
    - name: config-2
      size: 64Mi
      wipe: true
    - name: lvm_dummy_part
      size: 1Gi
      wipe: true
    - name: lvm_root_part
      size: 0
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-2
      minSize: 30Gi
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-3
      minSize: 30Gi
      wipe: true
    partitions:
    - name: lvm_lvp_part
      size: 0
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-4
      wipe: true
  fileSystems:
  - fileSystem: vfat
    partition: config-2
  - fileSystem: vfat
    mountPoint: /boot/efi
    partition: uefi
  - fileSystem: ext4
    logicalVolume: root
    mountPoint: /
  - fileSystem: ext4
    logicalVolume: lvp
    mountPoint: /mnt/local-volumes/
  grubConfig:
    defaultGrubOptions:
    - GRUB_DISABLE_RECOVERY="true"
    - GRUB_PRELOAD_MODULES=lvm
    - GRUB_TIMEOUT=30
  kernelParameters:
    modules:
    - content: 'options kvm_intel nested=1'
      filename: kvm_intel.conf
    sysctl:
    # For the list of options prohibited to change, refer to
    # https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.html
      fs.aio-max-nr: '1048576'
      fs.file-max: '9223372036854775807'
      fs.inotify.max_user_instances: '4096'
      kernel.core_uses_pid: '1'
      kernel.dmesg_restrict: '1'
      net.ipv4.conf.all.rp_filter: '0'
      net.ipv4.conf.default.rp_filter: '0'
      net.ipv4.conf.k8s-ext.rp_filter: '0'
      net.ipv4.conf.k8s-ext.rp_filter: '0'
      net.ipv4.conf.m-pub.rp_filter: '0'
      vm.max_map_count: '262144'
  logicalVolumes:
  - name: root
    size: 0
    vg: lvm_root
  - name: lvp
    size: 0
    vg: lvm_lvp
  postDeployScript: |
    #!/bin/bash -ex
    # used for test-debug only!
    echo "root:r00tme" | sudo chpasswd
    echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/rules.d/60-ssd-scheduler.rules
    echo $(date) 'post_deploy_script done' >> /root/post_deploy_done

  preDeployScript: |
    #!/bin/bash -ex
    echo "$(date) pre_deploy_script done" >> /root/pre_deploy_done
  volumeGroups:
  - devices:
    - partition: lvm_root_part
    name: lvm_root
  - devices:
    - partition: lvm_lvp_part
    name: lvm_lvp
  - devices:
    - partition: lvm_dummy_part
    # here we can create lvm, but do not mount or format it somewhere
    name: lvm_forawesomeapp

managed-ns_BareMetalHostProfile_worker-storage1.yaml

apiVersion: metal3.io/v1alpha1
kind: BareMetalHostProfile
metadata:
  labels:
    cluster.sigs.k8s.io/cluster-name: managed-cluster
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
  name: worker-storage1
  namespace: managed-ns
spec:
  devices:
  - device:
      minSize: 120Gi
      wipe: true
    partitions:
    - name: bios_grub
      partflags:
      - bios_grub
      size: 4Mi
      wipe: true
    - name: uefi
      partflags:
      - esp
      size: 200Mi
      wipe: true
    - name: config-2
      size: 64Mi
      wipe: true
    # Create dummy partition w\o mounting
    - name: lvm_dummy_part
      size: 1Gi
      wipe: true
    - name: lvm_root_part
      size: 0
      wipe: true
  - device:
      # Will be used for Ceph, so required to be wiped
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-1
      minSize: 30Gi
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-2
      minSize: 30Gi
      wipe: true
    partitions:
    - name: lvm_lvp_part
      size: 0
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-3
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-4
      minSize: 30Gi
      wipe: true
    partitions:
      - name: lvm_lvp_part_sdf
        wipe: true
        size: 0
  fileSystems:
  - fileSystem: vfat
    partition: config-2
  - fileSystem: vfat
    mountPoint: /boot/efi
    partition: uefi
  - fileSystem: ext4
    logicalVolume: root
    mountPoint: /
  - fileSystem: ext4
    logicalVolume: lvp
    mountPoint: /mnt/local-volumes/
  grubConfig:
    defaultGrubOptions:
    - GRUB_DISABLE_RECOVERY="true"
    - GRUB_PRELOAD_MODULES=lvm
    - GRUB_TIMEOUT=30
  kernelParameters:
    modules:
    - content: 'options kvm_intel nested=1'
      filename: kvm_intel.conf
    sysctl:
    # For the list of options prohibited to change, refer to
    # https://docs.mirantis.com/mke/3.6/install/predeployment/set-up-kernel-default-protections.html
      fs.aio-max-nr: '1048576'
      fs.file-max: '9223372036854775807'
      fs.inotify.max_user_instances: '4096'
      kernel.core_uses_pid: '1'
      kernel.dmesg_restrict: '1'
      net.ipv4.conf.all.rp_filter: '0'
      net.ipv4.conf.default.rp_filter: '0'
      net.ipv4.conf.k8s-ext.rp_filter: '0'
      net.ipv4.conf.k8s-ext.rp_filter: '0'
      net.ipv4.conf.m-pub.rp_filter: '0'
      vm.max_map_count: '262144'
  logicalVolumes:
  - name: root
    size: 0
    vg: lvm_root
  - name: lvp
    size: 0
    vg: lvm_lvp
  postDeployScript: |

    #!/bin/bash -ex

    # used for test-debug only! That would allow operator to logic via TTY.
    echo "root:r00tme" | sudo chpasswd
    # Just an example for enforcing "ssd" disks to be switched to use "deadline" i\o scheduler.
    echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/   rules.d/60-ssd-scheduler.rules
    echo $(date) 'post_deploy_script done' >> /root/post_deploy_done

  preDeployScript: |
    #!/bin/bash -ex
    echo "$(date) pre_deploy_script done" >> /root/pre_deploy_done

  volumeGroups:
  - devices:
    - partition: lvm_root_part
    name: lvm_root
  - devices:
    - partition: lvm_lvp_part
    - partition: lvm_lvp_part_sdf
    name: lvm_lvp
  - devices:
    - partition: lvm_dummy_part
    name: lvm_forawesomeapp

Note

If you mount the /var directory, before configuring BareMetalHostProfile, review Mounting recommendations for the /var directory.

Create the L2Template objects:

managed-ns_L2Template_bm-1490-template-controls-netplan-cz7700-pxebond.yaml

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  labels:
    bm-1490-template-controls-netplan-cz7700-pxebond: anymagicstring
    cluster.sigs.k8s.io/cluster-name: managed-cluster
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
  name: bm-1490-template-controls-netplan-cz7700-pxebond
  namespace: managed-ns
spec:
  ifMapping:
  - enp9s0f0
  - enp9s0f1
  - eno1
  - ens3f1
  l3Layout:
  - scope: namespace
    subnetName: lcm-nw
  - scope: namespace
    subnetName: storage-frontend
  - scope: namespace
    subnetName: storage-backend
  - scope: namespace
    subnetName: metallb-public-for-extiface
  npTemplate: |-
    version: 2
    ethernets:
      {{nic 0}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 0}}
        set-name: {{nic 0}}
        mtu: 1500
      {{nic 1}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}
        mtu: 1500
      {{nic 2}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
        mtu: 1500
      {{nic 3}}:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
        mtu: 1500
    bonds:
      bond0:
        parameters:
          mode: 802.3ad
          #transmit-hash-policy: layer3+4
          #mii-monitor-interval: 100
        interfaces:
          - {{ nic 0 }}
          - {{ nic 1 }}
      bond1:
        parameters:
          mode: 802.3ad
          #transmit-hash-policy: layer3+4
          #mii-monitor-interval: 100
        interfaces:
          - {{ nic 2 }}
          - {{ nic 3 }}
    vlans:
      stor-f:
        id: 1494
        link: bond1
        addresses:
          - {{ip "stor-f:storage-frontend"}}
      stor-b:
        id: 1489
        link: bond1
        addresses:
          - {{ip "stor-b:storage-backend"}}
      m-pub:
        id: 1491
        link: bond0
    bridges:
      k8s-ext:
        interfaces: [m-pub]
        addresses:
          - {{ ip "k8s-ext:metallb-public-for-extiface" }}
      k8s-lcm:
        dhcp4: false
        dhcp6: false
        gateway4: {{ gateway_from_subnet "lcm-nw" }}
        addresses:
          - {{ ip "k8s-lcm:lcm-nw" }}
        nameservers:
          addresses: [ 172.18.176.6 ]
        interfaces:
          - bond0

Create the Subnet objects:

Create MetalLB configuration objects:

Since MCC 2.27.0 (17.2.0 and 16.2.0):

Before MCC 2.27.0 (17.1.0, 16.1.0, or earlier):

Before MCC 2.24.0 (12.7.0, 11.7.0, or earlier):

Create the PublicKey object for a managed cluster connection. For details, see PublicKey resource.

Create the Cluster object. For details, see Cluster resource.

Create the Machine objects linked to each bmh node. For details, see Machine resource.

Verify that the bmh nodes are in the provisioning state:

KUBECONFIG=kubectl kubectl -n managed-ns get bmh  -o wide

Example of system response:

NAME                                  STATUS STATE          CONSUMER                                    BMC          BOOTMODE   ONLINE  ERROR REGION
cz7700-managed-cluster-control-noefi  OK     provisioning   cz7700-managed-cluster-control-noefi-8bkqw  192.168.1.12  legacy     true          region-one
cz7741-managed-cluster-control-noefi  OK     provisioning   cz7741-managed-cluster-control-noefi-42tp2  192.168.1.76  legacy     true          region-one
cz7743-managed-cluster-control-noefi  OK     provisioning   cz7743-managed-cluster-control-noefi-8cwpw  192.168.1.78  legacy     true          region-one
...

Wait until all bmh nodes are in the provisioned state.

Verify that the lcmmachine phase has started:

KUBECONFIG=kubeconfig kubectl -n managed-ns get lcmmachines  -o wide

Example of system response:

NAME                                       CLUSTERNAME       TYPE      STATE   INTERNALIP     HOSTNAME                                         AGENTVERSION
cz7700-managed-cluster-control-noefi-8bkqw managed-cluster   control   Deploy  172.16.170.153 kaas-node-803721b4-227c-4675-acc5-15ff9d3cfde2   v0.2.0-349-g4870b7f5
cz7741-managed-cluster-control-noefi-42tp2 managed-cluster   control   Prepare 172.16.170.152 kaas-node-6b8f0d51-4c5e-43c5-ac53-a95988b1a526   v0.2.0-349-g4870b7f5
cz7743-managed-cluster-control-noefi-8cwpw managed-cluster   control   Prepare 172.16.170.151 kaas-node-e9b7447d-5010-439b-8c95-3598518f8e0a   v0.2.0-349-g4870b7f5
...

Verify that the lcmmachine phase is complete and the Kubernetes cluster is created:

KUBECONFIG=kubeconfig kubectl -n managed-ns get lcmmachines  -o wide

Example of system response:

NAME                                       CLUSTERNAME       TYPE     STATE  INTERNALIP      HOSTNAME                                        AGENTVERSION
cz7700-managed-cluster-control-noefi-8bkqw  managed-cluster  control  Ready  172.16.170.153  kaas-node-803721b4-227c-4675-acc5-15ff9d3cfde2  v0.2.0-349-g4870b7f5
cz7741-managed-cluster-control-noefi-42tp2  managed-cluster  control  Ready  172.16.170.152  kaas-node-6b8f0d51-4c5e-43c5-ac53-a95988b1a526  v0.2.0-349-g4870b7f5
cz7743-managed-cluster-control-noefi-8cwpw  managed-cluster  control  Ready  172.16.170.151  kaas-node-e9b7447d-5010-439b-8c95-3598518f8e0a  v0.2.0-349-g4870b7f5
...

Create the KaaSCephCluster object:

Note

The storageDevices[].fullPath field is available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). For the clusters running earlier product versions, define the /dev/disk/by-id symlinks using storageDevices[].name instead.

Obtain kubeconfig of the newly created managed cluster:

KUBECONFIG=kubeconfig kubectl -n managed-ns get secrets managed-cluster-kubeconfig -o jsonpath='{.data.admin\.conf}' | base64 -d |  tee managed.kubeconfig

Verify the status of the Ceph cluster in your managed cluster:

KUBECONFIG=managed.kubeconfig kubectl -n rook-ceph exec -it $(KUBECONFIG=managed.kubeconfig kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -- ceph -s

Example of system response:

cluster:
  id:     e75c6abd-c5d5-4ae8-af17-4711354ff8ef
  health: HEALTH_OK
services:
  mon: 3 daemons, quorum a,b,c (age 55m)
  mgr: a(active, since 55m)
  osd: 3 osds: 3 up (since 54m), 3 in (since 54m)
data:
  pools:   1 pools, 32 pgs
  objects: 273 objects, 555 MiB
  usage:   4.0 GiB used, 1.6 TiB / 1.6 TiB avail
  pgs:     32 active+clean
io:
  client:   51 KiB/s wr, 0 op/s rd, 4 op/s wr

Automate multiple subnet creation using SubnetPool¶

Unsupported since MCC 2.28.0 (17.3.0 and 16.3.0)

Warning

The SubnetPool object is unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). For details, see Deprecation Notes: SubnetPool resource management.

Operators of Mirantis Container Cloud for on-demand self-service Kubernetes deployments will want their users to create networks without extensive knowledge about network topology or IP addresses. For that purpose, the Operator can prepare L2 network templates in advance for users to assign these templates to machines in their clusters.

The Operator can ensure that the users’ clusters have separate IP address spaces using the SubnetPool resource.

SubnetPool allows for automatic creation of Subnet objects that will consume blocks from the parent SubnetPool CIDR IP address range. The SubnetPool blockSize setting defines the IP address block size to allocate to each child Subnet. SubnetPool has a global scope, so any SubnetPool can be used to create the Subnet objects for any namespace and for any cluster.

You can use the SubnetPool resource in the L2Template resources to automatically allocate IP addresses from an appropriate IP range that corresponds to a specific cluster, or create a Subnet resource if it does not exist yet. This way, every cluster will use subnets that do not overlap with other clusters.

To automate multiple subnet creation using SubnetPool:

Log in to a local machine where your management cluster kubeconfig is located and where kubectl is installed.

Note

The management cluster kubeconfig is created during the last stage of the management cluster bootstrap.
Create the subnetpool.yaml file with a number of subnet pools:

Note

You can define either or both subnets and subnet pools, depending on the use case. A single L2 template can use either or both subnets and subnet pools.
```
kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <SubnetFileName.yaml>
```
Note

In the command above and in the steps below, substitute the parameters enclosed in angle brackets with the corresponding values.

Example of a subnetpool.yaml file:
```
apiVersion: ipam.mirantis.com/v1alpha1
kind: SubnetPool
metadata:
 name: kaas-mgmt
 namespace: default
 labels:
 kaas.mirantis.com/provider: baremetal
 kaas.mirantis.com/region: region-one
spec:
 cidr: 10.10.0.0/16
 blockSize: /25
 nameservers:
 - 172.18.176.6
 gatewayPolicy: first
```
For the specification fields description of the SubnetPool object, see SubnetPool spec.

Note

The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
Verify that the subnet pool is successfully created:
```
kubectl get subnetpool kaas-mgmt -oyaml
```
In the system output, verify the status fields of the subnetpool.yaml file. For the status fields description of the SunbetPool object, see SubnetPool status.
Proceed to creating an L2 template for one or multiple managed clusters as described in Create L2 templates. In this procedure, select the exemplary L2 template for multiple subnets.

Caution

Using the l3Layout section, define all subnets that are used in the npTemplate section. Defining only part of subnets is not allowed.

If labelSelector is used in l3Layout, use any custom label name that differs from system names. This allows for easier cluster scaling in case of adding new subnets as described in Expand IP addresses capacity in an existing cluster.

Mirantis recommends using a unique label prefix such as user-defined/.

Add a machine¶

This section describes how to add a machine to a managed MOSK cluster using CLI for advanced configuration.

Create a machine using CLI¶

This section describes adding machines to a new MOSK cluster using Mirantis Container Cloud CLI.

If you need to add more machines to an existing MOSK cluster, see Add a controller node and Add a compute node.

To add machine to the MOSK cluster:

Create a new text file mosk-cluster-machines.yaml and create the YAML definitons of the Machine resources. Use this as an example, and see the descriptions of the fields below:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  name: mosk-node-role-name
  namespace: mosk-project
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
    cluster.sigs.k8s.io/cluster-name: mosk-cluster
spec:
  providerSpec:
    value:
      apiVersion: baremetal.k8s.io/v1alpha1
      kind: BareMetalMachineProviderSpec
      bareMetalHostProfile:
        name: mosk-k8s-mgr
        namespace: mosk-project
      l2TemplateSelector:
        name: mosk-k8s-mgr
      hostSelector: {}
      l2TemplateMappingOverride: []

Note

Add the top level fields:
- apiVersion
  API version of the object that is cluster.k8s.io/v1alpha1.
- kind
  Object type that is Machine.
- metadata
  This section will contain the metadata of the object.
- spec
  This section will contain the configuration of the object.
Add mandatory fields to the metadata section of the Machine object definition.
- name
  The name of the Machine object.
- namespace
  The name of the Project where the Machine will be created.
- labels
  This section contains additional metadata of the machine. Set the following mandatory labels for the Machine object.
  - kaas.mirantis.com/provider
    Set to "baremetal".
  - kaas.mirantis.com/region
    Region name that matches the region name in the Cluster object.
  - cluster.sigs.k8s.io/cluster-name
    The name of the cluster to add the machine to.
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
Configure the mandatory parameters of the Machine object in spec field. Add providerSpec field that contains parameters for deployment on bare metal in a form of Kubernetes subresource.
In the providerSpec section, add the following mandatory configuration parameters:
- apiVersion
  API version of the subresource that is baremetal.k8s.io/v1alpha1.
- kind
  Object type that is BareMetalMachineProviderSpec.
- bareMetalHostProfile
  Reference to a configuration profile of a bare metal host. It helps to pick bare metal host with suitable configuration for the machine. This section includes two parameters:
  - name
    Name of a bare metal host profile
  - namespace
    Project in which the bare metal host profile is created.
- l2TemplateSelector
  If specified, contains the name (first priority) or label of the L2 template that will be applied during a machine creation. Note that changing this field after Machine object is created will not affect the host network configuration of the machine.
  
  You should assign one of the templates you defined in Create L2 templates to the machine. If there is no suitable templates, you should create one per Create L2 templates.
- hostSelector
  This parameter defines matching criteria for picking a bare metal host for the machine by label.
  
  Any custom label that is assigned to one or more bare metal hosts using API can be used as a host selector. If the bare metal host objects with the specified label are missing, the Machine object will not be deployed until at least one bare metal host with the specified label is available.
  
  See Deploy a machine to a specific bare metal host for details.
- l2TemplateIfMappingOverride
  This parameter contains a list of names of network interfaces of the host. It allows to override the default naming and ordering of network interfaces defined in L2 template referenced by the l2TemplateSelector. This ordering informs the L2 templates how to generate the host network configuration.
  
  See Override network interfaces naming and order for details.
Depending on the role of the machine in the MOSK cluster, add labels to the nodeLabels field.

This field contains the list of node labels to be attached to a node for the user to run certain components on separate cluster nodes. The list of allowed node labels is located in the Cluster object status providerStatus.releaseRef.current.allowedNodeLabels field.

If the value field is not defined in allowedNodeLabels, a label can have any value. For example:
```
allowedNodeLabels:
- displayName: Stacklight
  key: stacklight
```
Before or after a machine deployment, add the required label from the allowed node labels list with the corresponding value to spec.providerSpec.value.nodeLabels in machine.yaml. For example:
```
nodeLabels:
- key: stacklight
  value: enabled
```
Adding of a node label that is not available in the list of allowed node labels is restricted.
If you are NOT deploying MOSK with the compact control plane, add 3 dedicated Kubernetes manager nodes.
1. Add 3 Machine objects for Kubernetes manager nodes using the following label:
```
metadata:
  labels:
    cluster.sigs.k8s.io/control-plane: "true"
```
  Note
  
  The value of the label might be any non-empty string. On a worker node, this label must be omitted entirely.
2. Add 3 Machine objects for MOSK controller nodes using the following labels:
```
spec:
  providerSpec:
    value:
      nodeLabels:
        openstack-control-plane: enabled
        openstack-gateway: enabled
```

If you are deploying MOSK with compact control plane, add Machine objects for 3 combined control plane nodes using the following labels and parameters to the nodeLabels field:

metadata:
  labels:
    cluster.sigs.k8s.io/control-plane: true
spec:
  providerSpec:
    value:
      nodeLabels:
        openstack-control-plane: enabled
        openstack-gateway: enabled
        openvswitch: enabled

Add Machine objects for as many compute nodes as you want to install using the following labels:

spec:
  providerSpec:
    value:
      nodeLabels:
        openstack-compute-node: enabled
        openvswitch: enabled

Save the text file and repeat the process to create configuration for all machines in your MOSK cluster.

Create machines in the cluster using command:

kubectl create -f mosk-cluster-machines.yaml

Proceed to Add a Ceph cluster.

See also

Create a machine using web UI¶

After you add bare metal hosts and create a managed cluster as described in Create a MOSK cluster, proceed with associating Kubernetes machines of your cluster with the previously added bare metal hosts using the Container Cloud web UI.

To add a Kubernetes machine to a MOSK cluster:

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with the Machines list opens.
Click Create Machine button.
Fill out the Create New Machine form as required:
Since MCC 2.28.0 (Cluster releases 17.3.0 and 16.3.0)
- Name
 New machine name. If empty, a name is automatically generated in the <clusterName>-<machineType>-<uniqueSuffix> format.
- Type
 Machine type. Select Manager or Worker to create a Kubernetes manager or worker node.
 
 Caution
 
 The required minimum number of machines:
 
 3 manager nodes for HA
 
 3 worker storage nodes for a minimal Ceph cluster
- L2 Template
 From the drop-down list, select the previously created L2 template, if any. For details, see Create L2 templates. Otherwise, leave the default selection to use the default L2 template of the cluster.
 
 Note
 
 Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), if you leave the default selection in the drop-down list, a preinstalled L2 template is used. Preinstalled templates are removed in the above-mentioned releases.
- Distribution
 Operating system to provision the machine. From the drop-down list, select Ubuntu 22.04 Jammy as the machine distribution.
 
 Warning
 
 Do not use obsolete Ubuntu 20.04 distribution on greenfield deployments but only on existing clusters based on Ubuntu 20.04, which reaches end-of-life in April 2025. MOSK 24.3 release series is the last one to support Ubuntu 20.04 as the host operating system.
 
 Update of management or MOSK clusters running Ubuntu 20.04 to the following major product release, where Ubuntu 22.04 is the only supported version, is not possible.
- Upgrade Index
 Optional. A positive numeral value that defines the order of machine upgrade during a cluster update.
 
 Note
 
 You can change the upgrade order later on an existing cluster. For details, see Change the upgrade order of a machine.
 
 Consider the following upgrade index specifics:
 
 The first machine to upgrade is always one of the control plane machines with the lowest upgradeIndex. Other control plane machines are upgraded one by one according to their upgrade indexes.
 
 If the Cluster spec dedicatedControlPlane field is false, worker machines are upgraded only after the upgrade of all control plane machines finishes. Otherwise, they are upgraded after the first control plane machine, concurrently with other control plane machines.
 
 If several machines have the same upgrade index, they have the same priority during upgrade.
 
 If the value is not set, the machine is automatically assigned a value of the upgrade index.
- Host Configuration
 Configuration settings of the bare metal host to be used for the machine:
 
 Host
 From the drop-down list, select the previously created custom bare metal host to be used for the new machine.
 
 Host Profile
 From the drop-down list, select the previously created custom bare metal host profile, if any. For details, see Create a custom bare metal host profile. Otherwise, leave the default selection.
- Labels
 Add the required node labels for the worker machine to run certain components on a specific node. For example, for the StackLight nodes that run OpenSearch and require more resources than a standard node, add the StackLight label. The list of available node labels is obtained from allowedNodeLabels of your current Cluster release.
 
 If the value field is not defined in allowedNodeLabels, from the drop-down list, select the required label and define an appropriate custom value for this label to be set to the node. For example, the node-type label can have the storage-ssd value to meet the service scheduling logic on a particular machine.
 
 Note
 
 Due to the known issue 23002 fixed in Container Cloud 2.21.0 (Cluster releases 7.11.0 and 11.5.0), a custom value for a predefined node label cannot be set using the Container Cloud web UI. For a workaround, refer to the issue description.
 
 Caution
 
 If you deploy StackLight in the HA mode (recommended):
 
 Add the StackLight label to minimum three worker nodes. Otherwise, StackLight will not be deployed until the required number of worker nodes is configured with the StackLight label.
 
 Removal of the StackLight label from worker nodes along with removal of worker nodes with StackLight label can cause the StackLight components to become inaccessible. It is important to correctly maintain the worker nodes where the StackLight local volumes were provisioned. For details, see Delete a cluster machine.
 
 To obtain the list of nodes where StackLight is deployed, refer to Container Cloud Release Notes: Upgrade managed clusters with StackLight deployed in HA mode.
 
 If you move the StackLight label to a new worker machine on an existing cluster, manually deschedule all StackLight components from the old worker machine, which you remove the StackLight label from. For details, see Deschedule StackLight Pods from a worker machine.
 
 Note
 
 To add node labels after deploying a worker machine, navigate to the Machines page, click the More action icon in the last column of the required machine field, and select Configure machine.
Before MCC 2.28.0 (Cluster releases 17.2.0, 16.2.0, or earlier)
- Count
 Specify the number of machines to create. If you create a machine pool, specify the replicas count of the pool.
- Manager
 Select Manager or Worker to create a Kubernetes manager or worker node.
 
 Caution
 
 The required minimum number of machines:
 
 3 manager nodes for HA
 
 3 worker storage nodes for a minimal Ceph cluster
- BareMetal Host Label
 Assign the role to the new machine(s) to link the machine to a previously created bare metal host with the corresponding label. You can assign one role type per machine. The supported labels include:
 
 Manager
 This node hosts the manager services of a managed cluster. For the reliability reasons, Container Cloud does not permit running end user workloads on the manager nodes or use them as storage nodes.
 
 Worker
 The default role for any node in a managed cluster. Only the kubelet service is running on the machines of this type.
- Upgrade Index
 Optional. A positive numeral value that defines the order of machine upgrade during a cluster update.
 
 Note
 
 You can change the upgrade order later on an existing cluster. For details, see Change the upgrade order of a machine.
 
 Consider the following upgrade index specifics:
 
 The first machine to upgrade is always one of the control plane machines with the lowest upgradeIndex. Other control plane machines are upgraded one by one according to their upgrade indexes.
 
 If the Cluster spec dedicatedControlPlane field is false, worker machines are upgraded only after the upgrade of all control plane machines finishes. Otherwise, they are upgraded after the first control plane machine, concurrently with other control plane machines.
 
 If several machines have the same upgrade index, they have the same priority during upgrade.
 
 If the value is not set, the machine is automatically assigned a value of the upgrade index.
- Distribution
 Operating system to provision the machine. From the drop-down list, select the required Ubuntu distribution.
- L2 Template
 From the drop-down list, select the previously created L2 template, if any. For details, see Create L2 templates. Otherwise, leave the default selection to use the default L2 template of the cluster.
 
 Note
 
 Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), if you leave the default selection in the drop-down list, a preinstalled L2 template is used. Preinstalled templates are removed in the above-mentioned releases.
- BM Host Profile
 From the drop-down list, select the previously created custom bare metal host profile, if any. For details, see Create a custom bare metal host profile. Otherwise, leave the default selection.
- Node Labels
 Add the required node labels for the worker machine to run certain components on a specific node. For example, for the StackLight nodes that run OpenSearch and require more resources than a standard node, add the StackLight label. The list of available node labels is obtained from allowedNodeLabels of your current Cluster release.
 
 If the value field is not defined in allowedNodeLabels, from the drop-down list, select the required label and define an appropriate custom value for this label to be set to the node. For example, the node-type label can have the storage-ssd value to meet the service scheduling logic on a particular machine.
 
 Note
 
 Due to the known issue 23002 fixed in Container Cloud 2.21.0 (Cluster releases 7.11.0 and 11.5.0), a custom value for a predefined node label cannot be set using the Container Cloud web UI. For a workaround, refer to the issue description.
 
 Caution
 
 If you deploy StackLight in the HA mode (recommended):
 
 Add the StackLight label to minimum three worker nodes. Otherwise, StackLight will not be deployed until the required number of worker nodes is configured with the StackLight label.
 
 Removal of the StackLight label from worker nodes along with removal of worker nodes with StackLight label can cause the StackLight components to become inaccessible. It is important to correctly maintain the worker nodes where the StackLight local volumes were provisioned. For details, see Delete a cluster machine.
 
 To obtain the list of nodes where StackLight is deployed, refer to Container Cloud Release Notes: Upgrade managed clusters with StackLight deployed in HA mode.
 
 If you move the StackLight label to a new worker machine on an existing cluster, manually deschedule all StackLight components from the old worker machine, which you remove the StackLight label from. For details, see Deschedule StackLight Pods from a worker machine.
 
 Note
 
 To add node labels after deploying a worker machine, navigate to the Machines page, click the More action icon in the last column of the required machine field, and select Configure machine.
Click Create.

At this point, Container Cloud adds the new machine object to the specified cluster. And the Bare Metal Operator Controller creates the relation to bare metal host with the labels matching the roles.

Provisioning of the newly created machine starts when the machine object is created and includes the following stages:
1. Creation of partitions on the local disks as required by the operating system and the Container Cloud architecture.
2. Configuration of the network interfaces on the host as required by the operating system and the Container Cloud architecture.
3. Installation and configuration of the Container Cloud LCM Agent.
Repeat the steps above for the remaining machines.
Verify machine status.

Now, proceed to Add a Ceph cluster.

Assign L2 templates to machines¶

To install MOSK on bare metal with Container Cloud, you must create L2 templates for each node type in the MOSK cluster. Additionally, you may have to create separate templates for nodes of the same type when they have different configuration.

To assign specific L2 templates to machines in a cluster:

Select from the following options to assign the templates to the cluster:
- Since MOSK 23.3, use the cluster.sigs.k8s.io/cluster-name label in the labels section.
- Before MOSK 23.3, use the clusterRef parameter in the spec section.
Add a unique identifier label to every L2 template. Typically, that would be the name of the MOSK node role, for example l2template-compute, or l2template-compute-5nics.
Assign an L2 template to a machine. Set the l2TemplateSelector field in the machine spec to the name of the label added in the previous step. IPAM Controller uses this field to use a specific L2 template for the corresponding machine.

Alternatively, you may set the l2TemplateSelector field to the name of the L2 template.

Consider the following examples of an L2 template assignment to a machine.

Example of an L2Template resource¶

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  name: example-node-netconfig
  namespace: my-project
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
    l2template-example-node-netconfig: "1"
    cluster.sigs.k8s.io/cluster-name: my-cluster
    ...

Note

Example of a Machine resource with the label-based L2 template selector¶

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  name: machine1
  namespace: my-project
  labels:
    cluster.sigs.k8s.io/cluster-name: my-cluster
    ...
...
spec:
  providerSpec:
    value:
      l2TemplateSelector:
        label: l2template-example-node-netconfig
...

Example of a Machine resource with the name-based L2 template selector¶

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  name: machine1
  namespace: my-project
  labels:
    cluster.sigs.k8s.io/cluster-name: my-cluster
    ...
...
spec:
  providerSpec:
    value:
      l2TemplateSelector:
        name: example-node-netconfig
...

Now, proceed to Deploy a machine to a specific bare metal host.

Deploy a machine to a specific bare metal host¶

A machine in a MOSK cluster requires dedicated bare metal host for deployment. In the Mirantis Container Cloud management API, bare metal hosts are represented by the BareMetalHost objects that are automatically generated by the related BareMetalHostInventory objects.

Note

The BareMetalHostInventory resource is available since the update of the management cluster to the Cluster release 16.4.0 (Container Cloud 2.29.0). Before this release, the BareMetalHost object is used.

Since the above mentioned release, BareMetalHost is only used for internal purposes of the Container Cloud private API. All configuration changes must be applied using the BareMetalHostInventory objects.

For any existing BareMetalHost object, a BareMetalHostInventory object is created automatically during cluster update.

m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.

All BareMetalHostInventory objects must be labeled upon creation with a label that allows identifying the host and assigning it to a machine.

The labels may be unique, or applied to a group of hosts, based on similarities in their capacity, capabilities and hardware configuration, on their location, suitable role, or a combination of thereof.

In some cases, you may need to deploy a machine to a specific bare metal host. This is especially useful when some of your bare metal hosts have different hardware configuration than the rest.

To deploy a machine to a specific bare metal host:

Log in to the host where your management cluster kubeconfig is located and where kubectl is installed.
Identify the bare metal host that you want to associate with the specific machine. For example, host host-1.
Since the management cluster update to 16.4.0 (MCC 2.29.0)
kubectl get baremetalhostinventory host-1 -o yaml
Before the management cluster update to 16.4.0 (MCC 2.29.0)
kubectl get baremetalhost host-1 -o yaml
Add a label that will uniquely identify this host, for example, by the name of the host and machine that you want to deploy on it.
Since the management cluster update to 16.4.0 (MCC 2.29.0)
kubectl edit baremetalhostinventory host-1
Note

For details about labels, see BareMetalHostInventory resource.
Before the management cluster update to 16.4.0 (MCC 2.29.0)
kubectl edit baremetalhost host-1
Note

For details about labels, see BareMetalHost resource.
Configuration example:
Since the management cluster update to 16.4.0 (MCC 2.29.0)
kind: BareMetalHostInventory metadata: name: host-1 namespace: myProjectName labels: kaas.mirantis.com/baremetalhost-id: host-1-worker-HW11-cad5 ...
Before the management cluster update to 16.4.0 (MCC 2.29.0)
kind: BareMetalHost metadata: name: host-1 namespace: myProjectName labels: kaas.mirantis.com/baremetalhost-id: host-1-worker-HW11-cad5 ...
Open the text file with the YAML definition of the Machine object, created in Create a machine using CLI.

Add a host selector that matches the label you have added to the BareMetalHost object in the previous step. For example:

kind: Machine
metadata:
  name: worker-HW11-cad5
  namespace: myProjectName
spec:
  ...
  providerSpec:
    value:
      apiVersion: baremetal.k8s.io/v1alpha1
      kind: BareMetalMachineProviderSpec
      ...
      hostSelector:
        matchLabels:
          kaas.mirantis.com/baremetalhost-id: host-1-worker-HW11-cad5
  ...

Once created, this machine will be associated with the specified bare metal host, and you can return to Create a machine using CLI.

Caution

The required minimum number of machines:

3 manager nodes for HA
3 worker storage nodes for a minimal Ceph cluster

Override network interfaces naming and order¶

An L2 template contains the ifMapping field that allows you to identify Ethernet interfaces for the template. The Machine object API enables the Operator to override the mapping from the L2 template by enforcing a specific order of names of the interfaces when applied to the template.

The field l2TemplateIfMappingOverride in the spec of the Machine object contains a list of interfaces names. The order of the interfaces names in the list is important because the L2Template object will be rendered with NICs ordered as per this list.

Note

Changes in the l2TemplateIfMappingOverride field will apply only once when the Machine and corresponding IpamHost objects are created. Further changes to l2TemplateIfMappingOverride will not reset the interfaces assignment and configuration.

Caution

The l2TemplateIfMappingOverride field must contain the names of all interfaces of the bare metal host.

The following example illustrates how to include the override field to the Machine object. In this example, we configure the interface eno1, which is the second on-board interface of the server, to precede the first on-board interface eno0.

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  finalizers:
  - foregroundDeletion
  - machine.cluster.sigs.k8s.io
  labels:
    cluster.sigs.k8s.io/cluster-name: kaas-mgmt
    cluster.sigs.k8s.io/control-plane: "true"
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
spec:
  providerSpec:
    value:
      apiVersion: baremetal.k8s.io/v1alpha1
      hostSelector:
        matchLabels:
          kaas.mirantis.com/baremetalhost-id: hw-master-0
      image: {}
      kind: BareMetalMachineProviderSpec
      l2TemplateIfMappingOverride:
      - eno1
      - eno0
      - enp0s1
      - enp0s2

Note

As a result of the configuration above, when used with the example L2 template for bonds and bridges described in Create L2 templates, the enp0s1 and enp0s2 interfaces will be in predictable ordered state. This state will be used to create subinterfaces for Kubernetes networks (k8s-pods) and for Kubernetes external network (k8s-ext).

Also, you can use the non-case-sensitive list of NIC MAC addresses instead of the list of NIC names. For example:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
...
spec:
  providerSpec:
    value:
      ...
      kind: BareMetalMachineProviderSpec
      l2TemplateIfMappingOverride:
      - b4:96:91:6f:2e:10
      - b4:96:91:6f:2e:11
      - b5:a6:c1:6f:ee:02
      - b5:a6:c1:6f:ee:02

Manually allocate IP addresses for bare metal hosts¶

Available since MCC 2.26.0 (16.1.0 and 17.1.0)

You can force the DHCP server to assign a particular IP address for a bare metal host during PXE provisioning by adding the host.dnsmasqs.metal3.io/address annotation with the desired IP address value to the required bare metal host.

If you have a limited amount of free and unused IP addresses for a server provisioning, you can manually create bare metal hosts one by one and provision servers in small, manually managed batches.

For batching in small chunks, you can use the host.dnsmasqs.metal3.io/address annotation to manually allocate IP addresses along with the baremetalhost.metal3.io/detached annotation to pause automatic host management by the bare metal Operator.

To pause bare metal hosts for a manual IP allocation during provisioning:

Set the baremetalhost.metal3.io/detached annotation for all bare metal hosts that pauses host management.

Note

If the host provisioning has already started or completed, addition of this annotation deletes the information about the host from Ironic without triggering deprovisioning. The bare metal Operator recreates the host in Ironic once you remove the annotation. For details, see Metal3 documentation.
Add the host.dnsmasqs.metal3.io/address annotation with corresponding IP address values to a batch of bare metal hosts.
Remove the baremetalhost.metal3.io/detached annotation from the batch used in the previous step.
Repeat the steps 2 and 3 until all hosts are provisioned.

See also

Add a Ceph cluster¶

After you add machines to your new bare metal managed cluster as described in Add a machine, create a Ceph cluster on top of this managed cluster.

For an advanced configuration through the KaaSCephCluster CR, see Ceph advanced configuration. To configure Ceph Controller through Kubernetes templates to manage Ceph node resources, see Enable Ceph tolerations and resources management.

The procedure below enables you to create a Ceph cluster with minimum three Ceph nodes that provides persistent volumes to the Kubernetes workloads in the managed cluster.

Create a Ceph cluster using the CLI¶

Verify that the overall status of the managed cluster is ready with all conditions in the Ready state:

kubectl -n <managedClusterProject> get cluster <clusterName> -o yaml

Substitute <managedClusterProject> and <clusterName> with the corresponding managed cluster namespace and name.

Example of system response:

status:
  providerStatus:
    ready: true
    conditions:
    - message: Helm charts are successfully installed(upgraded).
      ready: true
      type: Helm
    - message: Kubernetes objects are fully up.
      ready: true
      type: Kubernetes
    - message: All requested nodes are ready.
      ready: true
      type: Nodes
    - message: Maintenance state of the cluster is false
      ready: true
      type: Maintenance
    - message: TLS configuration settings are applied
      ready: true
      type: TLS
    - message: Kubelet is Ready on all nodes belonging to the cluster
      ready: true
      type: Kubelet
    - message: Swarm is Ready on all nodes belonging to the cluster
      ready: true
      type: Swarm
    - message: All provider instances of the cluster are Ready
      ready: true
      type: ProviderInstance
    - message: LCM agents have the latest version
      ready: true
      type: LCMAgent
    - message: StackLight is fully up.
      ready: true
      type: StackLight
    - message: OIDC configuration has been applied.
      ready: true
      type: OIDC
    - message: Load balancer 10.100.91.150 for kubernetes API has status HEALTHY
      ready: true
      type: LoadBalancer

Create a YAML file with the Ceph cluster specification:

apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephCluster
metadata:
  name: <cephClusterName>
  namespace: <managedClusterProject>
spec:
  k8sCluster:
    name: <clusterName>
    namespace: <managedClusterProject>

Substitute <cephClusterName> with the required name of the Ceph cluster. This name will be used in the Ceph LCM operations.

Select from the following options:
- Add explicit network configuration of the Ceph cluster using the network section:
```
spec:
 cephClusterSpec:
 network:
 publicNet: <publicNet>
 clusterNet: <clusterNet>
```
 Substitute the following values:
 - <publicNet> is a CIDR definition or comma-separated list of CIDR definitions (if the managed cluster uses multiple networks) of public network for the Ceph data. The values should match the corresponding values of the cluster Subnet object.
 - <clusterNet> is a CIDR definition or comma-separated list of CIDR definitions (if the managed cluster uses multiple networks) of replication network for the Ceph data. The values should match the corresponding values of the cluster Subnet object.
- Configure Subnet objects for the Storage access network by setting ipam/SVC-ceph-public: "1" and ipam/SVC-ceph-cluster: "1" labels to the corresponding Subnet objects. For more details, refer to Create subnets for a managed cluster using CLI, Step 5.
Configure Ceph Manager and Ceph Monitor roles to select nodes that must place Ceph Monitor and Ceph Manager daemons:
1. Obtain the names of machines to place Ceph Monitor and Ceph Manager daemons at:
```
kubectl -n <managedClusterProject> get machine
```
2. Add the nodes section with mon and mgr roles defined:
```
spec:
 cephClusterSpec:
 nodes:
 <mgr-node-1>:
 roles:
 - <role-1>
 - <role-2>
 ...
 <mgr-node-2>:
 roles:
 - <role-1>
 - <role-2>
 ...
```
 Substitute <mgr-node-X> with the corresponding Machine object names and <role-X> with the corresponding roles of daemon placement, for example, mon or mgr.
 
 For other optional node parameters, see Ceph advanced configuration.

Configure Ceph OSD daemons for Ceph cluster data storage:

Note

This step involves the deployment of Ceph Monitor and Ceph Manager daemons on nodes that are different from the ones hosting Ceph cluster OSDs. However, you can also colocate Ceph OSDs, Ceph Monitor, and Ceph Manager daemons on the same nodes by configuring the roles and storageDevices sections accordingly. This kind of configuration flexibility is particularly useful in scenarios such as hyper-converged clusters.

Warning

The minimal production cluster requires at least three nodes for Ceph Monitor daemons and three nodes for Ceph OSDs.

Obtain the names of machines with disks intended for storing Ceph data:
```
kubectl -n <managedClusterProject> get machine
```

For each machine, use status.providerStatus.hardware.storage to obtain information about node disks:

kubectl -n <managedClusterProject> get machine <machineName> -o yaml

Select by-id symlinks on the disks to be used in the Ceph cluster. The symlinks must meet the following requirements:
- A by-id symlink must contain status.providerStatus.hardware.storage.serialNumber
- A by-id symlink must not contain wwn
For the example above, to use the sdc disk to store Ceph data on it, select the /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc symlink. It is persistent and will not be affected by node reboot.

Note

For details about storage device formats, see Mirantis Container Cloud Reference Architecture: Addressing storage devices.

Sepcify by-id symlinks:

Since MOSK 23.3

Specify the selected by-id symlinks in the spec.cephClusterSpec.nodes.storageDevices.fullPath field along with the spec.cephClusterSpec.nodes.storageDevices.config.deviceClass field:

spec:
  cephClusterSpec:
    nodes:
      <storage-node-1>:
        storageDevices:
        - fullPath: <byIDSymlink-1>
          config:
            deviceClass: <deviceClass-1>
        - fullPath: <byIDSymlink-2>
          config:
            deviceClass: <deviceClass-1>
        - fullPath: <byIDSymlink-3>
          config:
            deviceClass: <deviceClass-2>
        ...
      <storage-node-2>:
        storageDevices:
        - fullPath: <byIDSymlink-4>
          config:
            deviceClass: <deviceClass-1>
        - fullPath: <byIDSymlink-5>
          config:
            deviceClass: <deviceClass-1>
        - fullPath: <byIDSymlink-6>
          config:
            deviceClass: <deviceClass-2>
      <storage-node-3>:
        storageDevices:
        - fullPath: <byIDSymlink-7>
          config:
            deviceClass: <deviceClass-1>
        - fullPath: <byIDSymlink-8>
          config:
            deviceClass: <deviceClass-1>
        - fullPath: <byIDSymlink-9>
          config:
            deviceClass: <deviceClass-2>

Before MOSK 23.3

Specify the selected by-id symlinks in the spec.cephClusterSpec.nodes.storageDevices.name field along with the spec.cephClusterSpec.nodes.storageDevices.config.deviceClass field:

spec:
  cephClusterSpec:
    nodes:
      <storage-node-1>:
        storageDevices:
        - name: <byIDSymlink-1>
          config:
            deviceClass: <deviceClass-1>
        - name: <byIDSymlink-2>
          config:
            deviceClass: <deviceClass-1>
        - name: <byIDSymlink-3>
          config:
            deviceClass: <deviceClass-2>
        ...
      <storage-node-2>:
        storageDevices:
        - name: <byIDSymlink-4>
          config:
            deviceClass: <deviceClass-1>
        - name: <byIDSymlink-5>
          config:
            deviceClass: <deviceClass-1>
        - name: <byIDSymlink-6>
          config:
            deviceClass: <deviceClass-2>
      <storage-node-3>:
        storageDevices:
        - name: <byIDSymlink-7>
          config:
            deviceClass: <deviceClass-1>
        - name: <byIDSymlink-8>
          config:
            deviceClass: <deviceClass-1>
        - name: <byIDSymlink-9>
          config:
            deviceClass: <deviceClass-2>

Substitute the following values:

<storage-node-X> with the corresponding Machine object names
<byIDSymlink-X> with the by-id symlinks obtained from status.providerStatus.hardware.storage.byIDs
<deviceClass-X> with the disk types obtained from status.providerStatus.hardware.storage.type

Configure the pools for Image, Block Storage, and Compute services:

Note

Ceph validates the specified pools. Therefore, do not omit any of the following pools.

spec:
  pools:
  - default: true
    deviceClass: hdd
    name: kubernetes
    replicated:
      size: 3
    role: kubernetes
  - default: false
    deviceClass: hdd
    name: volumes
    replicated:
      size: 3
    role: volumes
  - default: false
    deviceClass: hdd
    name: vms
    replicated:
      size: 3
    role: vms
  - default: false
    deviceClass: hdd
    name: backup
    replicated:
      size: 3
    role: backup
  - default: false
    deviceClass: hdd
    name: images
    replicated:
      size: 3
    role: images

Each Ceph pool, depending on its role, has the default targetSizeRatio value that defines the expected consumption of the total Ceph cluster capacity. The default ratio values for MOSK pools are as follows:

20.0% for a Ceph pool with the role volumes
40.0% for a Ceph pool with the role vms
10.0% for a Ceph pool with the role images
10.0% for a Ceph pool with the role backup

Optional. Configure Ceph Block Pools to use RBD. For the detailed configuration, refer to Pool parameters.

Example configuration:

spec:
  cephClusterSpec:
    pools:
    - name: kubernetes
      role: kubernetes
      deviceClass: hdd
      replicated:
        size: 3
        targetSizeRatio: 10.0
      default: true

Configure Ceph Object Storage to use OpenStack Swift Object Storage. For details, see RADOS Gateway parameters. Example configuration:

spec:
  cephClusterSpec:
    objectStorage:
      rgw:
        dataPool:
          deviceClass: hdd
          erasureCoded:
            codingChunks: 1
            dataChunks: 2
          failureDomain: host
        gateway:
          instances: 3
          port: 80
          securePort: 8443
        metadataPool:
          deviceClass: hdd
          failureDomain: host
          replicated:
            size: 3
        name: object-store
        preservePoolsOnDelete: false

Optional. Configure Ceph Shared Filesystem to use CephFS. For the detailed configuration, refer to Configure Ceph Shared File System (CephFS). Example configuration:

spec:
  cephClusterSpec:
    sharedFilesystem:
      cephFS:
      - name: cephfs-store
        dataPools:
        - name: cephfs-pool-1
          deviceClass: hdd
          replicated:
            size: 3
          failureDomain: host
        metadataPool:
          deviceClass: nvme
          replicated:
            size: 3
          failureDomain: host
        metadataServer:
          activeCount: 1
          activeStandby: false

When the Ceph cluster specification is complete, apply the built YAML file on the management cluster:

kubectl apply -f <kcc-template>.yaml

Substitue <kcc-template> with the name of the file containing the KaaSCephCluster specification.

Wait for the KaaSCephCluster status and for status.shortClusterInfo.state to become Ready:
```
kubectl -n <managedClusterProject> get kcc -o yaml
```
Verify your Ceph cluster as described in Verify Ceph.
Once all pools are created, verify that an appropriate secret required for a successful deployment of the OpenStack services that rely on Ceph is created in the openstack-ceph-shared namespace:
```
kubectl -n openstack-ceph-shared get secrets openstack-ceph-keys
```
Example of a positive system response:
```
NAME                  TYPE     DATA   AGE
openstack-ceph-keys   Opaque   7      36m
```

Create a Ceph cluster using the web UI¶

Warning

Mirantis highly recommends adding a Ceph cluster using the CLI instead of the web UI.

The web UI capabilities for adding a Ceph cluster are limited and lack flexibility in defining Ceph cluster specifications. For example, if an error occurs while adding a Ceph cluster using the web UI, usually you can address it only through the CLI.

The web UI functionality for managing Ceph cluster is going to be deprecated in one of the following releases.

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The Cluster page with the Machines and Ceph clusters lists opens.
In the Ceph Clusters block, click Create Cluster.

Configure the Ceph cluster in the Create New Ceph Cluster wizard that opens:

Create new Ceph cluster¶
Section	Parameter name	Description
General settings	Name	The Ceph cluster name.
	Cluster Network	Replication network for Ceph OSDs. Must contain the CIDR definition and match the corresponding values of the cluster `L2Template` object or the environment network values.
	Public Network	Public network for Ceph data. Must contain the CIDR definition and match the corresponding values of the cluster `L2Template` object or the environment network values.
	Enable OSDs LCM	Select to enable LCM for Ceph OSDs.
Machines / Machine #1-3	Select machine	Select the name of the Kubernetes machine that will host the corresponding Ceph node in the Ceph cluster.
	Manager, Monitor	Select the required Ceph services to install on the Ceph node.
	Devices	Select the disk that Ceph will use. Warning Do not select the device for system services, for example, `sda`. Warning A Ceph cluster does not support removable devices that are hosts with hotplug functionality enabled. To use devices as Ceph OSD data devices, make them non-removable or disable the hotplug functionality in the BIOS settings for disks that are configured to be used as Ceph OSD data devices.
	Enable Object Storage	Select to enable the single-instance RGW Object Storage.

To add more Ceph nodes to the new Ceph cluster, click + next to any Ceph Machine title in the Machines tab. Configure a Ceph node as required.

Warning

Do not add more than 3 Manager and/or Monitor services to the Ceph cluster.
After you add and configure all nodes in your Ceph cluster, click Create.
Open the KaaSCephCluster CR for editing as described in Ceph advanced configuration.
Verify that the following snippet is present in the KaaSCephCluster configuration:
```
network:
  clusterNet: 10.10.10.0/24
  publicNet: 10.10.11.0/24
```

Configure the pools for Image, Block Storage, and Compute services.

Note

Ceph validates the specified pools. Therefore, do not omit any of the following pools.

spec:
  pools:
    - default: true
      deviceClass: hdd
      name: kubernetes
      replicated:
        size: 3
      role: kubernetes
    - default: false
      deviceClass: hdd
      name: volumes
      replicated:
        size: 3
      role: volumes
    - default: false
      deviceClass: hdd
      name: vms
      replicated:
        size: 3
      role: vms
    - default: false
      deviceClass: hdd
      name: backup
      replicated:
        size: 3
      role: backup
    - default: false
      deviceClass: hdd
      name: images
      replicated:
        size: 3
      role: images

Each Ceph pool, depending on its role, has a default targetSizeRatio value that defines the expected consumption of the total Ceph cluster capacity. The default ratio values for MOSK pools are as follows:

20.0% for a Ceph pool with role volumes
40.0% for a Ceph pool with role vms
10.0% for a Ceph pool with role images
10.0% for a Ceph pool with role backup

Once all pools are created, verify that an appropriate secret required for a successful deployment of the OpenStack services that rely on Ceph is created in the openstack-ceph-shared namespace:
```
kubectl -n openstack-ceph-shared get secrets openstack-ceph-keys
```
Example of a positive system response:
```
NAME                  TYPE     DATA   AGE
openstack-ceph-keys   Opaque   7      36m
```
Verify your Ceph cluster as described in Verify Ceph.

Deploy OpenStack¶

This section instructs you on how to deploy OpenStack on top of Kubernetes as well as how to troubleshoot the deployment and access your OpenStack environment after deployment.

Deploy an OpenStack cluster¶

This section instructs you on how to deploy OpenStack on top of Kubernetes using the OpenStack Controller (Rockoon) and openstackdeployments.lcm.mirantis.com (OsDpl) CR.

To deploy an OpenStack cluster:

Verify that you have pre-configured the networking according to Networking.

Verify that the TLS certificates that will be required for the OpenStack cluster deployment have been pre-generated.

Note

The Transport Layer Security (TLS) protocol is mandatory on public endpoints.

Caution

To avoid certificates renewal with subsequent OpenStack updates during which additional services with new public endpoints may appear, we recommend using wildcard SSL certificates for public endpoints. For example, *.it.just.works, where it.just.works is a cluster public domain.

The sample code block below illustrates how to generate a self-signed certificate for the it.just.works domain. The procedure presumes the cfssl and cfssljson tools are installed on the machine.

mkdir cert && cd cert

tee ca-config.json << EOF
{
  "signing": {
    "default": {
      "expiry": "8760h"
    },
    "profiles": {
      "kubernetes": {
        "usages": [
          "signing",
          "key encipherment",
          "server auth",
          "client auth"
        ],
        "expiry": "8760h"
      }
    }
  }
}
EOF

tee ca-csr.json << EOF
{
  "CN": "kubernetes",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names":[{
    "C": "<country>",
    "ST": "<state>",
    "L": "<city>",
    "O": "<organization>",
    "OU": "<organization unit>"
  }]
}
EOF

cfssl gencert -initca ca-csr.json | cfssljson -bare ca

tee server-csr.json << EOF
{
    "CN": "*.it.just.works",
    "hosts":     [
        "*.it.just.works"
    ],
    "key":     {
        "algo": "rsa",
        "size": 2048
    },
    "names": [    {
        "C": "US",
        "L": "CA",
        "ST": "San Francisco"
    }]
}
EOF
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem --config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server

Create the openstackdeployment.yaml file that will include the OpenStack cluster deployment configuration. For the configuration details, refer to OpenStack configuration and OpenStack Operator resources.

Note

The resource of kind OpenStackDeployment (OsDpl) is a custom resource defined by a resource of kind CustomResourceDefinition. The resource is validated with the help of the OpenAPI v3 schema.

Configure the OsDpl resource depending on the needs of your deployment. For the configuration details, refer to OpenStack configuration.

Example of an OpenStackDeployment CR of minimum configuration¶

apiVersion: lcm.mirantis.com/v1alpha1
kind: OpenStackDeployment
metadata:
  name: openstack-cluster
  namespace: openstack
spec:
  openstack_version: victoria
  preset: compute
  size: tiny
  internal_domain_name: cluster.local
  public_domain_name: it.just.works
  features:
    neutron:
      tunnel_interface: ens3
      external_networks:
        - physnet: physnet1
          interface: veth-phy
          bridge: br-ex
          network_types:
           - flat
          vlan_ranges: null
          mtu: null
      floating_network:
        enabled: False
    nova:
      live_migration_interface: ens3
      images:
        backend: local

If required, enable huge pages and other supported Telco features as described in Advanced OpenStack configuration (optional).
To the openstackdeployment object, add information about the TLS certificates:
- ssl:public_endpoints:ca_cert - CA certificate content (ca.pem)
- ssl:public_endpoints:api_cert - server certificate content (server.pem)
- ssl:public_endpoints:api_key - server private key (server-key.pem)
Verify that the Load Balancer network does not overlap your corporate or internal Kubernetes networks, for example, Calico IP pools. Also, verify that the pool of Load Balancer network is big enough to provide IP addresses for all Amphora VMs (loadbalancers).

If required, reconfigure the Octavia network settings using the following sample structure:
```
spec:
  services:
    load-balancer:
      octavia:
        values:
          octavia:
            settings:
              lbmgmt_cidr: "10.255.0.0/16"
              lbmgmt_subnet_start: "10.255.1.0"
              lbmgmt_subnet_end: "10.255.255.254"
```
If you are using the default backend to store OpenStack database backups, which is Ceph, you may want to increase the default size of the allocated storage since there is no automatic way to resize the backup volume once the cloud is deployed.

For the default sizes and configuration details, refer to Size of a backup storage.

Trigger the OpenStack deployment:

kubectl apply -f openstackdeployment.yaml

Monitor the status of your OpenStack deployment:

kubectl -n openstack get pods
kubectl -n openstack describe osdpl osh-dev

Assess the current status of the OpenStack deployment using the status section output in the OsDpl resource:
1. Get the OsDpl YAML file:
```
kubectl -n openstack get osdpl osh-dev -o yaml
```
2. Analyze the status output using the detailed description in OpenStack configuration.

Verify that the OpenStack cluster has been deployed:

clinet_pod_name=$(kubectl -n openstack get pods -l application=keystone,component=client  | grep keystone-client | head -1 | awk '{print $1}')
kubectl -n openstack exec -it $clinet_pod_name -- openstack service list

Example of a positive system response:

+----------------------------------+---------------+----------------+
| ID                               | Name          | Type           |
+----------------------------------+---------------+----------------+
| 159f5c7e59784179b589f933bf9fc6b0 | cinderv3      | volumev3       |
| 6ad762f04eb64a31a9567c1c3e5a53b4 | keystone      | identity       |
| 7e265e0f37e34971959ce2dd9eafb5dc | heat          | orchestration  |
| 8bc263babe9944cdb51e3b5981a0096b | nova          | compute        |
| 9571a49d1fdd4a9f9e33972751125f3f | placement     | placement      |
| a3f9b25b7447436b85158946ca1c15e2 | neutron       | network        |
| af20129d67a14cadbe8d33ebe4b147a8 | heat-cfn      | cloudformation |
| b00b5ad18c324ac9b1c83d7eb58c76f5 | radosgw-swift | object-store   |
| b28217da1116498fa70e5b8d1b1457e5 | cinderv2      | volumev2       |
| e601c0749ce5425c8efb789278656dd4 | glance        | image          |
+----------------------------------+---------------+----------------+

Caution

The DNS component is mandatory to access OpenStack public endpoints.

Obtain the full list of endpoints:

kubectl -n openstack get ingress -ocustom-columns=NAME:.metadata.name,HOSTS:spec.rules[*].host | awk '/namespace-fqdn/ {print $2}'

Example of system response:

barbican.<spec:public_domain_name>
cinder.<spec:public_domain_name>
cloudformation.<spec:public_domain_name>
designate.<spec:public_domain_name>
glance.<spec:public_domain_name>
heat.<spec:public_domain_name>
horizon.<spec:public_domain_name>
keystone.<spec:public_domain_name>
metadata.<spec:public_domain_name>
neutron.<spec:public_domain_name>
nova.<spec:public_domain_name>
novncproxy.<spec:public_domain_name>
octavia.<spec:public_domain_name>
placement.<spec:public_domain_name>

Obtain the public endpoint IP:

kubectl -n openstack get services ingress

In the system response, capture EXTERNAL-IP.

Example of system response:

NAME      TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)                                      AGE
ingress   LoadBalancer   10.96.32.97   10.172.1.101   80:34234/TCP,443:34927/TCP,10246:33658/TCP   4h56m

Ask the customer to create records for public endpoints, obtained earlier in this procedure, to EXTERNAL-IP from the Ingress service.

See also

Networking

Advanced OpenStack configuration (optional)¶

This section provides configuration details for available advanced MOSK features that include huge pages, CPU pinning, and other Enhanced Platform Awareness (EPA) capabilities.

Enable LVM ephemeral storage¶

TechPreview

Note

Consider this section as part of Deploy an OpenStack cluster.

This section instructs you on how to configure LVM as backend for the VM disks and ephemeral storage.

You can use flexible size units throughout bare metal host profiles. For example, you can now use either sizeGiB: 0.1 or size: 100Mi when specifying a device size.

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

Warning

Usage of more than one nonvolatile memory express (NVMe) drive per node may cause update issues and is therefore not supported.

To enable LVM ephemeral storage:

In BareMetalHostProfile in the spec:volumeGroups section, add the following configuration for the OpenStack compute nodes:

spec:
  devices:
    - device:
        byName: /dev/nvme0n1
        minSize: 30Gi
        wipe: true
      partitions:
        - name: lvm_nova_vol
          size: 0
          wipe: true
  volumeGroups:
    - devices:
      - partition: lvm_nova_vol
      name: nova-vol

For details about BareMetalHostProfile, see Operations Guide: Create a custom host profile.

Configure the OpenStackDeployment CR to deploy OpenStack with LVM ephemeral storage. For example:

spec:
  features:
    nova:
      images:
        backend: lvm
        lvm:
          volume_group: "nova-vol"

Optional. Enable encryption for the LVM ephemeral storage by adding the following metadata in the OpenStackDeployment CR:

spec:
  features:
    nova:
      images:
        encryption:
          enabled: true
          cipher: "aes-xts-plain64"
          key_size: 256

Caution

Both live and cold migrations are not supported for such instances.

Enable LVM block storage¶

TechPreview

Note

Consider this section as part of Deploy an OpenStack cluster.

This section instructs you on how to configure LVM as a backend for the OpenStack Block Storage service.

You can use flexible size units throughout bare metal host profiles. For example, you can now use either sizeGiB: 0.1 or size: 100Mi when specifying a device size.

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

To enable LVM block storage:

Open BareMetalHostProfile for editing.

In the spec:volumeGroups section, specify the following data for the OpenStack compute nodes. In the following example, we deploy a Cinder volume with LVM on compute nodes. However, you can use dedicated nodes for this purpose.

spec:
  devices:
    - device:
        byName: /dev/nvme0n1
        minSize: 30Gi
        wipe: true
      partitions:
        - name: lvm_cinder_vol
          size: 0
          wipe: true
  volumeGroups:
    - devices:
      - partition: lvm_cinder_vol
      name: cinder-vol

For details about BareMetalHostProfile, see Operations Guide: Create a custom host profile.

Configure the OpenStackDeployment CR to deploy OpenStack with LVM block storage. For example:

spec:
  nodes:
    rockoon-openstack-compute-node::enabled:
      features:
        cinder:
          volume:
            backends:
              lvm:
                lvm:
                  volume_group: "cinder-vol"

Enable DPDK with OVS¶

Deprecated since MOSK 25.1

Note

Consider this section as part of Deploy an OpenStack cluster.

This section instructs you on how to enable DPDK with the Neutron OVS back end.

Warning

Usage of third-party software, which is not part of Mirantis-supported configurations, for example, the use of custom DPDK modules, may block upgrade of an operating system distribution. Users are fully responsible for ensuring the compatibility of such custom components with the latest supported Ubuntu version.

To enable DPDK with OVS:

Verify that your deployment meets the following requirements:
- The required drivers have been installed on the host operating system.
  
  Different Poll Mode Driver (PMD) types may require different kernel drivers to properly work with NIC. For more information about the DPDK drivers, read DPDK official documentation: Linux Drivers and Overview of Networking Drivers.
- The DPDK NICs are not used on the host operating system.
- The huge pages feature is enabled on the host. See Enable huge pages for OpenStack for details.

Enable DPDK in the OsDpl custom resource through the node specific overrides settings. For example:

spec:
  nodes:
    <NODE-LABEL>::<NODE-LABEL-VALUE>:
      features:
        neutron:
          dpdk:
            bridges:
            - ip_address: 10.12.2.80/24
              name: br-phy
            driver: igb_uio
            enabled: true
            nics:
            - bridge: br-phy
              name: nic01
              pci_id: "0000:05:00.0"
          tunnel_interface: br-phy

See also

Enable SR-IOV with OVS¶

Note

Consider this section as part of Deploy an OpenStack cluster.

This section instructs you on how to enable SR-IOV with the Neutron OVS back end.

To enable SR-IOV with OVS:

Verify that your deployment meets the following requirements:
- NICs with the SR-IOV support are installed
- SR-IOV and VT-d are enabled in BIOS
Enable IOMMU in the kernel by configuring intel_iommu=on in the GRUB configuration file. Specify the parameter for compute nodes in BareMetalHostProfile in the grubConfig section:
```
spec:
  grubConfig:
      defaultGrubOptions:
        - 'GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX intel_iommu=on"'
```

Configure the OpenStackDeployment CR to deploy OpenStack with the VLAN tenant network encapsulation.

Caution

To ensure correct appliance of the configuration changes, configure VLAN segmentation during the initial OpenStack deployment.

Configuration example:

spec:
  features:
    neutron:
      external_networks:
      - bridge: br-ex
        interface: pr-floating
        mtu: null
        network_types:
        - flat
        physnet: physnet1
        vlan_ranges: null
      - bridge: br-tenant
        interface: bond0
        network_types:
          - vlan
        physnet: tenant
        vlan_ranges: 490:499,1420:1459
      tenant_network_types:
        - vlan

Enable SR-IOV in the OpenStackDeployment CR through the node-specific overrides settings. For example:

spec:
  nodes:
    <NODE-LABEL>::<NODE-LABEL-VALUE>:
      features:
        neutron:
          sriov:
            enabled: true
            nics:
            - device: enp10s0f1
              num_vfs: 7
              physnet: tenant

See also

Enable BGP VPN¶

Deprecated since MOSK 25.1

Note

Consider this section as part of Deploy an OpenStack cluster.

The BGP VPN service is an extra OpenStack Neutron plugin that enables connection of OpenStack Virtual Private Networks with external VPN sites through either BGP/MPLS IP VPNs or E-VPN.

To enable the BGP VPN service:

Enable BGP VPN in the OsDpl custom resource through the node specific overrides settings. For example:

spec:
  features:
    neutron:
      bgpvpn:
        enabled: true
         route_reflector:
           # Enable deploygin FRR route reflector
           enabled: true
           # Local AS number
           as_number: 64512
           # List of subnets we allow to connect to
           # router reflector BGP
           neighbor_subnets:
             - 10.0.0.0/8
             - 172.16.0.0/16
  nodes:
    rockoon-openstack-compute-node::enabled:
      features:
        neutron:
          bgpvpn:
            enabled: true

When the service is enabled, a route reflector is scheduled on nodes with the openstack-frrouting: enabled label. Mirantis recommends collocating the route reflector nodes with the OpenStack controller nodes. By default, two replicas are deployed.

See also

Encrypt the east-west traffic¶

TechPreview

Note

Consider this section as part of Deploy an OpenStack cluster.

MOSK allows configuring Internet Protocol Security (IPSec) encryption for the east-west tenant traffic between the OpenStack compute nodes and gateways. The feature uses the strongSwan open source IPSec solution. Authentication is accomplished through a pre-shared key (PSK). However, other authentication methods are upcoming.

To encrypt the east-west tenant traffic, enable ipsec in the spec:features:neutron settings of the OpenStackDeployment CR:

spec:
  features:
    neutron:
      ipsec:
        enabled: true

Caution

Enabling IPSec adds extra headers to the tenant traffic. The header size varies depending on IPSec configuration.

Therefore, Mirantis recommends decreasing network MTU for virtual networks and reserve 73 bytes overhead for the worst-case scenario as described in Cisco documentation: Configuring IPSec VPN Fragmentation and MTU.

Enable Cinder backend for Glance¶

TechPreview

Note

Consider this section as part of Deploy an OpenStack cluster.

This section instructs you on how to configure Cinder backend for the for images through the OpenStackDeployment CR.

Note

This feature depends heavily on Cinder multi-attach, which enables you to simultaneously attach volumes to multiple instances. Therefore, only the block storage backends that support multi-attach can be used.

To configure a Cinder backend for Glance, define the backend identity in the OpenStackDeployment CR. This identity will be used as a name for the backend section in the Glance configuration file.

When defining the backend:

Configure one of the backends as default.
Configure each backend to use specific Cinder volume type.

Note

You can use the cinder_volume_type parameter instead of backend_name. If so, you have to create this volume type beforehand and take into account that the bootstrap script does not manage such volume types.

The blockstore identity definition example:

spec:
  features:
    glance:
      backends:
        cinder:
          blockstore:
            default: true
            backend_name: <volume_type:volume_name>
            # e.g. backend_name: lvm:lvm_store

spec:
  features:
    glance:
      backends:
        cinder:
          blockstore:
            default: true
            cinder_volume_type: netapp

Enable Cinder volume encryption¶

TechPreview

Note

Consider this section as part of Deploy an OpenStack cluster.

This section instructs you on how to enable Cinder volume encryption through the OpenStackDeployment CR using Linux Unified Key Setup (LUKS) and store the encryption keys in Barbican. For details, see Volume encryption.

To enable Cinder volume encryption:

In the OpenStackDeployment CR, specify the LUKS volume type and configure the required encryption parameters for the storage system to encrypt or decrypt the volume.

The volume_types definition example:

spec:
  services:
    block-storage:
      cinder:
        values:
          bootstrap:
            volume_types:
              volumes-hdd-luks:
                arguments:
                  encryption-cipher: aes-xts-plain64
                  encryption-control-location: front-end
                  encryption-key-size: 256
                  encryption-provider: luks
                volume_backend_name: volumes-hdd

To create an encrypted volume as a non-admin user and store keys in the Barbican storage, assign the creator role to the user since the default Barbican policy allows only the admin or creator role:
```
openstack role add --project <PROJECT-ID> --user <USER-ID> --creator <CREATOR-ID> creator
```

Optional. To define an encrypted volume as a default one, specify volumes-hdd-luks in default_volume_type in the Cinder configuration:

spec:
  services:
    block-storage:
      cinder:
        values:
          conf:
            cinder:
              DEFAULT:
                default_volume_type: volumes-hdd-luks

Advanced configuration for OpenStack compute nodes¶

Note

Consider this section as part of Deploy an OpenStack cluster.

The section describes how to perform advanced configuration for the OpenStack compute nodes. Such configuration can be required in some specific use cases, such as SR-IOV and huge pages features usage.

Configuration recommendations for compute node types¶

This section contains recommendations for configuration of an OpenStackDeployment custom resource for the compute nodes of the following types:

Compute nodes with the default configuration, without local NVMe storage and SR-IOV network interface cards (NICs)
Compute nodes with the NVMe local storage
Compute nodes with the SR-IOV NICs
Compute nodes with both the NVMe local storage and SR-IOV NICs

Note

If the local NVMe storage is enabled, Mirantis recommends using it and enable SR-IOV if possible.

Caution

Before using the NVMe local storage and mount point, define them in BareMetalHostProfile. For example:

apiVersion: metal3.io/v1alpha1
 kind: BareMetalHostProfile
 ...
 spec:
   devices:
     ...
   - device:
       byName: /dev/nvme0n1
       minSizeGiB: 30
       wipe: true
     partitions:
     - name: local-volumes-partition
       sizeGiB: 0
       wipe: true
     ...
   fileSystems:
     ...
   - fileSystem: ext4
     partition: local-volumes-partition
     # mountpoint for Nova images
     mountPoint: /var/lib/nova

Note

If you mount the /var directory, review Mounting recommendations for the /var directory before configuring BareMetalHostProfile.

Caution

To control the storage type (local NVMe or Ceph) for virtual machines, place a node into the OpenStack aggregate. For details, see OpenStack documentation: Host aggregates.

As defined in Node-specific configuration, each node with a non-default configuration must be configured separately. Each Machine object must have a configuration-specific label. For example, for a compute node with the local NVMe storage:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
  ...
  spec:
    providerSpec:
      value:
        nodeLabels:
        - key: node-type
          value: compute-nvme

Caution

The <NODE-LABEL> value must match one of the allowed labels defined in the nodeLabels section of the Cluster object:

nodeLabels:
- key: <NODE-LABEL>
  value: <NODE-LABEL-VALUE>

Mirantis recommends using node-type as the <NODE-LABEL> key. To view the full list of allowed node labels:

kubectl \
  -n <CLUSTER-NAMESPACE> \
  get <CLUSTER-NAME> \
  -o json \
  | jq .status.providerStatus.releaseRefs.current.allowedNodeLabels

The list of node labels is read-only and cannot be modified.

For compute nodes with the SR-IOV NIC, use compute-sriov as the node-type value of nodeLabels:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
  ...
  spec:
    providerSpec:
      value:
        nodeLabels:
        - key: node-type
          value: compute-sriov

For compute nodes with the local NVMe storage and SR-IOV NICs, use the compute-nvme-sriov as the node-type value of nodeLabels:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
  ...
  spec:
    providerSpec:
      value:
        nodeLabels:
        - key: node-type
          value: compute-nvme-sriov

In the examples above, compute-sriov, compute-nvme-sriov, and compute-nvme are human-readable string identifiers. You can use any unique string value for each type of compute node.

In the OpenStackDeployment object of each node group, define its own section that starts with <NODE-LABEL>::<NODE-LABEL-VALUE>::

apiVersion: lcm.mirantis.com/v1alpha1
kind: OpenStackDeployment
...
spec:
  ...
  nodes:
    node-type::compute-nvme:
      features:
        nova:
          images:
            backend: local
    node-type::compute-sriov:
      features:
        neutron:
          sriov:
            enabled: true
            nics:
            - device: enp10s0f1
              num_vfs: 7
              physnet: tenant
    node-type::compute-nvme-sriov:
      features:
        nova:
          images:
            backend: local
        neutron:
          sriov:
            enabled: true
            nics:
            - device: enp10s0f1
              num_vfs: 7
              physnet: tenant

Configure the CPU model¶

Note

Consider this section as part of Deploy an OpenStack cluster.

Mirantis OpenStack for Kubernetes (MOSK) enables you to configure the vCPU model for all instances managed by the OpenStack Compute service (Nova) using the following osdpl definition:

spec:
  features:
    nova:
      vcpu_type: host-model

For the supported values and configuration examples, see Virtual CPU.

Enable huge pages for OpenStack¶

Note

Consider this section as part of Deploy an OpenStack cluster.

Note

The instruction provided in this section applies to both OpenStack with OVS and OpenStack with Tungsten Fabric topologies.

The huge pages OpenStack feature provides essential performance improvements for applications that are highly memory IO-bound. Huge pages should be enabled on a per compute node basis. By default, NUMATopologyFilter is enabled.

To activate the feature, you need to enable huge pages on the dedicated bare metal host as described in Enable huge pages in a host profile during the predeployment bare metal configuration.

Note

The multi-size huge pages are not fully supported by Kubernetes versions before 1.19. Therefore, define only one size in kernel parameters.

Configure CPU isolation for an instance¶

TechPreview

Note

Consider this section as part of Deploy an OpenStack cluster.

Warning

The below procedure applies only to deployments based on deprecated Ubuntu 20.04. For Ubuntu 22.04 that supports cgroup v2, use the cpushield module. For the procedure details, see Host operating system configuration.

CPU isolation is a way to force the system scheduler to use only some logical CPU cores for processes. For compute hosts, you should typically isolate system processes and virtual guests on different cores through the cpusets mechanism in Linux kernel.

The Linux kernel and cpuset provide a mechanism to run tasks by limiting the resources defined by a cpuset. The tasks can be moved from one cpuset to another to use the resources defined in other cpusets. The cset Python tool is a command-line interface to work with cpusets.

To configure CPU isolation using cpusets:

Configure core isolation:

Note

You can also automate this step during deployment by using the postDeploy script as described in Create MOSK host profiles.

cat <<-"EOF" > /usr/bin/setup-cgroups.sh
#!/bin/bash

set -x

UNSHIELDED_CPUS=${UNSHIELDED_CPUS:-"0-3"}
UNSHIELDED_MEM_NUMAS=${UNSHIELDED_MEM_NUMAS:-0}
SHIELD_CPUS=${SHIELD_CPUS:-"4-15"}
SHIELD_MODE=${SHIELD_MODE:-"cpuset"} # One of isolcpu or cpuset

DOCKER_CPUS=${DOCKER_CPUS:-$UNSHIELDED_CPUS}
DOCKER_MEM_NUMAS=${DOCKER_MEM_NUMAS:-$UNSHIELDED_MEM_NUMAS}
KUBERNETES_CPUS=${KUBERNETES_CPUS:-$UNSHIELDED_CPUS}
KUBERNETES_MEM_NUMAS=${KUBERNETES_MEM_NUMAS:-$UNSHIELDED_MEM_NUMAS}
CSET_CMD=${CSET_CMD:-"python3 /usr/bin/cset"}

if [[ ${SHIELD_MODE} == "cpuset" ]]; then
    ${CSET_CMD} set -c ${UNSHIELDED_CPUS} -m ${UNSHIELDED_MEM_NUMAS} -s system
    ${CSET_CMD} proc -m -f root -t system
    ${CSET_CMD} proc -k -f root -t system
fi

${CSET_CMD} set --cpu=${DOCKER_CPUS} --mem=${DOCKER_MEM_NUMAS} --set=docker
${CSET_CMD} set --cpu=${KUBERNETES_CPUS} --mem=${KUBERNETES_MEM_NUMAS} --set=kubepods
${CSET_CMD} set --cpu=${DOCKER_CPUS} --mem=${DOCKER_MEM_NUMAS} --set=com.docker.ucp

EOF
chmod +x /usr/bin/setup-cgroups.sh

cat <<-"EOF" > /etc/systemd/system/shield-cpus.service
[Unit]
Description=Shield CPUs
DefaultDependencies=no
After=systemd-udev-settle.service
Before=lvm2-activation-early.service
Wants=systemd-udev-settle.service
[Service]
ExecStart=/usr/bin/setup-cgroups.sh
RemainAfterExit=true
Type=oneshot
Restart=on-failure     #Service should restart on failure
RestartSec=5s          #Restart each five seconds until success
[Install]
WantedBy=basic.target
EOF

systemctl enable shield-cpus

reboot

As root user, verify that isolation has been applied:

cset set -l

Example of system response:

cset:
      Name       CPUs-X     MEMs-X    Tasks Subs   Path
  ------------ ---------- - ------- - ----- ---- ----------
  root             0-15 y       0 y     165    4  /
  kubepods         0-3 n        0 n       0    2  /kubepods
  docker           0-3 n        0 n       0    0  /docker
  system           0-3 n        0 n      65    0  /system
  com.docker.ucp   0-3 n        0 n       0    0  /com.docker.ucp

Run the cpustress container:

docker run -it --name cpustress --rm containerstack/cpustress --cpu 4 --timeout 30s --metrics-brief

Verify that isolated cores are not affected:
```
htop
```
Example of system response highlighting the load created on all available Docker cores:

Configure custom CPU topologies¶

Note

Consider this section as part of Deploy an OpenStack cluster.

The majority of CPU topologies features are activated by NUMATopologyFilter that is enabled by default. Such features do not require any further service configuration and can be used directly on a vanilla MOSK deployment. The list of the CPU topologies features includes, for example:

NUMA placement policies
CPU pinning policies
CPU thread pinning policies
CPU topologies

To enable libvirt CPU pinning through the node-specific overrides in the OpenStackDeployment custom resource, use the following sample configuration structure:

spec:
  nodes:
    <NODE-LABEL>::<NODE-LABEL-VALUE>:
      services:
        compute:
          nova:
            nova_compute:
              values:
                conf:
                  nova:
                    compute:
                      cpu_dedicated_set: 2-17
                      cpu_shared_set: 18-47

Configure PCI passthrough for guests¶

Note

Consider this section as part of Deploy an OpenStack cluster.

The Peripheral Component Interconnect (PCI) passthrough feature in OpenStack allows full access and direct control over physical PCI devices in guests. The mechanism applies to any kind of PCI devices including a Network Interface Card (NIC), Graphics Processing Unit (GPU), and any other device that can be attached to a PCI bus. The only requirement for the guest to properly use the device is to correctly install the driver.

To enable PCI passthrough in a MOSK deployment:

For Linux X86 compute nodes, verify that the following features are enabled on the host:
- VT-d in BIOS
- IOMMU on the host operating system as described in Enable SR-IOV with OVS.
Configure the nova-api service that is scheduled on OpenStack controllers nodes. To generate the alias for PCI in nova.conf, add the alias configuration through the OpenStackDeployment CR.

Note

When configuring PCI with SR-IOV on the same host, the values specified in alias take precedence. Therefore, add the SR-IOV devices to passthrough_whitelist explicitly.

For example:
```
spec:
  services:
    compute:
      nova:
        values:
          conf:
            nova:
              pci:
                alias: '{ "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", "name":"a1" }'
```

Configure the nova-compute service that is scheduled on OpenStack compute nodes. To enable Nova to pass PCI devices to virtual machines, configure the passthrough_whitelist section in nova.conf through the node-specific overrides in the OpenStackDeployment CR. For example:

spec:
  nodes:
    <NODE-LABEL>::<NODE-LABEL-VALUE>:
      services:
        compute:
          nova:
            nova_compute:
              values:
                conf:
                  nova:
                    pci:
                      alias: '{ "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", "name":"a1" }'
                      passthrough_whitelist: |
                        [{"devname":"enp216s0f0","physical_network":"sriovnet0"}, { "vendor_id": "8086", "product_id": "154d" }]

Configure initial resource oversubscription¶

Available since MOSK 23.1

MOSK enables you to configure initial oversubscription through the OpenStackDeployment custom resource. For configuration details and oversubscription considerations, refer to Configuring initial resource oversubscription.

By default, the following values are applied:

8.0 for the number of CPUs
1.6 for the disk space
1.0 for the amount of RAM

Note

In MOSK 22.5 and earlier, the effective default value of RAM allocation ratio is 1.1.

Changing oversubscription configuration after deployment will only affect the newly added compute nodes and will not change oversubscription for already existing compute nodes. You can change oversubscription for existing compute nodes through the placement API as described in Change oversubscription settings for existing compute nodes.

Limit HW resources for hyperconverged OpenStack compute nodes¶

Note

Consider this section as part of Deploy an OpenStack cluster.

Hyperconverged architecture combines OpenStack compute nodes along with Ceph nodes. To avoid nodes overloading, which can cause Ceph performance degradation and outage, limit the hardware resources consumption by the OpenStack compute services.

You can reserve hardware resources for non-workload related consumption using the following nova-compute parameters. For details, see OpenStack documentation: Overcommitting CPU and RAM and OpenStack documentation: Configuration Options.

cpu_allocation_ratio - in case of a hyperconverged architecture, the value depends on the number of vCPU used for non-workload related operations, total number of vCPUs of a hyperconverged node, and on workload vCPU consumption:
```
cpu_allocation_ratio = (${vCPU_count_on_a_hyperconverged_node} -
${vCPU_used_for_non_OpenStack_related_tasks}) /
${vCPU_count_on_a_hyperconverged_node} / ${workload_vCPU_utilization}
```
To define the vCPU count used for non-OpenStack related tasks, use the following formula, considering the storage data plane performance tests:
```
vCPU_used_for_non-OpenStack_related_tasks = 2 * SSDs_per_hyperconverged_node +
1 * Ceph_OSDs_per_hyperconverged_node + 0.8 * Ceph_OSDs_per_hyperconverged_node
```
Consider the following example with 5 SSD disks for Ceph OSDs per hyperconverged node and 2 Ceph OSDs per disk:
```
vCPU_used_for_non-OpenStack_related_tasks = 2 * 5 + 1 * 10 + 0.8 * 10 = 28
```
In this case, if there are 40 vCPUs per hyperconverged node, 28 vCPUs are required for non-workload related calculations, and a workload consumes 50% of the allocated CPU time: cpu_allocation_ratio = (40-28) / 40 / 0.5 = 0.6.

reserved_host_memory_mb - a dedicated variable in the OpenStack Nova configuration, to reserve memory for non-OpenStack related VM activities:
```
reserved_host_memory_mb = 13 GB * Ceph_OSDs_per_hyperconverged_node
```
For example for 10 Ceph OSDs per hyperconverged node: reserved_host_memory_mb = 13 GB * 10 = 130 GB = 133120

ram_allocation_ratio - the allocation ratio of virtual RAM to physical RAM. To completely exclude the possibility of memory overcommitting, set to 1.

To limit HW resources for hyperconverged OpenStack compute nodes:

In the OpenStackDeployment CR, specify the cpu_allocation_ratio, ram_allocation_ratio, and reserved_host_memory_mb parameters as required using the calculations described above.

For example:

apiVersion: lcm.mirantis.com/v1alpha1
kind: OpenStackDeployment
spec:
  services:
    compute:
      nova:
        values:
          conf:
            nova:
              DEFAULT:
                cpu_allocation_ratio: 0.6
                ram_allocation_ratio: 1
                reserved_host_memory_mb: 133120

Note

For an existing OpenStack deployment:

Obtain the name of your OpenStackDeployment CR:
```
kubectl -n openstack get osdpl
```
Open the OpenStackDeployment CR for editing and specify the parameters as required.
```
kubectl -n openstack edit osdpl <osdpl name>
```

Configure GPU virtualization¶

Available since MOSK 24.1 TechPreview

This section delves into virtual GPU configuration. It is specifically tailored for NVIDIA physical GPUs, with a focus on the A100 40GB GPU and NVIDIA AIE 4.1 drivers.

While setup procedures may vary among different cards and vendors, MOSK can generally ensure compatibility between the MOSK Compute service (Nova) and vGPU functionality, as long as the drivers for the physical GPU expose an VFIO mdev-compatible interface to the Linux host.

For configuration specifics of other physical GPUs, refer to the official documentation provided by the vendor.

Obtain drivers¶

Visit NVIDIA AI Enterprise documentation for comprehensive guidance on how to download the required drivers.

Also, if you have access to the NVIDIA NGC Catalog, search for the latest Infra Release that provides NVIDIA vGPU Host Driver there.

NVIDIA licensing

To fully utilize the capabilities of NVIDIA GPU virtualization, you may need to set up and configure the NVIDIA licensing server.

Install drivers¶

To install the acquired drivers within your cluster, add a custom postDeployScript script to the custom BareMetalHostProfile object used for the compute nodes with GPUs.

Note

For the instruction on how to create a BareMetalHostProfile object, refer to Operations Guide: Create a custom host profile.

This script must solve the following tasks:

Download and install the drivers, if needed
Configure physical GPU according to your cluster requirements
Configure a startup task to reconfigure the physical GPU after node reboots.

Example postDeployScript script:

#!/bin/bash -ex
# Create a one time script that will initialize physical GPU right now and self-destruct
cat << EOF > /root/test_postdeploy_job.sh
#!/bin/bash -ex
systemctl enable initialize-vgpu
systemctl start --no-block initialize-vgpu
crontab -l | grep -v test_postdeploy_job.sh | crontab -
rm /root/test_postdeploy_job.sh
EOF
mkdir -p /var/spool/cron/crontabs/ && echo "*/1 * * * * sudo /root/test_postdeploy_job.sh >> /var/log/test_postdeploy_job.log 2>&1" >> /var/spool/cron/crontabs/root
chmod +x /root/test_postdeploy_job.sh

# Create a systemd unit that will re-initialize physical GPU on restart
cat << EOF > /etc/systemd/system/initialize-vgpu.service
[Unit]
Description=Configure VGPU
After=systemd-modules-load.service

[Service]
Type=oneshot
ExecStart=/root/initialize_vgpu.sh
RemainAfterExit=true
StandardOutput=journal
[Install]
RequiredBy=multi-user.target
EOF
cat << EOF > /root/initialize_vgpu.sh
#!/bin/bash
set -ex
while ! docker inspect ucp-kubelet;
    do echo "Waiting lcm-agent is finished.";
    sleep 1;
done
# Download and install the driver, dependencies and tools
if [[ ! -f /root/nvidia-vgpu-ubuntu-aie-535_535.129.03_amd64.deb ]]; then
    apt update
    apt install -y dkms unzip gcc libc-dev make linux-headers-$(uname -r) pciutils lshw mdevctl
    wget https://my.intra.net//root/gpu-driver-x-y-z.deb -O /root/gpu-driver-x-y-z.deb
    apt install /root/gpu-driver-x-y-z.deb
    systemctl enable initialize-vgpu
fi
systemctl restart nvidia-vgpud.service
# Enable SR-IOV mode for the pGPU
/usr/lib/nvidia/sriov-manage -e <PCI-ADDRESS-OF-NVIDIA-CARD>
# Enable MIG mode for pGPU
nvidia-smi -i 0 -mig 1
systemctl enable nvidia-vgpu-mgr.service
systemctl start nvidia-vgpu-mgr.service
EOF
chmod +x /root/initialize_vgpu.sh

Manage virtual GPU types¶

Virtual GPU types are similar to compute flavors as they determine the resources allocated to each virtual GPU. This allows for efficient allocation and optimization of GPU resources in virtualized environments.

Each physical GPU has a maximum number of virtual GPUs of a specific type that can be created on it, with no possibility for overallocation. In the time-sliced vGPU configuration, each particular physical GPU can only instantiate vGPUs of the same selected type. In the Multi-Instance GPU (MIG), a single physical GPU may be partitioned into several differently sized virtual GPUs.

Either way, prior to accepting workloads, Mirantis recommends determining the virtual GPU types that each of your physical GPU will provide. Altering these settings afterward necessitates terminating every virtual machine currently running on the physical GPU intended for reconfiguration or repurposing for another virtual GPU type.

Partition to Multi-Instance GPUs¶

This section outlines the process for partitioning physical GPUs into Multi-Instance GPUs (MIG) using the nvidia-smi tool provided by the NVIDIA Host GPU driver.

To list available virtual GPU instance profiles:

nvidia-smi mig -lgip

Example system response:

+-----------------------------------------------------------------------------+
| GPU instance profiles:                                                      |
| GPU   Name             ID    Instances   Memory     P2P    SM    DEC   ENC  |
|                              Free/Total   GiB              CE    JPEG  OFA  |
|=============================================================================|
|   0  MIG 1g.5gb        19     7/7        4.75       No     14     0     0   |
|                                                             1     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 1g.5gb+me     20     1/1        4.75       No     14     1     0   |
|                                                             1     1     1   |
+-----------------------------------------------------------------------------+
|   0  MIG 1g.10gb       15     4/4        9.75       No     14     1     0   |
|                                                             1     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 2g.10gb       14     3/3        9.75       No     28     1     0   |
|                                                             2     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 3g.20gb        9     2/2        19.62      No     42     2     0   |
|                                                             3     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 4g.20gb        5     1/1        19.62      No     56     2     0   |
|                                                             4     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 7g.40gb        0     1/1        39.50      No     98     5     0   |
|                                                             7     1     1   |
+-----------------------------------------------------------------------------+

To create seven, which is a maximum possible amount of instances according to the system response above, MIG vGPUs of the smallest size:

nvidia-smi mig -cgi 19,19,19,19,19,19,19

To create three differently sized vGPUs of 4g.20gb, 2g.10gb, and 1g.5gb sizes:

nvidia-smi mig -cgi 5,14,19

Caution

Keep in mind that not all combinations of differently sized vGPU instances are supported. Additionally, the order in which you create vGPUs is important.

For example configurations, see NVIDIA documentation.

Find mdev class of virtual GPU type¶

To correctly configure the MOSK Compute service, you need to correlate the following naming schemes related to virtual GPU types:

The GPU instance profile as reported by nvidia-smi mig. For example, MIG 1g.5gb.
The vGPU type as reported by the driver. For example, GRID A100-1-5C.
The mdev class that corresponds to the vGPU type. For example, nvidia-474.

For the compatibility between GPU instance profiles and virtual GPU types, refer to NVIDIA documentation: Virtual GPU Types for Supported GPUs.

To determine the mdev class supported by a specific virtual GPU type listed by a PCI device address, verify the output of the mdevctl types command executed on the compute node that has a physical GPU available on it:

mdevctl types

Example system response for MIGs:

0000:42:00.4
  nvidia-1053
    Available instances: 0
    Device API: vfio-pci
    Name: GRID A100-1-10C
    Description: num_heads=1, frl_config=60, framebuffer=10240M, max_resolution=4096x2400, max_instance=4
  ...
  nvidia-474
    Available instances: 1
    Device API: vfio-pci
    Name: GRID A100-1-5C
    Description: num_heads=1, frl_config=60, framebuffer=5120M, max_resolution=4096x2400, max_instance=7
  ...

The Name field from the example system output above corresponds to the supported virtual GPU type, linking the GPU instance profile with the mdev class supported by your physical GPU.

In the example above, the MIG 1g.5gb GPU instance profile corresponds to the GRID A100-1-5C vGPU type as per NVIDIA documentation, and according to the mdevctl types` output, it corresponds to the nvidia-474 mdev class.

Note

Notice that Available instances is zero for vGPU types that are not actually supported by this given card and configuration. For MIGs, the Available instances will be non-zero only for the virtual GPU types for which the MIG virtual GPU instances have already been created. See Partition to Multi-Instance GPUs.

Configure the Compute service¶

The parameters you need to define for the nova-compute service on each compute node with physical GPUs you want to expose as virtual GPUs include:

[devices]enabled_mdev_types
Required. List of the mdev classes, see the previous step for details.
[devices]cleanup_mdev_devices
Optional. By default, the Compute service does not delete created mdev devices but reuses them instead. While this speeds up processes, it may pose challenges when reconfiguring the enabled_mdev_types parameter. Set cleanup_mdev_devices to True for the Compute service to auto-delete created mdev devices upon instance deletion.

Time-sliced vGPU¶

If you plan to use only time-sliced vGPUs and provide a single virtal GPU type across the entire cloud, you only need to configure the options mentioned above once globally for all compute nodes through the spec.services section of the OpenStackDeployment custom resource.

With the configuration below, the Compute service will auto-detect all PCI devices that provide this mdev type and automatically create required resource providers in the placement service with the resource class VGPU.

Example configuration for the nvidia-474 mdev type:

kind: OpenStackDeployment
spec:
  services:
    compute:
      nova:
        values:
          conf:
            nova:
              devices:
                enabled_mdev_types: nvidia-474
                cleanup_mdev_devices: true

If you plan to provide multiple time-sliced vGPU types, simplify the configuration by grouping the nodes based on a node label (not necessarily aggregates). Ensure that each group exposes only one mdev type using the Node-specific configuration settings. Additionally, use custom resource classes to facilitate flavor creation, ensuring consistent use of the CUSTOM_ prefix for custom mdev_class.

For example, if you want to provide the nvidia-474 and nvidia-475 mdev types, label your nodes with the vgpu-type=nvidia-474 and vgpu-type=nvidia-475 labels and use the following node-specific settings:

kind: OpenStackDeployment
spec:
  nodes:
    vgpu-type::nvidia-474:
      services:
        compute:
          nova:
            nova_compute:
              values:
                conf:
                  nova:
                    devices:
                      enabled_mdev_types: nvidia-474
                      cleanup_mdev_devices: true
                    mdev_nvidia-474:
                      mdev_class: CUSTOM_VGPU_A100_1_5C
    vgpu-type::nvidia-475:
      services:
        compute:
          nova:
            nova_compute:
              values:
                conf:
                  nova:
                    devices:
                      enabled_mdev_types: nvidia-475
                      cleanup_mdev_devices: true
                    mdev_nvidia-475:
                      mdev_class: CUSTOM_VGPU_A100_2_10C

The configuration above creates corresponding resource providers in the placement service that provide CUSTOM_VGPU_A100_1_5C or CUSTOM_VGPU_A100_2_10C resources. You can use these resources during the definition of flavors for instances with corresponding vGPU types.

In some cases, you may need to provide different vGPU types from a single compute node, for example, if the compute node has 2 physical GPUs and you want to create two different types of vGPU on them. For such scenarios, you should provide explicit PCI device addresses of these physical GPUs in the settings. This makes such configuration verbose in heterogeneous hardware environments where physical GPUs have different PCI addresses on each node. For example, when targeting node-specific settings by node name:

kind: OpenStackDeployment
spec:
  nodes:
    kubernetes.io/hostname::kaas-node-7af9aab1-596d-4ba3-a717-846653aa441a:
      services:
        compute:
          nova:
            nova_compute:
              values:
                conf:
                  nova:
                    devices:
                      enabled_mdev_types: nvidia-474,nvidia-475
                      cleanup_mdev_devices: true
                    mdev_nvidia-474:
                      device_addresses: 0000:42:00.0
                      mdev_class: CUSTOM_VGPU_A100_1_5C
                    mdev_nvidia-475:
                      device_addresses: 0000:43:00.0
                      mdev_class: CUSTOM_VGPU_A100_2_10C

Multi-Instance GPU (MIG)¶

In the SR-IOV mode, the driver typically creates more virtual functions than the maximum capacity of the physical GPU, even for the smallest virtual GPU type. Each virtual function can hold only one single virtual GPU. This leads to resource over-reporting to the placement service.

Therefore, to ensure efficient resource allocation and utilization within a homogeneous hardware environment, assuming that each compute node in it has the same PCI address for the physical GPU and the physical GPU has been partitioned to the MIG GPU instances identically:

Identify the number of instances created of each MIG profile.
Select random but not overlapping sets of PCI addresses from the list of virtual functions of the physical GPU. The amount of addresses in each set must correspond to the number of instances created of each MIG profile.
Assign the mdev type to the selected devices.

For example, for the environment with the following configuration:

3 MIG instances of MIG 1.5gb and 2 MIG instances of MIG 2.10gb
16 virtual functions created for the physical GPU with the PCI address range from 0000:42:00.0 to 0000:42:01.7

Pick 3 and 2 random PCI addresses from that pool and assign them to CUSTOM_VGPU_A100_1_5C and CUSTOM_VGPU_A100_2_10C mdev classes respectively:

spec:
  services:
    compute:
      nova:
        values:
          conf:
            nova:
              devices:
                enabled_mdev_types: nvidia-474,nvidia-475
                cleanup_mdev_devices: true
              mdev_nvidia-474:
                device_addresses: 0000:42:00.0,0000:42:00.1,0000:42:00.2
                mdev_class: CUSTOM_VGPU_A100_1_5C
              mdev_nvidia-475:
                device_addresses: 0000:42:01.0,0000:42:01.1
                mdev_class: CUSTOM_VGPU_A100_2_10C

In a heterogeneous hardware environment, use node-specific settings to group nodes with the same PCI addresses and intended vGPU configuration, or use explicit setting for each node targeting node-specific settings to every node, sequentially if needed.

Verify resource providers¶

This section provides guidelines for verifying that virtual GPUs are correctly accounted for in the OpenStack Placement service, ensuring proper scheduling of instances that utilize virtual GPUs.

Firstly, verify that resource providers have been created with accurate inventories. For each PCI device associated with a virtual GPU, including virtual instances in the case of MIG/SR-IOV, there should be a nested resource provider under the resource provider of the corresponding compute node. The name of this nested resource provider should follow the format <node-name>_pci_<pci-address-with-underscores>:

openstack resource provider list --resource CUSTOM_VGPU_A100_1_5C=1 -f yaml

Example system response:

- generation: 1
  name: kaas-node-9d18b7c8-7ea8-4b13-abe9-0e76ee8db596.kaas-kubernetes-294cbb1cbf084789b931ebc54d3f9b05_pci_0000_42_00_4
  parent_provider_uuid: c922488b-69eb-42a8-afad-dc7d3d56b8fd
  root_provider_uuid: c922488b-69eb-42a8-afad-dc7d3d56b8fd
  uuid: 963bb3ce-3ed1-421f-a186-a808c3460c48
  ...

Also, examine the inventory of each resource provider. It should exclusively consist of the VGPU resource or any custom resource name configured in the Compute service settings. The total capacity of the resource should match the capacity reported by the mdevctl types output, reflecting the capabilities of the PCI device for the specified mdev class. In the case of MIG, this total capacity will always be 1.

openstack resource provider inventory list 963bb3ce-3ed1-421f-a186-a808c3460c48 -f yaml

Example system response:

- allocation_ratio: 1.0
  max_unit: 1
  min_unit: 1
  reserved: 0
  resource_class: CUSTOM_VGPU_A100_1_5C
  step_size: 1
  total: 1
  used: 0

Create required resources¶

This section provides instructions for creating a flavor that requests a specific virtual GPU resource, using the mdev classes configured in the Compute service and registered in the placement service.

To create the flavor, use the openstack flavor create command. Ensure that the flavor properties match the configured mdev classes in the Compute service. For example, to request one vGPU of type nvidia-474 using the resource class from the previous examples:

openstack flavor create --ram 1024 --vcpus 2 --disk 5 --property resources:CUSTOM_VGPU_A100_1_5C=1

Replace the --property resources:CUSTOM_VGPU_A100_1_5C=1 parameter with the appropriate property matching the desired virtual GPU type and quantity.

Once the flavor is created, you can start launching instances using the created flavor as usual.

Enable image signature verification¶

TechPreview

Note

Consider this section as part of Deploy an OpenStack cluster.

Mirantis OpenStack for Kubernetes (MOSK) enables you to perform image signature verification when booting an OpenStack instance, uploading a Glance image with signature metadata fields set, and creating a volume from an image.

To enable signature verification, use the following osdpl definition:

spec:
  features:
    glance:
      signature:
        enabled: true

When enabled during initial deployment, all internal images such as Amphora, Ironic, and test (CirrOS, Fedora, Ubuntu) images, will be signed by a self-signed certificate.

Configure LoadBalancer for PowerDNS¶

Note

Consider this section as part of Deploy an OpenStack cluster.

Mirantis OpenStack for Kubernetes (MOSK) allows configuring LoadBalancer for the Designate PowerDNS backend. For example, you can expose a TCP port for zone transferring using the following exemplary osdpl definition:

spec:
 designate:
   backend:
     external_ip: 10.172.1.101
     protocol: udp
     type: powerdns

For the supported values, see LoadBalancer type for PowerDNS.

Access OpenStack after deployment¶

This section contains the guidelines on how to access your MOSK OpenStack environment.

Configure DNS to access OpenStack¶

DNS is a mandatory component for MOSK deployment, all records must be created on the customer DNS server. The OpenStack services are exposed through the Ingress NGINX controller.

Warning

This document describes how to temporarily configure DNS. The workflow contains non-permanent changes that will be rolled back during a managed cluster update or reconciliation loop. Therefore, proceed at your own risk.

To configure DNS to access your OpenStack environment:

Obtain the external IP address of the Ingress service:

kubectl -n openstack get services ingress

Example of system response:

NAME      TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)                                      AGE
ingress   LoadBalancer   10.96.32.97   10.172.1.101   80:34234/TCP,443:34927/TCP,10246:33658/TCP   4h56m

Select from the following options:

If you have a corporate DNS server, update your corporate DNS service and create appropriate DNS records for all OpenStack public endpoints.

To obtain the full list of public endpoints:

kubectl -n openstack get ingress -ocustom-columns=NAME:.metadata.name,HOSTS:spec.rules[*].host | awk '/namespace-fqdn/ {print $2}'

Example of system response:

barbican.it.just.works
cinder.it.just.works
cloudformation.it.just.works
designate.it.just.works
glance.it.just.works
heat.it.just.works
horizon.it.just.works
keystone.it.just.works
neutron.it.just.works
nova.it.just.works
novncproxy.it.just.works
octavia.it.just.works
placement.it.just.works

If you do not have a corporate DNS server, perform one of the following steps:

Add the appropriate records to /etc/hosts locally. For example:

172.1.101 barbican.it.just.works
172.1.101 cinder.it.just.works
172.1.101 cloudformation.it.just.works
172.1.101 designate.it.just.works
172.1.101 glance.it.just.works
172.1.101 heat.it.just.works
172.1.101 horizon.it.just.works
172.1.101 keystone.it.just.works
172.1.101 neutron.it.just.works
172.1.101 nova.it.just.works
172.1.101 novncproxy.it.just.works
172.1.101 octavia.it.just.works
172.1.101 placement.it.just.works

Deploy your DNS server on top of Kubernetes:

Deploy a standalone CoreDNS server by including the following configuration into coredns.yaml:

apiVersion: lcm.mirantis.com/v1alpha1
kind: HelmBundle
metadata:
  name: coredns
  namespace: osh-system
spec:
  repositories:
  - name: hub_stable
    url: https://charts.helm.sh/stable
  releases:
  - name: coredns
    chart: hub_stable/coredns
    version: 1.8.1
    namespace: coredns
    values:
      image:
        repository: mirantis.azurecr.io/openstack/extra/coredns
        tag: "1.6.9"
      isClusterService: false
      servers:
      - zones:
        - zone: .
          scheme: dns://
          use_tcp: false
        port: 53
        plugins:
        - name: cache
          parameters: 30
        - name: errors
        # Serves a /health endpoint on :8080, required for livenessProbe
        - name: health
        # Serves a /ready endpoint on :8181, required for readinessProbe
        - name: ready
        # Required to query kubernetes API for data
        - name: kubernetes
          parameters: cluster.local
        - name: loadbalance
          parameters: round_robin
        # Serves a /metrics endpoint on :9153, required for serviceMonitor
        - name: prometheus
          parameters: 0.0.0.0:9153
        - name: forward
          parameters: . /etc/resolv.conf
        - name: file
          parameters: /etc/coredns/it.just.works.db it.just.works
      serviceType: LoadBalancer
      zoneFiles:
      - filename: it.just.works.db
        domain: it.just.works
        contents: |
          it.just.works.            IN      SOA     sns.dns.icann.org. noc.dns.icann.org. 2015082541 7200 3600 1209600 3600
          it.just.works.            IN      NS      b.iana-servers.net.
          it.just.works.            IN      NS      a.iana-servers.net.
          it.just.works.            IN      A       1.2.3.4
          *.it.just.works.           IN      A      1.2.3.4

Update the public IP address of the Ingress service:

sed -i 's/1.2.3.4/10.172.1.101/' coredns.yaml
kubectl apply -f coredns.yaml

Verify that the DNS resolution works properly:

Assign an external IP to the service:

kubectl -n coredns patch service coredns-coredns --type='json' -p='[{"op": "replace", "path": "/spec/ports", "value": [{"name": "udp-53", "port": 53, "protocol": "UDP", "targetPort": 53}]}]'
kubectl -n coredns patch service coredns-coredns --type='json' -p='[{"op": "replace", "path": "/spec/type", "value":"LoadBalancer"}]'

Obtain the external IP address of CoreDNS:

kubectl -n coredns get service coredns-coredns

Example of system response:

NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
coredns-coredns   ClusterIP   10.96.178.21   10.172.1.102      53/UDP,53/TCP   25h

Point your machine to use the correct DNS. It is 10.172.1.102 in the example system response above.

If you plan to launch Tempest tests or use the OpenStack client from a keystone-client-XXX pod, verify that the Kubernetes built-in DNS service is configured to resolve your public FQDN records by adding your public domain to Corefile. For example, to add the it.just.works domain:

kubectl -n kube-system get configmap coredns -oyaml

Example of system response:

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
    it.just.works:53 {
        errors
        cache 30
        forward . 10.96.178.21
    }

See also

Access your OpenStack environment¶

This section explains how to access your OpenStack environment as admin user.

Before you proceed, make sure that you can access the Kubernetes API and have privileges to read secrets from the openstack-external namespace in Kubernetes or you are able to exec to the pods in the openstack namespace.

Access OpenStack using the Kubernetes built-in admin CLI¶

You can use the built-in admin CLI client and execute the openstack commands from a dedicated pod deployed in the openstack namespace:

kubectl -n openstack exec \
  $(kubectl -n openstack get pod -l application=keystone,component=client -ojsonpath='{.items[*].metadata.name}') \
  -ti -- bash

This pod has python-openstackclient and all required plugins already installed. The python-openstackclient command-line client is configured to use the admin user credentials. You can view the detailed configuration for the openstack command in /etc/openstack/clouds.yaml file in the pod.

Access an OpenStack environment through Horizon¶

Configure the external DNS resolution for OpenStack services as described in Configure DNS to access OpenStack.

Obtain the admin user credentials from the openstack-identity-credentials secret in the openstack-external namespace:

kubectl -n openstack-external get secrets openstack-identity-credentials -o jsonpath='{.data.clouds\.yaml}' | base64 -d

Example of a system response:

clouds:
  admin:
    auth:
      auth_url: https://keystone.it.just.works/
      password: <ADMIN_PWD>
      project_domain_name: <ADMIN_PROJECT_DOMAIN>
      project_name: <ADMIN_PROJECT>
      user_domain_name: <ADMIN_USER_DOMAIN>
      username: <ADMIN_USER_NAME>
    endpoint_type: public
    identity_api_version: 3
    interface: public
    region_name: CustomRegion
  admin-system:
    auth:
      auth_url: https://keystone.it.just.works/
      password: <ADMIN_PWD>
      system_scope: all
      user_domain_name: <ADMIN_USER_DOMAIN>
      username: <ADMIN_USER_NAME>
    endpoint_type: public
    identity_api_version: 3
    interface: public
    region_name: CustomRegion

Access Horizon through your browser using its public service. For example, https://horizon.it.just.works.

To log in, specify the user name and domain name obtained in previous step from the <ADMIN_USER_NAME> and <ADMIN_USER_DOMAIN> values.

If the OpenStack Identity service has been deployed with the OpenID Connect integration:
1. From the Authenticate using drop-down menu, select OpenID Connect.
2. Click Connect. You will be redirected to your identity provider to proceed with the authentication.
Note

If OpenStack has been deployed with self-signed TLS certificates for public endpoints, you may get a warning about an untrusted certificate. To proceed, allow the connection.

Access OpenStack through CLI from your local machine¶

To be able to access your OpenStack environment through the CLI, you need to configure the openstack client environment using either an openstackrc environment file or clouds.yaml file.

openstackrc

Log in to Horizon as described in Access an OpenStack environment through Horizon.
Download the openstackrc file from the web UI.
On any shell from which you want to run OpenStack commands, source the environment file for the respective project.

clouds.yaml

Obtain clouds.yaml:

mkdir -p ~/.config/openstack
kubectl -n openstack-external get secrets openstack-identity-credentials -o jsonpath='{.data.clouds\.yaml}' | base64 -d > ~/.config/openstack/clouds.yaml

The OpenStack client looks for clouds.yaml in the following locations: current directory, ~/.config/openstack, and /etc/openstack.

Export the OS_CLOUD environment variable:
```
export OS_CLOUD=admin
```

Now, you can use the openstack CLI as usual. For example:

openstack user list

Example of an expected system response:

+----------------------------------+-----------------+
| ID                               | Name            |
+----------------------------------+-----------------+
| dc23d2d5ee3a4b8fae322e1299f7b3e6 | internal_cinder |
| 8d11133d6ef54349bd014681e2b56c7b | admin           |
+----------------------------------+-----------------+

Note

If OpenStack was deployed with self-signed TLS certificates for public endpoints, you may need to use the openstack command-line client with certificate validation disabled. For example:

openstack --insecure user list

Troubleshoot an OpenStack deployment¶

This section provides the general debugging instructions for your OpenStack on Kubernetes deployment. Start your troubleshooting with the determination of the failing component that can include the OpenStack Controller (Rockoon), Helm, a particular pod, or service.

Note

For Kubernetes cluster debugging and troubleshooting, refer to Kubernetes official documentation: Troubleshoot clusters and Docker Enterprise v3.0 documentation: Monitor and troubleshoot.

Debugging the Helm releases¶

OpenStack Controller (Rockoon)

Since MOSK 25.1, the OpenStack Controller has been open-sourced under the name Rockoon and is maintained as an independent open-source project going forward.

Note

MOSK uses direct communication with Helm 3.

Verify the Helm releases statuses¶

kubectl -n osh-system get pods  |grep rockoon

Example of a system response:

rockoon-5c5965688b-knck5             9/9     Running   0          21h
rockoon-admission-6795d594b4-rp7kf   1/1     Running   0          22h
rockoon-exporter-6f66547b67-pcxhh    1/1     Running   0          22h

Verify the Helm releases statuses:

helm3 --namespace openstack list --all

Example of a system response:

NAME                            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                           APP VERSION
etcd                            openstack       4               2021-07-09 11:06:25.377538008 +0000 UTC deployed        etcd-0.1.0-mcp-2735
ingress-openstack               openstack       4               2021-07-09 11:06:24.892822083 +0000 UTC deployed        ingress-0.1.0-mcp-2735
openstack-barbican              openstack       4               2021-07-09 11:06:25.733684392 +0000 UTC deployed        barbican-0.1.0-mcp-3890
openstack-ceph-rgw              openstack       4               2021-07-09 11:06:25.045759981 +0000 UTC deployed        ceph-rgw-0.1.0-mcp-2735
openstack-cinder                openstack       4               2021-07-09 11:06:42.702963544 +0000 UTC deployed        cinder-0.1.0-mcp-3890
openstack-designate             openstack       4               2021-07-09 11:06:24.400555027 +0000 UTC deployed        designate-0.1.0-mcp-3890
openstack-glance                openstack       4               2021-07-09 11:06:25.5916904 +0000 UTC deployed        glance-0.1.0-mcp-3890
openstack-heat                  openstack       4               2021-07-09 11:06:25.3998706 +0000 UTC deployed        heat-0.1.0-mcp-3890
openstack-horizon               openstack       4               2021-07-09 11:06:23.27538297 +0000 UTC deployed        horizon-0.1.0-mcp-3890
openstack-iscsi                 openstack       4               2021-07-09 11:06:37.891858343 +0000 UTC deployed        iscsi-0.1.0-mcp-2735            v1.0.0
openstack-keystone              openstack       4               2021-07-09 11:06:24.878052272 +0000 UTC deployed        keystone-0.1.0-mcp-3890
openstack-libvirt               openstack       4               2021-07-09 11:06:38.185312907 +0000 UTC deployed        libvirt-0.1.0-mcp-2735
openstack-mariadb               openstack       4               2021-07-09 11:06:24.912817378 +0000 UTC deployed        mariadb-0.1.0-mcp-2735
openstack-memcached             openstack       4               2021-07-09 11:06:24.852840635 +0000 UTC deployed        memcached-0.1.0-mcp-2735
openstack-neutron               openstack       4               2021-07-09 11:06:58.96398517 +0000 UTC deployed        neutron-0.1.0-mcp-3890
openstack-neutron-rabbitmq      openstack       4               2021-07-09 11:06:51.454918432 +0000 UTC deployed        rabbitmq-0.1.0-mcp-2735
openstack-nova                  openstack       4               2021-07-09 11:06:44.277976646 +0000 UTC deployed        nova-0.1.0-mcp-3890
openstack-octavia               openstack       4               2021-07-09 11:06:24.775069513 +0000 UTC deployed        octavia-0.1.0-mcp-3890
openstack-openvswitch           openstack       4               2021-07-09 11:06:55.271711021 +0000 UTC deployed        openvswitch-0.1.0-mcp-2735
openstack-placement             openstack       4               2021-07-09 11:06:21.954550107 +0000 UTC deployed        placement-0.1.0-mcp-3890
openstack-rabbitmq              openstack       4               2021-07-09 11:06:25.431404853 +0000 UTC deployed        rabbitmq-0.1.0-mcp-2735
openstack-tempest               openstack       2               2021-07-09 11:06:21.330801212 +0000 UTC deployed        tempest-0.1.0-mcp-3890

If a Helm release is not in the DEPLOYED state, obtain the details from the output of the following command:

helm3 --namespace openstack  history <release-name>

Verify the status of a Helm release¶

To verify the status of a Helm release:

helm3 --namespace openstack status <release-name>

Example of a system response:

NAME: openstack-memcached
LAST DEPLOYED: Fri Jul  9 11:06:24 2021
NAMESPACE: openstack
STATUS: deployed
REVISION: 4
TEST SUITE: None

Debugging the OpenStack Controller¶

OpenStack Controller (Rockoon)

Since MOSK 25.1, the OpenStack Controller has been open-sourced under the name Rockoon and is maintained as an independent open-source project going forward.

The OpenStack Controller (Rockoon) is running in several containers in the rockoon-xxxx pod in the osh-system namespace. For the full list of containers and their roles, refer to OpenStack Controller (Rockoon).

To verify the status of the OpenStack Controller, run:

kubectl -n osh-system get pods

Example of a system response:

NAME                                  READY   STATUS    RESTARTS   AGE
rockoon-5c5965688b-knck5             9/9     Running   0          21h
rockoon-admission-6795d594b4-rp7kf   1/1     Running   0          22h
rockoon-exporter-6f66547b67-pcxhh    1/1     Running   0          22h

To verify the logs for the osdpl container, run:

kubectl -n osh-system logs -f <rockoon-xxxx> -c osdpl

Debugging the OsDpl CR¶

This section includes the ways to mitigate the most common issues with the OsDpl CR. We assume that you have already debugged the Helm releases and OpenStack Controller to rule out possible failures with these components as described in Debugging the Helm releases and Debugging the OpenStack Controller.

The osdpl has DEPLOYED=false¶

Possible root cause: One or more Helm releases have not been deployed successfully.

To determine if you are affected:

Verify the status of the osdpl object:

kubectl -n openstack get osdpl osh-dev

Example of a system response:

NAME      AGE   DEPLOYED   DRAFT
osh-dev   22h   false      false

To debug the issue:

Identify the failed release by assessing the status:children section in the OsDpl resource:
1. Get the OsDpl YAML file:
```
kubectl -n openstack get osdpl osh-dev -o yaml
```
2. Analyze the status output using the detailed description in OpenStackDeploymentStatus custom resource.
For further debugging, refer to Debugging the Helm releases.

Some pods are stuck in Init¶

Possible root cause: MOSK uses the Kubernetes entrypoint init container to resolve dependencies between objects. If the pod is stuck in Init:0/X, this pod may be waiting for its dependencies.

To debug the issue:

Verify the missing dependencies:

kubectl -n openstack logs -f placement-api-84669d79b5-49drw -c init

Example of a system response:

Entrypoint WARNING: 2020/04/21 11:52:50 entrypoint.go:72: Resolving dependency Job placement-ks-user in namespace openstack failed: Job Job placement-ks-user in namespace openstack is not completed yet .
Entrypoint WARNING: 2020/04/21 11:52:52 entrypoint.go:72: Resolving dependency Job placement-ks-endpoints in namespace openstack failed: Job Job placement-ks-endpoints in namespace openstack is not completed yet .

Some Helm releases are not present¶

Possible root cause: some OpenStack services depend on Ceph. These services include OpenStack Image, OpenStack Compute, and OpenStack Block Storage. If the Helm releases for these services are not present, the openstack-ceph-keys secret may be missing in the openstack-ceph-shared namespace.

To debug the issue:

Verify that the Ceph Controller has created the openstack-ceph-keys secret in the openstack-ceph-shared namespace:

kubectl -n openstack-ceph-shared get secrets openstack-ceph-keys

Example of a positive system response:

NAME                  TYPE     DATA   AGE
openstack-ceph-keys   Opaque   7      23h

If the secret is not present, create one manually.

Support dump¶

TechPreview

OpenStack Controller (Rockoon)

Since MOSK 25.1, the OpenStack Controller has been open-sourced under the name Rockoon and is maintained as an independent open-source project going forward.

Support dump described in this section specifically targets OpenStack components, providing valuable insights for troubleshooting OpenStack-related problems.

To generate a support dump for your MOSK environment, use the osctl sos report tool present within the rockoon image.

This section focuses only on the essential capabilities of the tool. For all available parameters, consult osctl sos report --help.

Collectors¶

The support dump is modular. Each module is responsible for specific functionality. To enable or disable specific modules during support dump creation, use the --collector option. If not specified, all collectors are used.

Support dump collectors¶
Collector	Description
`elastic`	Collects logs from StackLight by connecting to the OpenSearch API.
`k8s`	Collects data about objects from Kubernetes.
`nova`	Collects metadata associated with the Compute service (OpenStack Nova) from the OpenStack nodes. This encompasses a wide range of data, including instance details, general libvirt information, and so on.
`neutron`	Collects metadata associated with the Networking service (OpenStack Neutron) from the OpenStack nodes. This encompasses a wide range of data, including Open vSwitch statistics, list of namespaces, IP address statistics in namespaces, Open vSwitch flows, and so on.

Components¶

Given the substantial amount of information, you can manage the components included in a support dump using the mutually exclusive --component or --all-components options. Within the elastic collector component, you can specify which loggers to gather logs for. For example, the --component nova option restricts log collection to pods related to Nova, which names start with nova-* and libvirt-*.

Hosts¶

Another filtering criterion involves specifying the host for which you intend to collect support information. This can be accomplished through the use of mutually exclusive --host or --all-hosts options. This feature is particularly valuable for limiting the volume of data included in the support dump.

Modes¶

Support dump works in the following modes:

report
Generic report is created, no specific information such as resource UUID is included. The tool collects as much information as possible.
trace
Provides more sophisticated filtering criteria rather than the report mode. For example, you can search for specific message patterns in OpenSearch.

Usage¶

Since MOSK 23.2, you can execute the osctl sos commands directly in the rockoon pod. For example:

kubectl -n osh-system exec -it deployment/rockoon bash

osctl sos --since 1d \
          --all-hosts \
          --component neutron \
          --collector elastic \
          --workspace /tmp/ report

To get trace for a specific resource UUID in Neutron for a specific host, use the following command as an example:

kubectl -n osh-system exec -it deployment/rockoon bash

osctl sos --since 1d \
          --host kaas-node-fe0734de-20e8-4493-9f7d-52c4f8a8a98c \
          --component neutron \
          --workspace /workspace/ \
          --collector elastic trace --message ".4a055675-89b0-45c2-a3b3-a10dffa07f31."

For older MOSK versions, to start generating support dumps, execute the osctl sos commands from a manually started Docker container on any node of your cluster. For example, to create a generic report for the Neutron component:

docker run -v /home/ubuntu/sosreport/:/workspace -v /root/.kube/config:/root/.kube/config -it mirantis.azurecr.io/openstack/rockoon:1.0.1 bash

osctl sos --since 1d \
          --elastic-url http://172.16.37.11:9200 \
          --all-hosts \
          --component neutron \
          --collector elastic \
          --workspace /workspace/ report

Note

172.16.37.11 is the IP address of the opensearch-master StackLight service. To obtain it, run:

kubectl -n stacklight get svc opensearch-master -o jsonpath='{.spec.clusterIP}')

Deploy Tungsten Fabric¶

This section describes how to deploy Tungsten Fabric as a backend for networking for your MOSK environment.

Caution

Before you proceed with the Tungsten Fabric deployment, read through Tungsten Fabric known limitations.

Tungsten Fabric deployment prerequisites¶

Before you proceed with the actual Tungsten Fabric (TF) deployment, verify that your deployment meets the following prerequisites:

Your MOSK OpenStack cluster is deployed as described in Deploy an OpenStack cluster with the Tungsten Fabric backend enabled for Neutron using the following structure:
```
spec:
  features:
    neutron:
      backend: tungstenfabric
```
Your MOSK OpenStack cluster uses the correct value of features:neutron:tunnel_interface in the openstackdeployment object. The TF Operator will consume this value through the shared secret and use it as a network interface from the underlay network to create encapsulated tunnels with the tenant networks.
Considerations for tunnel_interface
- Plan this interface as a dedicated physical interface for TF overlay networks. TF uses features:neutron:tunnel_interface to create the vhost0 virtual interface and transfers the IP configuration from the tunnel_interface to the virtual one.
- Do not use bridges from L2 templates as tunnel_interface. Such usage might lead to networking performance degradation and data plane downtime.

The Kubernetes nodes are labeled according to the TF node roles:

Tungsten Fabric (TF) node roles¶
Node role	Description	Kubernetes labels	Minimal count
TF control plane	Hosts the TF control plane services such as `database`, `messaging`, `api`, `svc`, `config`.	`tfconfig=enabled` `tfcontrol=enabled` `tfwebui=enabled` `tfconfigdb=enabled`	3
TF analytics Unsupported since {{ product_name_abbr }} 24.2	Hosts the TF analytics services.	`tfanalytics=enabled` `tfanalyticsdb=enabled`	3
TF vRouter	Hosts the TF vRouter module and vRouter Agent.	`tfvrouter=enabled`	Varies

Note

TF supports only Kubernetes OpenStack workloads. Therefore, you should label OpenStack compute nodes with the tfvrouter=enabled label.

Note

Do not specify the openvswitch=enabled label for the OpenStack deployments with TF as a networking backend.

Deploy Tungsten Fabric¶

Deployment of Tungsten Fabric is managed by the tungstenfabric-operator Helm resource in a respective ClusterRelease.

To deploy Tungsten Fabric:

Optional. Configure the ASN and encapsulation settings if you need custom values for these parameters. For configuration details, see Autonomous System Number (ASN).
Verify that you have completed all prerequisite steps as described in Tungsten Fabric deployment prerequisites.

Create the tungstenfabric.yaml file with the Tungsten Fabric resource configuration. For example:

apiVersion: tf.mirantis.com/v2
kind: TFOperator
metadata:
  name: openstack-tf
  namespace: tf
spec:
   dataStorageClass: tungstenfabric-operator-bind-mounts

Configure the TFOperator custom resource according to the needs of your deployment. For the configuration details, refer to TFOperator custom resource and Tungsten Fabric Operator resources.
Trigger the Tungsten Fabric deployment:
```
kubectl apply -f tungstenfabric.yaml
```
Verify that Tungsten Fabric has been successfully deployed:
```
kubectl get pods -n tf
```
The successfully deployed TF services should appear in the Running status in the system response.
If you have enabled StackLight, enable Tungsten Fabric monitoring by setting tungstenFabricMonitoring.enabled to true as described in StackLight configuration procedure.

Since MOSK 23.1, tungstenFabricMonitoring.enabled is enabled by default during the Tungsten Fabric deployment. Therefore, skip this step.

Troubleshoot the Tungsten Fabric deployment¶

This section provides the general debugging instructions for your Tungsten Fabric (TF) on Kubernetes deployment.

Enable debug logs for the Tungsten Fabric services¶

To enable debug logging for the Tungsten Fabric (TF) services:

API v2 Available since MOSK 24.1

Open the TF custom resource for modification:

kubectl -n tf edit tfoperators.tf.mirantis.com openstack-tf

Set the logLevel variable to the SYS_DEBUG value for the required TF service. For example, for the config-api service:

spec:
  services:
    config:
      tf:
        logging:
           api:
             logLevel: SYS_DEBUG

API v1alpha1 Removed in MOSK 25.1

Open the TF custom resource for modification:

kubectl -n tf edit tfoperators.operator.tf.mirantis.com openstack-tf

Set the LOG_LEVEL variable to the SYS_DEBUG value for the required TF service. For example, for the config-api service:

spec:
   controllers:
     tf-config:
       api:
         containers:
         - name: api
           env:
           - name: LOG_LEVEL
             value: SYS_DEBUG

Warning

After the TF custom resource modification, the pods related to the affected services will be restarted. This rule does not apply to the tf-vrouter-agent-<XXXXX> pods as their update strategy differs. Therefore, if you enable the debug logging for the services in a tf-vrouter-agent-<XXXXX> pod, restart this pod manually after you modify the custom resource.

Troubleshoot access to the Tungsten Fabric web UI¶

If you cannot access the Tungsten Fabric (TF) web UI service, verify that the FQDN of the TF web UI is resolvable on your PC by running one of the following commands:

host tf-webui.it.just.works
# or
ping tf-webui.it.just.works
# or
dig host tf-webui.it.just.works

All commands above should resolve the web UI domain name to the IP address that should match the EXTERNAL-IPs subnet dedicated to Kubernetes.

If the TF web UI domain name has not been resolved to the IP address, your PC is using a different DNS or the DNS does not contain the record for the TF web UI service. To resolve the issue, define the IP address of the Ingress service from the openstack namespace of Kubernetes in the hosts file of your machine. To obtain the Ingress IP address:

kubectl -n openstack get svc ingress -o custom-columns=HOSTS:.status.loadBalancer.ingress[*].ip

If the web UI domain name is resolvable but you still cannot access the service, verify the connectivity to the cluster.

Disable TX offloading on NICs used by vRouter¶

In the following cases, a TCP-based service may not work on VMs:

If the setup has nested VMs.
If VMs are running in the ESXi hypervisor.
If the Network Interface Cards (NICs) do not support the IP checksum calculation and generate an incorrect checksum. For example, the Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe NIC cards.

To resolve the issue, disable the transmit (TX) offloading on all OpenStack compute nodes for the affected NIC used by the vRouter as described below.

To identify the issue:

Verify whether ping is working between VMs on different hypervisor hosts and the TCP services are working.
Run the following command for the vRouter Agent and verify whether the output includes the number of Checksum errors:
```
kubectl -n tf exec tf-vrouter-agent-XXXXX -c agent -- dropstats
```

Run the following command and verify if the output includes the cksum incorrect entries:

kubectl -n tf exec tf-vrouter-agent-XXXXX -c agent -- tcpdump -i <tunnel interface> -v -nn | grep -i incorrect

Example of system response:

tcpdump: listening on <tunnel interface>, link-type EN10MB (Ethernet), capture size 262144 bytes
<src ip.port> > <dst ip.port>: Flags [S.], cksum 0x43bf (incorrect -> 0xb8dc), \
seq 1901889431, ack 1081063811, win 28960, options [mss 1420,sackOK,\
TS val 456361578 ecr 41455995,nop,wscale 7], length 0
<src ip.port> > <dst ip.port>: Flags [S.], cksum 0x43bf (incorrect -> 0xb8dc), \
seq 1901889183, ack 1081063811, win 28960, options [mss 1420,sackOK,\
TS val 456361826 ecr 41455995,nop,wscale 7], length 0
<src ip.port> > <dst ip.port>: Flags [S.], cksum 0x43bf (incorrect -> 0xb8dc), \
seq 1901888933, ack 1081063811, win 28960, options [mss 1420,sackOK,\
TS val 456362076 ecr 41455995,nop,wscale 7], length 0

Run the following command for the vRouter Agent container and verify whether the output includes the information about a drop for an unknown reason:
```
kubectl -n tf exec tf-vrouter-agent-XXXXX -c agent -- flow -l
```

To disable the TX offloading on NICs used by vRouter:

Open the TFOperator custom resource (CR) for editing:

kubectl -n tf edit tfoperators.operator.tf.mirantis.com openstack-tf

Specify the DISABLE_TX_OFFLOAD variable with the "YES" value for the vRouter Agent container:
API v2 Available since MOSK 24.1
spec: features: vRouter: disableTXOffload: true
API v1alpha1 Removed in MOSK 25.1
spec: controllers: tf-vrouter: agent: containers: - name: agent env: - name: DISABLE_TX_OFFLOAD value: "YES"
Warning

Once you modify the TFOperator CR, the tf-vrouter-agent-<XXXXX> pods will not restart automatically because they use the OnDelete update strategy. Restart such pods manually, considering that the vRouter pods restart causes network services interruption for the VMs hosted on the affected nodes.
To disable TX offloading on a specific subset of nodes, use custom vRouter settings. For details, see Custom vRouter settings.

Warning

Once you add a new CustomSpec, a new daemon set will be generated and the tf-vrouter-agent-<XXXXX> pods will be automatically restarted. The vRouter pods restart causes network services interruption for VMs hosted on the affected node. Therefore, plan this procedure accordingly.

See also

Access the Tungsten Fabric web UI

Operations Guide¶

This guide outlines the post-deployment Day-2 operations for a Mirantis OpenStack for Kubernetes environment. It describes how to configure and manage the MOSK components, perform different types of cloud verification, and enable additional features depending on your cloud needs. The guide also contains day-to-day maintenance procedures such as how to back up and restore, update and upgrade, or troubleshoot your MOSK cluster.

Cluster update¶

Updating a MOSK cluster ensures that the system remains secure, efficient, and up-to-date with the latest features and performance improvements, as well as receives fixes for the known CVEs. This section provides comprehensive details and step-by-step procedures to guide you through the process of updating your cluster.

Update to a major version¶

This section describes the workflow you as a cloud operator need to follow to correctly update your Mirantis OpenStack for Kubernetes (MOSK) cluster to a major release version.

Note

The hereby guide applies to the clusters running MOSK of version 23.1 and above. In case you have an older version and looking to update, please contact Mirantis support to get intructions valid for your cluster.

The instructions below are generic and apply to any MOSK cluster regardless of its configuration specifics. However, every major release may have its own update peculiarities. Therefore, to accurately plan and successfully perform an update, in addition to the hereby document, read the update-related section in the Release Notes of the target MOSK version.

Depending on the payload of a target release, the update mechanism can perform the changes on different levels of the stack, from the configuration of the host operating system to the code of OpenStack itself. The update mechanism is designed to avoid the impact on the workloads and cloud users as much as possible. The life-cycle management logic minimizes the downtime for the cloud API by means of smart management of the cluster components under the hood and only requests your involvement when a human decision is required to proceed.

Though the update mechanism may change the internal components of the cluster, it will always preserve the major versions of OpenStack, that is, the APIs that cloud users and workloads deal with. After the cluster is successfully updated, you can initiate a separate upgrade procedure to obtain the latest supported OpenStack version.

See also

Before you begin¶

Before starting an update, we recommend that you closely peruse the Release Compatibility Matrix document and Release notes of the target release, as well as thoroughly plan maintenance windows for each update phase depending on the configurational of your cluster.

Read the release notes¶

Read carefully Release Compatibility Matrix and Release Notes of the target MOSK version paying particular attention to the following:

Current Mirantis Container Cloud software version and the need to first update to the latest cluster release version
Update notes provided in the Release notes for the target MOSK version
New product features that will get enabled in your cloud by default
New product features that may have already been configured in your cloud as customizations and now need to be properly re-enabled to be eligible for further support
Any changes in the behavior of the product features enabled in your cloud
List of the addressed and known issues in the target MOSK version

Warning

If your cloud configuration is known to have any custom configuration that was not explicitly approved by Mirantis, make sure to bring this up with your dedicated Mirantis representative before proceeding with the update. Mirantis cannot guarantee the safe updating of a customized cloud.

Plan the cluster update¶

Depending on the payload brought by a particular target release, a generic cluster update includes from three to six major phases.

The first three phases are present in any update. They focus on the containerized components of the software stack and have minimal impact on the cloud users and workloads.

The remaining phases are only present if any changes need to be made to the foundation layers: the underlay Kubernetes cluster and host operating system. For the changes to take effect, you may need to reboot the cluster nodes. This procedure imposes a severe impact on cloud workloads and, therefore, needs to be thoroughly planned across several sequential maintenance windows.

Important

To effectively plan a cluster update, keep in mind the architecture of your specific cloud. Depending on the selected design, the components of a MOSK cluster may have different distribution across the nodes (physical servers) comprising the underlay bare metal Kubernetes cluster. The more components are collocated on a single node, the harder is the impact on the functions of the cloud when the changes are applied.

The tables below will help you to plan your cluster update and include the following information for each mandatory and additional update phase:

What happens during the phase
Includes the phase milestones. The nature of changes that are going to be applied is important to understand in order to estimate the exact impact the update is going to have on your cluster.

Consult the Update notes section of the target MOSK release for the detailed information about the changes it brings and the impact these changes are going to imply when getting applied to your cluster.
Impact
Describes possible impact on cloud users and workloads.

The provided information about the impact represents the worst-case scenario in the cluster architectures that imply a combination of several roles on the same physical servers, such as hyper-converged compute nodes and clusters with a compact control plane.

The impact estimation presumes that your cluster uses one of the standard architectures provided by the product and follows Mirantis design guidelines.
Time to complete
Provides a rough estimation of the time required to complete the phase.

The estimates for a phase timeline presume that your cluster uses one of the standard architectures provided by the product and follows Mirantis design guidelines.

Warning

During the update, try to prevent users from performing write operations on the cloud resources. Any intensive manipulations may lead to workload corruption.

Phase 1: Life-cycle management modules update

Important

This phase is mandatory. It is always present in the update flow regardless of the contents of the target release.

Life-cycle management modules update¶
What happens during the phase	New versions of OpenStack, Tungsten Fabric, and Ceph controllers downloaded and installed. OpenStack and Tungsten Fabric images precached.
Impact	None
Time to complete	Depending on the quality of the Internet connectivity, up to 45 minutes.

Phase 2: OpenStack and Tungsten Fabric components update

Important

This phase is mandatory. It is always present in the update flow regardless of the contents of the target release.

OpenStack and Tungsten Fabric components update¶
What happens during the phase	New versions of OpenStack and Tungsten Fabric container images downloaded, services restarted sequentially.
Impact	Some of the running cloud operations may fail over the course of the phase due to minor unavailability of the cloud API. Workloads may experience temporary loss of the North-South connectivity in the clusters with Open vSwitch networking backend. The downtime depends on the type of virtual routers in use.
Time to complete	20 minutes per network gateway node (Open vSwitch) 5 minutes for a Tungsten Fabric cluster 15 minutes per compute node

Phase 3: Ceph cluster update and upgrade

Important

This phase is mandatory. It is always present in the update flow regardless of the contents of the target release.

Ceph cluster update and upgrade¶
What happens during the phase	New versions of Ceph components downloaded, services restarted. If applicable, Ceph switched to the latest major version.
Impact	Workloads may experience IO performance degradation for the virtual storage devices backed by Ceph.
Time to complete	The update of a Ceph cluster with 30 storage nodes can take up to 35 minutes. Additionally, 15 minutes are required for the major Ceph version upgrade, if any.

Phase 4a: Host operating system update on Kubernetes master nodes

Important

This phase is optional. The presense of this phase in the update flow depends on the contents of the target release.

Host operating system update on Kubernetes master nodes¶
What happens during the phase	New system packages downloaded and installed on the host operating system, other major changes get applied.
Impact	None
Time to complete	The nodes are updated sequentially. Up to 15 minutes per node.

Phase 4b: Kubernetes components update on Kubernetes master nodes

Important

This phase is optional. The presense of this phase in the update flow depends on the contents of the target release.

Kubernetes cluster update on Kubernetes master nodes¶
What happens during the phase	New versions of Kubernetes control plane components downloaded and installed.
Impact	For clusters with the compact control plane, some of the running cloud operations may fail over the course of the phase due to minor unavailability of the cloud API. For the compact control plane with gateway nodes collocated (Open vSwitch networking backend), workloads can experience temporary loss of the North-South connectivity. The downtime depends on the type of virtual routers in use.
Time to complete	Up to 40 minutes total

Phases 5a and 5b: Host operating system and Kubernetes cluster update on Kubernetes worker nodes

Important

Both phases, 5a and 5b, are applied together, either node by node (default) or to several nodes in parallel. The parallel updating is available since 23.1.

Take this into consideration when estimating the impact and planning the maintenance window.

Host operating system and Kubernetes cluster¶
What happens during the phase	During the host operating system update: New packages for host operating system downloaded and installed, including kernel, and other system components. Any other major configuration changes get applied. Node manually rebooted. But an operator of the cloud has an option to restart the nodes later, during another maintenance window. During the Kubernetes cluster update: New versions of Kubernetes control plane components, including container runtime, downloaded and installed Containers get restarted
Impact	For the storage nodes: Minor impact on Ceph cluster availability, depending on the number of storage nodes getting updated in parallel. See Enable parallel update of Kubernetes worker nodes Loss of connectivity to the volumes for the nodes hosting LVM with iSCSI volumes. For dedicated control plane nodes, some of the running cloud operations may fail over the course of the phase due to minor unavailability of the cloud API. For dedicated gateway nodes (Open vSwitch), workloads can experience minor loss of the North-South connectivity. For compute nodes, there can be up to 5 minute downtime on the network connectivity for the workloads running on the node, due to the restart of the containers hosting the components of the cloud data plane. For clusters running MOSK 24.1.2 and above, the dowtime is up to 2 minutes per node.
Time to complete	By default, the nodes are updated sequentially as follows: For the host operating system update, up to 15 minutes per node. For the Kubernetes cluster update, up to 40 minutes per node. For MOSK 23.1 to 23.2 and newer releases, you can reduce update time by enabling parallel node update. The procedure is described further in the Enable parallel update of Kubernetes worker nodes subsection.

Phase 6: Cluster nodes reboot

Important

This phase is optional. The presense of this phase in the update flow depends on the contents of the target release.

Important

An update to a newer MOSK version may require reboot of the cluster nodes for changes to take effect. Although, you can decide when to restart each particular node, an update can not be considered complete until all of the nodes get restarted.

To determine whether the reboot is required, consult the Step 4. Reboot the nodes with optional instance migration section.

What happens during the phase	You put the cluster into the maintenance mode. For each node in the cluster: Optional. You configure an instance migration policy. You initiate the node reboot. The node is gracefully restarted with automatic or manual migration of cloud workloads running on it.
Impact	For the storage nodes: No impact on the nodes hosting the Ceph cluster data Loss of connectivity to the volumes for the nodes hosting LVM with iSCSI volumes For the control plane nodes, some of the running cloud operations may fail over the course of the phase due to minor unavailability of the cloud API. For the network gateway nodes (Open vSwitch), workloads can experience minor loss of the North-South connectivity depending on the type of virtual routers in use. For the compute nodes, no or controllable impact on the workloads depending on the configured instance migration policy. See Configure instance migration policy for cluster nodes.
Time to complete	Optional. Time to migrate instances across compute nodes. Up to 10 minutes per node to reboot. Depends on the hardware and BIOS configuration. Several nodes can be rebooted in parallel.

Step 1. Verify that the Container Cloud management cluster is up-to-date¶

MOSK relies on Mirantis Container Cloud to manage the underlying software stack for a cluster, as well as to deliver updates for all the components.

Since every MOSK release is tightly coupled with a Container Cloud release, a MOSK cluster update becomes possible once the management cluster is known to run the latest Container Cloud version. The management cluster periodically verifies public Mirantis repositories and updates itself automatically when a newer version becomes available. Having any of the managed clusters, including MOSK, running outdated Container Cloud version will prevent the management cluster from automatic self-update.

To identify the current version of the Container Cloud software your management cluster is running, refer to the Container Cloud web UI. You can also verify your management cluster status using CLI as described in Verify the management cluster status before MOSK update.

Step 2. Initiate MOSK cluster update¶

Silence alerts¶

During an update of a MOSK cluster, numerous alerts may be seen in StackLight. This is expected behavior. Therefore, ignore or temporarily mute the alerts as described in Silence alerts.

Caution

During update, the false positive CalicoDataplaneFailuresHigh alert may be firing. Disregard this alert, which will disappear once update succeeds.

The observed behavior is typical for calico-node during upgrades, as workload changes occur frequently. Consequently, there is a possibility of temporary desynchronization in the Calico dataplane. This can occasionally result in throttling when applying workload changes to the Calico dataplane.

Note

In non-HA StackLight deployments, the KubePodsCrashLooping alert may temporarily be firing for the Grafana ReplicaSet. Such behavior is expected in non-HA StackLight setups. For details, see known issue 42463.

To prevent the issue, deploy StackLight in HA mode.

Verify Ceph configuration¶

If you update MOSK to 23.1, verify that the KaaSCephCluster custom resource does not contain the following entries. If they exist, remove them.

In the spec.cephClusterSpec section, the external section.
In the spec.cephClusterSpec.rookConfig section, the ms_crc_data or ms crc data configuration key. After you remove the key, wait for rook-ceph-mon pods to restart on the MOSK cluster.

Enable parallel update of Kubernetes worker nodes¶

Optional. Starting from MOSK 23.1 to 23.2 update, you can enable and configure parallel node update to reduce update time and minimize downtime:

To enable parallel update of Kubernetes worker nodes, set the spec.providerSpec.value.maxWorkerUpgradeCount configuration parameter in the Mirantis Container Cloud management cluster as described in conf-upd-count.
Consider the specifics of handling of parallel node updates by OpenStack, Ceph, and Tungsten Fabric Controllers to properly plan the maintenance window. For handling details and possible configuration, refer to Parallelizing node update operations.

Enable automatic node reboot in update groups¶

TechPreview

Optional. Starting from MOSK 24.3, you can enable automatic node reboot of an update group, which contains a set of controller or worker machines. This option applies when a Cluster release update requires node reboot, for example, when kernel version update is available in the target Cluster release. The option reduces manual intervention and overall downtime during cluster update.

To enable automatic node reboot in an update group, set spec.rebootIfUpdateRequires in the required UpdateGroup object. For details, see UpdateGroup resource.

Caution

During a distribution upgrade, machines are always rebooted, overriding rebootIfUpdateRequires: false.

Trigger the update¶

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, find the managed MOSK cluster.
Click the More action icon to see whether a new release is available. If that is the case, click Update cluster.
In the Release Update window, select the required Cluster release to update your managed cluster to.

The Description section contains the list of components versions to be installed with a new Cluster release.
Click Update.

Before the cluster update starts, Container Cloud performs a backup of MKE and Docker Swarm. The backup directory is located under:
- /srv/backup/swarm on every Container Cloud node for Docker Swarm
- /srv/backup/ucp on one of the controller nodes for MKE
To view the update status, verify the cluster status on the Clusters page. Once the orange blinking dot near the cluster name disappears, the update is complete.

Step 3. Watch the cluster update¶

Watch the update process through the web UI¶

To view the update status through the Container Cloud web UI, navigate to the Clusters page. Once the orange blinking dot next to the cluster name disappears, the cluster update is complete.

Also, you can see the general status of each node during the update on the Container Cloud cluster view page.

Follow the update process through logs¶

The whole update process is controlled by lcm-controller, which runs in the kaas namespace of the Container Cloud management cluster. Follow its logs to watch the progress of the update, discover, and debug any issues.

Watch the state of the cluster and nodes update through the CLI¶

The lcmclusterstate and lcmmachines objects in the mos namespace of the Container Cloud management cluster provide detailed information about the current phase of the update process in the context of the managed cluster overall as well as specific nodes.

The lcmmachine object being in the Ready state indicates that a node has been successfully updated.

To display the detailed view of the cluster update state, run:

kubectl -n child-ns get lcmclusterstates -o wide

Example system response:

NAME                                            CLUSTERNAME   TYPE              ARG                                          VALUE   ACTUALVALUE   ATTEMPT   MESSAGE
cd-cz7506-child-cl-storage-worker-noefi-rgxhk   child-cl      cordon-drain      cz7506-child-cl-storage-worker-noefi-rgxhk   true                  0         Error: following    NodeWorkloadLocks are still active - ceph: UpdatingController,openstack: InProgress
sd-cz7506-child-cl-storage-worker-noefi-rgxhk   child-cl      swarm-drain       cz7506-child-cl-storage-worker-noefi-rgxhk   true                  0         Error: waiting for kubernetes node kaas-node-5222a92f-5523-457c-8c69-b7aa0ffc235c to be drained first

To display the detailed view of the nodes update state, run:

kubectl -n child-ns get lcmmachines

Example system response:

NAME                                                 CLUSTERNAME   TYPE      STATE
cz5018-child-cl-storage-worker-noefi-dzttw           child-cl      worker    Prepare
cz5019-child-cl-storage-worker-noefi-vxcm9           child-cl      worker    Prepare
cz7500-child-cl-control-storage-worker-noefi-nkk9t   child-cl      control   Ready
cz7501-child-cl-control-storage-worker-noefi-7pcft   child-cl      control   Ready
cz7502-child-cl-control-storage-worker-noefi-c7k6f   child-cl      control   Ready
cz7503-child-cl-storage-worker-noefi-5lvd7           child-cl      worker    Prepare
cz7505-child-cl-storage-worker-noefi-jh4mc           child-cl      worker    Prepare
cz7506-child-cl-storage-worker-noefi-rgxhk           child-cl      worker    Prepare

Step 4. Reboot the nodes with optional instance migration¶

Depending on the target release content, you may need to reboot the cluster nodes for the changes to take effect. Running a MOSK cluster in a semi-updated state for an extended period may result in unpredictable behavior of the cloud and impact users and workloads. Therefore, when it is required, you need to reboot the cluster nodes as soon as possible to avoid potential risks.

Note

If you enabled rebootIfUpdateRequires as described in Enable automatic node reboot in update groups, nodes will be automatically rebooted in update groups during a Cluster release update that requires a reboot, for example, when kernel version update is available in the target Cluster release. For a distribution upgrade, continue reading the following subsections.

Determine if the node needs to be rebooted¶

Verify the YAML definitions of the LCMMachine and Machine objects. The node must be rebooted if the rebootRequired flag is set to true. In addition, objects explicitly specify the reason for rebooting. For example:

The LCMMachine object of the node that requires rebooting:

...
status:
   hostInfo:
     rebootRequired: true
     rebootReason: "linux-image-5.13.0-51-generic"

The Machine object of the node that does not require rebooting:

...
status:
  ...
  providerStatus:
    ...
    reboot:
      reason: ""
      required: false
    status: Ready

Since MOSK 23.1, you can also use the Mirantis Container Cloud web UI to identify the nodes requiring reboot:

In the Clusters tab, click the required cluster name. The page with Machines opens.
Hover over the status of every machine. A machine to reboot contains the Reboot > The machine requires a reboot notification in the Status tooltip.

Configure instance migration policy for cluster nodes¶

Restarting the cluster causes downtime of the cloud services running on the nodes. While the MOSK control plane is built for high availability and can tolerate temporary loss of at least 1/3 of services without a significant impact on user experience, rebooting nodes that host the elements of cloud data plane, such as network gateway nodes and compute nodes, has a detrimental effect on the cloud workloads, if not performed gracefully.

To configure the instance migration policy:

Edit the target compute node resource. For example:

kubectl edit node kaas-node-03ab613d-cf79-4830-ac70-ed735453481a

To mitigate the potential impact on the cloud workloads, define the migration mode and the number of attempts the OpenStack Controller should make to migrate a single instance running on it:

Instance migration configuration for hosts¶
Node annotation	Default	Description
`instance_migration_mode`	`live`	Defines the instance migration mode for the host. The list of available options include: `live`: the OpenStack Controller live migrates instances automatically. The update mechanism tries to move the memory and local storage of all instances on the node to another node without interrupting before applying any changes to the node. By default, the update mechanism makes three attempts to migrate each instance before falling back to the `manual` mode. 0 `manual`: the OpenStack Controller waits for the Operator to migrate instances from the host. When it is time to update the host, the update mechanism asks you to manually migrate the instances and proceeds only once you confirm the node is safe to update. 1 `skip`: the OpenStack Controller skips the instance check on the node and reboots it.
`instance_migration_attempts`	`3`	Defines the number of times the OpenStack Controller attempts to live-migrate a single instance before falling back to the `manual` mode.

0

Success of live migration depends on many factors including the selected vCPU type and model, the amount of data that needs to be transferred, the intensity of the disk IO and memory writes, the type of the local storage, and others. Instances using the following product features are known to have issues with live migration:

LVM-based ephemeral storage with and without encryption
Encrypted block storage volumes
CPU and NUMA node pinning

1

For the clouds relying on the converged LVM with iSCSI block storage that offer persistent volumes in a remote edge sub-region, it is important to keep in mind that applying a major change to a compute node may impact not only the instances running on this node but also the instances attached to the LVM devices hosted there. Mirantis recommends that in such environments you perform the update procedure in the manual mode with mitigation measures taken by the Operator for each compute node. Otherwise, all the instances that have LVM with iSCSI volumes attached would need reboot to restore connectivity.

Configuration example that sets the instance migration mode to live and the number of attempts to live-migrate to 5:

apiVersion: v1
kind: Node
metadata:
 name: kaas-node-03ab613d-cf79-4830-ac70-ed735453481a
 selfLink: /api/v1/nodes/kaas-node-03ab613d-cf79-4830-ac70-ed735453481a
 uid: 54be5139-aba7-47e7-92bf-5575773a12a6
 resourceVersion: '299734609'
 creationTimestamp: '2021-03-24T16:03:11Z'
 labels:
   ...
   openstack-compute-node: enabled
   openvswitch: enabled
 annotations:
   openstack.lcm.mirantis.com/instance_migration_mode: "live"
   openstack.lcm.mirantis.com/instance_migration_attempts: "5"
   ...

If needed, as a cloud user, mark the instances that require individual handling during instance migration using the openstack.lcm.mirantis.com:maintenance_action=<ACTION-TAG> server tag. For details, refer to Configure per-instance migration mode.

See also

OpenStack Controller maintenance API

Reboot MOSK cluster¶

Since MOSK 23.1, you can reboot several cluster nodes in one go by using the Graceful reboot mechanism provided by Mirantis Container Cloud. The mechanism restarts the selected nodes one by one, honoring the instance migration policies.

For older versions of MOSK, you need to reboot each node manually as follows:

Enable maintenance mode for the MOSK cluster.
For each node in the cluster:
1. Enable maintenance mode for the node.
2. If manual instance migration policy is configured for the node, perform manual migration once the node is ready to reboot (see below).
3. Reboot the node using cluster life-cycle management.
4. Disable maintenance mode for the node.
Disable maintenance mode for the MOSK cluster.

Perform manual actions before node reboot¶

When a node that has a manual instance migration policy is ready to be restarted, the life-cycle management mechanism notifies you about that by creating a NodeMaintenanceRequest object for the node and setting the active status attribute for the corresponding NodeWorkloadLock object.

Note

Verify the status:errorMessage attribute before proceeding.

To view the NodeWorkloadLock objects details for a specific node, run:

kubectl get nodeworkloadlocks <NODE-NAME> -o yaml

Example system response:

apiVersion: lcm.mirantis.com/v1alpha1
kind: NodeWorkloadLock
metadata:
  annotations:
    inner_state: active
  creationTimestamp: "2022-02-04T13:24:48Z"
  generation: 1
  name: openstack-kaas-node-b2a55089-5b03-4698-9879-8756e2e81df5
  resourceVersion: "173934"
  uid: 0cb4428f-dd0d-401d-9d5e-e9e97e077422
spec:
  controllerName: openstack
  nodeName: kaas-node-b2a55089-5b03-4698-9879-8756e2e81df5
status:
  errorMessage: 2022-02-04 14:43:52.674125 Some servers ['0ab4dd8f-ef0d-401d-9d5e-e9e97e077422'] are still present on host kaas-node-b2a55089-5b03-4698-9879-8756e2e81df5.
  Waiting unless all of them are migrated manually or instance_migration_mode is set to 'skip'
  release: 8.5.0-rc+22.1
  state: active

Note

For MOSK compute nodes, you need to manually shut down all instances running on it, or perform cold or live migration of the instances.

After the update¶

Once your MOSK cluster update is complete, proceed with the following:

Perform the post-update steps recommended in the update notes of the target release if any.
Use the standard configuration mechanisms to re-enable the new product features that could previously exist in your cloud as a custom configuration.
To ensure the cluster operability, execute a set of smoke tests as described in Run Tempest tests.
Optional. Proceed with the upgrade of OpenStack.
If necessary, expire alert silences in StackLight as described in Silence alerts.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

What to do if the update hangs or fails¶

If an update phase takes significantly longer than expected according to the tables included in Plan the cluster update, you should consider the update process hung.

If you observe errors that are not described explicitly in the documentation, immediately contact Mirantis support.

Troubleshoot issues¶

To see any issues that might have occurred during the update, verify the logs of the lcm-controller pods in the kaas namespace of the Container Cloud management cluster.

To troubleshoot the update that involves the operating system upgrade with host reboot, refer to Troubleshoot an operating system upgrade with host restart.

Roll back the changes¶

Container Cloud and MOSK life-cycle management mechanism does not provide a way to perform a cluster-wide rollback of an update.

Update to a patch version¶

Patch releases aim to significantly shorten the cycle of CVE fixes delivery onto your MOSK deployments to help you avoid cyber threats and data breaches.

Your management bare-metal cluster obtains patch releases automatically the same way as major releases. A new patch MOSK release version becomes available through the Container Cloud web UI after the automatic upgrade of the management cluster.

It is not possible to update between the patch releases that belong to different release series in one go. For example, you can update from MOSK 23.1.1 to 23.1.2, but you cannot immediately update from MOSK 23.1.x to 23.2.x because you need to update to the major MOSK 23.2 release first.

Caution

If you delay the Container Cloud upgrade and schedule it at a later time as described in Schedule Mirantis Container Cloud updates, make sure to schedule a longer maintenance window as the upgrade queue can include several patch releases along with the major release upgrade.

Pre-update actions¶

Estimate the update impact¶

Read the Update notes part of the target MOSK release notes to understand the changes it brings and the impact these changes are going to have on your cloud users and workloads.

Determine if cluster nodes need to be rebooted¶

The application of the patch releases may not require the cluster nodes reboot. Though, your cluster can contain nodes that require reboot after the last update to a major release, and this requirement will remain after update to any of the following patch releases. Therefore, Mirantis strongly recommends that you determine if there are such nodes in your cluster before you update to the next patch release and reboot them if any, as described in Step 4. Reboot the nodes with optional instance migration.

Avoid network downtime for cloud workloads¶

For some MOSK versions, applying a patch release may require restart of the containers that host the elements of the cloud data plane. In case of Open vSwitch-based clusters, this may result in up to 5 minute downtime of workload network connectivity for each compute node.

For MOSK prior to 24.1 series, you can determine whether applying a patch release is going to require the restart of the data plane by consulting the Release artifacts part of the release notes of the current and target MOSK releases. The data plane restart will only happen if there are new versions of the container images related to the data plane.

It is possible to avoid the downtime for the cloud data by explicitly pinning the image versions of the following components:

Open vSwitch
Kubernetes entrypoint

However, pinning these images will result in the cloud data plane not receiving any security or bugfixes during the update.

To pin the images:

Depending on the proxy configuration, the image base URL differs. To obtain the list of currently used images on the cluster, run:

kubectl -n openstack get ds openvswitch-openvswitch-vswitchd-default -o yaml |grep "image:" | sort -u

Example of system response:

image: mirantis.azurecr.io/general/openvswitch:2.13-focal-20230211095312
image: mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849

Add the openvswitch and kubernetes-entrypoint images used on your cluster:

Since MOSK 25.1

Create a ConfigMap in the openstack namespace with the following content, replacing <OPENSTACKDEPLOYMENT-NAME> with the name of your OpenStackDeployment custom resource:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    penstack.lcm.mirantis.com/watch: "true"
  name: <OPENSTACKDEPLOYMENT-NAME>-artifacts
  namespace: openstack
data:
  caracal: |
    dep_check: <KUBERNETES-ENTRYPOINT-IMAGE-URL>
    openvswitch_db_server: <OPENVSWITCH-IMAGE-URL>
    openvswitch_vswitchd: <OPENVSWITCH-IMAGE-URL>

Before MOSK 25.1

Edit the OpenStackDeployment custom resoruce as follows:

spec:
  services:
    networking:
      openvswitch:
        values:
          images:
            tags:
              dep_check: <KUBERNETES-ENTRYPOINT-IMAGE-URL>
              openvswitch_db_server: <OPENVSWITCH-IMAGE-URL>
              openvswitch_vswitchd: <OPENVSWITCH-IMAGE-URL>

For example:

spec:
  services:
    networking:
      openvswitch:
        values:
          images:
            tags:
              dep_check: mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849
              openvswitch_db_server: mirantis.azurecr.io/general/openvswitch:2.13-focal-20230211095312
              openvswitch_vswitchd: mirantis.azurecr.io/general/openvswitch:2.13-focal-20230211095312

Update a patch Cluster release of a MOSK cluster¶

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click Upgrade next to the More action icon located in the last column for each cluster where available.

Note

If Upgrade is greyed out, the cluster is in maintenance mode that must be disabled before you can proceed with cluster update. For details, see Disable maintenance mode on a cluster and machine.

If Upgrade does not display, your cluster is up-to-date.
In the Release update window, select the required patch Cluster release to update your managed cluster to.

The release notes for patch Cluster releases are available in Container Cloud Release Notes: Patch releases.
Click Update. To monitor update readiness, refer to Verify cluster status.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Note

Since Container Cloud 2.26.1 (patch Cluster releases 17.1.1 and 16.1.1), the update of Ubuntu packages with kernel minor version update may apply in certain releases.

In this case, cordon-drain and reboot of machines does not apply automatically, and all machines have the Reboot is required notification after the cluster update. You can manually handle the reboot of machines during a convenient maintenance window as described in Perform a graceful reboot of a cluster.

Note

To prevent the issue, deploy StackLight in HA mode.

See also

Verify the management cluster status before MOSK update¶

Before you start updating your managed clusters, Mirantis recommends verifying that the associated management cluster is upgraded successfully.

To verify that the management cluster is upgraded successfully:

Using kubeconfig of the management cluster, verify the Cluster release version of the management cluster machines:

for i in $(kubectl get lcmmachines | awk '{print $1}' | sed '1d'); do echo $i; kubectl get lcmmachines $i -o yaml | grep release | tail -1; done

Example of system response:

master-0
  release: 14.0.0+3.6.5
master-1
  release: 14.0.0+3.6.5
master-2
  release: 14.0.0+3.6.5

Obtain the name of the latest available Container Cloud release object:

kubectl get kaasrelease

Example of system response:

NAME          AGE
kaas-2-15-0   63m
kaas-2-14-0   40d

Using the name of the latest Container Cloud release object, obtain the latest available Cluster release version:

kubectl get -o yaml clusterrelease $(kubectl get kaasrelease kaas-2-15-0 -o yaml | egrep "^ +clusterRelease:" | cut -d: -f2 | tr -d ' ') | egrep "^  version:"

Example of system response:

version: 14.0.0+3.6.4

Compare the output obtained in the previous step with the output from the first step. The Cluster releases must match. If this is not the case, contact Mirantis support for further details.
Proceed to Step 2. Initiate MOSK cluster update.

Change the upgrade order of a machine¶

You can define the upgrade sequence for existing machines to allow prioritized machines to be upgraded first during a cluster update.

Consider the following upgrade index specifics:

The first machine to upgrade is always one of the control plane machines with the lowest upgradeIndex. Other control plane machines are upgraded one by one according to their upgrade indexes.
If the Cluster spec dedicatedControlPlane field is false, worker machines are upgraded only after the upgrade of all control plane machines finishes. Otherwise, they are upgraded after the first control plane machine, concurrently with other control plane machines.
If several machines have the same upgrade index, they have the same priority during upgrade.
If the value is not set, the machine is automatically assigned a value of the upgrade index.

To define the upgrade order of an existing machine:

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with the Machines list opens.
In one of the Unassigned machines settings menu, select Change upgrade index.
In the Configure Upgrade Priority window that opens, use the Up and Down arrows in the Upgrade Index field to configure the upgrade sequence of a machine. Click Update to apply changes.
Using the Pool info or Machine info options in the machine settings menu, verify that the Upgrade Priority Index contains the updated value.

Configure the parallel update of worker nodes¶

Available since MCC 2.25.0 (17.0.0 and 16.0.0)

Note

You can start using the below procedure during cluster update from 23.1 to 23.2. For details, see Parallelizing node update operations.

By default, worker machines are upgraded sequentially, which includes node draining, software upgrade, services restart, and so on. Though, MOSK enables you to parallelize node upgrade operations, significantly improving the efficiency of your deployment, especially on large clusters.

For upgrade workflow of the control plane, see Change the upgrade order of a machine.

Configure the parallel update of worker nodes using web UI¶

Available since MCC 2.25.0 (17.0.0 and 16.0.0)

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with the Machines list opens.
On the Clusters page, click the More action icon in the last column of the required cluster and select Configure cluster.
In General Settings of the Configure cluster window, define the following parameters:

Parallel Upgrade Of Worker Machines
The maximum number of the worker nodes to update simultaneously. It serves as an upper limit on the number of machines that are drained at a given moment of time. Defaults to 1.

You can configure this option after deployment before the cluster update.

Parallel Preparation For Upgrade Of Worker Machines
The maximum number of worker nodes being prepared at a given moment of time, which includes downloading of new artifacts. It serves as a limit for the network load that can occur when downloading the files to the nodes. Defaults to 50.

Configure the parallel update of worker nodes using CLI¶

Available since MCC 2.24.0 (15.0.1 and 14.0.1)

Open the Cluster object for editing.

Adjust the following parameters as required:

Configuration of the parallel node update¶
Parameter	Default	Description
`spec.providerSpec.maxWorkerUpgradeCount`	`1`	The maximum number of the worker nodes to update simultaneously. It serves as an upper limit on the number of machines that are drained at a given moment of time. Caution Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), `maxWorkerUpgradeCount` is deprecated and will be removed in one of the following releases. Use the `concurrentUpdates` parameter in the `UpdateGroup` object instead. For details, see Create update groups for worker machines.
`spec.providerSpec.maxWorkerPrepareCount`	`50`	The maximum number of workers being prepared at a given moment of time, which includes downloading of new artifacts. It serves as a limit for the network load that can occur when downloading the files to the nodes.

Save the Cluster object to apply the change.

Create update groups for worker machines¶

Available since MCC 2.27.0 (17.2.0 and 16.2.0)

The use of update groups provides enhanced control over update of worker machines by allowing granular concurrency settings for specific machine groups. This feature uses the UpdateGroup object to decouple the concurrency settings from the global cluster level, providing flexibility based on the workload characteristics of different machine sets.

The UpdateGroup objects are processed sequentially based on their indexes. Update groups with the same indexes are processed concurrently. The control update group is always processed first.

Note

The update order of a machine within the same group is determined by the upgrade index of a specific machine. For details, see Change the upgrade order of a machine.

The maxWorkerUpgradeCount parameter of the Cluster object is inherited by the default update group. Changing maxWorkerUpgradeCount leads to changing the concurrentUpdates parameter of the default update group.

Note

The maxWorkerUpgradeCount parameter of the Cluster object is deprecated and will be removed in one of the following Container Cloud releases. You can still use this parameter to change the concurrentUpdates value of the default update group. However, Mirantis recommends changing this value directly in the UpdateGroup object.

Update group for controller nodes¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0) TechPreview

The update group for controller nodes is automatically generated during initial cluster creation with the following settings:

name: <cluster-name>-control
index: 1
concurrentUpdates: 1
rebootIfUpdateRequires: false

Caution

During a distribution upgrade, machines are always rebooted, overriding rebootIfUpdateRequires: false.

All control plane machines are automatically assigned to the update group for controller nodes with no possibility to change it.

Note

On existing clusters created before Container Cloud 2.28.0 (Cluster releases 17.2.0, 16.2.0, or earlier), the update group for controller nodes is created after Container Cloud upgrade to 2.28.0 (Cluster release 16.3.0) on the management cluster.

Caution

The index and concurrentUpdates parameters of the update group for controller nodes are hardcoded and cannot be changed.

Example of the update group for controller nodes:

apiVersion: kaas.mirantis.com/v1alpha1
kind: UpdateGroup
metadata:
  name: example-cluster-control
  namespace: example-ns
  labels:
    cluster.sigs.k8s.io/cluster-name: example-cluster
spec:
  index: 1
  concurrentUpdates: 1
  rebootIfUpdateRequires: false

Default update group¶

The default update group is automatically created during initial cluster creation with the following settings:

name: <cluster-name>-default
index: 1
rebootIfUpdateRequires: false
concurrentUpdates: inherited from the maxWorkerUpgradeCount parameter set in the Cluster object

Caution

During a distribution upgrade, machines are always rebooted, overriding rebootIfUpdateRequires: false.

Note

On existing clusters created before Container Cloud 2.27.0 (Cluster releases 17.1.0, 16.1.0, or earlier), the default update group is created after Container Cloud upgrade to 2.27.0 (Cluster release 16.2.0) on the management cluster.

Example of the default update group:

apiVersion: kaas.mirantis.com/v1alpha1
kind: UpdateGroup
metadata:
  name: example-cluster-default
  namespace: example-ns
  labels:
    cluster.sigs.k8s.io/cluster-name: example-cluster
spec:
  index: 1
  concurrentUpdates: 1
  rebootIfUpdateRequires: false

If you require custom update settings for worker machines, create one or several custom UpdateGroup objects as described below.

Assign a machine to an update group using CLI¶

Note

All worker machines that are not assigned to any update group are automatically assigned to the default update group.

Create an UpdateGroup object with the required specification. For description of the object fields, see UpdateGroup resource.
Label the machines to associate them with the newly created UpdateGroup object:
```
kubectl label machine <machineName> kaas.mirantis.com/update-group=<UpdateGroupObjectName>
```
To change the update group of a machine, update the kaas.mirantis.com/update-group label of the machine with the new update group name. Removing this label from a machine automatically assigns such machine to the default update group.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Note

After creation of a custom UpdateGroup object, if you plan to add a new machine that requires a non-default update group, manually add the corresponding label to the machine as described above. Otherwise, the default update group is applied to such machine.

Note

Before removing the UpdateGroup object, reassign all machines to another update group.

Granularly update a managed cluster using the ClusterUpdatePlan object¶

Available since MCC 2.27.0 (17.2.0) TechPreview

You can control the process of a managed cluster update by manually launching update stages using the ClusterUpdatePlan custom resource. Between the update stages, a cluster remains functional from the perspective of cloud users and workloads.

A ClusterUpdatePlan object contains the following funtionality:

The object is automatically created by the bare metal provider when a new Cluster release becomes available for your cluster.
The object is created in the management cluster for the same namespace that the corresponding managed cluster refers to.
The object contains a list of self-descriptive update steps that are cluster-specific. These steps are defined in the spec section of the object with information about their impact on the cluster.
The object starts cluster update when the operator manually changes the commence field of the first update step to true. All steps have the commence flag initially set to false so that the operator can decide when to pause or resume the update process.
The object has the following naming convention: <managedClusterName>-<targetClusterReleaseVersion>.
Since Container Cloud 2.28.0 (Cluster release 17.3.0), the object contains several StackLight alerts to notify the operator about the update progress and potential update issues. For details, see StackLight alerts: Container Cloud.

Granularly update a managed cluster using CLI¶

Verify that the management cluster is upgraded successfully as described in Verify the management cluster status before MOSK update.
Optional. Available since Container Cloud 2.29.0 (Cluster release 17.4.0) as Technology Preview. Enable update auto-pause to be triggered by specific StackLight alerts. For details, see Configure update auto-pause.
Open the ClusterUpdatePlan object for editing.

Start cluster update by changing the spec:steps:commence field of the first update step to true.

Once done, the following actions are applied to the cluster:

The Cluster release in the corresponding Cluster spec is changed to the target Cluster version defined in the ClusterUpdatePlan spec.
The cluster update starts and pauses before the next update step with commence: false set in the ClusterUpdatePlan spec.

Caution

Cancelling an already started update step is not supported.

The following example illustrates the ClusterUpdatePlan object of a MOSK cluster update that has completed:

Example of a completed ClusterUpdatePlan object

Object:
  apiVersion: kaas.mirantis.com/v1alpha1
  kind: ClusterUpdatePlan
  metadata:
    creationTimestamp: "2025-02-06T16:53:51Z"
    generation: 11
    name: mosk-17.4.0
    namespace: child
    resourceVersion: "6072567"
    uid: 82c072be-1dc5-43dd-b8cf-bc643206d563
  spec:
    cluster: mosk
    releaseNotes: https://docs.mirantis.com/mosk/latest/25.1-series.html
    source: mosk-17-3-0-24-3
    steps:
    - commence: true
      description:
      - install new version of OpenStack and Tungsten Fabric life cycle management
        modules
      - OpenStack and Tungsten Fabric container images pre-cached
      - OpenStack and Tungsten Fabric control plane components restarted in parallel
      duration:
        estimated: 1h30m0s
        info:
        - 15 minutes to cache the images and update the life cycle management modules
        - 1h to restart the components
      granularity: cluster
      id: openstack
      impact:
        info:
        - some of the running cloud operations may fail due to restart of API services
          and schedulers
        - DNS might be affected
        users: minor
        workloads: minor
      name: Update OpenStack and Tungsten Fabric
    - commence: true
      description:
      - Ceph version update
      - restart Ceph monitor, manager, object gateway (radosgw), and metadata services
      - restart OSD services node-by-node, or rack-by-rack depending on the cluster
        configuration
      duration:
        estimated: 8m30s
        info:
        - 15 minutes for the Ceph version update
        - around 40 minutes to update Ceph cluster of 30 nodes
      granularity: cluster
      id: ceph
      impact:
        info:
        - 'minor unavailability of object storage APIs: S3/Swift'
        - workloads may experience IO performance degradation for the virtual storage
          devices backed by Ceph
        users: minor
        workloads: minor
      name: Update Ceph
    - commence: true
      description:
      - new host OS kernel and packages get installed
      - host OS configuration re-applied
      - container runtime version gets bumped
      - new versions of Kubernetes components installed
      duration:
        estimated: 1h40m0s
        info:
        - about 20 minutes to update host OS per a Kubernetes controller, nodes updated
          one-by-one
        - Kubernetes components update takes about 40 minutes, all nodes in parallel
      granularity: cluster
      id: k8s-controllers
      impact:
        users: none
        workloads: none
      name: Update host OS and Kubernetes components on master nodes
    - commence: true
      description:
      - new host OS kernel and packages get installed
      - host OS configuration re-applied
      - container runtime version gets bumped
      - new versions of Kubernetes components installed
      - data plane components (Open vSwitch and Neutron L3 agents, TF agents and vrouter)
        restarted on gateway and compute nodes
      - storage nodes put to “no-out” mode to prevent rebalancing
      - by default, nodes are updated one-by-one, a node group can be configured to
        update several nodes in parallel
      duration:
        estimated: 8h0m0s
        info:
        - host OS update - up to 15 minutes per node (not including host OS configuration
          modules)
        - Kubernetes components update - up to 15 minutes per node
        - OpenStack controllers and gateways updated one-by-one
        - nodes hosting Ceph OSD, monitor, manager, metadata, object gateway (radosgw)
          services updated one-by-one
      granularity: machine
      id: k8s-workers-vdrok-child-default
      impact:
        info:
        - 'OpenStack controller nodes: some running OpenStack operations might not
          complete due to restart of components'
        - 'OpenStack compute nodes: minor loss of the East-West connectivity with
          the Open vSwitch networking back end that causes approximately 5 min of
          downtime'
        - 'OpenStack gateway nodes: minor loss of the North-South connectivity with
          the Open vSwitch networking back end: a non-distributed HA virtual router
          needs up to 1 minute to fail over; a non-distributed and non-HA virtual
          router failover time depends on many factors and may take up to 10 minutes'
        users: major
        workloads: major
      name: Update host OS and Kubernetes components on worker nodes, group vdrok-child-default
    - commence: true
      description:
      - restart of StackLight, MetalLB services
      - restart of auxiliary controllers and charts
      duration:
        estimated: 1h30m0s
      granularity: cluster
      id: mcc-components
      impact:
        info:
        - minor cloud API downtime due restart of MetalLB components
        users: minor
        workloads: none
      name: Auxiliary components update
    target: mosk-17-4-0-25-1
  status:
    completedAt: "2025-02-07T19:24:51Z"
    startedAt: "2025-02-07T17:07:02Z"
    status: Completed
    steps:
    - duration: 26m36.355605528s
      id: openstack
      message: Ready
      name: Update OpenStack and Tungsten Fabric
      startedAt: "2025-02-07T17:07:02Z"
      status: Completed
    - duration: 6m1.124356485s
      id: ceph
      message: Ready
      name: Update Ceph
      startedAt: "2025-02-07T17:33:38Z"
      status: Completed
    - duration: 24m3.151554465s
      id: k8s-controllers
      message: Ready
      name: Update host OS and Kubernetes components on master nodes
      startedAt: "2025-02-07T17:39:39Z"
      status: Completed
    - duration: 1h19m9.359184228s
      id: k8s-workers-vdrok-child-default
      message: Ready
      name: Update host OS and Kubernetes components on worker nodes, group vdrok-child-default
      startedAt: "2025-02-07T18:03:42Z"
      status: Completed
    - duration: 2m0.772243006s
      id: mcc-components
      message: Ready
      name: Auxiliary components update
      startedAt: "2025-02-07T19:22:51Z"
      status: Completed

Monitor the message and status fields of the first step. The message field contains information about the progress of the current step. The status field can have the following values:
- NotStarted
- Scheduled ^{Since MCC 2.28.0 (17.3.0)}
- InProgress
- AutoPaused ^{TechPreview since MCC 2.29.0 (17.4.0)}
- Stuck
- Completed
The Scheduled status indicates that a step is already triggered but its execution has not started yet.

The AutoPaused status indicates that the update process is paused by a firing StackLight alert defined in the UpdateAutoPause object. For details, see Configure update auto-pause.

The Stuck status indicates an issue and that the step can not fit into the ETA defined in the duration field for this step. The ETA for each step is defined statically and does not change depending on the cluster.

Caution

The status is not populated for the ClusterUpdatePlan objects that have not been started by adding the commence: true flag to the first object step. Therefore, always start updating the object from the first step.
Optional. Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). Add or remove update groups of worker nodes on the fly, unless the update of the group that is being removed has already been scheduled, or if a newly set group will have an index that is lower or equal to another group that is already scheduled. These changes are reflected in ClusterUpdatePlan.

You can also reassign a machine to a different update group while the cluster is being updated, but only if the new update group has an index higher than the index of the last scheduled worker update group. Disabled machines are considered as updated immediately.

Note

Depending on the number of update groups for worker nodes present in the cluster, the number of steps in spec differs. Each update group for worker nodes that has at least one machine will be represented by a step with the ID k8s-workers-<UpdateGroupName>.
Proceed with changing the commence flag of the following update steps granularly depending on the cluster update requirements.

Caution

Launch the update steps sequentially. A consecutive step is not started until the previous step is completed.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

See also

ClusterUpdatePlan resource

Granularly update a managed cluster using the Container Cloud web UI¶

Available since MCC 2.29.0 (17.4.0 and 16.4.0)

Verify that the management cluster is upgraded successfully as described in Verify the management cluster status before MOSK update.
Optional. Available since Container Cloud 2.29.0 (Cluster release 17.4.0) as Technology Preview. Enable update auto-pause to be triggered by specific StackLight alerts. For details, see Configure update auto-pause.
Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
On the Clusters page, in the Updates column of the required cluster, click the Available link. The Updates tab opens.

Note

If the Updates column is absent, it indicates that the cluster is up-to-date.

Note

For your convenience, the Cluster updates menu is also available in the right-side kebab menu of the cluster on the Clusters page.
On the Updates page, click the required version in the Target column to open update details, including the list of update steps, current and target cluster versions, and estimated update time.
In the Target version section of the Cluster update window, click Release notes and carefully read updates about target release, including the Update notes section that contains important pre-update and post-update steps.
Expand each step to verify information about update impact and other useful details.
Select one of the following options:
- Enable Auto-commence all at the top-right of the first update step section and click Start Update to launch update and start each step automatically.
- Click Start Update to only launch the first update step.
  
  Note
  
  This option allows you to auto-commence consecutive steps while the current step is in progress. Enable the Auto-commence toggle for required steps and click Save to launch the selected steps automatically. You will only be prompted to confirm the consecutive step, all remaining steps will be launched without a manual confirmation.
Before launching the update, you will be prompted to manually type in the target Cluster release name and confirm that you have read release notes about target release.

Caution

Cancelling an already started update step is not supported.
Monitor the status of each step by hovering over the In Progress icon at the top-right of the step window. While the step is in progress, its current status is updated every minute.

Once the required step is completed, the Waiting for input status at the top of the update window is displayed requiring you to confirm the next step.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

The update history is retained in the Updates tab with the completion timestamp. The update plans that were not started and can no longer be used are cleaned up automatically.

Configure update auto-pause¶

Available since MOSK 25.1 TechPreview

Uinsg the UpdateAutoPause object, the operator can define specific StackLight alerts that trigger auto-pause of an update phase execution in a MOSK cluster. The feature enhances update management of MOSK clusters by preventing harmful changes to be propagated across the entire cloud.

Note

The feature is not available for management clusters.

When an update auto-pause is configured on a cluster, the following workflow applies:

During cluster updates, the system continuously monitors for the alerts defined in the UpdateAutoPause object
If any configured alert fires:
- The update process automatically pauses
- The commence field is removed from all steps that have not started
- The commence field is removed from the steps related to Update host OS and Kubernetes components on worker nodes even if the step is in progress, and the step is paused
- The ClusterUpdatePlan status changes to AutoPaused
- The firing alerts are recorded in the UpdateAutoPause status
- A condition is added to the Cluster object indicating the pause state

Configure auto-pausing of a MOSK cluster update¶

Verify that StackLight is enabled on the MOSK cluster.

Create an UpdateAutoPause object with the name that matches your cluster name within the cluster namespace. For example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: UpdateAutoPause
metadata:
  name: managed-cluster-example    # Must match cluster name
  namespace: managed-cluster-ns   # Must match cluster namespace
spec:
  alerts:
    - AlertName1
    - AlertName2

The list of alerts can include standard and custom StackLight alerts previously configured for the cluster.

For the object spec, see UpdateAutoPause resource.

Apply the configuration:
```
kubectl apply -f update-autopause.yaml
```

Resume paused updates¶

Select one of the following options:
- Investigate and resolve the conditions that triggered the alerts, then wait for the alerts to clear automatically
- Remove the problematic alert from the UpdateAutoPause configuration
Set the commence field to true for the relevant UpdatePlan steps to resume the update.

Caution

Admission Controller blocks attempts to set commence: true while alerts defined in the UpdateAutoPause object are still firing.

Monitor the status of an update auto-pause¶

You can monitor the status of an update auto-pause using the following resources:

The UpdateAutoPause object status:

kubectl get updateautopause <cluster-name> -n <namespace> -o yaml

The ClusterUpdatePlan object status that displays the following details:
- The AutoPaused status when updates are paused.
- Messages indicating which alerts caused the pause and other relevant information.
StackLight alerts:
- ClusterUpdateAutoPaused, which indicates that an update is currently paused.
- ClusterUpdateStepAutoPaused, which describes specific steps that are paused.
For alert details, see Container Cloud.

Calculate a maintenance window duration for update Deprecated¶

Deprecation notice

The maintenance window duration calculator is deprecated. Starting from MOSK 25.1, cloud operators should use the ClusterUpdatePlan API instead. For details, refer to ClusterUpdatePlan resource.

This section provides an online calculator for quick calculation of the approximate time required to update your MOSK cluster that uses Open vSwitch as a networking backend.

Additionally, for a more accurate calculation, consider any cluster-specific factors that can have a large impact on the update time in some edge cases, such as number of routers, frequency of CPU, and so on.

Number of virtual machines per compute node:

Number of OpenStack compute nodes:

Number of OpenStack gateway nodes:

Number of Kubernetes control plane nodes:

Number of Kubernetes worker nodes except Kubernetes control plane nodes under the Kubernetes worker role:

Boolean, 1 to include or 0 to skip nodes restart:

Total number of nodes in Ceph cluster:

Number of Ceph monitor nodes:

Number of Ceph manager nodes:

Number of Ceph OSD:

Getting access¶

This section contains instructions on how to get access to different systems of a MOSK cluster.

To obtain endpoints of the MKE web UI and StackLight web UIs such as Prometheus, Alertmanager, Alerta, OpenSearch Dashboards, and Grafana, in the Clusters tab of the Container Cloud web UI, navigate to More > Cluster info.

Note

The Alertmanager web UI displays alerts received by all configured receivers, which can be mistaken for duplicates. To only display the alerts received by a particular receiver, use the Receivers filter.

Generate a kubeconfig for a MOSK cluster using API¶

This section describes how to generate a MOSK cluster kubeconfig using the Container Cloud API. You can also download a MOSK cluster kubeconfig using the Download Kubeconfig option in the Container Cloud web UI. For details, see Connect to a MOSK cluster.

To generate a MOSK cluster kubeconfig using API:

Obtain the following details:
- Your <username> with the corresponding password that were created after the management cluster bootstrap as described in Create initial users after a management cluster bootstrap.
- The kubeconfig of your <username> that you can download through the Container Cloud web UI using Download Kubeconfig located under your <username> on the top-left of the page.

Obtain the <cluster> object of the <cluster_name> MOSK cluster:

kubectl get cluster <cluster_name> -n <project_name> -o yaml

Obtain the access token from Keycloak for the <username> user:

curl -d 'client_id=<cluster.status.providerStatus.oidc.clientId>' --data-urlencode 'username=<username>' --data-urlencode 'password=<password>' -d 'grant_type=password' -d 'response_type=id_token' -d 'scope=openid' <cluster.status.providerStatus.oidc.issuerURL>/protocol/openid-connect/token

Generate the MOSK cluster kubeconfig using the data from <cluster.status> and <token> obtained in the previous steps. Use the following template as an example:

apiVersion: v1
clusters:
  - name: <cluster_name>
    cluster:
      certificate-authority-data: <cluster.status.providerStatus.apiServerCertificate>
      server: https://<cluster.status.providerStatus.loadBalancerHost>:443
contexts:
  - context:
      cluster: <cluster_name>
      user: <username>
    name: <username>@<cluster_name>
current-context: <username>@<cluster_name>
kind: Config
preferences: {}
users:
  - name: <username>
    user:
      auth-provider:
        config:
          client-id: <cluster.status.providerStatus.oidc.clientId>
          idp-certificate-authority-data: <cluster.status.providerStatus.oidc.certificate>
          idp-issuer-url: <cluster.status.providerStatus.oidc.issuerUrl>
          refresh-token: <token.refresh_token>
          id-token: <token.id_token>
        name: oidc

Connect to a MOSK cluster¶

Note

The Container Cloud web UI communicates with Keycloak to authenticate users. Keycloak is exposed using HTTPS with self-signed TLS certificates that are not trusted by web browsers.

To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for cluster applications.

After you deploy a MOSK management or managed cluster, connect to the cluster to verify the availability and status of the nodes as described below.

To connect to a MOSK cluster:

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with the Machines list opens.
Verify the status of the manager nodes. Once the first manager node is deployed and has the Ready status, the Download Kubeconfig option for the cluster being deployed becomes active.
Open the Clusters tab.
Click the More action icon in the last column of the required cluster and select Download Kubeconfig:
1. Enter your user password.
2. Not recommended. Select Offline Token to generate an offline IAM token. Otherwise, for security reasons, the kubeconfig token expires every 30 minutes of the Container Cloud API idle time and you have to download kubeconfig again with a newly generated token.
3. Click Download.
Verify the availability of the managed cluster machines:
1. Export the kubeconfig parameters to your local machine with access to kubectl. For example:
```
export KUBECONFIG=~/Downloads/kubeconfig-test-cluster.yml
```
2. Obtain the list of available machines:
```
kubectl get nodes -o wide
```
  The system response must contain the details of the nodes in the READY status.

To connect to a management cluster:

Log in to a local machine where your management cluster kubeconfig is located and where kubectl is installed.

Note

The management cluster kubeconfig is created during the last stage of the management cluster bootstrap.
Obtain the list of available management cluster machines:
```
kubectl get nodes -o wide
```
The system response must contain the details of the nodes in the READY status.

Access the Keycloak Admin Console¶

Using the Keycloak Admin Console, you can create or delete a user as well as grant or revoke roles to or from a user. The Keycloak administrator is responsible for assigning roles to users depending on the level of access they need in a cluster.

See also

Keycloak documentation: Admin Console

Obtain access credentials using the CLI¶

Available since MCC 2.22.0 (Cluster release 11.6.0)

./container-cloud get keycloak-creds --mgmt-kubeconfig <pathToManagementClusterKubeconfig>

Optionally, use the --output key to save credentials in a YAML file.

Example of system response:

Keycloak admin credentials:
Address: https://<keycloak-ip-adress>/auth
Login: keycloak
Password: foobar

Obtain access credentials using kubectl¶

kubectl get cluster <mgmtClusterName> -o=jsonpath='{.status.providerStatus.helm.releases.iam.keycloak.url}'

The system response contains the URL to access the Keycloak Admin Console. The user name is keycloak by default. The password is located in passwords.yaml generated during bootstrap.

You can also obtain the password from the iam-api-secrets secret in the kaas namespace of the management cluster and decode the content of the keycloak_password key:

kubectl get secret iam-api-secrets -n kaas -o=jsonpath='{.data.keycloak_password}' | base64 -d

Access the Tungsten Fabric web UI¶

The Tungsten Fabric (TF) web UI allows for easy and fast TF resources configuration, monitoring, and debugging. You can access the TF web UI through either the Ingress service or the Kubernetes Service directly. TLS termination for the https protocol is performed through the Ingress service.

Note

Mirantis OpenStack for Kubernetes provides the TF web UI as is and does not include this service in the support Service Level Agreement.

To access the TF web UI through Ingress:

Log in to a local machine where kubectl is installed.
Obtain and export kubeconfig of your managed cluster as described in Connect to a MOSK cluster.

Obtain the password of the Admin user:

kubectl -n openstack get secret keystone-keystone-admin -ojsonpath='{.data.OS_PASSWORD}' | base64 -d

Obtain the external IP address of the Ingress service:

kubectl -n openstack get services ingress

Example of system response:

NAME      TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)                                      AGE
ingress   LoadBalancer   10.96.32.97   10.172.1.101   80:34234/TCP,443:34927/TCP,10246:33658/TCP   4h56m

Note

Do not use the EXTERNAL-IP value to directly access the TF web UI. Instead, use the FQDN from the list below.

Obtain the FQDN of tf-webui:

Note

The command below outputs all host names assigned to the TF web UI service. Use one of them.
```
kubectl -n tf get ingress tf-webui -o custom-columns=HOSTS:.spec.rules[*].host
```
Configure DNS to access the TF web UI host as described in Configure DNS to access OpenStack.
Use your favorite browser to access the TF web UI at https://<FQDN-WEBUI>.

Add a cluster to Lens¶

For quick and easy inspection and monitoring, you can add a MOSK cluster to Lens using the Container Cloud web UI. The following options are available in the More action icon menu of each cluster:

Add cluster to Lens
Open cluster in Lens

Before you can start monitoring your clusters in Lens, install the Container Cloud Lens extension as described below.

Install the Container Cloud Lens extension¶

Start Lens.
Verify that your Lens version is 4.2.4 or later.
Select Lens > Extensions.
Copy and paste the following text into the Install Extension field:
```
@mirantis/lens-extension-cc
```
Click Install.
Verify that the Container Cloud Lens extension appears in the Installed Extensions section.

Add a cluster to Lens¶

Enable your browser to open pop-ups for the Container Cloud web UI.
Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Open the Clusters tab.
Verify that the target cluster is successfully deployed and is in the Ready status.
In the last column of the target cluster area, click the More action icon and select Add cluster to Lens.
In the Add Cluster To Lens window, click Add. The system redirects you to Lens that now contains the previously added cluster.

Caution

If prompted, allow your browser to open Lens.

Open a cluster in Lens¶

Add the target cluster to Lens as described above.
Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Open the Clusters tab.
In the last column of the target cluster area, click the More action icon and select Open cluster in Lens.

See also

OpenStack operations¶

The section covers the management aspects of an OpenStack cluster deployed on Kubernetes.

Upgrade OpenStack¶

This section provides instructions on how to upgrade OpenStack to a major version on a MOSK cluster.

Note

The update of the OpenStack components within the same major OpenStack version is performed seamlessly as part of the MOSK cluster update.

Prerequisites¶

Verify that your OpenStack cloud is running on the latest MOSK release. See Release Compatibility Matrix for the release matrix and supported upgrade paths.
Just before the upgrade, back up your OpenStack databases. See the following documentation for details:
- Reference Architecture: OpenStack database
- Operations Guide: Back up and restore an OpenStack database
Verify that OpenStack is healthy and operational. All OpenStack components in the health group in the OpenStackDeploymentStatus CR should be in the Ready state. See OpenStackDeploymentStatus custom resource for details.
Verify the workability of your OpenStack deployment by running Tempest against the OpenStack cluster as described in Run Tempest tests. Verification of the testing pass rate before upgrading will help you measure your cloud quality before and after upgrade.
Read carefully through the Release Notes of your MOSK version paying attention to the Known issues section and the OpenStack upstream release notes for the target OpenStack version.
Calculate the maintenance window using Plan the cluster update and Calculate a maintenance window duration for update Deprecated and notify users.
When upgrading to OpenStack Yoga, remove the Panko service from the cloud by removing the event entry from the spec:features:services structure in the OpenStackDeployment resource as described in Remove an OpenStack service.

Note

The OpenStack Panko service has been removed from the product and is no longer maintained in the upstream OpenStack. See the project repository page for details.

Perform the upgrade¶

To start the OpenStack upgrade, change the value of the spec:openstack_version parameter in the OpenStackDeployment object to the target OpenStack release.

After you change the value of the spec:openstack_version parameter, the OpenStack Controller initializes the upgrade process.

To verify the upgrade status, use:

Logs from the osdpl container in the OpenStack Controller rockoon pod.

The OpenStackDeploymentStatus object.

When upgrade starts, the OPENSTACK VERSION field content changes to the target OpenStack version, and STATE displays APPLYING:

kubectl -n openstack get osdplst

Example of system output:

NAME      OPENSTACK VERSION   CONTROLLER VERSION   STATE
osh-dev   antelope            0.15.6               APPLYING

When upgrade finishes, the STATE field should display APPLIED:

kubectl -n openstack get osdplst

Example of system output:

NAME      OPENSTACK VERSION   CONTROLLER VERSION   STATE
osh-dev   antelope            0.15.6               APPLIED

The maintenance window for the OpenStack upgrade usually takes from two to four hours, depending on the cloud size.

Verify the upgrade¶

Verify that OpenStack is healthy and operational. All OpenStack components in the health group in the OpenStackDeploymentStatus CR should be in the Ready state. See OpenStackDeploymentStatus custom resource for details.
Verify the workability of your OpenStack deployment by running Tempest against the OpenStack cluster as described in Run Tempest tests.

Upgrade from Antelope to Caracal¶

Before upgrading, verify that you have completed the Prerequisites and removed the domains from federation mappings as described below.

Warning

If your MOSK cluster is running version 24.3 and includes the Instance High Availability service (OpenStack Masakari), the OpenStack upgrade will fail due to an incorrect migration of the Masakari database from legacy SQLAlchemy Migrate to Alembic caused by a misconfigured alembic_table. To avoid this issue, follow the workaround steps outlined in [47603] Masakari fails during the OpenStack upgrade to Caracal before proceeding with the upgrade.

Warning

If you initially deployed your MOSK cluster with OpenStack Victoria or earlier releases and gradually upgraded it to Antelope, and you do not perform periodic cleanups of OpenStack databases from soft-deleted rows, the upgrade will stuck due to the failing cinder-db-sync job.

To prevent, diagnose, and fix this issue, perform the workaround steps outlined in [47695] Cinder database sync job fails during upgrade from Antelope to Caracal before proceeding with the upgrade.

MOSK enables you to upgrade directly from Antelope to Caracal without the need to upgrade to the intermediate Bobcat release. To upgrade the cloud, complete the upgrade steps instruction changing the value of the spec:openstack_version parameter in the OpenStackDeployment object from antelope to caracal.

Remove domains from federation mappings¶

Important

Perform the domains removal from the federation mappings if your MOSK cluster configuration includes federated identity management system, such as IAM or any other supported identity provider.

Before Caracal, Keystone does not properly handle domain specifications for users in mappings. Even though domains are specified for users, Keystone always creates users in the domain associated with the identity provider the user logs in from.

Starting with Caracal, Keystone honors the domains specified for users in mappings. Many example mappings, including the previous default mapping in MOSK, use domain specifications. After upgrading to Caracal, the new users logging in through federation may be assigned to a different Keystone domain, while existing users will retain their current domain. This behaviour may negatively impact monitoring, compliance, and overall cluster operations.

To maintain the same functionality after the upgrade, remove the domain element from both the local.user element and local element, which sets default domain values for user and group elements, from the previous default mappings.

You can use the openstack mapping commands to manage mappings:

To list available mappings: openstack mapping list
To display the mapping rules: openstack mapping show <name>
To modify the mapping rules: openstack mapping set <name> --rules <rules>

Example mapping rules in Antelope:

[
  {
    "local": [
      {
        "user": {
          "name": "{0}",
          "email": "{1}",
          "domain": {
            "name": "Default"
          }
        }
      },
      {
        "groups": "{2}",
        "domain": {
          "name": "Default"
        }
      },
      {
        "domain": {
          "name": "Default"
        }
      }
    ],
    "remote": [
      {
        "type": "OIDC-iam_username"
      },
      {
        "type": "OIDC-email"
      },
      {
        "type": "OIDC-iam_roles"
      }
    ]
  }
]

Example mapping rules in Caracal:

[
  {
    "local": [
      {
        "user": {
          "name": "{0}",
          "email": "{1}"
        }
      },
      {
        "groups": "{2}",
        "domain": {
          "name": "Default"
        }
      }
    ],
    "remote": [
      {
        "type": "OIDC-iam_username"
      },
      {
        "type": "OIDC-email"
      },
      {
        "type": "OIDC-iam_roles"
      }
    ]
  }
]

Upgrade from Yoga to Antelope¶

MOSK enables you to upgrade directly from Yoga to Antelope without the need to upgrade to the intermediate Zed release.

Before upgrading, verify that you have completed the Prerequisites.

Important

There are several known issue affecting MOSK clusters running OpenStack Antelope that can disrupt the network connectivity of the cloud workloads.

If your cluster is still running OpenStack Yoga, update to the MOSK 24.2.1 patch release first and only then upgrade to OpenStack Antelope. If you have not been applying patch releases previously and would prefer to switch back to major releases-only mode, you will be able to do this when MOSK 24.3 is released.

If you have updated your cluster to OpenStack Antelope, apply the workarounds described in Release notes: OpenStack known issues for the following issues:

[45879] [Antelope] Incorrect packet handling between instance and its gateway
[44813] Traffic disruption observed on trunk ports

To upgrade the cloud, complete the upgrade steps instruction changing the value of the spec:openstack_version parameter in the OpenStackDeployment object from yoga to antelope.

Upgrade from Victoria to Yoga¶

Caution

If your cluster is running on top of the MOSK 23.1.2 patch version, the OpenStack upgrade to Yoga may fail due to the delay in the Cinder start. For the workaround, see 23.1.2 known issues: OpenStack upgrade failure.

Before upgrading, verify that you have completed the Prerequisites.

If your cloud runs on top of the OpenStack Victoria release, you must first upgrade to the technical OpenStack releases Wallaby and Xena before upgrading to Yoga.

Caution

The Wallaby and Xena releases are not recommended for a long-run production usage. These versions are transitional, so-called technical releases with limited testing scopes. For the OpenStack versions support cycle, refer to OpenStack support cycle.

To upgrade the cloud, complete the upgrade steps for each release version in line in the following strict order:

Upgrade the cloud from victoria to wallaby
Upgrade the cloud from wallaby to xena
Upgrade the cloud from xena to yoga

Backup and restore OpenStack databases¶

Mirantis OpenStack for Kubernetes (MOSK) relies on the MariaDB Galera cluster to provide its OpenStack components with reliable storage of persistent data. Mirantis recommends backing up your OpenStack databases daily to ensure the safety of your cloud data. Also, you should always create an instant backup before updating your cloud or performing any kind of potentially disruptive experiment.

MOSK has a built-in automated backup routine that can be triggered manually or by schedule. Periodic backups are suspended by default but you can easily enable them through the OpenStackDeployment custom resource. For the details about enablement and configuration of the periodic backups, refer to Periodic OpenStack database backups in the Reference Architecture.

This section includes more intricate procedures that involve additional steps beyond editing the OpenStackDeployment custom resource, such as restoring the OpenStack database from a backup or configuring a remote storage for backups.

Enable OpenStack database remote backups¶

TechPreview

By default, MOSK stores the OpenStack database backups locally in the Mirantis Ceph cluster, which is a part of the same cloud.

Alternatively, MOSK enables you to save the backup data to an external storage. This section contains the details on how you, as a cloud operator, can configure a remote storage backend for OpenStack database backups.

In general, the built-in automated backup mechanism saves the data to the mariadb-phy-backup-data PersistentVolumeClaim (PVC), which is provisioned from StorageClass specified in the spec.persistent_volume_storage_class parameter of the OpenstackDeployment custom resource (CR).

Configure a remote NFS storage for OpenStack backups¶

If your MOSK cluster was originally deployed with the default backup storage, proceed with this step. Otherwise, skip it.
1. Copy the already existing backup data to a storage different from the mariadb-phy-backup-data PVC.
2. Remove the mariadb-phy-backup-data PVC manually:
```
kubectl -n openstack delete pvc mariadb-phy-backup-data
```

Enable the NFS backend in the OpenStackDeployment object by editing the backup section of the OpenStackDeployment CR as follows:

spec:
  features:
    database:
      backup:
        enabled: true
        backend: pv_nfs
        pv_nfs:
          server: <ip-address/dns-name-of-the-server>
          path: <path-to-the-share-folder-on-the-server>

Optional. Set the required mount options for the NFS mount command. You can set as many options of mount as you need. For example:

spec:
  services:
    database:
      mariadb:
        values:
          volume:
            phy_backup:
              nfs:
                mountOptions:
                  - "nfsvers=4"
                  - "hard"

Verify the mariadb-phy-backup-data PVC and NFS persistent volume (PV):

kubectl -n openstack get pvc mariadb-phy-backup-data -o wide

kubectl -n openstack get pv mariadb-phy-backup-data-nfs-pv -o yaml

An example of a positive system response:

NAME                      STATUS   VOLUME                           CAPACITY   ACCESS MODES   STORAGECLASS   AGE     VOLUMEMODE
mariadb-phy-backup-data   Bound    mariadb-phy-backup-data-nfs-pv   20Gi       RWO                           5m40s   Filesystem

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    meta.helm.sh/release-name: openstack-mariadb
    meta.helm.sh/release-namespace: openstack
  <<<skipped>>>>
  name: mariadb-phy-backup-data-nfs-pv
  resourceVersion: "2279204"
  uid: 60db9f89-afc4-417b-bf44-8acab844f17e
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 20Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: mariadb-phy-backup-data
    namespace: openstack
    resourceVersion: "2279201"
    uid: e0e08d73-e56f-425a-ad4e-e5393aa3cdc1
  mountOptions:
  - nfsvers=4
  - hard
  nfs:
    path: /
    server: 10.10.0.116
  persistentVolumeReclaimPolicy: Retain
  volumeMode: Filesystem
status:
  phase: Bound

Switch back to the local storage for OpenStack backups¶

Remove NFS PVC and PV:

kubectl -n openstack delete pvc mariadb-phy-backup-data

kubectl -n openstack delete pv mariadb-phy-backup-data-nfs-pv

Re-enable the local backup in the OpenStackDeployment CR:

spec:
  features:
    database:
      backup:
        enabled: true
        backend: pvc

Verify that the mariadb-phy-backup-data PVC uses the default PV:

kubectl -n openstack get pvc mariadb-phy-backup-data

An example of a positive system response:

NAME                      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS              AGE
mariadb-phy-backup-data   Bound    pvc-a4f6e24b-c05b-4a76-bca2-bb6a5c8ef5b5   20Gi       RWO            mirablock-k8s-block-hdd   80m

See also

Restore OpenStack databases from a backup¶

During the OpenStack database restoration, the MariaDB cluster is unavailable due to the MariaDB StatefulSet being scaled down to 0 replicas. Therefore, to safely restore the state of the OpenStack database, plan the maintenance window thoroughly and in accordance with the database size.

The duration of the maintenance window may depend on the following:

Network throughput
Performance of the storage where backups are kept, which is Mirantis Ceph by default
Local disks performance of the nodes where MariaDB data resides

To restore OpenStack databases:

Obtain an image of the MariaDB container:

kubectl -n openstack get pods mariadb-server-0 -o jsonpath='{.spec.containers[0].image}'

Create the check_pod.yaml file to create the helper pod required to view the backup volume content:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: check-backup-helper
  namespace: openstack
---
apiVersion: v1
kind: Pod
metadata:
  name: check-backup-helper
  namespace: openstack
  labels:
    application: check-backup-helper
spec:
  nodeSelector:
    openstack-control-plane: enabled
  containers:
    - name: helper
      securityContext:
        allowPrivilegeEscalation: false
        runAsUser: 0
        readOnlyRootFilesystem: true
      command:
        - sleep
        - infinity
      image: << image of mariadb container >>
      imagePullPolicy: IfNotPresent
      volumeMounts:
        - name: pod-tmp
          mountPath: /tmp
        - mountPath: /var/backup
          name: mysql-backup
  restartPolicy: Never
  serviceAccount: check-backup-helper
  serviceAccountName: check-backup-helper
  volumes:
    - name: pod-tmp
      emptyDir: {}
    - name: mariadb-secrets
      secret:
        secretName: mariadb-secrets
        defaultMode: 0444
    - name: mariadb-bin
      configMap:
        name: mariadb-bin
        defaultMode: 0555
    - name: mysql-backup
      persistentVolumeClaim:
        claimName: mariadb-phy-backup-data

Create the helper pod:

kubectl -n openstack apply -f check_pod.yaml

Obtain the name of the backup to restore:
```
kubectl -n openstack exec -t check-backup-helper -- tree /var/backup
```
Example of system response:
```
/var/backup
|-- base
|   `-- 2020-09-09_11-35-48
|       |-- backup.stream.gz
|       |-- backup.successful
|       |-- grastate.dat
|       |-- xtrabackup_checkpoints
|       `-- xtrabackup_info
|-- incr
|   `-- 2020-09-09_11-35-48
|       |-- 2020-09-10_01-02-36
|       |-- 2020-09-11_01-02-02
|       |-- 2020-09-12_01-01-54
|       |-- 2020-09-13_01-01-55
|       `-- 2020-09-14_01-01-55
`-- lost+found

10 directories, 5 files
```
If you want to restore the full backup, the name from the example above is 2020-09-09_11-35-48. To restore a specific incremental backup, the name from the example above is 2020-09-09_11-35-48/2020-09-12_01-01-54.

In the example above, the backups will be restored in the following strict order:
1. 2020-09-09_11-35-48 - full backup, path /var/backup/base/2020-09-09_11-35-48
2. 2020-09-10_01-02-36 - incremental backup, path /var/backup/incr/2020-09-09_11-35-48/2020-09-10_01-02-36
3. 2020-09-11_01-02-02 - incremental backup, path /var/backup/incr/2020-09-09_11-35-48/2020-09-11_01-02-02
4. 2020-09-12_01-01-54 - incremental backup, path /var/backup/incr/2020-09-09_11-35-48/2020-09-12_01-01-54

Delete the helper pod:

kubectl -n openstack delete -f check_pod.yaml

Pass the following parameters to the mariadb_resque.py script from the OsDpl object:

Parameter	Type	Default	Description
`--backup-name`	String		Name of a folder with backup in `<BASE_BACKUP>` or `<BASE_BACKUP>/<INCREMENTAL_BACKUP>`.
`--replica-restore-timeout`	Integer	`3600`	Timeout in seconds for 1 replica data to be restored to the `mysql` data directory. Also, includes time for spawning a rescue runner pod in Kubernetes and extracting data from a backup archive.

Edit the OpenStackDeployment object as follows:

spec:
  services:
    database:
      mariadb:
        values:
          manifests:
            job_mariadb_phy_restore: true
          conf:
            phy_restore:
              backup_name: "2020-09-09_11-35-48/2020-09-12_01-01-54"
              replica_restore_timeout: 7200

Wait until the mariadb-phy-restore job suceeds:
```
kubectl -n openstack get jobs mariadb-phy-restore -o jsonpath='{.status}'
```
Important

If mariadb-phy-restore fails, the MariaDB Pods do not start automatically. For example, the failure may occur due to discrepancy between the current and backup versions of MariaDB, broken backup archive, and so on.

Assess the mariadb-phy-restore job log to identify the issue:
```
kubectl -n openstack logs --tail=10000 -l application=mariadb-phy-restore,job-name=mariadb-phy-restore
```
If the restoration process does not start due to the MariaDB versions discrepancy:
- Use other backup file with the corresponding MariaDB version for restoration, if any.
- Start MariaDB Pods without restoration:
  kubectl scale --replicas=3 sts/mariadb-server -n openstack
The command above restores the previous cluster state.
The mariadb-phy-restore job is an immutable object. Therefore, remove the job after each successful execution. To correctly remove the job, clean up all the settings from the OpenStackDeployment object that you have configured during step 7 of this procedure. This will remove all related pods as well.
Resolve database discrepancies by analysing the following resources that may be inconsistent in the restored snapshot as opposed to the original environment:
- Leftover VMs, volumes, images, and other dynamic resources. For example:
  - A VM is removed after a snapshot for restoration is created. Such VM will be present as an orphan entry in the database and the OpenStack API after restoration.
  - A VM is created after a snapshot for restoration is created. Such VM will disappear from the OpenStack API after database restoration but will still be present as a process on the compute host.
- Broken Octavia Amphorae that may become unresponsive after restoration, potentially requiring LoadBalancer failover
- Other broken or leftover resources

Verify the periodic backup jobs for the OpenStack database¶

Verify pods in the openstack namespace. After the backup jobs have succeeded, the pods stay in the Completed state:

kubectl -n openstack get pods -l application=mariadb-phy-backup

Example of a posistive system response:

NAME                                  READY   STATUS      RESTARTS   AGE
mariadb-phy-backup-1599613200-n7jqv   0/1     Completed   0          43h
mariadb-phy-backup-1599699600-d79nc   0/1     Completed   0          30h
mariadb-phy-backup-1599786000-d5kc7   0/1     Completed   0          6h17m

Note

By default, the system keeps three latest successful and one latest failed pods.

Obtain an image of the MariaDB container:

kubectl -n openstack get pods mariadb-server-0 -o jsonpath='{.spec.containers[0].image}'

Create the check_pod.yaml file to create the helper pod required to view the backup volume content.

Configuration example:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: check-backup-helper
  namespace: openstack
---
apiVersion: v1
kind: Pod
metadata:
  name: check-backup-helper
  namespace: openstack
  labels:
    application: check-backup-helper
spec:
  nodeSelector:
    openstack-control-plane: enabled
  containers:
    - name: helper
      securityContext:
        allowPrivilegeEscalation: false
        runAsUser: 0
        readOnlyRootFilesystem: true
      command:
        - sleep
        - infinity
      image: << image of mariadb container >>
      imagePullPolicy: IfNotPresent
      volumeMounts:
        - name: pod-tmp
          mountPath: /tmp
        - mountPath: /var/backup
          name: mysql-backup
  restartPolicy: Never
  serviceAccount: check-backup-helper
  serviceAccountName: check-backup-helper
  volumes:
    - name: pod-tmp
      emptyDir: {}
    - name: mariadb-secrets
      secret:
        secretName: mariadb-secrets
        defaultMode: 0444
    - name: mariadb-bin
      configMap:
        name: mariadb-bin
        defaultMode: 0555
    - name: mysql-backup
      persistentVolumeClaim:
        claimName: mariadb-phy-backup-data

Apply the helper service account and pod resources:

kubectl -n openstack apply -f check_pod.yaml
kubectl -n openstack get pods -l application=check-backup-helper

Example of a positive system response:

NAME                  READY   STATUS    RESTARTS   AGE
check-backup-helper   1/1     Running   0          27s

Verify the directories structure within the /var/backup directory of the spawned pod:

kubectl -n openstack exec -t check-backup-helper -- tree /var/backup

Example of a system response:

/var/backup
|-- base
|   `-- 2020-09-09_11-35-48
|       |-- backup.stream.gz
|       |-- backup.successful
|       |-- grastate.dat
|       |-- xtrabackup_checkpoints
|       `-- xtrabackup_info
|-- incr
|   `-- 2020-09-09_11-35-48
|       |-- 2020-09-10_01-02-36
|       |   |-- backup.stream.gz
|       |   |-- backup.successful
|       |   |-- grastate.dat
|       |   |-- xtrabackup_checkpoints
|       |   `-- xtrabackup_info
|       `-- 2020-09-11_01-02-02
|           |-- backup.stream.gz
|           |-- backup.successful
|           |-- grastate.dat
|           |-- xtrabackup_checkpoints
|           `-- xtrabackup_info

The base directory contains full backups. Each directory in the incr folder contains incremental backups related to a certain full backup in the base folder. All incremental backups always have the base backup name as parent folder.

Delete the helper pod:
```
kubectl delete -f check_pod.yaml
```

Add a controller node¶

This section describes how to add a new control plane node to the existing MOSK deployment.

To add an OpenStack controller node:

Add a bare metal host to the MOSK cluster as described in Add a bare metal host.

When adding the bare metal host YAML file, specify the following OpenStack control plane node labels for the OpenStack control plane services such as database, messaging, API, schedulers, conductors, L3 and L2 agents:
- openstack-control-plane=enabled
- openstack-gateway=enabled
- openvswitch=enabled
Create a Kubernetes machine in your cluster as described in Add a machine.

When adding the machine, verify that OpenStack control plane node has the following labels:
- openstack-control-plane=enabled
- openstack-gateway=enabled
- openvswitch=enabled
Note

Depending on the applications that were colocated on the failed controller node, you may need to specify some additional labels, for example, ceph_role_mgr=true and ceph_role_mon=true . To successfuly replace a failed mon and mgr node, refer to Ceph operations.
Verify that the node is in the Ready state through the Kubernetes API:
```
kubectl get node <NODE-NAME> -o wide | grep Ready
```
Verify that the node has all required labels described in the previous steps:
```
kubectl get nodes --show-labels
```

Configure new Octavia health manager resources:

Rerun the octavia-create-resources job:

kubectl -n osh-system exec -t <OS-CONTROLLER-POD> -c osdpl osctl-job-rerun octavia-create-resources openstack

Wait until the Octavia health manager pod on the newly added control plane node appears in the Running state:

kubectl -n openstack get pods -o wide | grep <NODE_ID> | grep octavia-health-manager

Note

If the pod is in the crashloopbackoff state, remove it:

kubectl -n openstack delete pod <OCTAVIA-HEALTH-MANAGER-POD-NAME>

Verify that an OpenStack port for the node has been created and the node is in the Active state:

kubectl -n openstack exec -t <KEYSTONE-CLIENT-POD-NAME> openstack port show octavia-health-manager-listen-port-<NODE-NAME>

Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Replace a failed controller node¶

This section describes how to replace a failed control plane node in your MOSK deployment. The procedure applies to the control plane nodes that are, for example, permanently failed due to a hardware failure and appear in the NotReady state:

kubectl get nodes <CONTAINER-CLOUD-NODE-NAME>

Example of system response:

NAME                         STATUS       ROLES    AGE   VERSION
<CONTAINER-CLOUD-NODE-NAME>    NotReady   <none>   10d   v1.18.8-mirantis-1

To replace a failed controller node:

Remove the Kubernetes labels from the failed node by editing the .metadata.labels node object:
```
kubectl edit node <CONTAINER-CLOUD-NODE-NAME>
```
If your cluster is deployed with a compact control plane, inspect precautions for a cluster machine deletion.
Add the control plane node to your deployment as described in Add a controller node.

Identify all stateful applications present on the failed node:

node=<CONTAINER-CLOUD-NODE-NAME>
claims=$(kubectl -n openstack get pv -o jsonpath="{.items[?(@.spec.nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0] == '${node}')].spec.claimRef.name}")
for i in $claims; do echo $i; done

Example of system response:

mysql-data-mariadb-server-2
openstack-operator-bind-mounts-rfr-openstack-redis-1
etcd-data-etcd-etcd-0

For MOSK 23.3 series or earlier, reschedule stateful applications pods to healthy controller nodes as described in Reschedule stateful applications. For the newer versions, MOSK performs the rescheduling of stateful applications automatically.
If the failed controller node had the StackLight label, fix the StackLight volume node affinity conflict as described in Delete a cluster machine.

Remove the OpenStack port related to the Octavia health manager pod of the failed node:

kubectl -n openstack exec -t <KEYSTONE-CLIENT-POD-NAME> openstack port delete octavia-health-manager-listen-port-<NODE-NAME>

For clouds using Open Virtual Network (OVN) as the networking backend, remove the Northbound and Southbound database members for the failed node:

Remove an old Northboud database member:

Identify the member to be removed:

ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound

Example of system response:

5d02
Name: OVN_Northbound
Cluster ID: 4d61 (4d61fde5-6cd5-449e-9846-34fcb470687b)
Server ID: 5d02 (5d022977-982b-4de7-b125-e679746ece8d)
Address: tcp:openvswitch-ovn-db-0.ovn-discovery.openstack.svc.cluster.local:6643
Status: cluster member
Role: follower
Term: 5402
Leader: c617
Vote: c617

Election timer: 10000
Log: [22917, 26535]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->c617 ->4d1e <-c617 <-0e28
Disconnections: 0
Servers:
    c617 (c617 at tcp:openvswitch-ovn-db-2.ovn-discovery.openstack.svc.cluster.local:6643) last msg 1153 ms ago
    4d1e (4d1e at tcp:openvswitch-ovn-db-1.ovn-discovery.openstack.svc.cluster.local:6643)
    0e28 (0e28 at tcp:openvswitch-ovn-db-1.ovn-discovery.openstack.svc.cluster.local:6643) last msg 109828 ms ago
    5d02 (5d02 at tcp:openvswitch-ovn-db-0.ovn-discovery.openstack.svc.cluster.local:6643) (self)

In the above example output, the 4d1e member belongs to the failed node.

Remove the old member:

ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/kick OVN_Northbound 4d1e
sent removal request to leader

Verify that the old member has been removed successfully:

ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound

Example of a successful system response:

5d02
Name: OVN_Northbound
Cluster ID: 4d61 (4d61fde5-6cd5-449e-9846-34fcb470687b)
Server ID: 5d02 (5d022977-982b-4de7-b125-e679746ece8d)
Address: tcp:openvswitch-ovn-db-0.ovn-discovery.openstack.svc.cluster.local:6643
Status: cluster member
Role: follower
Term: 5402
Leader: c617
Vote: c617

Election timer: 10000
Log: [22917, 26536]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->c617 <-c617 <-0e28 ->0e28
Disconnections: 1
Servers:
    c617 (c617 at tcp:openvswitch-ovn-db-2.ovn-discovery.openstack.svc.cluster.local:6643) last msg 3321 ms ago
    0e28 (0e28 at tcp:openvswitch-ovn-db-1.ovn-discovery.openstack.svc.cluster.local:6643) last msg 134877 ms ago
    5d02 (5d02 at tcp:openvswitch-ovn-db-0.ovn-discovery.openstack.svc.cluster.local:6643) (self)

Remove an old Southbound database member by following the same steps used to remove an old Northbound database member:
1. Identify the member to be removed:
```
ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
```
2. Remove the old member:
```
ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/kick OVN_Southbound <SERVER-ID>
```

Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Add a compute node¶

This section describes how to add a new compute node to your existing Mirantis OpenStack for Kubernetes deployment.

To add a compute node:

Add a bare metal host to the MOSK cluster as described in Add a bare metal host.

Create a Kubernetes machine in your cluster as described in Add a machine.

When adding the machine, specify the node labels as required for an OpenStack compute node:

OpenStack node roles¶
Node role	Description	Kubernetes labels	Minimal count
OpenStack control plane	Hosts the OpenStack control plane services such as database, messaging, API, schedulers, conductors, L3 and L2 agents.	`openstack-control-plane=enabled` `openstack-gateway=enabled` `openvswitch=enabled`	3
OpenStack compute	Hosts the OpenStack compute services such as libvirt and L2 agents.	`openstack-compute-node=enabled` `openvswitch=enabled` (for a deployment with Open vSwitch as a backend for networking)	Varies

If required, configure the compute host to enable huge pages, SR-IOV, and other advanced features in your MOSK deployment. See Advanced OpenStack configuration (optional) for details.
Once the node is available in Kubernetes and when the nova-compute and neutron pods are running on the node, verify that the compute service and Neutron Agents are healthy in OpenStack API.

In the keystone-client pod, run:
```
openstack network agent list --host <cmp_host_name>

openstack compute service list --host <cmp_host_name>
```
Verify that the compute service is mapped to cell.

The OpenStack Controller triggers the nova-cell-setup job once it detects a new compute pod in the Ready state. This job sets mapping for new compute services to cells.

In the nova-api-osapi pod, run:
```
nova-manage cell_v2 list_hosts | grep <cmp_host_name>
```
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Change oversubscription settings for existing compute nodes¶

Available since MOSK 23.1

MOSK enables you to control the oversubscription of compute node resources through the placement service API.

To manage the oversubscription through the placement API:

Obtain the host name of the hypervisor in question:

openstack hypervisor list -f yaml

Example of system response:

- Host IP: 10.10.0.78
  Hypervisor Hostname: ps-ps-obnqilm4xxlu-0-gdy3x46euaeu-server-ftp6p7j6pyjl.cluster.local
  Hypervisor Type: QEMU
  ID: 1
  State: up
- Host IP: 10.10.0.118
  Hypervisor Hostname: ps-ps-obnqilm4xxlu-1-n36gax6zqgef-server-xtby2leuercd.cluster.local
  Hypervisor Type: QEMU
  ID: 7
  State: up

Determine the resource provider that corresponds to the hypervisor:

openstack resource provider list -f yaml --name <hypervisor_hostname>

Example of system response:

- generation: 4
  name: ps-ps-obnqilm4xxlu-1-n36gax6zqgef-server-xtby2leuercd.cluster.local
  parent_provider_uuid: null
  root_provider_uuid: b16e9094-3f0e-4b8e-a138-e0b1f0a980db
  uuid: b16e9094-3f0e-4b8e-a138-e0b1f0a980db

Verify the current values in the resource provider by its UUID:

openstack resource provider inventory list <provider_uuid> -f yaml

Example of system response:

- allocation_ratio: 8.0
  max_unit: 8
  min_unit: 1
  reserved: 0
  resource_class: VCPU
  step_size: 1
  total: 8
  used: 0
- allocation_ratio: 1.0
  max_unit: 7956
  min_unit: 1
  reserved: 512
  resource_class: MEMORY_MB
  step_size: 1
  total: 7956
  used: 0
- allocation_ratio: 1.6
  max_unit: 145
  min_unit: 1
  reserved: 0
  resource_class: DISK_GB
  step_size: 1
  total: 145
  used: 0

Update the allocation ratio for the required resource class in the resource provider and inspect the system response to verify that the change has been applied:

openstack resource provider inventory set <provider_uuid> --amend --resource VCPU:allocation_ratio=10

Caution

To ensure accurate resource updates, it is crucial to specify the --amend argument when making requests. Failure to do so will require the inclusion of values for all fields associated with the resource provider.

Example of system response:

- allocation_ratio: 10.0
  max_unit: 8
  min_unit: 1
  reserved: 0
  resource_class: VCPU
  step_size: 1
  total: 8
  used: 0
- allocation_ratio: 1.0
  max_unit: 7956
  min_unit: 1
  reserved: 512
  resource_class: MEMORY_MB
  step_size: 1
  total: 7956
  used: 0
- allocation_ratio: 1.6
  max_unit: 145
  min_unit: 1
  reserved: 0
  resource_class: DISK_GB
  step_size: 1
  total: 145
  used: 0

Delete a compute node¶

Since MOSK 23.2, the OpenStack-related metadata is automatically removed during the graceful machine deletion through the Mirantis Container Cloud web UI. For the procedure, refer to Delete a cluster machine.

During the graceful machine deletion, the OpenStack Controller (Rockoon) performs the following operations:

Disables the OpenStack Compute and Block Storage services on the node to prevent further scheduling of workloads to it.
Verifies if any resources are present on the node, for example, instances and volumes. By default, the OpenStack Controller blocks the removal process until the resources are removed by the user. To adjust this behavior to the needs of your cluster, refer to OpenStack Controller configuration.
Removes OpenStack services metadata including compute services, Neutron agents, and volume services.

Caution

You cannot collocate the OpenStack compute node with other cluster components, such as Ceph. If done so, refer to the removal steps of the collocated components when planning the maintenance window.

If your cluster runs MOSK 23.1 or older version, perfrom the following steps before you remove the node from the cluster through the web UI to correctly remove the OpenStack-related metadata from it:

Disable the compute service to prevent spawning of new instances. In the keystone-client pod, run:

openstack compute service set --disable <cmp_host_name> nova-compute --disable-reason "Compute is going to be removed."

Migrate all workloads from the node. For more information, follow Nova official documentation: Migrate instances.
Ensure that there are no pods running on the node to delete by draining the node as instructed in the Kubernetes official documentation: Safely drain node.
Delete the compute service using the OpenStack API. In the keystone-client pod, run:
```
openstack compute service delete <service_id>
```
Note

To obtain <service_id>, run:
```
openstack compute service list --host <cmp_host_name>
```
Depending on the networking backend in use, proceed with one of the following:
Open vSwitch
1. Obtain the network agent ID:
 openstack network agent list --host <cmp_host_name>
2. Delete the Neutron Agent service running the following command in the keystone-client pod:
 openstack network agent delete <agent_id>
Tungsten Fabric
1. Log in to the Tungsten Fabric web UI.
2. Navigate to Configure > Infrastructure > Virtual Routers.
3. Select the target compute node.
4. Click Delete.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Reschedule stateful applications¶

Note

The procedure applies to the MOSK clusters running MOSK 23.3 series or earlier versions. Starting from 24.1, MOSK performs the rescheduling of stateful applications automatically.

The rescheduling of stateful applications may be required when replacing a permanently failed node, decommissioning a node, migrating applications to nodes with a more suitable set of hardware, and in several other use cases.

MOSK deployment profiles include the following stateful applications:

OpenStack database (MariaDB)
OpenStack coordination (etcd)
OpenStack Time Series Database backend (Redis)

Each stateful application from the list above has a persistent volume claim (PVC) based on a local persistent volume per pod. Each of control plane nodes has a set of local volumes available. To migrate an application pod to another node, recreate a PVC with the persistent volume from the target node.

Caution

A stateful application pod can only be migrated to a node that does not contain other pods of this application.

Caution

When a PVC is removed, all data present in the related persistent volume is removed from the node as well.

Reschedule pods to another control plane node¶

This section describes how to reschedule pods for MariaDB, etcd, and Redis to another control plane node.

Reschedule pods for MariaDB¶

Important

Perform the pods rescheduling if you have to move a PVC to another node and the current node is still present in the cluster. If the current node has been removed already, MOSK reschedules pods automatically when a node with required labels is present in the cluster.

Recreate PVCs as described in Recreate a PVC on another control plane node.
Remove the pod:

Note

To remove a pod from a node in the NotReady state, add --grace-period=0 --force to the following command.
```
kubectl -n openstack delete pod <STATEFULSET-NAME>-<NUMBER>
```
Wait until the pod appears in the Ready state.

When the rescheduling is finalized, the <STATEFULSET-NAME>-<NUMBER> pod rejoins the Galera cluster with a clean MySQL data directory and requests the Galera state transfer from the available nodes.

Reschedule pods for Redis¶

Important

Recreate PVCs as described in Recreate a PVC on another control plane node.
Remove the pod:

Note

To remove a pod from a node in the NotReady state, add --grace-period=0 --force to the following command.
```
kubectl -n openstack-redis delete pod <STATEFULSET-NAME>-<NUMBER>
```
Wait until the pod is in the Ready state.

Reschedule pods for etcd¶

Warning

During the reschedule procedure of the etcd LCM, a short cluster downtime is expected.

Before MOSK 23.1:

Identify the etcd replica ID that is a numeric suffix in a pod name. For example, the ID of the etcd-etcd-0 pod is 0. This ID is required during the reschedule procedure.

kubectl -n openstack get pods | grep etcd

Example of a system response:

etcd-etcd-0                    0/1     Pending                 0          3m52s
etcd-etcd-1                    1/1     Running                 0          39m
etcd-etcd-2                    1/1     Running                 0          39m

If the replica ID is 1 or higher:

Add the coordination section to the spec.services section of the OsDpl object:

spec:
  services:
    coordination:
      etcd:
        values:
          conf:
            etcd:
              ETCD_INITIAL_CLUSTER_STATE: existing

Wait for the etcd statefulSet to update the new state parameter:

kubectl -n openstack get sts etcd-etcd -o jsonpath='{.spec.template.spec.containers[0].env[?(@.name=="ETCD_INITIAL_CLUSTER_STATE")].value}'

Scale down the etcd StatefulSet to 0 replicas. Verify that no replicas are running on the failed node.
```
kubectl -n openstack scale sts etcd-etcd --replicas=0
```
Select from the following options:
- If the current node is still present in the cluster and the PVC should be moved to another node, recreate the PVC as described in Recreate a PVC on another control plane node.
- If the current node has been removed, remove the PVC related to the etcd replica of the failed node:
```
kubectl -n <NAMESPACE> delete pvc <PVC-NAME>
```
 The PVC will be recreated automatically after the etcd StatefulSet is scaled to the initial number of replicas.

Scale the etcd StatefulSet to the initial number of replicas:

kubectl -n openstack scale sts etcd-etcd --replicas=<NUMBER-OF-REPLICAS>

Wait until all etcd pods are in the Ready state.

Verify that the etcd cluster is healthy:

kubectl -n openstack exec -t etcd-etcd-1 -- etcdctl -w table endpoint --cluster status

Before MOSK 23.1, if the replica ID is 1 or higher:
1. Remove the coordination section from the spec.services section of the OsDpl object.
2. Wait until all etcd pods appear in the Ready state.
3. Verify that the etcd cluster is healthy:
```
kubectl -n openstack exec -t etcd-etcd-1 -- etcdctl -w table endpoint --cluster status
```

Recreate a PVC on another control plane node¶

This section describes how to recreate a PVC of a stateful application on another control plane node.

To recreate a PVC on another control plane node:

Select one of the persistent volumes available on the node:

Caution

A stateful application pod can only be migrated to the node that does not contain other pods of this application.

NODE_NAME=<NODE-NAME>
STORAGE_CLASS=$(kubectl -n openstack get osdpl <OSDPL_OBJECT_NAME> -o jsonpath='{.spec.local_volume_storage_class}')
kubectl -n openstack get pv -o json | jq --arg NODE_NAME $NODE_NAME --arg STORAGE_CLASS $STORAGE_CLASS -r '.items[] | select(.spec.nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0] == $NODE_NAME and .spec.storageClassName == $STORAGE_CLASS and .status.phase == "Available") | .metadata.name'

As the new PVC should contain the same parameters as the deleted one except for volumeName, save the old PVC configuration in YAML:
```
kubectl -n <NAMESPACE> get pvc <PVC-NAME> -o yaml > <OLD-PVC>.yaml
```
Note

<NAMESPACE> is a Kubernetes namespace where the PVC is created. For Redis, specify openstack-redis, for other applications specify openstack.
Delete the old PVC:
```
kubectl -n <NAMESPACE> delete pvc <PVC-NAME>
```
Note

If a PVC has stuck in the terminating state, run kubectl -n openstack edit pvc <PVC-NAME> and remove the finalizers section from metadata of the PVC.

Create a PVC with a new persistent volume:

cat <<EOF | kubectl apply -f -
     apiVersion: v1
     kind: PersistentVolumeClaim
     metadata:
       name: <PVC-NAME>
       namespace: <NAMESPACE>
     spec:
       accessModes:
       - ReadWriteOnce
       resources:
         requests:
           storage: <STORAGE-SIZE>
       storageClassName: <STORAGE-CLASS>
       volumeMode: Filesystem
       volumeName: <PV-NAME>
    EOF

Caution

<STORAGE-SIZE>, <STORAGE-CLASS>, and <NAMESPACE> should correspond to the storage, storageClassName, and namespace values from the <OLD-PVC>.yaml file with the old PVC configuration.

Run Tempest tests¶

The OpenStack Integration Test Suite (Tempest), is a set of integration tests to be run against a live OpenStack environment. This section instructs you on how to verify the workability of your OpenStack deployment using Tempest.

To verify an OpenStack deployment using Tempest:

Configure the Tempest run parameters using the features:services:tempest structure in the OpenStackDeployment custom resource.

Note

To perform the smoke testing of your deployment, no additional configuration is required.

Configuration examples:

To perform the full Tempest testing:

spec:
  services:
    tempest:
      tempest:
        values:
          conf:
            script: |
              tempest run --config-file /etc/tempest/tempest.conf --concurrency 4 --blacklist-file /etc/tempest/test-blacklist --regex test

To set the image build timeout to 600:

spec:
  services:
    tempest:
      tempest:
        values:
          conf:
            tempest:
              image:
                build_timeout: 600

Run Tempest. The OpenStack Tempest is deployed like other OpenStack services in a dedicated openstack-tempest Helm release by adding tempest to spec:features:services in the OpenStackDeployment custom resource:
```
spec:
  features:
    services:
      - tempest
```
Wait until Tempest is ready. The Tempest tests are launched by the openstack-tempest-run-tests job. To keep track of the tests execution, run:
```
kubectl -n openstack logs -l application=tempest,component=run-tests
```

Get the Tempest results. The Tempest results can be stored in a pvc-tempest PersistentVolumeClaim (PVC). To get them from a PVC, use:

# Run pod and mount pvc to it
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: tempest-test-results-pod
  namespace: openstack
spec:
  nodeSelector:
    openstack-control-plane: enabled
  volumes:
    - name: tempest-pvc-storage
      persistentVolumeClaim:
        claimName: pvc-tempest
  containers:
    - name: tempest-pvc-container
      image: ubuntu
      command: ['sh', '-c', 'sleep infinity']
      volumeMounts:
        - mountPath: "/var/lib/tempest/data"
          name: tempest-pvc-storage
EOF

If required, copy the results locally:

kubectl -n openstack cp tempest-test-results-pod:/var/lib/tempest/data/report_file.xml .

Remove the Tempest test results pod:

kubectl -n openstack delete pod tempest-test-results-pod

To rerun Tempest:
1. Remove Tempest from the list of enabled services.
2. Wait until Tempest jobs are removed.
3. Add Tempest back to the list of the enabled services.

See also

Remove an OpenStack cluster¶

This section instructs you on how to remove an OpenStack cluster, deployed on top of Kubernetes, by deleting the openstackdeployments.lcm.mirantis.com (OsDpl) CR.

To remove an OpenStack cluster:

Verify that the OsDpl object is present:
```
kubectl get osdpl -n openstack
```
Delete the OsDpl object:
```
kubectl delete osdpl osh-dev -n openstack
```
The deletion may take a certain amount of time.
Verify that all pods and jobs have been deleted and no objects are present in the command output:
```
kubectl get pods,jobs -n openstack
```

Delete Persistent Volume Claims (PVCs) using the following snippet. Deletion of PVCs causes data deletion on Persistent Volumes. The volumes themselves will become available for further operations.

Caution

Before deleting PVCs, save valuable data in a safe place.

#!/bin/bash
PVCS=$(kubectl get pvc  --all-namespaces |egrep "openstack|openstack-redis" | egrep "redis|etcd|mariadb" | awk '{print $1" "$2" "$4}'| column -t )
echo  "$PVCS" | while read line; do
PVC_NAMESPACE=$(echo "$line" | awk '{print $1}')
PVC_NAME=$(echo "$line" | awk '{print $2}')
echo "Deleting PVC ${PVC_NAME}"
kubectl delete pvc ${PVC_NAME} -n ${PVC_NAMESPACE}
done

Note

Deletion of PVCs may get stuck if a resource that uses the PVC is still running. Once the resource is deleted, the PVC deletion process will proceed.

Delete the MariaDB state ConfigMap:

kubectl delete configmap openstack-mariadb-mariadb-state -n openstack

Delete secrets using the following snippet:

#!/bin/bash
SECRETS=$(kubectl get secret  -n openstack | awk '{print $1}'| column -t | awk 'NR>1')
echo  "$SECRETS" | while read line; do
echo "Deleting Secret ${line}"
kubectl delete secret ${line} -n openstack
done

Verify that OpenStack ConfigMaps and secrets have been deleted:
```
kubectl get configmaps,secrets -n openstack
```

Remove an OpenStack service¶

OpenStack Controller (Rockoon)

Since MOSK 25.1, the OpenStack Controller has been open-sourced under the name Rockoon and is maintained as an independent open-source project going forward.

This section instructs you on how to remove an OpenStack service deployed on top of Kubernetes. A service is typically removed by deleting a corresponding entry in the spec.features.services section of the openstackdeployments.lcm.mirantis.com (OsDpl) CR.

Caution

You cannot remove the default services built into the preset section.

Remove a service¶

Verify that the spec.features.services section is present in the OsDpl object:

kubectl -n openstack get osdpl osh-dev -o jsonpath='{.spec.features.services}'

Example of system output:

[instance-ha object-storage]

Obtain the user name of the service database that will be required during Clean up OpenStack database leftovers after the service removal to substitute SERVICE-DB-NAME:

Note

For example, the <SERVICE-NAME> for the instance-ha service type is masakari.
```
kubectl -n osh-system exec -t <ROCKOON-POD-NAME> -- helm3 -n openstack get values openstack-<SERVICE-NAME> -o json | jq -r .endpoints.oslo_db.auth.<SERVICE-NAME>.username
```
Delete the service from the spec.features.services section of the OsDpl CR:
```
kubectl -n openstack edit osdpl osh-dev
```
The deletion may take a certain amount of time.

Verify that all related objects have been deleted and no objects are present in the output of the following command:

for i in $(kubectl api-resources --namespaced -o name | grep -v event); do kubectl -n openstack get $i 2>/dev/null | grep <SERVICE-NAME>; done

Clean up OpenStack API leftovers after the service removal¶

kubectl -n openstack exec -it <KEYSTONE-CLIENT-POD-NAME> -- bash

Remove service endpoints from the Keystone catalog:

for i in $(openstack endpoint list --service <SERVICE-NAME> -f value -c ID); do openstack endpoint delete $i; done

Remove the service user from the Keystone catalog:

openstack user list --project service | grep <SERVICE-NAME>
openstack user delete <SERVICE-USER-ID>

Remove the service from the catalog:

openstack service list | grep <SERVICE-NAME>
openstack service delete <SERVICE-ID>

Clean up OpenStack database leftovers after the service removal¶

Caution

The procedure below will permanently destroy the data of the removed service.

kubectl -n openstack exec -it mariadb-server-0 -- bash

Remove the service database user and its permissions:

Note

Use the user name for the service database obtained during the Remove a service procedure to substitute SERVICE-DB-NAME:

mysql -u root -p${MYSQL_DBADMIN_PASSWORD} -e "REVOKE ALL PRIVILEGES, GRANT OPTION FROM '<SERVICE-DB-USERNAME>'@'%';"
mysql -u root -p${MYSQL_DBADMIN_PASSWORD} -e "DROP USER '<SERVICE-DB-USERNAME>'@'%';"

Remove the service database:

mysql -u root -p${MYSQL_DBADMIN_PASSWORD} -e "DROP DATABASE <SERVICE-NAME>;"

Enable uploading of an image through Horizon with untrusted SSL certificates¶

By default, the OpenStack Dashboard (Horizon) is configured to load images directly into Glance. However, if a MOSK cluster is deployed using untrusted certificates for public API endpoints and Horizon, uploading of images to Glance through the Horizon web UI may fail.

When accessing the Horizon web UI of such MOSK deployment for the first time, a warning informs you that the site is insecure and you must force trust the certificate of this site. However, when trying to upload an image directly from a web browser, the certificate of the Glance API is still not considered by the web browser as a trusted one since host:port of the site is different. In this case, you must explicitly trust the certificate of the Glance API.

To enable uploading of an image through Horizon with untrusted SSL certificates:

Navigate to the Horizon web UI.
Configure your web browser to trust the Horizon certificate if you have not done so yet:
- In Google Chrome or Chromium, click Advanced > Proceed to <URL> (unsafe).
- In Mozilla Firefox, navigate to Advanced > Add Exception, enter the URL in the Location field, and click Confirm Security Exception.
Note

For other web browsers, the steps may vary slightly.
Navigate to Project > API Access.
Copy the Service Endpoint URL of the Image service.
Open this URL in a new window or tab of the same web browser.

Configure your web browser to trust the certificate of this site as described in the step 2.

As a result, the version discovery document should appear with contents that varies depending on the OpenStack version. For example, for OpenStack Victoria:

{"versions": [{"id": "v2.9", "status": "CURRENT", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}, \
{"id": "v2.7", "status": "SUPPORTED", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}, \
{"id": "v2.6", "status": "SUPPORTED", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}, \
{"id": "v2.5", "status": "SUPPORTED", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}, \
{"id": "v2.4", "status": "SUPPORTED", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}, \
{"id": "v2.3", "status": "SUPPORTED", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}, \
{"id": "v2.2", "status": "SUPPORTED", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}, \
{"id": "v2.1", "status": "SUPPORTED", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}, \
{"id": "v2.0", "status": "SUPPORTED", "links": \
[{"rel": "self", "href": "https://glance.ic-eu.ssl.mirantis.net/v2/"}]}]}

Once done, you should be able to upload an image through Horizon with untrusted SSL certificates.

Rotate OpenStack credentials¶

OpenStack Controller (Rockoon)

Since MOSK 25.1, the OpenStack Controller has been open-sourced under the name Rockoon and is maintained as an independent open-source project going forward.

The credential rotation procedure is designed to minimize the impact on service availability and workload downtime. It depends on the credential type and is based on the following principles:

Credentials for OpenStack admin database and messaging are immediately changed during one rotation cycle, without a transition period.
Credentials for OpenStack admin identity are rotated with a transition period of one extra rotation cycle. This means that the credentials become invalid after two rotations. MOSK exposes the latest valid credentials to the openstack-external namespace. For details, refer to Access OpenStack through CLI from your local machine.
Credentials for OpenStack service users, including those for messaging, identity, and database, undergo a transition period of one extra rotation cycle during rotation.

Note

If immediate inactivation of credentials is required, initiate the rotation procedure twice.

Impact on workloads availability¶

The restarts of the Networking service may cause workload downtimes. The exact lengths of these downtimes depend on the cloud density and scale.

Impact on APIs availability¶

Rotating both administrator and service credentials can potentially result in certain API operations failing.

Rotation prerequisites¶

Verify that the current state of the LCM action in OpenstackDeploymentStatus is APPLIED:

kubectl -n openstack get osdplst -o yaml

Example of an expected system response:

 kind: OpenStackDeploymentStatus
 metadata:
   name: osh-dev
   namespace: openstack
 spec: {}
 status:
   ...
   osdpl:
     cause: update
     changes: '((''add'', (''status'',), None, {''watched'': {''ceph'': {''secret'':
       {''hash'': ''0fc01c5e2593bc6569562b451b28e300517ec670809f72016ff29b8cbaf3e729''}}}}),)'
     controller_version: 0.5.3.dev12
     fingerprint: a112a4a7d00c0b5b79e69a2c78c3b50b0caca76a15fe7d79a6ad1305b19ee5ec
     openstack_version: ussuri
     state: APPLIED
     timestamp: "2021-09-08 17:01:45.633143"

Verify that there are no other LCM operations running on the OpenStack cluster.
Thoroughly plan the maintenance window taking into account the following considerations:
- All OpenStack control plane services, components of the Networking service (OpenStack Neutron) responsible for the data plane and messaging services are restarted during service credentials rotation.
- OpenStack database and OpenStack messaging services are restarted during administrator credentials rotation, as well as some of the Openstack control plane services, including the Instance High Availability service (OpenStack Masakari), Dashboard (OpenStack Horizon), and Identity service (OpenStack Keystone).
For approximate maintenance window duration, refer to Calculate a maintenance window duration for update Deprecated.

Rotate the credentials¶

kubectl -n osh-system exec -it <rockoon-pod> -c osdpl -- bash

Use the osctl utility to trigger credentials rotation:
```
osctl credentials rotate --osdpl <osdpl-object-name> --type <credentials-type>
```
Where the <credentials-type> value is either admin or service.
Note

Mirantis recommends rotating both admin and service credentials simultaneously to decrease the duration of the maintenance window and number of service restarts. You can do this by passing the --type argument twice:
```
osctl credentials rotate --osdpl <osdpl-object-name> --type service --type admin
```
Wait until the OpenStackDeploymentStatus object has state APPLIED and all OpenStack components in the health group in the OpenStackDeploymentStatus custom resource are in the Ready state.

Alternatively, you can launch the rotation command with the --wait flag.

Now, the latest admin password for your OpenStack environment is available in the openstack-identity-credentials secret in the openstack-external namespace.

See also

Customize OpenStack container images¶

This section provides instructions on how to customize the functionality of your MOSK OpenStack services by installing custom system or Python packages into their container images.

The MOSK services are running in Ubuntu-based containers, which can be extended to meet specific requirements or implement specific use cases, for example:

Enabling third-party storage driver for OpenStack Cinder
Implementing a custom scheduler for OpenStack Nova
Adding a custom dashboard to OpenStack Horizon
Building your own image importing workflow for OpenStack Glance

Warning

Mirantis cannot be held responsible for any consequences arising from using customized container images. Mirantis does not provide support for such actions, and any modifications to production systems are made at your own risk.

Note

Custom images are pinned in the OpenStackDeployment custom resource. These images do not undergo automatic updates or upgrades. Cloud administrator is responsible for image update during OpenStack updating and upgrading.

Build a custom container image¶

Create a new directory and switch to it:

mkdir my-custom-image
cd my-custom-image

Create a Dockerfile:
```
touch Dockerfile
```
Specify the location for the base image in the Dockerfile.

A custom image can be derived from any OpenStack image shipped with MOSK. For locations of the images comprising a specific MOSK release, refer to a corresponding release artifacts page in the Release Notes.
```
ARG FROM=<images-base-url>/openstack/<image-name>:<tag>
FROM $FROM
```
Note

Presuming the custom image will need to get rebuilt for every new MOSK release, Mirantis recommends parametrizing the location of its base by introducing the $FROM argument to the Dockerfile.

Instruct the Dockerfile to install additional system packages:

RUN apt-get update ;\
      apt-get install --no-install-recommends -y <package name> ;\
      apt-get clean -y

If you need to install packages from a third-party repository:

Make sure that the add-apt-repository utility is installed:

RUN apt-get install --no-install-recommends -y software-properties-common

Add the third-party repository:

RUN add-apt-repository <repository_name>

Instruct the Dockerfile to install additional Python packages:
Caution

Rules to comply with when extending MOSK container images with Python packages:
- Use only Python wheel packaging standard, the older *.egg package type is not supported
- Honor upper constraints that MOSK defines for its OpenStack packages prerequisites
1. OpenStack components in every MOSK release are shipped together with their requirements packaged as Python wheels and constraints file. Download and extract these artifacts from the corresponding requirements container image, so that they can be used for building your packages as well. Use the requirements image with the same tag as the base image that you plan to customize:
```
docker pull <images-base-url>/openstack/requirements:<tag>
docker save -o requirements.tar <images-base-url>/openstack/requirements:<tag>

mkdir requirements
tar -xf requirements.tar -C requirements
tar -xf requirements/<shasum>/layer.tar -C requirements
```
2. Build your Python wheel packages using one of the commands below depending on the place where the source code is stored:
 - Build from source:
 pip wheel --no-binary <package> <package> --wheel-dir=custom-wheels -c requirements/dev-upper-constraints.txt
 - Build from an upstream pip repository:
 pip wheel <package> --wheel-dir=custom-wheels -c requirements/dev-upper-constraints.txt
 - Build from a custom repository:
 pip wheel <package> --extra-index-url <Repo-URL> --wheel-dir=custom-wheels -c requirements/dev-upper-constraints.txt
3. Include the built custom wheel packages and packages for the requirements into the Dockerfile:
```
COPY custom-wheels /tmp/custom-wheels
COPY requirements /tmp/wheels
```
4. Install the necessary wheel packages to be stored along with other OpenStack components:
```
RUN source /var/lib/openstack/bin/activate ;\
 pip install <package> --no-cache-dir --only-binary :all: --no-compile --find-links /tmp/wheels --find-links /tmp/custom-wheels -c /tmp/wheels/dev-upper-constraints.txt
```
Build the container image from your Dockerfile:
```
docker build --tag <user-name>/<repo-name>:<tag> .
```
When selecting the name for your image, Mirantis recommends following the common practice across major public Docker repositories, that is Docker Hub. The image name should be <user-name>/<repo-name>, where <user-name> is a unique identifier of the user who authored it and <repo-name> is the name of the software shipped.

Specify the current directory as the build context. Also, use the --tag option to assign the tag to your image. Assigning a tag :<tag> enables you to add multiple versions of the same image to the repository. Unless you assign a tag, it defaults to latest.

If you are adding Python packages, you can minimize the size of the custom image by building it with the --squash flag. It merges all the image layers into one and instructs the system not to store the cache layers of the wheel packages.
Verify that the image has been built and is present on your system:
```
docker image ls
```
Publish the image to the designated registry by its name and tag:

Note

Before pushing the image, make sure that you have authenticated with the registry using the docker login command.
```
docker push <user-name>/<repo-name>:<tag>
```

Attach a private Docker registry to MOSK Kubernetes underlay¶

To ensure that the Kubernetes worker nodes in your MOSK cluster can locate and download the custom image, it should be published to a container image registry that the cluster is configured to use.

To configure the MOSK Kubernetes underlay to use your private registry, you need to create a ContainerRegistry resource in the Mirantis Container Cloud API with the registry domain and CA certificate in it, and specify the resource in the Cluster object that corresponds to MOSK.

For the details, refer to Define a custom CA certificate for a private Docker registry and ContainerRegistry resource.

Inject a custom image into MOSK cluster¶

To inject a customized OpenStack container into your MOSK cluster:

Since MOSK 25.1

Create a ConfigMap in the openstack namespace with the following content, replacing <OPENSTACKDEPLOYMENT-NAME> with the name of your OpenStackDeployment custom resource:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    penstack.lcm.mirantis.com/watch: "true"
  name: <OPENSTACKDEPLOYMENT-NAME>-artifacts
  namespace: openstack
data:
  caracal: |
    horizon: <image_path>

Before MOSK 25.1

Update the spec:services section in the OpenStackDeployment custom resource to override the default location of the container image with your own:

Open the OpenStackDeployment custom resource for editing:
```
kubectl -n openstack edit osdpl osh-dev
```

Specify the path to your custom image:

For MOSK Dashboard (OpenStack Horizon):

spec:
  services:
    dashboard:
      horizon:
        values:
          images:
            tags:
              horizon: <PATH-TO-IMAGE>

For the MOSK Block Storage service (OpenStack Cinder):

spec:
  services:
    block-storage:
      cinder:
        values:
          images:
            tags:
              <IMAGE-NAME>: <PATH-TO-IMAGE>

How to examples¶

To help you better understand the process, this section provides a few examples illustrating how to add various plugins to MOSK services.

Warning

Mirantis cannot be held responsible for any consequences arising from using storage drivers, plugins, or features that are not explicitly tested or documented with MOSK. Mirantis does not provide support for such configurations as a part of standard product subscription.

Troubleshoot orphaned resource allocations¶

Available since MOSK 24.3

OpenStack Controller (Rockoon)

Since MOSK 25.1, the OpenStack Controller has been open-sourced under the name Rockoon and is maintained as an independent open-source project going forward.

Orphaned resource allocations are entries in the Placement database that track resource consumption, but the corresponding consumer (instance) no longer exists on the compute nodes. As a result, the Nova scheduler mistakenly believes that compute nodes have more resources allocated than they actually have.

For example, orphaned resource allocations may occur when an instance is evacuated from a hypervisor while the related nova-compute service is down.

This section provides instructions on how to resolve orphaned resource allocations in Nova if they are detected on compute nodes.

Detect orphaned allocations¶

Orphaned allocations are detected by the nova-placement-audit CronJob that runs every four hours.

The osdpl-exporter service processes the nova-placement-audit CronJob output and exports current number of orphaned allocations to StackLight as an osdpl_nova_audit_orphaned_allocations metric. If the value of this metric is greater than 0, StackLight raises a major alert NovaOrphanedAllocationsDetected.

Collect logging data from the cluster¶

Obtain the mapping with IDs of resource providers and related orphaned consumers:

kubectl -n openstack logs -l application=nova,component=placement-audit -c placement-audit-report | \
   jq .orphaned_allocations.detected

Example of a system response:

{
   "12ed66d0-00d8-40e5-a28b-19cdecd2211d": [
      {
         "consumer": "1e616d60-bc5b-436d-8d71-503d15de5c55",
         "resources": {
         "DISK_GB": 5,
         "MEMORY_MB": 512,
         "VCPU": 1
         }
      }
   ]
}

Obtain the list of the nova-compute services that have issues with orphaned allocations:

Obtain the UUIDs of the resource providers containing orphaned allocations:

rp_uuids=$(kubectl -n openstack logs -l application=nova,component=placement-audit -c placement-audit-report | \
   jq -c '.orphaned_allocations.detected|keys')

Obtain the hostnames of the compute nodes that correspond to the resource providers obtained in the previous step:

cmp_fqdns=$(kubectl -n openstack exec -t deployment/keystone-client -- \
   openstack resource provider list -f json | \
   jq --argjson rp_uuids "$rp_uuids" -c ' .[] | select( [.uuid] | inside($rp_uuids) ) | .name')
cmp_hostnames=$(for n in $(echo ${cmp_fqdns} | tr -d \"); do echo ${n%%.*}; done)

List the nova-compute services that contain orphaned allocations:

kubectl -n openstack exec -t deployment/keystone-client -- \
   openstack compute service list --service nova-compute --long -f json | \
   jq --arg hosts "$cmp_hostnames" -r '.[] | select( .Host | inside($hosts) )'

Example of a system response:

[{
   "ID": "14a1685a-798e-40f1-b490-a09a5c8f6f66",
   "Binary": "nova-compute",
   "Host": "mk-ps-7bqjjdq7o53q-0-rlocum3rumf4-server-ir4n3ag33erk",
   "Zone": "nova",
   "Status": "enabled",
   "State": "down",
   "Updated At": "2024-08-12T07:52:46.000000",
   "Disabled Reason": null,
   "Forced Down": true
}]

Analyze the list of the nova-compute services obtained during the previous step:

For the nova-compute services in the down state, most probably there were evacuations of instances from the correspoding nodes when the services were down. If this is the case, proceed directly to Remove orphaned allocations. Otherwise, proceed with collecting the logs.

To verify if the evacuations were performed:

openstack server migration list --type evacuation --host <CMP_HOSTNAME> -f json

Example of a system response:

[{
   "Id": 3,
   "UUID": "d7c29e99-2f69-4f85-80ed-72e1ef71c099",
   "Source Node": "mk-ps-7bqjjdq7o53q-1-snwv3ahiip6i-server-3axzquptwsao.cluster.local",
   "Dest Node": "mk-ps-7bqjjdq7o53q-0-rlocum3rumf4-server-ir4n3ag33erk.cluster.local",
   "Source Compute": "mk-ps-7bqjjdq7o53q-1-snwv3ahiip6i-server-3axzquptwsao",
   "Dest Compute": "mk-ps-7bqjjdq7o53q-0-rlocum3rumf4-server-ir4n3ag33erk",
   "Dest Host": "10.10.0.61",
   "Status": "completed",
   "Server UUID": "1e616d60-bc5b-436d-8d71-503d15de5c55",
   "Old Flavor": null,
   "New Flavor": null,
   "Type": "evacuation",
   "Created At": "2024-08-07T09:01:08.000000",
   "Updated At": "2024-08-07T16:11:54.000000"
}]

For the nova-compute services in the UP state, proceed with collecting the logs.

Collect the following logs from the environment:

Caution

The log data can be significant in size. Ensure that there is sufficient space available in the /tmp/ directory of the OpenStack Controller (Rockoon) pod. Create a separate report for each node.

Logs from compute nodes for a 3-day period around the time of the alert:
- From the node with the orphaned allocation
- From the node with the actual allocation (where the instance exists, if any)
```
kubectl -n osh-system exec -it deployment/rockoon -- bash

osctl sos --between <REPORT_PERIOD_TIMESTAMPS> \
 --host <CMP_HOSTNAME> \
 --component nova \
 --collector elastic \
 --collector nova \
 --workspace /tmp/ report
```
For example, if the alert was raised on 2024-08-12, set <REPORT_PERIOD_TIMESTAMPS> to 2024-08-11,2024-08-13.

Logs from the nova-scheduler, nova-api, nova-conductor, placement-api pods for a 3-day period around the time of the alert:

ctl_nodes=$(kubectl get nodes -l openstack-control-plane=enabled -o name)

kubectl -n osh-system exec -it deployment/rockoon -- bash

# for each node in ctl_nodes execute:
osctl sos --between <REPORT_PERIOD_TIMESTAMPS> \
   --host <CTL_HOSTNAME> \
   --component nova \
   --component placement \
   --collector elastic \
   --workspace /tmp/ report

Logs from the Kubernetes objects:

kubectl -n osh-system exec -it deployment/rockoon -- bash
osctl sos --collector k8s --workspace /tmp/ report

Nova service data from the API:

kubectl -n openstack exec -it deployment/keystone-client -- bash

openstack server migration list
openstack compute service list --long
openstack resource provider list
# Get the server event list for each orphaned consumer id
openstack server event list <SERVER_ID>

Note

SERVER_ID is the orphaned consumer ID from the nova-placement-audit logs.

Create a support case and attach the obtained information.

Remove orphaned allocations¶

kubectl -n openstack exec -it deployment/nova-api-osapi -- bash

Remove orphaned allocations:

To remove all found orphaned allocations:

nova-manage placement audit --verbose --delete

To remove orphaned allocations on a specific resource provider:

nova-manage placement audit --verbose --delete --resource_provider <RESOURCE_PROVIDER_UUID>

Verify that no orphaned allocations exist:
```
nova-manage placement audit --verbose
```

Start monitoring IP address capacity¶

Available since MOSK 25.1

Note

The MOSK deployments with Tungsten Fabric do not support IP address capacity monitoring.

Monitoring IP address capacity helps cloud operators allocate routable IP addresses efficiently for dynamic workloads in the clouds. This capability provides insights for predicting future needs for IP addresses, ensuring seamless communication between workloads, users, and services while optimizing IP address usage.

By monitoring IP address capacity, cloud operators can:

Predict when to add new IP address blocks to prevent service disruptions.
Identify networks or subnets nearing capacity to prevent issues.
Optimize the allocation of costly external IP address pools.

To start monitoring IP address capacity in your cloud:

Verify that all required networks and subnets are monitored.

By default, MOSK monitors IP address capacity for the external networks that have the router:external=External attribute and segmentation type of vlan or flat.

To include additional networks and subnets in the monitoring:
- Tag the network with the openstack.lcm.mirantis.com:prometheus tag. When a network is tagged, all its subnets are automatically included in the monitoring:
```
openstack network set <NETWORK-ID> --tag openstack.lcm.mirantis.com:prometheus
```
- Tag individual subnets with the openstack.lcm.mirantis.com:prometheus tag. This includes the subnet in the monitoring regardless of the network tagging:
```
openstack subnet set <SUBNET_ID> --tag openstack.lcm.mirantis.com:prometheus
```
View and analyze the monitoring data.

The metrics of IP address capacity are collected by the OpenStack Exporter and are available in Prometheus within the StackLight monitoring suite as:
- osdpl_neutron_network_total_ips
- osdpl_neutron_network_free_ips
- osdpl_neutron_subnet_total_ips
- osdpl_neutron_subnet_free_ips

See also

OpenStack Exporter

See also

Management cluster operations

OpenStack services configuration¶

This section covers post-deployment configuration of OpenStack services and is intended for cloud operators responsible for maintaining a functional cloud infrastructure for end users. It focuses on more complex procedures that require additional steps beyond simply editing the OpenStackDeployment custom resource.

For an overview of the capabilities provided by MOSK OpenStack services and instructions on enabling and configuring them at the OpenStackDeployment level, refer to Cloud services.

Configure high availability with Masakari¶

Instances High Availability Service or Masakari is an OpenStack project designed to ensure high availability of instances and compute processes running on hosts.

Before the end user can start enjoying the benefits of Masakari, the cloud operator has to configure the service properly. This section includes instructions on how to create segments and host through the Masakari API as well as provides the list of additional settings that can be useful in certain use cases.

Group compute nodes into segments¶

The segment object is a logical grouping of compute nodes into zones also known as availability zones. The segment object enables the cloud operator to list, create, show details for, update, and delete segments.

To create a segment named allcomputes with service_type = compute, and recovery_method = auto, run:

openstack segment create allcomputes auto compute

Example of a positive system response:

+-----------------+--------------------------------------+
| Field           | Value                                |
+-----------------+--------------------------------------+
| created_at      | 2021-07-06T07:34:23.000000           |
| updated_at      | None                                 |
| uuid            | b8b0d7ca-1088-49db-a1e2-be004522f3d1 |
| name            | allcomputes                          |
| description     | None                                 |
| id              | 2                                    |
| service_type    | compute                              |
| recovery_method | auto                                 |
+-----------------+--------------------------------------+

Create hosts under segments¶

The host object represents compute service hypervisors. A host belongs to a segment. The host can be any kind of virtual machine that has compute service running on it. The host object enables the operator to list, create, show details for, update, and delete hosts.

To create a host under a given segment:

Obtain the hypervisor hostname:

openstack hypervisor list

Example of a positive system response:

+----+-------------------------------------------------------+-----------------+------------+-------+
| ID | Hypervisor Hostname                                   | Hypervisor Type | Host IP    | State |
+----+-------------------------------------------------------+-----------------+------------+-------+
|  2 | vs-ps-vyvsrkrdpusv-1-w2mtagbeyhel-server-cgpejthzbztt | QEMU            | 10.10.0.39 | up    |
|  5 | vs-ps-vyvsrkrdpusv-0-ukqbpy2pkcuq-server-s4u2thvgxdfi | QEMU            | 10.10.0.14 | up    |
+----+-------------------------------------------------------+-----------------+------------+-------+

Create the host under previously created segment. For example, with uuid = b8b0d7ca-1088-49db-a1e2-be004522f3d1:

Caution

The segment under which you create a host must exist.

openstack segment host create \
    vs-ps-vyvsrkrdpusv-1-w2mtagbeyhel-server-cgpejthzbztt \
    compute \
    SSH \
    b8b0d7ca-1088-49db-a1e2-be004522f3d1

Positive system response:

+---------------------+-------------------------------------------------------+
| Field               | Value                                                 |
+---------------------+-------------------------------------------------------+
| created_at          | 2021-07-06T07:37:26.000000                            |
| updated_at          | None                                                  |
| uuid                | 6f1bd5aa-0c21-446a-b6dd-c1b4d09759be                  |
| name                | vs-ps-vyvsrkrdpusv-1-w2mtagbeyhel-server-cgpejthzbztt |
| type                | compute                                               |
| control_attributes  | SSH                                                   |
| reserved            | False                                                 |
| on_maintenance      | False                                                 |
| failover_segment_id | b8b0d7ca-1088-49db-a1e2-be004522f3d1                  |
+---------------------+-------------------------------------------------------+

Enable notifications¶

The alerting API is used by Masakari monitors to notify about a failure of either a host, process, or instance. The notification object enables the operator to list, create, and show details of notifications.

Useful tunings¶

The list of useful tunings for the Masakari service includes:

[host_failure]\evacuate_all_instances

Enables the operator to decide whether to evacuate all instances or only the instances that have [host_failure]\ha_enabled_instance_metadata_key set to True. By default, the parameter is set to False.
[host_failure]\ha_enabled_instance_metadata_key

Enables the operator to decide on the instance metadata key naming that affects the per instance behavior of [host_failure]\evacuate_all_instances. The default is the same for both failure types, which include host and instance, but the value can be overridden to make the metadata key different per failure type.
[host_failure]\ignore_instances_in_error_state

Enables the operator to decide whether error instances should be allowed for evacuation from a failed source compute node or not. If set to True, it will ignore error instances from evacuation from a failed source compute node. Otherwise, it will evacuate error instances along with other instances from a failed source compute node.
Available since MOSK 24.2 [host_failure]\ha_enabled_project_tag

By default, instances belonging to any project are evacuated. However, if the operator needs to restrict this functionality to specific projects, they can tag these projects with a designated tag and pass this tag as the value for this Masakari option. Consequently, instances from projects that do not have the specified tag are not considered for evacuation, even if they have the corresponding metadata key and value set.
[instance_failure]\process_all_instances

Enables the operator to decide whether all instances or only the ones that have [instance_failure]\ha_enabled_instance_metadata_key set to True should be recovered from instance failure events. If set to True, it will execute instance failure recovery actions for an instance irrespective of whether that particular instance has [instance_failure]\ha_enabled_instance_metadata_key set to True or not. Otherwise, it will only execute instance failure recovery actions for an instance which has [instance_failure]\ha_enabled_instance_metadata_key set to True.
[instance_failure]\ha_enabled_instance_metadata_key

Enables the operators to decide on the instance metadata key naming that affects the per-instance behavior of [instance_failure]\process_all_instances. The default is the same for both failure types, which include host and instance, but you can override the value to make the metadata key different per failure type.

Configure monitoring of cloud workload availability¶

MOSK enables cloud operators to oversee the availability of workloads hosted in their OpenStack infrastructure through the monitoring of floating IP addresses availability (Cloudpprober) and network port availability (Portprober).

For the feature description and usage requirements, refer to Workload monitoring.

Configure floating IP address availability monitoring¶

Available since MOSK 23.2 TechPreview

MOSK allows you to monitor the floating IP address availability through the Cloudprober service. This section explains the details of the service configuration.

Enable the Cloudprober service¶

Enable the Cloudprober service in the OpenStackDeployment custom resource:
```
spec:
  features:
    services:
      - cloudprober
```

Wait untill the OpenStackDeployment state becomes applied:

kubectl -n openstack get osdplst

Example of a positive system response:

NAME      OPENSTACK VERSION   CONTROLLER VERSION   STATE
osh-dev   yoga                0.13.1.dev54         APPLIED

Verify that the Cloudprober service is running:

kubectl -n openstack get pods -l application=cloudprober

Example of a positive system response:

NAME                                     READY   STATUS    RESTARTS   AGE
openstack-cloudprober-587b4bf7c4-lwmxx   2/2     Running   2          3d1h
openstack-cloudprober-587b4bf7c4-v9tt9   2/2     Running   0          3d1h

Verify that the Cloudprober service is sending data to StackLight:
1. Log in to the StackLight Prometheus web UI.
2. Navigate to Status - Targets.
3. Search for the openstack-cloudprober target and verify that it is UP.

Configure security groups¶

By default, for outgoing traffic, the IP address for the Cloudprober Pod is translated to the node IP address. In this procedure, we assume no further translation of that node IP address on the path between the node and floating network.

Identify the node IP address used for traffic destined to floating network by selecting the IP address from the floating network and running the following command on each OpenStack control plane node:
```
ip r get <floating ip> | grep -E -o '(src .*)' | awk '{print $2}'
```
In the project where monitored virtual machines are running, create a security group:
```
openstack security group create --project <project_id> instance-monitoring
```

Create the rule for each IP address you obtain in step 1:

openstack security group rule create --proto icmp --ingress --remote-ip <node ip> instance-monitoring

Mark instances with floating IPs for monitoring¶

Log in to the keystone-client Pod to assign the openstack.lcm.mirantis.com:prober tag to each instance to be added to monitoring:
```
openstack --os-compute-api-version 2.26 server set --tag openstack.lcm.mirantis.com:prober <INSTANCE_ID>
```

Assign the instance-monitoring security group to the server:

openstack server add security group <SERVER_ID> <SECURITY_GROUP_ID>

Verify that the instances have been added successfully.

Cloudprober uses auto-discovery of instances on periodic basis. Therefore, wait for the discovery interval to pass (defaults to 600 seconds) and execute the following command inside the keystone-client Pod:

curl -s http://cloudprober.openstack.svc.cluster.local:9313/metrics | grep <INSTANCE_ID>

Example of a positive system response:

cloudprober_total{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 266388 1685963215202
cloudprober_success{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 266386 1685963215202
cloudprober_latency{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7"} 315484742.137 1685963215202
cloudprober_validation_failure{ptype="ping",probe="openstack-instances-icmp-probe",dst="d34a0c6b-91a2-4bd3-95ea-772da49b90c3-10.11.12.122",openstack_hypervisor_hostname="mk-ps-xp4m27lfl56j-1-w74pc7cinu67-server-42kx24m22xop.cluster.local",openstack_instance_id="d34a0c6b-91a2-4bd3-95ea-772da49b90c3",openstack_instance_name="test-vm-proj",openstack_project_id="1eb031db8add42fda2fdb0ef2c2ad8d7",validator="data-integrity"} 0 1685963215202

Note

You can adjust the instance auto-discovery interval in the OpenStackDeployment object. However, Mirantis does not recommend setting it to too low values to avoid high load on the OpenStack API:

spec:
  features:
    cloudprober:
      discovery:
        interval: 300

Now, you can start seeing the availability of instances floating IP addresses per OpenStack compute node and project, as well as viewing the probe statistics for individual instance floating IP addresses through the OpenStack Instances Availability dashboard in Grafana.

See also

View Grafana dashboards
Cloudprober alerts

Enable network port availability monitoring¶

Available since MOSK 24.2 TechPreview

MOSK allows you to monitor the network port availability through the Portprober service.

The Portprober service is enabled by default when the Cloudprober service is enabled as described above, on clouds running OpenStack Antelope or newer version and using Neutron OVS backend for networking.

Also, you can enable Portprober explicitly, regardless of whether Cloudprober is enabled or not. See Network port availability monitoring (Portprober) for details.

When the service is enabled, you can monitor the network port availability through the OpenStack PortProber dashboard in Grafana.

See also

View Grafana dashboards

Ceph operations¶

This section outlines Ceph LCM operations such as adding Ceph Monitor, Ceph nodes, and RADOS Gateway nodes to an existing Ceph cluster or removing them, as well as removing or replacing Ceph OSDs. The section also includes OpenStack-specific operations for Ceph.

The following sections describe the Ceph cluster configuration options:

Ceph default configuration options¶

Ceph Controller provides the capability to specify configuration options for the Ceph cluster through the spec.cephClusterSpec.rookConfig key-value parameter of the KaaSCephCluster resource as if they were set in a usual ceph.conf file. For details, see Ceph advanced configuration.

However, if rookConfig is empty, Ceph Controller still specifies the following default configuration options for each Ceph cluster:

Required network parameters that you can change through the spec.cephClusterSpec.network section:

cluster network = <spec.cephClusterSpec.network.clusterNet>
public network = <spec.cephClusterSpec.network.publicNet>

General default configuration options that you can override using the rookConfig parameter:

mon target pg per osd = 200
mon max pg per osd = 600

# Workaround configuration option to avoid the
# https://github.com/rook/rook/issues/7573 issue
# when updating to Rook 1.6.x versions:
rgw_data_log_backing = omap

If rookConfig is empty but the spec.cephClsuterSpec.objectStore.rgw section is defined, Ceph Controller specifies the following OpenStack-related default configuration options for each Ceph cluster:

Ceph Object Gateway options, which you can override using the rookConfig parameter:

rgw swift account in url = true
rgw keystone accepted roles = '_member_, Member, member, swiftoperator'
rgw keystone accepted admin roles = admin
rgw keystone implicit tenants = true
rgw swift versioning enabled = true
rgw enforce swift acls = true
rgw_max_attr_name_len = 64
rgw_max_attrs_num_in_req = 32
rgw_max_attr_size = 1024
rgw_bucket_quota_ttl = 0
rgw_user_quota_bucket_sync_interval = 0
rgw_user_quota_sync_interval = 0
rgw s3 auth use keystone = true

Additional parameters for the Keystone integration:

Warning

All values with the keystone prefix are programmatically specified for each MOSK deployment. Do not modify these parameters manually.

rgw keystone api version = 3
rgw keystone url = <keystoneAuthURL>
rgw keystone admin user = <keystoneUser>
rgw keystone admin password = <keystonePassword>
rgw keystone admin domain = <keystoneProjectDomain>
rgw keystone admin project = <keystoneProjectName>

See also

Ceph advanced configuration

Ceph advanced configuration¶

This section describes how to configure a Ceph cluster through the KaaSCephCluster (kaascephclusters.kaas.mirantis.com) CR during or after the deployment of a MOSK cluster.

The KaaSCephCluster CR spec has two sections, cephClusterSpec and k8sCluster and specifies the nodes to deploy as Ceph components. Based on the roles definitions in the KaaSCephCluster CR, Ceph Controller automatically labels nodes for Ceph Monitors and Managers. Ceph OSDs are deployed based on the storageDevices parameter defined for each Ceph node.

For a default KaaSCephCluster CR, see Example of a complete template configuration for cluster creation.

Configure a Ceph cluster¶

Select from the following options:
- If you do not have a cluster yet, open kaascephcluster.yaml.template for editing.
- If the cluster is already deployed, open the KaasCephCluster CR for editing:
```
kubectl edit kaascephcluster -n <ClusterProjectName>
```
 Substitute <ClusterProjectName> with a corresponding value.

Using the tables below, configure the Ceph cluster as required.

High-level parameters
General parameters
Node parameters
NodeGroups parameters
ExtraOpts parameters

Pool parameters
Clients parameters
RADOS Gateway parameters
Multisite parameters ^{Technical Preview}
HealthCheck parameters

Select from the following options:
- If you are creating a cluster, save the updated KaaSCephCluster template to the corresponding file and proceed with the cluster creation.
- If you are configuring KaaSCephCluster of an existing cluster, exit the text editor to apply the change.

Ceph configuration options¶

High-level parameters¶
Parameter	Description
`cephClusterSpec`	Describes a Ceph cluster in the MOSK cluster. For details on `cephClusterSpec` parameters, see the tables below.
`k8sCluster`	Defines the cluster on which the `KaaSCephCluster` depends on. Use the `k8sCluster` parameter if the name or namespace of the corresponding MOSK cluster differs from default one: spec: k8sCluster: name: kaas-mgmt namespace: default

General parameters¶
Parameter	Description
`network`	Specifies networks for the Ceph cluster: `clusterNet` - specifies a Classless Inter-Domain Routing (CIDR) for the Ceph OSD replication network. Warning To avoid ambiguous behavior of Ceph daemons, do not specify `0.0.0.0/0` in `clusterNet`. Otherwise, Ceph daemons can select an incorrect public interface that can cause the Ceph cluster to become unavailable. The bare metal provider automatically translates the `0.0.0.0/0` network range to the default LCM IPAM subnet if it exists. Note The `clusterNet` and `publicNet` parameters support multiple IP networks. For details, see Enable Ceph multinetwork. `publicNet` - specifies a CIDR for communication between the service and operator. Warning To avoid ambiguous behavior of Ceph daemons, do not specify `0.0.0.0/0` in `publicNet`. Otherwise, Ceph daemons can select an incorrect public interface that can cause the Ceph cluster to become unavailable. The bare metal provider automatically translates the `0.0.0.0/0` network range to the default LCM IPAM subnet if it exists. Note The `clusterNet` and `publicNet` parameters support multiple IP networks. For details, see Enable Ceph multinetwork.
`nodes`	Specifies the list of Ceph nodes. For details, see Node parameters. The `nodes` parameter is a map with machine names as keys and Ceph node specifications as values, for example: nodes: master-0: <node spec> master-1: <node spec> ... worker-0: <node spec>
`nodeGroups`	Specifies the list of Ceph nodes grouped by node lists or node labels. For details, see NodeGroups parameters. The `nodeGroups` parameter is a map with group names as keys and Ceph node specifications for defined nodes or node labels as values. For example: nodes: group-1: spec: <node spec> nodes: ["master-0", "master-1"] group-2: spec: <node spec> label: <nodeLabelExpression> ... group-3: spec: <node spec> nodes: ["worker-2", "worker-3"] The `<nodeLabelExpression>` must be a valid Kubernetes label selector expression.
`pools`	Specifies the list of Ceph pools. For details, see Pool parameters.
`objectStorage`	Specifies the parameters for Object Storage, such as RADOS Gateway, the Ceph Object Storage. Also specifies the RADOS Gateway Multisite configuration. For details, see RADOS Gateway parameters and Multisite parameters.
`rookConfig`	Optional. String key-value parameter that allows overriding Ceph configuration options. Since MOSK 24.2, use the `\|` delimiter to specify the section where a parameter must be placed. For example, `mon` or `osd`. And, if required, use the `.` delimiter to specify the exact number of the Ceph OSD or Ceph Monitor to apply an option to a specific `mon` or `osd` and override the configuration of the corresponding section. The use of this option enables restart of only specific daemons related to the corresponding section. If you do not specify the section, a parameter is set in the `global` section, which includes restart of all Ceph daemons except Ceph OSD. For example: rookConfig: "osd_max_backfills": "64" "mon\|mon_health_to_clog": "true" "osd\|osd_journal_size": "8192" "osd.14\|osd_journal_size": "6250"
`extraOpts`	Available since MOSK 23.3. Enables specification of extra options for a setup, includes the `deviceLabels` parameter. For details, see ExtraOpts parameters.
`ingress`	In MOSK 25.1, is automatically replaced with `ingressConfig`. Enables a custom ingress rule for public access on Ceph services, for example, Ceph RADOS Gateway. For details, see Configure Ceph Object Gateway TLS.
`ingressConfig`	Available since MOSK 25.1 to automatically replace the `ingress` section. Enables a custom ingress rule for public access on Ceph services, for example, Ceph RADOS Gateway. For details, see Configure Ceph Object Gateway TLS.
`rbdMirror`	Enables pools mirroring between two interconnected clusters. For details, see Enable Ceph RBD mirroring.
`clients`	List of Ceph clients. For details, see Clients parameters.
`disableOsSharedKeys`	Disables autogeneration of shared Ceph values for OpenStack deployments. Set to `false` by default.
`mgr`	Contains the `mgrModules` parameter that should list the following keys: `name` - Ceph Manager module name `enabled` - flag that defines whether the Ceph Manager module is enabled `settings.balancerMode` - available since MOSK 25.1. Allows defining balancer mode for the Ceph Manager `balancer` module. Possible values are `crush-compat` or `upmap`. For example: mgr: mgrModules: - name: balancer enabled: true settings: balancerMode: upmap - name: pg_autoscaler enabled: true The `balancer` and `pg_autoscaler` Ceph Manager modules are enabled by default and cannot be disabled. Note Most Ceph Manager modules require additional configuration that you can perform through the `ceph-tools` pod on a MOSK cluster.
`healthCheck`	Configures health checks and liveness probe settings for Ceph daemons. For details, see HealthCheck parameters. Example configuration spec: cephClusterSpec: network: clusterNet: 10.10.10.0/24 publicNet: 10.10.11.0/24 nodes: master-0: <node spec> ... pools: - <pool spec> ... rookConfig: "mon max pg per osd": "600" ...

Node parameters¶
Parameter	Description
`roles`	Specifies the `mon`, `mgr`, or `rgw` daemon to be installed on a Ceph node. You can place the daemons on any nodes upon your decision. Consider the following recommendations: The recommended number of Ceph Monitors in a Ceph cluster is 3. Therefore, at least 3 Ceph nodes must contain the `mon` item in the `roles` parameter. The number of Ceph Monitors must be odd. Do not add more than 2 Ceph Monitors at a time and wait until the Ceph cluster is `Ready` before adding more daemons. For better HA and fault tolerance, the number of `mgr` roles must equal the number of `mon` roles. Therefore, we recommend labeling at least 3 Ceph nodes with the `mgr` role. If `rgw` roles are not specified, all `rgw` daemons will spawn on the same nodes with `mon` daemons. If a Ceph node contains a `mon` role, the Ceph Monitor Pod deploys on this node. If a Ceph node contains a `mgr` role, it informs the Ceph Controller that a Ceph Manager can be deployed on the node. Rook Operator selects the first available node to deploy the Ceph Manager on it: Before MOSK 23.1, only one Ceph Manager is deployed on a cluster. Since MOSK 23.1, two Ceph Managers, active and stand-by, are deployed on a cluster. If you assign the `mgr` role to three recommended Ceph nodes, one back-up Ceph node is available to redeploy a failed Ceph Manager in case of a server outage.
`storageDevices`	Specifies the list of devices to use for Ceph OSD deployment. Includes the following parameters: Note Since MOSK 23.3, Mirantis recommends migrating all `storageDevices` items to `by-id` symlinks as persistent device identifiers. For details, refer to Container Cloud documentation: Addressing storage devices. `fullPath` - a storage device symlink. Accepts the following values: Since MOSK 23.3, the device `by-id` symlink that contains the serial number of the physical device and does not contain `wwn`. For example, `/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543`. The `by-id` symlink should be equal to the one of `Machine` status `status.providerStatus.hardware.storage.byIDs` list. Mirantis recommends using this field for defining `by-id` symlinks. The device `by-path` symlink. For example, `/dev/disk/by-path/pci-0000:00:11.4-ata-3`. Since MOSK 23.3, Mirantis does not recommend specifying storage devices with device `by-path` symlinks because such identifiers are not persistent and can change at node boot. This parameter is mutually exclusive with `name`. `name` - a storage device name. Accepts the following values: The device name, for example, `sdc`. Since MOSK 23.3, Mirantis does not recommend specifying storage devices with device names because such identifiers are not persistent and can change at node boot. The device `by-id` symlink that contains the serial number of the physical device and does not contain `wwn`. For example, `/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543`. The `by-id` symlink should be equal to the one of `Machine` status `status.providerStatus.hardware.storage.byIDs` list. Since MOSK 23.3, Mirantis recommends using the `fullPath` field for defining `by-id` symlinks instead. This parameter is mutually exclusive with `fullPath`. `config` - a map of device configurations that must contain a mandatory `deviceClass` parameter set to `hdd`, `ssd`, or `nvme`. The device class must be defined in a pool and can optionally contain a metadata device, for example: storageDevices: - name: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS config: deviceClass: hdd metadataDevice: nvme01 osdsPerDevice: "2" The underlying storage format to use for Ceph OSDs is BlueStore. The `metadataDevice` parameter accepts a device name or logical volume path for the BlueStore device. Mirantis recommends using logical volume paths created on `nvme` devices. For devices partitioning on logical volumes, see Create a custom bare metal host profile. The `osdsPerDevice` parameter accepts the string-type natural numbers and allows splitting one device on several Ceph OSD daemons. Mirantis recommends using this parameter only for `ssd` or `nvme` disks.
`crush`	Specifies the explicit key-value CRUSH topology for a node. For details, see Ceph official documentation: CRUSH maps. Includes the following parameters: `datacenter` - a physical data center that consists of rooms and handles data. `room` - a room that accommodates one or more racks with hosts. `pdu` - a power distribution unit (PDU) device that has multiple outputs and distributes electric power to racks located within a data center. `row` - a row of computing racks inside `room`. `rack` - a computing rack that accommodates one or more hosts. `chassis` - a bare metal structure that houses or physically assembles hosts. `region` - the geographic location of one or more Ceph Object instances within one or more zones. `zone` - a logical group that consists of one or more Ceph Object instances. Example configuration: crush: datacenter: dc1 room: room1 pdu: pdu1 row: row1 rack: rack1 chassis: ch1 region: region1 zone: zone1
`monitorIP`	Optional. Available since MOSK 25.1. Specifies the custom monitor endpoint for the node on which the monitor is placed. The custom monitor endpoint can be equal, for example, to an IP address from the Ceph public network range. Example configuration: monitorIP: "192.168.13.1"

NodeGroups parameters¶
Parameter	Description
`spec`	Specifies a Ceph node specification. For the entire spec, see Node parameters.
`nodes`	Specifies a list of names of machines to which the Ceph node `spec` must be applied. Mutually exclusive with the `label` parameter. For example: nodeGroups: group-1: spec: <node spec> nodes: - master-0 - master-1 - worker-0
`label`	Specifies a string with a valid label selector expression to select machines to which the node spec must be applied. Mutually exclusive with `nodes` parameter. For example: nodeGroup: group-2: spec: <node spec> label: "ceph-storage-node=true,!ceph-control-node"

Pool parameters¶
Parameter	Description
`name`	Mandatory. Specifies the pool name as a prefix for each Ceph block pool. The resulting Ceph block pool name will be `<name>-<deviceClass>`.
`useAsFullName`	Optional. Enables Ceph block pool to use only the `name` value as a name. The resulting Ceph block pool name will be `<name>` without the `deviceClass` suffix.
`role`	Mandatory. Specifies the pool role and is used mostly for (MOSK) pools.
`default`	Mandatory. Defines if the pool and dependent StorageClass should be set as default. Must be enabled only for one pool.
`deviceClass`	Mandatory. Specifies the device class for the defined pool. Possible values are HDD, SSD, and NVMe.
`replicated`	Mandatory, mutually exclusive with `erasureCoded`. Includes the following parameters: `size` - the number of pool replicas. `targetSizeRatio` - Optional. A float percentage from `0.0` to `1.0`, which specifies the expected consumption of the total Ceph cluster capacity. The default values are as follows: The default ratio of the Ceph Object Storage `dataPool` is 10.0%. For the pools ratio for MOSK, see Add a Ceph cluster.
`erasureCoded`	Mandatory, mutually exclusive with `replicated`. Enables the erasure-coded pool. For details, see Rook documentation: Erasure coded and Ceph documentation: Erasure coded pool.
`failureDomain`	Mandatory. The failure domain across which the replicas or chunks of data will be spread. Set to `host` by default. The list of possible recommended values includes: `host`, `rack`, `room`, and `datacenter`. Caution Mirantis does not recommend using the following intermediate topology keys: `pdu`, `row`, `chassis`. Consider the `rack` topology instead. The `osd` failure domain is prohibited.
`mirroring`	Optional. Enables the mirroring feature for the defined pool. Includes the `mode` parameter that can be set to `pool` or `image`. For details, see Enable Ceph RBD mirroring.
`allowVolumeExpansion`	Optional. Not updatable as it applies only once. Enables expansion of persistent volumes based on `StorageClass` of a corresponding pool. For details, see Kubernetes documentation: Resizing persistent volumes using Kubernetes. Note A Kubernetes cluster only supports increase of storage size.
`rbdDeviceMapOptions`	Optional. Not updatable as it applies only once. Specifies custom `rbd device map` options to use with `StorageClass` of a corresponding pool. Allows customizing the Kubernetes CSI driver interaction with Ceph RBD for the defined `StorageClass`. For the available options, see Ceph documentation: Kernel RBD (KRBD) options.
`parameters`	Optional. Available since MOSK 23.1. Specifies the key-value map for the parameters of the Ceph pool. For details, see Ceph documentation: Set Pool values.
`reclaimPolicy`	Optional. Available since MOSK 23.3. Specifies reclaim policy for the underlying `StorageClass` of the pool. Accepts `Retain` and `Delete` values. Default is `Delete` if not set.

Example configuration:

pools:
- name: kubernetes
  role: kubernetes
  deviceClass: hdd
  replicated:
    size: 3
    targetSizeRatio: 10.0
  default: true

To configure additional required pools for MOSK, see Add a Ceph cluster.

Caution

Since Ceph Pacific, Ceph CSI driver does not propagate the 777 permission on the mount point of persistent volumes based on any StorageClass of the Ceph pool.

Clients parameters¶
Parameter	Description
`name`	Ceph client name.
`caps`	Key-value parameter with Ceph client capabilities. For details about `caps`, refer to Ceph documentation: Authorization (capabilities).

Example configuration:

clients:
- name: glance
  caps:
    mon: allow r, allow command "osd blacklist"
    osd: profile rbd pool=images

RADOS Gateway parameters¶
Parameter	Description
`name`	Ceph Object Storage instance name.
`dataPool`	Mutually exclusive with the `zone` parameter. Object storage data pool spec that should only contain `replicated` or `erasureCoded` and `failureDomain` parameters. The `failureDomain` parameter may be set to `osd` or `host`, defining the failure domain across which the data will be spread. For `dataPool`, Mirantis recommends using an `erasureCoded` pool. For details, see Rook documentation: Erasure coding. For example: cephClusterSpec: objectStorage: rgw: dataPool: erasureCoded: codingChunks: 1 dataChunks: 2
`metadataPool`	Mutually exclusive with the `zone` parameter. Object storage metadata pool spec that should only contain `replicated` and `failureDomain` parameters. The `failureDomain` parameter may be set to `osd` or `host`, defining the failure domain across which the data will be spread. Can use only `replicated` settings. For example: cephClusterSpec: objectStorage: rgw: metadataPool: replicated: size: 3 failureDomain: host where `replicated.size` is the number of full copies of data on multiple nodes. Warning When using the non-recommended Ceph pools `replicated.size` of less than `3`, Ceph OSD removal cannot be performed. The minimal replica size equals a rounded up half of the specified `replicated.size`. For example, if `replicated.size` is `2`, the minimal replica size is `1`, and if `replicated.size` is `3`, then the minimal replica size is `2`. The replica size of `1` allows Ceph having PGs with only one Ceph OSD in the `acting` state, which may cause a `PG_TOO_DEGRADED` health warning that blocks Ceph OSD removal. Mirantis recommends setting `replicated.size` to `3` for each Ceph pool.
`gateway`	The gateway settings corresponding to the `rgw` daemon settings. Includes the following parameters: `port` - the port on which the Ceph RGW service will be listening on HTTP. `securePort` - the port on which the Ceph RGW service will be listening on HTTPS. `instances` - the number of pods in the Ceph RGW ReplicaSet. If `allNodes` is set to `true`, a DaemonSet is created instead. Note Mirantis recommends using 2 instances for Ceph Object Storage. `allNodes` - defines whether to start the Ceph RGW pods as a DaemonSet on all nodes. The `instances` parameter is ignored if `allNodes` is set to `true`. For example: cephClusterSpec: objectStorage: rgw: gateway: allNodes: false instances: 1 port: 80 securePort: 8443
`preservePoolsOnDelete`	Defines whether to delete the data and metadata pools in the `rgw` section if the object storage is deleted. Set this parameter to `true` if you need to store data even if the object storage is deleted. However, Mirantis recommends setting this parameter to `false`.
`objectUsers` and `buckets`	Optional. To create new Ceph RGW resources, such as buckets or users, specify the following keys. Ceph Controller will automatically create the specified object storage users and buckets in the Ceph cluster. `objectUsers` - a list of user specifications to create for object storage. Contains the following fields: `name` - a user name to create. `displayName` - the Ceph user name to display. `capabilities` - user capabilities: `user` - admin capabilities to read/write Ceph Object Store users. `bucket` - admin capabilities to read/write Ceph Object Store buckets. `metadata` - admin capabilities to read/write Ceph Object Store metadata. `usage` - admin capabilities to read/write Ceph Object Store usage. `zone` - admin capabilities to read/write Ceph Object Store zones. The available options are ``, `read`, `write`, `read, write`. For details, see Ceph documentation: Add/remove admin capabilities. `quotas` - user quotas: `maxBuckets` - the maximum bucket limit for the Ceph user. Integer, for example, `10`. `maxSize` - the maximum size limit of all objects across all the buckets of a user. String size, for example, `10G`. `maxObjects` - the maximum number of objects across all buckets of a user. Integer, for example, `10`. For example: objectUsers: - capabilities: bucket: '' metadata: read user: read displayName: test-user name: test-user quotas: maxBuckets: 10 maxSize: 10G `users` - a list of strings that contain user names to create for object storage. Note This field is deprecated. Use `objectUsers` instead. If `users` is specified, it will be automatically transformed to the `objectUsers` section. `buckets` - a list of strings that contain bucket names to create for object storage.
`zone`	Optional. Mutually exclusive with `metadataPool` and `dataPool`. Defines the Ceph Multisite zone where the object storage must be placed. Includes the `name` parameter that must be set to one of the `zones` items. For details, see Enable multisite for Ceph RGW Object Storage. For example: cephClusterSpec: objectStorage: multisite: zones: - name: master-zone ... rgw: zone: name: master-zone
`SSLCert`	Optional. Custom TLS certificate parameters used to access the Ceph RGW endpoint. If not specified, a self-signed certificate will be generated. For example: cephClusterSpec: objectStorage: rgw: SSLCert: cacert: \| -----BEGIN CERTIFICATE----- ca-certificate here -----END CERTIFICATE----- tlsCert: \| -----BEGIN CERTIFICATE----- private TLS certificate here -----END CERTIFICATE----- tlsKey: \| -----BEGIN RSA PRIVATE KEY----- private TLS key here -----END RSA PRIVATE KEY-----
`SSLCertInRef`	Optional. Available since {{ product_name_abbr }} 25.1. Flag to determine that a TLS certificate for accessing the Ceph RGW endpoint is used but not exposed in `spec`. For example: cephClusterSpec: objectStorage: rgw: SSLCertInRef: true The operator must manually provide TLS configuration using the `rgw-ssl-certificate` secret in the `rook-ceph` namespace of the managed cluster. The secret object must have the following structure: data: cacert: <base64encodedCaCertificate> cert: <base64encodedCertificate> When removing an already existing `SSLCert` block, no additional actions are required, because this block uses the same `rgw-ssl-certificate` secret in the `rook-ceph` namespace. When adding a new secret directly without exposing it in `spec`, the following rules apply: `cert` - base64 representation of a file with the server TLS key, server TLS cert, and cacert. `cacert` - base64 representation of a cacert only.

For configuration example, see Enable Ceph RGW Object Storage.

ExtraOpts parameters¶
Parameter	Description
`deviceLabels`	Available since MOSK 23.3. A key-value setting used to assign a specification label to any available device on a specific node. These labels can then be utilized within `nodeGroups` or node definitions to eliminate the need to specify different devices for each node individually. Additionally, it helps in avoiding the use of device names, facilitating the grouping of nodes with similar labels. Usage: extraOpts: deviceLabels: <node-name>: <dev-label>: /dev/disk/by-id/<unique_ID> ... <node-name-n>: <dev-label-n>: /dev/disk/by-id/<unique_ID> nodesGroup: <group-name>: spec: storageDevices: - devLabel: <dev_label> - devLabel: <dev_label_n> nodes: - <node_name> - <node_name_n> Before MOSK 23.3, you need to specify the device labels for each node separately: nodes: <node-name>: - storageDevices: - fullPath: /dev/disk/by-id/<unique_ID> <node-name-n>: - storageDevices: - fullPath: /dev/disk/by-id/<unique_ID>
`customDeviceClasses`	Available since MOSK 23.3 as TechPreview. A list of custom device class names to use in the specification. Enables you to specify the custom names different from the default ones, which include `ssd`, `hdd`, and `nvme`, and use them in nodes and pools definitions. Usage: extraOpts: customDeviceClasses: - <custom_class_name> nodes: kaas-node-5bgk6: storageDevices: - config: # existing item deviceClass: <custom_class_name> fullPath: /dev/disk/by-id/<unique_ID> pools: - default: false deviceClass: <custom_class_name> erasureCoded: codingChunks: 1 dataChunks: 2 failureDomain: host Before MOSK 23.3, you cannot specify custom class names in the specification.

Multisite parameters¶
Parameter	Description
`realms` ^{Technical Preview}	List of realms to use, represents the realm namespaces. Includes the following parameters: `name` - the realm name. `pullEndpoint` - optional, required only when the master zone is in a different storage cluster. The endpoint, access key, and system key of the system user from the realm to pull from. Includes the following parameters: `endpoint` - the endpoint of the master zone in the master zone group. `accessKey` - the access key of the system user from the realm to pull from. `secretKey` - the system key of the system user from the realm to pull from.
`zoneGroups` ^{Technical Preview}	The list of zone groups for realms. Includes the following parameters: `name` - the zone group name. `realmName` - the realm namespace name to which the zone group belongs to.
`zones` ^{Technical Preview}	The list of zones used within one zone group. Includes the following parameters: `name` - the zone name. `metadataPool` - the settings used to create the Object Storage metadata pools. Must use replication. For details, see Pool parameters. `dataPool` - the settings to create the Object Storage data pool. Can use replication or erasure coding. For details, see Pool parameters. `zoneGroupName` - the zone group name. `endpointsForZone` - available since {{ product_name_abbr }} 24.2. The list of all endpoints in the zone group. If you use ingress proxy for RGW, the list of endpoints must contain that FQDN/IP address to access RGW. By default, if no ingress proxy is used, the list of endpoints is set to the IP address of the RGW external service. Endpoints must follow the HTTP URL format.

For configuration example, see Enable multisite for Ceph RGW Object Storage.

HealthCheck parameters¶
Parameter	Description
`daemonHealth`	Specifies health check settings for Ceph daemons. Contains the following parameters: `status` - configures health check settings for Ceph health `mon` - configures health check settings for Ceph Monitors `osd` - configures health check settings for Ceph OSDs Each parameter allows defining the following settings: `disabled` - a flag that disables the health check. `interval` - an interval in seconds or minutes for the health check to run. For example, `60s` for 60 seconds. `timeout` - a timeout for the health check in seconds or minutes. For example, `60s` for 60 seconds.
`livenessProbe`	Key-value parameter with liveness probe settings for the defined daemon types. Can be one of the following: `mgr`, `mon`, `osd`, or `mds`. Includes the `disabled` flag and the `probe` parameter. The `probe` parameter accepts the following options: `initialDelaySeconds` - the number of seconds after the container has started before the liveness probes are initiated. Integer. `timeoutSeconds` - the number of seconds after which the probe times out. Integer. `periodSeconds` - the frequency (in seconds) to perform the probe. Integer. `successThreshold` - the minimum consecutive successful probes for the probe to be considered successful after a failure. Integer. `failureThreshold` - the minimum consecutive failures for the probe to be considered failed after having succeeded. Integer. Note Ceph Controller specifies the following `livenessProbe` defaults for `mon`, `mgr`, `osd`, and `mds` (if CephFS is enabled): `5` for `timeoutSeconds` `5` for `failureThreshold`
`startupProbe`	Key-value parameter with startup probe settings for the defined daemon types. Can be one of the following: `mgr`, `mon`, `osd`, or `mds`. Includes the `disabled` flag and the `probe` parameter. The `probe` parameter accepts the following options: `timeoutSeconds` - the number of seconds after which the probe times out. Integer. `periodSeconds` - the frequency (in seconds) to perform the probe. Integer. `successThreshold` - the minimum consecutive successful probes for the probe to be considered successful after a failure. Integer. `failureThreshold` - the minimum consecutive failures for the probe to be considered failed after having succeeded. Integer.

The following sections describe the OpenStack-related Ceph operations:

Configure Ceph Object Gateway TLS¶

Once you enable Ceph Object Gateway (radosgw) as described in Enable Ceph RGW Object Storage, you can configure the Transport Layer Security (TLS) protocol for a Ceph Object Gateway public endpoint using the following options:

Using MOSK TLS, if it is enabled and exposes its certificates and domain for Ceph. In this case, Ceph Object Gateway will automatically create an ingress rule with MOSK certificates and domain to access the Ceph Object Gateway public endpoint. Therefore, you only need to reach the Ceph Object Gateway public and internal endpoints and set the CA certificates for a trusted TLS connection.
Using custom ingress specified in the KaaSCephCluster CR. In this case, Ceph Object Gateway public endpoint will use the public domain specified using the ingress parameters.

Caution

External Ceph Object Gateway service is not supported and will be deleted during update. If your system already uses endpoints of an external Ceph Object Gateway service, reconfigure them to the ingress endpoints.

Caution

When using a custom or OpenStack ingress, ensure to configure the DNS name for RGW to target an external IP address of that ingress. If there is no OpenStack or custom ingress available, point the DNS to an external load balancer of RGW.

Note

Since MOSK 23.3, if the cluster has tls-proxy enabled, TLS certificates specified in ingress objects, including those configured in the KaaSCephCluster specification, are disregarded. Instead, common certificates are applied to all ingresses from the OpenStackDeployment object. This implies that tlsCert and other ingress certificates specified in KaaSCephCluster are ignored, and the common certificate from the OpenStackDeployment object is used.

This section also describes how to specify a custom public endpoint for the Object Storage service.

To configure Ceph Object Gateway TLS:

Verify whether MOSK TLS is enabled. The spec.features.ssl.public_endpoints section should be specified in the OpenStackDeployment CR.
To generate an SSL certificate for internal usage, verify that the gateway securePort parameter is specified in the KaasCephCluster CR. For details, see Enable Ceph RGW Object Storage.

Select from the following options:

Since MOSK 25.1

Configure TLS for Ceph Object Gateway using a custom ingressConfig:

Open the KaasCephCluster CR for editing.

Specify the ingressConfig parameters:

Description of the tlsConfig section parameters¶
`certs`	TLS configuration for ingress including certificates. Contains the following parameters: `cacert` The Certificate Authority (CA) certificate, used for the ingress rule TLS support. `tlsCert` The TLS certificate, used for the ingress rule TLS support. `tlsKey` The TLS private key, used for the ingress rule TLS support.
`publicDomain`	Mandatory. The domain name to use for public endpoints. Caution The default ingress controller does not support `publicDomain` values different from the OpenStack ingress public domain. Therefore, if you intend to use the default OpenStack Ingress Controller for your Ceph Object Storage public endpoint, plan to use the same public domain as your OpenStack endpoints.
`hostname`	Custom name to override the Objectstore RGW name for public RGW access. Public RGW endpoint has the `https://<hostname>.<publicDomain>` format.
`tlsSecretRefName`	Optional. Secret name with TLS certs on the managed cluster in the `rook-ceph` namespace prepared by the operator. Allows avoiding exposure of certs directly in `spec`. Must contain the following format: data: ca.cert: <base64encodedCaCertificate> tls.crt: <base64encodedTlsCert> tls.key: <base64encodedTlsKey> Caution When using `tlsSecretRefName`, remove the following fields: `cacert`, `tlsCert`, and `tlsKey`.

Description of optional parameters in the ingressConfig section¶
`controllerClassName`	Name of the custom Ingress Controller. By default, the `openstack-ingress-nginx` class name is specified and Ceph uses the OpenStack Ingress Controller based on NGINX.
`annotations`	Extra annotations for the ingress proxy that are a key-value mapping of strings to add or override ingress rule annotations. For details, see NGINX Ingress Controller: Annotations. By default, the following annotations are set: `nginx.ingress.kubernetes.io/rewrite-target` is set to `/` `nginx.ingress.kubernetes.io/upstream-vhost` is set to `<rgwName>.rook-ceph.svc` The value for `<rgwName>` is located in `spec.cephClusterSpec.objectStorage.rgw.name`. Optional annotations: `nginx.ingress.kubernetes.io/proxy-request-buffering: "off"` that disables buffering for `ingress` to prevent the 413 (Request Entity Too Large) error when uploading large files using `radosgw`. `nginx.ingress.kubernetes.io/proxy-body-size: <size>` that increases the default uploading size limit to prevent the 413 (Request Entity Too Large) error when uploading large files using `radosgw`. Set the value in MB (`m`) or KB (`k`). For example, `100m`. Note By default, an ingress rule is created with an internal Ceph Object Gateway service endpoint as a backend. Also, `rgw dns name` is specified in the Ceph configuration and is set to `<rgwName>.rook-ceph.svc` by default. You can override `rgw dns name` using the `spec.cephClusterSpec.rookConfig` key-value parameter. In this case, also change the corresponding ingress annotation. Configuration example with the `rgw dns name` override spec: cephClusterSpec: objectStorage: rgw: name: rgw-store ingressConfig: tlsConfig: publicDomain: public.domain.name certs: cacert: \| -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- tlsCert: \| -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- tlsKey: \| -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- controllerClassName: openstack-ingress-nginx annotations: nginx.ingress.kubernetes.io/rewrite-target: / nginx.ingress.kubernetes.io/upstream-vhost: rgw-store.public.domain.name nginx.ingress.kubernetes.io/proxy-body-size: 100m rookConfig: "rgw dns name": rgw-store.public.domain.name For clouds with the `publicDomain` parameter specified, align the `upstream-vhost` ingress annotation with the name of the Ceph Object Storage and the specified public domain. Ceph Object Storage requires the `upstream-vhost` and `rgw dns name` parameters to be equal. Therefore, override the default `rgw dns name` with the corresponding ingress annotation value.

Before MOSK 25.1

Configure Ceph Object Gateway TLS using a custom ingress:

Warning

The rgw section is deprecated and the ingress parameters are moved under cephClusterSpec.ingress. If you continue using rgw.ingress, it will be automatically translated into cephClusterSpec.ingress during the MOSK cluster release update.

Open the KaasCephCluster CR for editing.
Specify the ingress parameters:
- publicDomain - domain name to use for the external service.
 
 Caution
 
 Since MOSK 23.3, the default ingress controller does not support publicDomain values different from the OpenStack ingress public domain. Therefore, if you intend to use the default OpenStack ingress controller for your Ceph Object Storage public endpoint, plan to use the same public domain as your OpenStack endpoints.
- cacert - Certificate Authority (CA) certificate, used for the ingress rule TLS support.
- tlsCert - TLS certificate, used for the ingress rule TLS support.
- tlsKey - TLS private key, used for the ingress rule TLS support.
- customIngress ^Optional - includes the following custom Ingress Controller parameters:
 - className - the custom Ingress Controller class name. If not specified, the openstack-ingress-nginx class name is used by default.
 - annotations - extra annotations for the ingress proxy. For details, see NGINX Ingress Controller: Annotations.
 
 By default, the following annotations are set:
 - nginx.ingress.kubernetes.io/rewrite-target is set to /
 - nginx.ingress.kubernetes.io/upstream-vhost is set to <rgwName>.rook-ceph.svc.
 
 The value for <rgwName> is spec.cephClusterSpec.objectStorage.rgw.name.
 Optional annotations:
 - nginx.ingress.kubernetes.io/proxy-request-buffering: "off" that disables buffering for ingress to prevent the 413 (Request Entity Too Large) error when uploading large files using radosgw.
 - nginx.ingress.kubernetes.io/proxy-body-size: <size> that increases the default uploading size limit to prevent the 413 (Request Entity Too Large) error when uploading large files using radosgw. Set the value in MB (m) or KB (k). For example, 100m.
 For example:
 customIngress: className: openstack-ingress-nginx annotations: nginx.ingress.kubernetes.io/rewrite-target: / nginx.ingress.kubernetes.io/upstream-vhost: openstack-store.rook-ceph.svc nginx.ingress.kubernetes.io/proxy-body-size: 100m
 Note
 
 An ingress rule is by default created with an internal Ceph Object Gateway service endpoint as a backend. Also, rgw dns name is specified in the Ceph configuration and is set to <rgwName>.rook-ceph.svc by default. You can override this option using the spec.cephClusterSpec.rookConfig key-value parameter. In this case, also change the corresponding ingress annotation.
 
 For example:
 spec: cephClusterSpec: objectStorage: rgw: name: rgw-store ingress: publicDomain: public.domain.name cacert: | -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- tlsCert: | -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- tlsKey: | -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- customIngress: annotations: "nginx.ingress.kubernetes.io/upstream-vhost": rgw-store.public.domain.name rookConfig: "rgw dns name": rgw-store.public.domain.name
 Warning
 - For clouds with the publicDomain parameter specified, align the upstream-vhost ingress annotation with the name of the Ceph Object Storage and the specified public domain.
 - Ceph Object Storage requires the upstream-vhost and rgw dns name parameters to be equal. Therefore, override the default rgw dns name to the corresponding ingress annotation value.

If MOSK TLS is enabled

Obtain the MOSK CA certificate for a trusted connection:

kubectl -n openstack-ceph-shared get secret openstack-rgw-creds -o jsonpath="{.data.ca_cert}" | base64 -d

Access internal and public Ceph Object Gateway endpoints by selecting one of the following options:

Note

If you are using the HTTP scheme instead of HTTPS:

Since MOSK 25.1

In the KaaSCephCluster object on the management cluster, find the ingressConfig section and add custom annotations:

spec:
  cephClusterSpec:
    ingressConfig:
      annotations:
        "nginx.ingress.kubernetes.io/force-ssl-redirect": "false"
        "nginx.ingress.kubernetes.io/ssl-redirect": "false"

You can omit TLS configuration for the default settings provided by OpenStack to be applied.

If both HTTP and HTTPS must be used, apply the following configuration in the KaaSCephCluster object:

spec:
  cephClusterSpec:
    ingressConfig:
      tlsConfig:
        publicDomain: public.domain.name
        cacert: |
          -----BEGIN CERTIFICATE-----
          ...
          -----END CERTIFICATE-----
        tlsCert: |
          -----BEGIN CERTIFICATE-----
          ...
          -----END CERTIFICATE-----
        tlsKey: |
          -----BEGIN RSA PRIVATE KEY-----
          ...
          -----END RSA PRIVATE KEY-----
      annotations:
        "nginx.ingress.kubernetes.io/force-ssl-redirect": "false"
        "nginx.ingress.kubernetes.io/ssl-redirect": "false"

Since MOSK 23.3

In the KaaSCephCluster object on the management cluster, find the ingress section and add custom annotations:

spec:
  cephClusterSpec:
    ingress:
      publicDomain: public.domain.name
      cacert: |
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
      tlsCert: |
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
      tlsKey: |
        -----BEGIN RSA PRIVATE KEY-----
        ...
        -----END RSA PRIVATE KEY-----
      customIngress:
        annotations:
          "nginx.ingress.kubernetes.io/force-ssl-redirect": "false"
          "nginx.ingress.kubernetes.io/ssl-redirect": "false"

Before MOSK 23.3

In the KaaSCephCluster object on the management cluster, find the ingress section and add custom annotations:

spec:
  cephClusterSpec:
    rookConfig:
      rgw_remote_addr_param: "HTTP_X_FORWARDED_FOR"

For a public endpoint

Obtain the Ceph Object Gateway public endpoint:
```
kubectl -n rook-ceph get ingress
```

Obtain the public endpoint TLS CA certificate:

kubectl -n rook-ceph get secret $(kubectl -n rook-ceph get ingress -o jsonpath='{.items[0].spec.tls[0].secretName}{"\n"}') -o jsonpath='{.data.ca\.crt}' | base64 -d; echo

For an internal endpoint

Obtain the internal endpoint name for Ceph Object Gateway:
```
kubectl -n rook-ceph get svc -l app=rook-ceph-rgw
```
The internal endpoint for Ceph Object Gateway has the https://<internal-svc-name>.rook-ceph.svc:<rgw-secure-port>/ format, where <rgw-secure-port> is spec.rgw.gateway.securePort specified in the KaaSCephCluster CR.

Obtain the internal endpoint TLS CA certificate:

kubectl -n rook-ceph get secret rgw-ssl-certificate -o jsonpath="{.data.cacert}" | base64 -d

Update the zonegroup hostnames of Ceph Object Gateway:
MOSK 22.5 and later
The hostnames zonegroup of Ceph Object Gateway updates automatically and you can skip this step if one of the following requirements is met:
- The public hostname matches the public domain name set by the spec.cephClusterSpec.ingressConfig.tlsConfig.publicDomain field
- The OpenStack configuration has been applied
Otherwise, complete the steps for MOSK 22.4 and earlier product versions.

Note

Prior to MOSK 25.1, use the spec.cephClusterSpec.ingress.publicDomain field instead.
MOSK 22.4 and earlier
1. Enter the rook-ceph-tools pod:
 kubectl -n rook-ceph exec -it deployment/rook-ceph-tools -- bash
2. Obtain Ceph Object Gateway default zonegroup configuration:
 radosgw-admin zonegroup get --rgw-zonegroup=<objectStorageName> --rgw-zone=<objectStorageName> | tee zonegroup.json
 Substitute <objectStorageName> with the Ceph Object Storage name from spec.cephClusterSpec.objectStorage.rgw.name.
3. Inspect zonegroup.json and verify that the hostnames key is a list that contains two endpoints: an internal endpoint and a custom public endpoint:
 "hostnames": ["rook-ceph-rgw-<objectStorageName>.rook-ceph.svc", <customPublicEndpoint>]
 Substitute <objectStorageName> with the Ceph Object Storage name and <customPublicEndpoint> with the public endpoint with a custom public domain.
4. If one or both endpoints are omitted in the list, add the missing endpoints to the hostnames list in the zonegroup.json file and update Ceph Object Gateway zonegroup configuration:
 radosgw-admin zonegroup set --rgw-zonegroup=<objectStorageName> --rgw-zone=<objectStorageName> --infile zonegroup.json radosgw-admin period update --commit
5. Verify that the hostnames list contains both the internal and custom public endpoint:
 radosgw-admin --rgw-zonegroup=<objectStorageName> --rgw-zone=<objectStorageName> zonegroup get | jq -r ".hostnames"
 Example of system response:
 [ "rook-ceph-rgw-obj-store.rook-ceph.svc", "obj-store.mcc1.cluster1.example.com" ]
6. Exit the rook-ceph-tools pod:
 exit

Once done, Ceph Object Gateway becomes available by the custom public endpoint with an S3 API client, OpenStack Swift CLI, and OpenStack Horizon Containers plugin.

Use object storage server-side encryption¶

TechPreview

When you use Ceph Object Gateway server-side encryption (SSE), unencrypted data sent over HTTPS is stored encrypted by the Ceph Object Gateway in the Ceph cluster. The current implementation integrates Barbican as a key management service.

The object storage SSE feature is enabled by default in MOSK deployments with Barbican. To use object storage SSE, the AWS CLI S3 client is used.

To use object storage server-side encryption:

Create Amazon Elastic Compute Cloud (EC2) credentials:
```
openstack ec2 credentials create
```
Configure AWS CLI with access and secret created in the previous step:
```
aws configure
```
Create a secret key in Barbican secret key:
```
openstack secret order create --name <name> --algorithm <algorithm> --mode <mode> --bit-length 256 --payload-content-type=<payload-content-type> key
```
Substitute the parameters enclosed in angle brackets:
- <name> - human-friendly name.
- <algorithm> - algorithm to use with the requested key. For example, aes.
- <mode> - algorithm mode to use with the requested key. For example, ctr.
- <payload-content-type> - type/format of the secret to generate. For example, application/octet-stream.
Verify that the key has been created:
```
openstack secret order get <order-href>
```
Substitute <order-href> with the corresponding value from the output of the previous command.
Specify the ceph-rgw user in the Barbican secret Access Control List (ACL):
1. Obtain the list of ceph-rgw users:
```
openstack user list --domain service | grep ceph-rgw
```
 Example output:
```
| c63b70134e0845a2b13c3f947880f66a | ceph-rgwZ6ycK3dY |
```
 In the output, capture the first value as the <user-id>, which is c63b70134e0845a2b13c3f947880f66a in the above example.
2. Specify the ceph-rgw user in the Barbican ACL:
```
openstack acl user add --user <user-id> <secret-href>
```
 Substitute <user-id> with the corresponding value from the output of the previous command and <secret-href> with the corresponding value obtained in step 3.
Create an S3 bucket:
```
aws --endpoint-url <rgw-endpoint-url> --ca-bundle <ca-bundle> s3api create-bucket --bucket <bucket-name>
```
Substitute the parameters enclosed in angle brackets:
- <rgw-endpoint-url> - Ceph Object Gateway endpoint DNS name
- <ca-bundle> - CA Certificate Bundle
- <bucket-name> - human-friendly bucket name
Upload a file using object storage SSE:
```
aws --endpoint-url <rgw-endpoint-url> --ca-bundle <ca-bundle> s3 cp <path-to-file> "s3://<bucket-name>/<filename>" --sse aws:kms --sse-kms-key-id <key-id>
```
Substitute the parameters enclosed in angle brackets:
- <path-to-file> - path to the file that you want to upload
- <filename> - name under which the uploaded file will be stored in the bucket
- <key-id> - Barbican secret key ID

Select from the following options to download the file:

Download the file using a key:

aws --endpoint-url <rgw-endpoint-url> --ca-bundle <ca-bundle> s3 cp "s3://<bucket-name>/<filename>" <path-to-output-file> --sse aws:kms --sse-kms-key-id <key-id>

Substitute <path-to-output-file> with the path to the file you want to download.

Download the file without a key:

aws --endpoint-url <rgw-endpoint-url> --ca-bundle <ca-bundle> s3 cp "s3://<bucket-name>/<filename>" <output-filename>

Set an Amazon S3 bucket policy¶

This section explains how to create an Amazon Simple Storage Service (Amazon S3 or S3) bucket and set an S3 bucket policy between two Ceph Object Storage users.

Create Ceph Object Storage users¶

Ceph Object Storage users can create Amazon S3 buckets and bucket policies that grant access to other users.

This section describes how to create two Ceph Object Storage users and configure their S3 credentials.

To create and configure Ceph Object Storage users:

Open the KaaSCephCluster CR:
```
kubectl --kubeconfig <managementKubeconfig> -n <managedClusterProject> edit kaascephcluster
```
Substitute <managementKubeconfig> with a management cluster kubeconfig file and <managedClusterProject> with a managed cluster project name.

In the cephClusterSpec section, add new Ceph Object Storage users.

Caution

For user name, apply the UUID format with no capital letters.

For example:

spec:
  cephClusterSpec:
    objectStorage:
      rgw:
        objectUsers:
        - name: user-b
          displayName: user-a
          capabilities:
            bucket: "*"
            user: read
        - name: user-t
          displayName: user-t
          capabilities:
            bucket: "*"
            user: read

Verify that rgwUserSecrets are created for both users:

kubectl --kubeconfig <managementKubeconfig> -n <managedClusterProject> get kaascephcluster -o yaml

Substitute <managementKubeconfig> with a management cluster kubeconfig file and <managedClusterProject> with a managed cluster project name.

Example of a positive system response:

status:
  miraCephSecretsInfo:
    secretInfo:
      rgwUserSecrets:
      - name: user-a
        secretName: <user-aCredSecretName>
        secretNamespace: <user-aCredSecretNamespace>
      - name: user-t
        secretName: <user-tCredSecretName>
        secretNamespace: <user-tCredSecretNamespace>

Obtain S3 user credentials from the cluster secrets. Specify an access key and a secret key for both users:

kubectl --kubeconfig <managedKubeconfig> -n <user-aCredSecretNamespace> get secret <user-aCredSecretName> -o jsonpath='{.data.AccessKey}' | base64 -d
kubectl --kubeconfig <managedKubeconfig> -n <user-aCredSecretNamespace> get secret <user-aCredSecretName> -o jsonpath='{.data.SecretKey}' | base64 -d
kubectl --kubeconfig <managedKubeconfig> -n <user-tCredSecretNamespace> get secret <user-tCredSecretName> -o jsonpath='{.data.AccessKey}' | base64 -d
kubectl --kubeconfig <managedKubeconfig> -n <user-tCredSecretNamespace> get secret <user-tCredSecretName> -o jsonpath='{.data.SecretKey}' | base64 -d

Substitute <managementKubeconfig> with a management cluster kubeconfig and specify the corresponding secretNamespace and secretName for both users.

Obtain Ceph Object Storage public endpoint from the KaaSCephCluster status:
```
kubectl --kubeconfig <managementKubeconfig> -n <managedClusterProject> get kaascephcluster -o yaml | grep PublicEndpoint
```
Substitute <managementKubeconfig> with a management cluster kubeconfig file and <managedClusterProject> with a managed cluster project name.

Example of a positive system response:
```
objectStorePublicEndpoint: https://object-storage.mirantis.example.com
```

Obtain the CA certificate to use an HTTPS endpoint:

kubectl --kubeconfig <managedKubeconfig> -n rook-ceph get secret $(kubectl -n rook-ceph get ingress -o jsonpath='{.items[0].spec.tls[0].secretName}{"\n"}') -o jsonpath='{.data.ca\.crt}' | base64 -d; echo

Save the output to ca.crt.

Set a bucket policy for a Ceph Object Storage user¶

Available since 2.23.1 (Cluster release 12.7.0)

Amazon S3 is an object storage service with different access policies. A bucket policy is a resource-based policy that grants permissions to a bucket and objects in it. For more details, see Amazon S3 documentation: Using bucket policies .

The following procedure illustrates the process of setting a bucket policy for a bucket (test01) stored in a Ceph Object Storage. The bucket policy requires at least two users: a bucket owner (user-a) and a bucket user (user-t). The bucket owner creates the bucket and sets the policy that regulates access for the bucket user.

Caution

For user name, apply the UUID format with no capital letters.

To configure an Amazon S3 bucket policy:

Note

The s3cmd is a free command-line tool and client for uploading, retrieving, and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol. You can download the s3cmd CLI tool from Amazon S3 tools: Download s3cmd.

Configure the s3cmd client with the user-a credentials:

s3cmd --configure --ca-certs=ca.crt

Specify the bucket access parameters as required:

Bucket access parameters¶
Parameter	Description	Comment
`Access Key`	Public part of access credentials.	Specify a user access key.
`Secret Key`	Secret part of access credentials.	Specify a user secret key.
`Default Region`	Region of AWS servers where requests are sent by default.	Use the default value.
`S3 Endpoint`	Connection point to the Ceph Object Storage.	Specify the Ceph Object Storage public endpoint.
`DNS-style bucket+hostname:port template for accessing a bucket`	Bucket location.	Specify the Ceph Object Storage public endpoint.
`Path to GPG program`	Path to the GNU Privacy Guard encryption suite.	Use the default value.
`Use HTTPS protocol`	HTTPS protocol switch.	Specify `Yes`.
`HTTP Proxy server name`	HTTP Proxy server name.	Skip this parameter.

When configured correctly, the s3cmd tool connects to the Ceph Object Storage. Save new settings when prompted by the system.

As user-a, create a new bucket test01:
```
s3cmd mb s3://test01
```
Example of a positive system response:
```
Bucket 's3://test01/' created
```

Upload an object to the bucket:

touch test.txt
s3cmd put test.txt s3://test01

Example of a positive system response:

upload: 'test.txt' -> 's3://test01/test.txt'  [1 of 1]
0 of 0     0% in    0s     0.00 B/s  done

Verify that the object is in the test01 bucket:

s3cmd ls s3://test01

Example of a positive system response:

2022-09-02 13:06            0  s3://test01/test.txt

Create the bucket policy file and add bucket CRUD permissions for user-t:

{
  "Version": "2012-10-17",
  "Id": "S3Policy1",
  "Statement": [
    {
     "Sid": "BucketAllow",
     "Effect": "Allow",
     "Principal": {
       "AWS": ["arn:aws:iam:::user/user-t"]
     },
     "Action": [
       "s3:ListBucket",
       "s3:PutObject",
       "s3:GetObject"
     ],
     "Resource": [
       "arn:aws:s3:::test01",
       "arn:aws:s3:::test01/*"
     ]
    }
  ]
}

Set the bucket policy for the test01 bucket:
```
s3cmd setpolicy policy.json s3://test01
```
Example of a positive system response:
```
s3://test01/: Policy updated
```
Verify that the user-t has access for the test01 bucket by reconfiguring the s3cmd client with the user-t credentials:
```
s3cmd  --ca-certs=ca.crt --configure
```
Specify the bucket access parameters in a similar to the step 1 manner.

When configured correctly, the s3cmd tool connects to the Ceph Object Storage. Save new settings when prompted by the system.

Verify that the user-t can read the bucket test01 content:
```
s3cmd ls s3://test01
```
Example of a positive system response:
```
2022-09-02 13:06            0  s3://test01/test.txt
```

Download the object from the test01 bucket:

s3cmd get s3://test01/test.txt check.txt

Example of a positive system response:

download: 's3://test01/test.txt' -> 'check.txt'  [1 of 1]
 0 of 0     0% in    0s     0.00 B/s  done

Upload a new object to the test01 bucket:

s3cmd put check.txt s3://test01

Example of a positive system response:

upload: 'check.txt' -> 's3://test01/check.txt'  [1 of 1]
 0 of 0     0% in    0s     0.00 B/s  done

Verify that the object is in the test01 bucket:

s3cmd ls s3://test01

Example of a positive system response:

2022-09-02 14:41            0  s3://test01/check.txt
2022-09-02 13:06            0  s3://test01/test.txt

Verify the new object by reconfiguring the s3cmd client with the user-a credentials:
```
s3cmd --configure --ca-certs=ca.crt
```

List test01 bucket objects:

s3cmd ls s3://test01

Example of a positive system response:

2022-09-02 14:41            0  s3://test01/check.txt
2022-09-02 13:06            0  s3://test01/test.txt

Set a bucket policy for OpenStack users¶

The following procedure illustrates the process of setting a bucket policy for a bucket between two OpenStack users.

Due to specifics of the Ceph integration with OpenStack projects, you should configure the bucket policy for OpenStack users indirectly through setting permissions for corresponding OpenStack projects.

For illustration purposes, we use the following names in the procedure:

test01 for the bucket
user-a, user-t for the OpenStack users
project-a, project-t for the OpenStack projects

To configure an Amazon S3 bucket policy for OpenStack users:

Specify the rookConfig parameter in the cephClusterSpec section of the KaaSCephCluster custom resource:

spec:
  cephClusterSpec:
    rookConfig:
      rgw keystone implicit tenants: "swift"

Prepare the Ceph Object Storage similarly to the procedure described in Create Ceph Object Storage users.

Create two OpenStack projects:

openstack project create project-a
openstack project create project-t

Example of system response:

+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description |                                  |
| domain_id   | default                          |
| enabled     | True                             |
| id          | faf957b776874a2e80384cb882ebf6ab |
| is_domain   | False                            |
| name        | project-a                         |
| options     | {}                               |
| parent_id   | default                          |
| tags        | []                               |
+-------------+----------------------------------+

You can also use existing projects. Save the ID of each project for the bucket policy specification.

Note

For details how to access OpenStack CLI, refer Access your OpenStack environment.

Create an OpenStack user for each project:

openstack user create user-a --project project-a
openstack user create user-t --project project-t

Example of system response:

+---------------------+----------------------------------+
| Field               | Value                            |
+---------------------+----------------------------------+
| default_project_id  | faf957b776874a2e80384cb882ebf6ab |
| domain_id           | default                          |
| enabled             | True                             |
| id                  | cc2607dc383e4494948d68eeb556f03b |
| name                | user-a                            |
| options             | {}                               |
| password_expires_at | None                             |
+---------------------+----------------------------------+

You can also use existing project users.

Assign the member role to the OpenStack users:

openstack role add member --user user-a --project project-a
openstack role add member --user user-t --project project-t

Verify that the OpenStack users have obtained the member roles paying attention to the role IDs:

openstack role show member

Example of system response:

+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | None                             |
| domain_id   | None                             |
| id          | 8f0ce4f6cd61499c809d6169b2b5bd93 |
| name        | member                           |
| options     | {'immutable': True}              |
+-------------+----------------------------------+

List the role assignments for the user-a and user-t:

openstack role assignment list --user user-a --project project-a
openstack role assignment list --user user-t --project project-t

Example of system response:

+----------------------------------+----------------------------------+-------+----------------------------------+--------+--------+-----------+
| Role                             | User                             | Group | Project                          | Domain | System | Inherited |
+----------------------------------+----------------------------------+-------+----------------------------------+--------+--------+-----------+
| 8f0ce4f6cd61499c809d6169b2b5bd93 | cc2607dc383e4494948d68eeb556f03b |       | faf957b776874a2e80384cb882ebf6ab |        |        | False     |
+----------------------------------+----------------------------------+-------+----------------------------------+--------+--------+-----------+

Create Amazon EC2 credentials for user-a and user-t:

openstack ec2 credentials create --user user-a --project project-a
openstack ec2 credentials create --user user-t --project project-t

Example of system response:

+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field      | Value                                                                                                                                                          |
+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| access     | d03971aedc2442dd9a79b3b409c32046                                                                                                                               |
| links      | {'self': 'http://keystone-api.openstack.svc.cluster.local:5000/v3/users/cc2607dc383e4494948d68eeb556f03b/credentials/OS-EC2/d03971aedc2442dd9a79b3b409c32046'} |
| project_id | faf957b776874a2e80384cb882ebf6ab                                                                                                                               |
| secret     | 0a9fd8d9e0d24aecacd6e75951154d0f                                                                                                                               |
| trust_id   | None                                                                                                                                                           |
| user_id    | cc2607dc383e4494948d68eeb556f03b                                                                                                                               |
+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

Obtain the values from the access and secret fields to connect with Ceph Object Storage trough the s3cmd tool.

Note

The s3cmd is a free command-line tool for uploading, retrieving, and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol. You can download the s3cmd tool from Amazon S3 tools: Download s3cmd.

Create bucket users and configure a bucket policy for the project-t OpenStack project similarly to the procedure described in Set a bucket policy for a Ceph Object Storage user. Ceph integration does not allow providing permissions for OpenStack users directly. Therefore, you need to set permissions for the project that corresponds to the user:

{
  "Version": "2012-10-17",
  "Id": "S3Policy1",
  "Statement": [
    {
     "Sid": "BucketAllow",
     "Effect": "Allow",
     "Principal": {
       "AWS": ["arn:aws:iam::<PROJECT-T_ID>:root"]
     },
     "Action": [
       "s3:ListBucket",
       "s3:PutObject",
       "s3:GetObject"
     ],
     "Resource": [
       "arn:aws:s3:::test01",
       "arn:aws:s3:::test01/*"
     ]
    }
  ]
}

Ceph Object Storage bucket policy examples¶

You can configure different bucket policies for various situations. See examples below.

See also

Calculate target ratio for Ceph pools¶

Ceph pool target ratio defines for the Placement Group (PG) autoscaler the amount of data the pools are expected to acquire over time in relation to each other. You can set initial PG values for each Ceph pool. Otherwise, the autoscaler starts with the minimum value and scales up, causing a lot of data to move in the background.

You can allocate several pools to use the same device class, which is a solid block of available capacity in Ceph. For example, if three pools (kubernetes-hdd, images-hdd, and volumes-hdd) are set to use the same device class hdd, you can set the target ratio for Ceph pools to provide 80% of capacity to the volumes-hdd pool and distribute the remaining capacity evenly between the two other pools. This way, Ceph pool target ratio instructs Ceph on when to warn that a pool is running out of free space and, at the same time, instructs Ceph on how many placement groups Ceph should allocate/autoscale for a pool for better data distribution.

Ceph pool target ratio is not a constant value and you can change it according to new capacity plans. Once you specify target ratio, if the PG number of a pool scales, other pools with specified target ratio will automatically scale accordingly.

For details, see Ceph Documentation: Autoscaling Placement Groups.

To calculate target ratio for each Ceph pool:

Define raw capacity of the entire storage by device class:
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o name) -- ceph df
```
For illustration purposes, the procedure below uses raw capacity of 185 TB or 189440 GB.
Design Ceph pools with the considered device class upper bounds of the possible capacity. For example, consider the hdd device class that contains the following pools:
- The kubernetes-hdd pool will contain not more than 2048 GB.
- The images-hdd pool will contain not more than 2048 GB.
- The volumes-hdd pool will contain 50 GB per VM. The upper bound of used VMs on the cloud is 204, the pool replicated size is 3. Therefore, calculate the upper bounds for volumes-hdd:
```
50 GB per VM * 204 VMs * 3 replicas = 30600 GB
```
- The backup-hdd pool can be calculated as a relative of volumes-hdd. For example, 1 volumes-hdd storage unit per 5 backup-hdd units.
- The vms-hdd is a pool for ephemeral storage Copy on Write (CoW). We recommend designing the amount of ephemeral data it should store. For example purposes, we use 500 GB. However, in reality, despite the CoW data reduction, this value is very optimistic.
Note

If dataPool is replicated and Ceph Object Store is planned for intensive use, also calculate upper bounds for dataPool.

Calculate target ratio for each considered pool. For example:

Example bounds and capacity¶
Pools upper bounds	Pools capacity
`kubernetes-hdd` = 2048 GB `images-hdd` = 2048 GB `volumes-hdd` = 30600 GB `backup-hdd` = 30600 GB * 5 = 153000 GB `vms-hdd` = 500 GB	Summary capacity = 188196 GB Total raw capacity = 189440 GB

Calculate pools fit factor using the (total raw capacity) / (pools summary capacity) formula. For example:
```
pools fit factor = 189440 / 188196 = 1.0066
```

Calculate pools upper bounds size using the (pool upper bounds) * (pools fit factor) formula. For example:

kubernetes-hdd = 2048 GB * 1.0066   = 2061.5168 GB
images-hdd     = 2048 GB * 1.0066   = 2061.5168 GB
volumes-hdd    = 30600 GB * 1.0066  = 30801.96 GB
backup-hdd     = 153000 GB * 1.0066 = 154009.8 GB
vms-hdd        = 500 GB * 1.0066    = 503.3 GB

Calculate pool target ratio using the (pool upper bounds) * 100 / (total raw capacity) formula. For example:

kubernetes-hdd = 2061.5168 GB * 100 / 189440 GB = 1.088
images-hdd     = 2061.5168 GB * 100 / 189440 GB = 1.088
volumes-hdd    = 30801.96 GB * 100 / 189440 GB  = 16.259
backup-hdd     = 154009.8 GB * 100 / 189440 GB  = 81.297
vms-hdd        = 503.3 GB * 100 / 189440 GB     = 0.266

If required, calculate the target ratio for erasure-coded pools.

Due to erasure-coded pools splitting each object into K data parts and M coding parts, the total used storage for each object is less than that in replicated pools. Indeed, M is equal to the number of OSDs that can be missing from the cluster without the cluster experiencing data loss. This means that planned data is stored with an efficiency of (K+M)/2 on the Ceph cluster. For example, if an erasure-coded data pool with K=2, M=2 planned capacity is 200 GB, then the total used capacity is 200*(2+2)/2, which is 400 GB.
Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the spec.cephClusterSpec.pools section, specify the calculated relatives as targetSizeRatio for each considered replicated pool. For example:

spec:
  cephClusterSpec:
    pools:
    - name: kubernetes
      deviceClass: hdd
      ...
      replicated:
        size: 3
        targetSizeRatio: 1.088
    - name: images
      deviceClass: hdd
      ...
      replicated:
        size: 3
        targetSizeRatio: 1.088
    - name: volumes
      deviceClass: hdd
      ...
      replicated:
        size: 3
        targetSizeRatio: 16.259
    - name: backup
      deviceClass: hdd
      ...
      replicated:
        size: 3
        targetSizeRatio: 81.297
    - name: vms
      deviceClass: hdd
      ...
      replicated:
        size: 3
        targetSizeRatio: 0.266

If Ceph Object Store dataPool is replicated and a proper value is calculated, also specify it:

spec:
  cephClusterSpec:
    objectStorage:
      rgw:
        name: rgw-store
        ...
        dataPool:
          deviceClass: hdd
          ...
          replicated:
            size: 3
            targetSizeRatio: <relative>

In the spec.cephClusterSpec.pools section, specify the calculated relatives as parameters.target_size_ratio for each considered erasure-coded pool. For example:

Note

The parameters section is a key-value mapping where the value is of the string type and should be quoted.

spec:
  cephClusterSpec:
    pools:
    - name: ec-pool
      deviceClass: hdd
      ...
      parameters:
        target_size_ratio: "<relative>"

If Ceph Object Store dataPool is erasure-coded and a proper value is calculated, also specify it:

spec:
  cephClusterSpec:
    objectStorage:
      rgw:
        name: rgw-store
        ...
        dataPool:
          deviceClass: hdd
          ...
          parameters:
            target_size_ratio: "<relative>"

Verify that all target ratio has been successfully applied to the Ceph cluster:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o name) -- ceph osd pool autoscale-status

Example of system response:

POOL                                SIZE  TARGET SIZE  RATE  RAW CAPACITY  RATIO   TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE
device_health_metrics               0                  2.0   149.9G        0.0000                                 1.0   1                   on
kubernetes-hdd                      2068               2.0   149.9G        0.0000  1.088         1.0885           1.0   32                  on
volumes-hdd                         19                 2.0   149.9G        0.0000  16.259        16.2591          1.0   256                 on
vms-hdd                             19                 2.0   149.9G        0.0000  0.266         0.2661           1.0   128                 on
backup-hdd                          19                 2.0   149.9G        0.0000  81.297        81.2972          1.0   256                 on
images-hdd                          888.8M             2.0   149.9G        0.0116  1.088         1.0881           1.0   32                  on

Optional. Repeat the steps above for other device classes.

Ceph pools for Cinder multi-backend¶

Available since MOSK 23.2

The KaaSCephCluster object supports multiple Ceph pools with the volumes role to configure Cinder multiple backends.

To define Ceph pools for Cinder multiple backends:

In the KaaSCephCluster object, add the desired number of Ceph pools to the pools section with the volumes role:

kubectl -n <MOSKClusterProject> edit kaascephcluster

Substitute <MOSKClusterProject> with corresponding namespace of the MOSK cluster.

Example configuration:

spec:
  cephClusterSpec:
    pools:
    - default: false
      deviceClass: hdd
      name: volumes
      replicated:
        size: 3
      role: volumes
    - default: false
      deviceClass: hdd
      name: volumes-backend-1
      replicated:
        size: 3
      role: volumes
    - default: false
      deviceClass: hdd
      name: volumes-backend-2
      replicated:
        size: 3
      role: volumes

Verify that Cinder backend pools are created and ready:

kubectl -n <managedClusterProject> get kaascephcluster -o yaml

Example output:

status:
  fullClusterStatus:
    blockStorageStatus:
      poolsStatus:
        volumes-hdd:
          present: true
          status:
            observedGeneration: 1
            phase: Ready
        volumes-backend-1-hdd:
          present: true
          status:
            observedGeneration: 1
            phase: Ready
        volumes-backend-2-hdd:
          present: true
          status:
            observedGeneration: 1
            phase: Ready

Verify that the added Ceph pools are accessible from the Cinder service. For example:

kubectl -n openstack exec -it cinder-volume-0 -- rbd ls -p volumes-backend-1-hdd -n client.cinder
kubectl -n openstack exec -it cinder-volume-0 -- rbd ls -p volumes-backend-2-hdd -n client.cinder

After the Ceph pool becomes available, it is automatically specified as an additional Cinder backend and registered as a new volume type, which you can use to create Cinder volumes.

The following sections describe how to configure, manage, and verify specific aspects of a Ceph cluster.

Caution

Before you proceed with any reading or writing operation, first verify the cluster status using the ceph tool as described in Verify the Ceph core services.

Automated Ceph LCM¶

This section describes the supported automated Ceph lifecycle management (LCM) operations.

High-level workflow of Ceph OSD or node removal¶

The Ceph LCM automated operations such as Ceph OSD or Ceph node removal are performed by creating a corresponding KaaSCephOperationRequest CR that creates separate CephOsdRemoveRequest requests. It allows for automated removal of healthy or non-healthy Ceph OSDs from a Ceph cluster and covers the following scenarios:

Reducing hardware - all Ceph OSDs are up/in but you want to decrease the number of Ceph OSDs by reducing the number of disks or hosts.
Hardware issues. For example, if a host unexpectedly goes down and will not be restored, or if a disk on a host goes down and requires replacement.

This section describes the KaaSCephOperationRequest CR creation workflow, specification, and request status.

For step-by-step procedures, refer to Automated Ceph LCM.

Creating a Ceph OSD removal request¶

The workflow of creating a Ceph OSD removal request includes the following steps:

Removing obsolete nodes or disks from the spec.nodes section of the KaaSCephCluster CR as described in Ceph advanced configuration.

Note

Note the names of the removed nodes, devices or their paths exactly as they were specified in KaaSCephCluster for further usage.
Creating a YAML template for the KaaSCephOperationRequest CR. For details, see KaaSCephOperationRequest OSD removal specification.
- If KaaSCephOperationRequest contains information about Ceph OSDs to remove in a proper format, the information will be validated to eliminate human error and avoid a wrong Ceph OSD removal.
- If the osdRemove.nodes section of KaaSCephOperationRequest is empty, the Ceph Request Controller will automatically detect Ceph OSDs for removal, if any. Auto-detection is based not only on the information provided in the KaaSCephCluster but also on the information from the Ceph cluster itself.
Once the validation or auto-detection completes, the entire information about the Ceph OSDs to remove appears in the KaaSCephOperationRequest object: hosts they belong to, OSD IDs, disks, partitions, and so on. The request then moves to the ApproveWaiting phase until the Operator manually specifies the approve flag in the spec.
Manually adding an affirmative approve flag in the KaaSCephOperationRequest spec. Once done, the Ceph Status Controller reconciliation pauses until the request is handled and executes the following:
- Stops regular Ceph Controller reconciliation
- Removes Ceph OSDs
- Runs batch jobs to clean up the device, if possible
- Removes host information from the Ceph cluster if the entire Ceph node is removed
- Marks the request with an appropriate result with a description of occurred issues
Note

If the request completes successfully, Ceph Controller reconciliation resumes. Otherwise, it remains paused until the issue is resolved.
Reviewing the Ceph OSD removal status. For details, see KaaSCephOperationRequest OSD removal status.
Manual removal of device cleanup jobs.
Note

Device cleanup jobs are not removed automatically and are kept in the ceph-lcm-mirantis namespace along with pods containing information about the executed actions. The jobs have the following labels:
```
labels:
 app: miraceph-cleanup-disks
 host: <HOST-NAME>
 osd: <OSD-ID>
 rook-cluster: <ROOK-CLUSTER-NAME>
```
Additionally, jobs are labeled with disk names that will be cleaned up, such as vdb=true. You can remove a single job or a group of jobs using any label described above, such as host, disk, and so on.

KaaSCephOperationRequest OSD removal specification¶

This section describes the KaaSCephOperationRequest CR specification used to automatically create a CephOsdRemoveRequest request. For the procedure workflow, see Creating a Ceph OSD removal request.

KaaSCephOperationRequest high-level parameters spec
KaaSCephOperationRequest ‘osdRemove’ parameters spec
KaaSCephOperationRequest ‘nodes’ parameters spec

KaaSCephOperationRequest high-level parameters spec¶
Parameter	Description
`osdRemove`	Describes the definition for the `CephOsdRemoveRequest` spec. For details on the `osdRemove` parameters, see the tables below.
`kaasCephCluster`	Defines `KaaSCephCluster` on which the `KaaSCephOperationRequest` depends on. Use the `kaasCephCluster` parameter if the name or project of the corresponding Container Cloud cluster differs from the default one: spec: kaasCephCluster: name: kaas-mgmt namespace: default

KaaSCephOperationRequest ‘osdRemove’ parameters spec¶
Parameter	Description
`nodes`	Map of Kubernetes nodes that specifies how to remove Ceph OSDs: by host-devices or OSD IDs. For details, see KaaSCephOperationRequest ‘nodes’ parameters spec.
`approve`	Flag that indicates whether a request is ready to execute removal. Can only be manually enabled by the Operator. For example: spec: osdRemove: approve: true
`keepOnFail`	Flag used to keep requests in handling and not to proceed to the next request if the `Validating` or `Processing` phases failed. The request will remain in the `InputWaiting` state until the flag or the request itself is removed or the request spec is updated. If the `Validation` phase fails, you can update the `spec.osdRemove.nodes` section in `KaaSCephCluster` to avoid issues and re-run the validation. If the `Processing` phase fails, you can resolve issues without resuming the Ceph Controller reconciliation and proceeding to the next request and apply the required actions to keep cluster data. For example: spec: osdRemove: keepOnFail: true
`resolved`	Optional. Flag that marks a finished request, even if it failed, to keep it in history. It allows resuming the Ceph Controller reconciliation without removing the failed request. The flag is used only by Ceph Controller and has no effect on request processing. Can only be manually specified. For example: spec: osdRemove: resolved: true
`resumeFailed`	Optional. Flag used to resume a failed request and proceed with Ceph OSD removal if the `KeepOnFail` is set and the request status is `InputWaiting`. For example: spec: osdRemove: resumeFailed: true

KaaSCephOperationRequest ‘nodes’ parameters spec¶
Parameter	Description
`completeCleanUp`	Flag used to clean up an entire node and drop it from the CRUSH map. Mutually exclusive with `cleanupByDevice` and `cleanupByOsdId`.
`cleanupByDevice`	List that describes devices to clean up by name or device path as they were specified in `KaaSCephCluster`. Mutually exclusive with `completeCleanUp` and `cleanupByOsdId`. Includes the following parameters: `name` - name of the device to remove from the Ceph cluster. Mutually exclusive with `path`. `path` - `by-path` of the device to remove from the Ceph cluster. Mutually exclusive with `name`. Supports device removal with `by-id`. Warning Since MOSK 23.3, Mirantis does not recommend setting device `name` or device `by-path` symlink in the `cleanupByDevice` field as these identifiers are not persistent and can change at node boot. Remove Ceph OSDs with `by-id` symlinks specified in the `path` field or use `cleanupByOsdId` instead. For details, see Container Cloud documentation: Addressing storage devices.
`cleanupByOsdId`	List of Ceph OSD IDs to remove. Mutually exclusive with `completeCleanUp` and `cleanupByDevice`.

The example above includes the following actions:

For node-a, full cleanup, including all OSDs on the node, node drop from the CRUSH map, and cleanup of all disks used for Ceph OSDs on this node.
For node-b, cleanup of Ceph OSDs with IDs 1, 15, and 25 along with the related disk information.
For node-c, cleanup of the device with name sdb, the device with path ID /dev/disk/by-path/pci-0000:00:1c.5, and the device with by-id /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS, dropping of OSDs running on these devices.

KaaSCephOperationRequest OSD removal status¶

This section describes the status.osdRemoveStatus.removeInfo fields of the KaaSCephOperationRequest CR that you can use to review a Ceph OSD or node removal phases. The following diagram represents the phases flow:

KaaSCephOperationRequest high-level parameters status¶
Parameter	Description
`osdRemoveStatus`	Describes the status of the current `CephOsdRemoveRequest`. For details, see KaaSCephOperationRequest ‘osdRemoveStatus’ parameters status.
`childNodesMapping`	The key-value mapping that reflects the management cluster machine names with their corresponding Kubernetes node names.

KaaSCephOperationRequest ‘osdRemoveStatus’ parameters status¶
Parameter	Description
`phase`	Describes the current request phase that can be one of: `Pending` - the request is created and placed in the request queue. `Validation` - the request is taken from the queue and the provided information is being validated. `ApproveWaiting` - the request passed the validation phase, is ready to execute, and is waiting for user confirmation through the approve flag. `Processing` - the request is executing following the next phases: `Pending` - marking the current Ceph OSD for removal. `Rebalancing` - the Ceph OSD is moved out, waiting until it is rebalanced. If the current Ceph OSD is down or already out, the next phase takes place. `Removing` - purging the Ceph OSD and its authorization key. `Removed` - the Ceph OSD has been successfully removed. `Failed` - the Ceph OSD failed to remove. `Completed` - the request executed with no issues. `CompletedWithWarnings` - the request executed with non-critical issues. Review the output, action may be required. `InputWaiting` - during the `Validation` or `Processing` phases, critical issues occurred that require attention. If issues occurred during validation, update `osdRemove` information, if present, and re-run validation. If issues occurred during processing, review the reported issues and manually resolve them. `Failed` - the request failed during the `Validation` or `Processing` phases.
`removeInfo`	The overall information about the Ceph OSDs to remove: final removal map, issues, and warnings. Once the `Processing` phase succeeds, `removeInfo` will be extended with the removal status for each node and Ceph OSD. In case of an entire node removal, the status will contain the status itself and an error message, if any. The `removeInfo.osdMapping` field contains information about: Ceph OSDs removal status. Batch job reference for the device cleanup: its name, status, and error, if any. The batch job status for the device cleanup will be either `Failed`, `Completed`, or `Skipped`. The `Skipped` status is used when a host is down, disk is crashed, or an error occurred when obtaining the `ceph-volume` information. Ceph OSD deployment removal status and the related Ceph OSD name. The status will be either `Failed` or `Removed`.
`messages`	Informational messages describing the reason for the request transition to the next phase.
`conditions`	History of spec updates for the request.

Example of status.osdRemoveStatus.removeInfo after successful Validation

removeInfo:
  cleanUpMap:
    "node-a":
      completeCleanUp: true
      osdMapping:
        "2":
          deviceMapping:
            "sdb":
              path: "/dev/disk/by-path/pci-0000:00:0a.0"
              partition: "/dev/ceph-a-vg_sdb/osd-block-b-lv_sdb"
              type: "block"
              class: "hdd"
              zapDisk: true
        "6":
          deviceMapping:
            "sdc":
              path: "/dev/disk/by-path/pci-0000:00:0c.0"
              partition: "/dev/ceph-a-vg_sdc/osd-block-b-lv_sdc-1"
              type: "block"
              class: "hdd"
              zapDisk: true
        "11":
          deviceMapping:
            "sdc":
              path: "/dev/disk/by-path/pci-0000:00:0c.0"
              partition: "/dev/ceph-a-vg_sdc/osd-block-b-lv_sdc-2"
              type: "block"
              class: "hdd"
              zapDisk: true
    "node-b":
      osdMapping:
        "1":
          deviceMapping:
            "sdb":
              path: "/dev/disk/by-path/pci-0000:00:0a.0"
              partition: "/dev/ceph-b-vg_sdb/osd-block-b-lv_sdb"
              type: "block"
              class: "ssd"
              zapDisk: true
        "15":
          deviceMapping:
            "sdc":
              path: "/dev/disk/by-path/pci-0000:00:0b.1"
              partition: "/dev/ceph-b-vg_sdc/osd-block-b-lv_sdc"
              type: "block"
              class: "ssd"
              zapDisk: true
        "25":
          deviceMapping:
            "sdd":
              path: "/dev/disk/by-path/pci-0000:00:0c.2"
              partition: "/dev/ceph-b-vg_sdd/osd-block-b-lv_sdd"
              type: "block"
              class: "ssd"
              zapDisk: true
    "node-c":
      osdMapping:
        "0":
          deviceMapping:
            "sdb":
              path: "/dev/disk/by-path/pci-0000:00:1t.9"
              partition: "/dev/ceph-c-vg_sdb/osd-block-c-lv_sdb"
              type: "block"
              class: "hdd"
              zapDisk: true
        "8":
          deviceMapping:
            "sde":
              path: "/dev/disk/by-path/pci-0000:00:1c.5"
              partition: "/dev/ceph-c-vg_sde/osd-block-c-lv_sde"
              type: "block"
              class: "hdd"
              zapDisk: true
            "sdf":
              path: "/dev/disk/by-path/pci-0000:00:5a.5",
              partition: "/dev/ceph-c-vg_sdf/osd-db-c-lv_sdf-1",
              type: "db",
              class: "ssd"

The example above is based on the example spec provided in KaaSCephOperationRequest OSD removal specification. During the Validation phase, the provided information was validated and reflects the final map of the Ceph OSDs to remove:

For node-a, Ceph OSDs with IDs 2, 6, and 11 will be removed with the related disk and its information: all block devices, names, paths, and disk class.
For node-b, the Ceph OSDs with IDs 1, 15, and 25 will be removed with the related disk information.
For node-c, the Ceph OSD with ID 8 will be removed, which is placed on the specified sdb device. The related partition on the sdf disk, which is used as the BlueStore metadata device, will be cleaned up keeping the disk itself untouched. Other partitions on that device will not be touched.

Note

In case of failures similar to the examples above, review the ceph-request-controller logs and the Ceph cluster status. Such failures may simply indicate timeout and retry issues. If no other issues were found, re-create the request with a new name and skip adding successfully removed Ceph OSDS or Ceph nodes.

Add, remove, or reconfigure Ceph nodes¶

Mirantis Ceph Controller simplifies a Ceph cluster management by automating LCM operations. This section describes how to add, remove, or reconfigure Ceph nodes.

Note

When adding a Ceph node with the Ceph Monitor role, if any issues occur with the Ceph Monitor, rook-ceph removes it and adds a new Ceph Monitor instead, named using the next alphabetic character in order. Therefore, the Ceph Monitor names may not follow the alphabetical order. For example, a, b, d, instead of a, b, c.

Add Ceph nodes on a managed cluster¶

Prepare a new machine for the required managed cluster as described in Add a machine. During machine preparation, update the settings of the related bare metal host profile for the Ceph node being replaced with the required machine devices as described in Create a custom bare metal host profile.
Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.
In the nodes section, specify the parameters for a Ceph node as required. For the parameters description, see Node parameters.

The example configuration of the nodes section with the new node:
Since MOSK 23.3
nodes: kaas-node-5bgk6: roles: - mon - mgr storageDevices: - config: deviceClass: hdd fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
Before MOSK 23.3
nodes: kaas-node-5bgk6: roles: - mon - mgr storageDevices: - config: deviceClass: hdd name: sdb
Warning

Since MOSK 23.3, Mirantis highly recommends using the non-wwn by-id symlinks to specify storage devices in the storageDevices list.

For details, see Container Cloud documentation: Addressing storage devices.
Note
- To use a new Ceph node for a Ceph Monitor or Ceph Manager deployment, also specify the roles parameter.
- Reducing the number of Ceph Monitors is not supported and causes the Ceph Monitor daemons removal from random nodes.
- Removal of the mgr role in the nodes section of the KaaSCephCluster CR does not remove Ceph Managers. To remove a Ceph Manager from a node, remove it from the nodes spec and manually delete the mgr pod in the Rook namespace.

Verify that all new Ceph daemons for the specified node have been successfully deployed in the Ceph cluster. The fullClusterInfo section should not contain any issues.

kubectl -n <managedClusterProjectName> get kaascephcluster -o yaml

Remove a Ceph node from a managed cluster¶

Note

Ceph node removal presupposes usage of a KaaSCephOperationRequest CR. For workflow overview, spec and phases description, see High-level workflow of Ceph OSD or node removal.

Note

To remove a Ceph node with a mon role, first move the Ceph Monitor to another node and remove the mon role from the Ceph node as described in Move a Ceph Monitor daemon to another node.

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the spec.cephClusterSpec.nodes section, remove the required Ceph node specification.

For example:

spec:
  cephClusterSpec:
    nodes:
      worker-5: # remove the entire entry for the required node
        storageDevices: {...}
        roles: [...]

Create a YAML template for the KaaSCephOperationRequest CR. For example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
  name: remove-osd-worker-5
  namespace: <managedClusterProjectName>
spec:
  osdRemove:
    nodes:
      worker-5:
        completeCleanUp: true
  kaasCephCluster:
    name: <kaasCephClusterName>
    namespace: <managedClusterProjectName>

Substitute <managedClusterProjectName> with the corresponding cluster namespace and <kaasCephClusterName> with the corresponding KaaSCephCluster name.

Apply the template on the management cluster in the corresponding namespace:
```
kubectl apply -f remove-osd-worker-5.yaml
```

Verify that the corresponding request has been created:

kubectl get kaascephoperationrequest remove-osd-worker-5 -n <managedClusterProjectName>

Verify that the removeInfo section appeared in the KaaSCephOperationRequest CR status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest remove-osd-worker-5 -o yaml

Verify that the cleanUpMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:
```
kubectl -n <managedClusterProjectName> get kaascephoperationrequest remove-osd-worker-5 -o yaml
```
Example of system response:
```
status:
 phase: ApproveWaiting
```

Edit the KaaSCephOperationRequest CR and set the approve flag to true:

kubectl -n <managedClusterProjectName> edit kaascephoperationrequest remove-osd-worker-5

For example:

spec:
  osdRemove:
    approve: true

Review the status of the KaaSCephOperationRequest resource request processing. The valuable parameters are as follows:
- status.phase - the current state of request processing
- status.messages - the description of the current phase
- status.conditions - full history of request processing before the current phase
- status.removeInfo.issues and status.removeInfo.warnings - contain error and warning messages occurred during request processing

Verify that the KaaSCephOperationRequest has been completed. For example:

status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues

Remove the device cleanup jobs:

kubectl delete jobs -n ceph-lcm-mirantis -l app=miraceph-cleanup-disks

Reconfigure a Ceph node on a managed cluster¶

There is no hot reconfiguration procedure for existing Ceph OSDs and Ceph Monitors. To reconfigure an existing Ceph node, follow the steps below:

Remove the Ceph node from the Ceph cluster as described in Remove a Ceph node from a managed cluster.
Add the same Ceph node but with a modified configuration as described in Add Ceph nodes on a managed cluster.

Add, remove, or reconfigure Ceph OSDs¶

Mirantis Ceph Controller simplifies Ceph cluster management by automating LCM operations. This section describes how to add, remove, or reconfigure Ceph OSDs.

Add a Ceph OSD on a managed cluster¶

Manually prepare the required machine devices with LVM2 on the existing node because BareMetalHostProfile does not support in-place changes.
To add a Ceph OSD to an existing or hot-plugged raw device
If you want to add a Ceph OSD on top of a raw device that already exists on a node or is hot-plugged, add the required device using the following guidelines:
- You can add a raw device to a node during node deployment.
- If a node supports adding devices without node reboot, you can hot plug a raw device to a node.
- If a node does not support adding devices without node reboot, you can hot plug a raw device during node shutdown. In this case, complete the following steps:
  1. Enable maintenance mode on the managed cluster.
  2. Turn off the required node.
  3. Attach the required raw device to the node.
  4. Turn on the required node.
  5. Disable maintenance mode on the managed cluster.
Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes.<machineName>.storageDevices section, specify the parameters for a Ceph OSD as required. For the parameters description, see Node parameters.

The example configuration of the nodes section with the new node:

Since MOSK 23.3

nodes:
  kaas-node-5bgk6:
    roles:
    - mon
    - mgr
    storageDevices:
    - config: # existing item
        deviceClass: hdd
      fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
    - config: # new item
        deviceClass: hdd
      fullPath: /dev/disk/by-id/scsi-0ATA_HGST_HUS724040AL_PN1334PEHN1VBC

Before MOSK 23.3

nodes:
  kaas-node-5bgk6:
    roles:
    - mon
    - mgr
    storageDevices:
    - config: # existing item
        deviceClass: hdd
      name: sdb
    - config: # new item
        deviceClass: hdd
      name: sdc

Warning

Since MOSK 23.3, Mirantis highly recommends using the non-wwn by-id symlinks to specify storage devices in the storageDevices list.

For details, see Container Cloud documentation: Addressing storage devices.

Verify that the Ceph OSD on the specified node is successfully deployed. The fullClusterInfo section should not contain any issues.
```
kubectl -n <managedClusterProjectName> get kaascephcluster -o yaml
```
For example:
```
status:
 fullClusterInfo:
 daemonsStatus:
 ...
 osd:
 running: '3/3 running: 3 up, 3 in'
 status: Ok
```
Note

Since MOSK 23.2, cephDeviceMapping is removed because its large size can potentially exceed the Kubernetes 1.5 MB quota.

Verify the Ceph OSD on the managed cluster:

kubectl -n rook-ceph get pod -l app=rook-ceph-osd -o wide | grep <machineName>

Remove a Ceph OSD from a managed cluster¶

Note

Ceph OSD removal presupposes usage of a KaaSCephOperationRequest CR. For workflow overview, spec and phases description, see High-level workflow of Ceph OSD or node removal.

Warning

When using the non-recommended Ceph pools replicated.size of less than 3, Ceph OSD removal cannot be performed. The minimal replica size equals a rounded up half of the specified replicated.size.

For example, if replicated.size is 2, the minimal replica size is 1, and if replicated.size is 3, then the minimal replica size is 2. The replica size of 1 allows Ceph having PGs with only one Ceph OSD in the acting state, which may cause a PG_TOO_DEGRADED health warning that blocks Ceph OSD removal. Mirantis recommends setting replicated.size to 3 for each Ceph pool.

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

Remove the required Ceph OSD specification from the spec.cephClusterSpec.nodes.<machineName>.storageDevices list:

The example configuration of the nodes section with the new node:

Since MOSK 23.3

nodes:
  kaas-node-5bgk6:
    roles:
    - mon
    - mgr
    storageDevices:
    - config:
        deviceClass: hdd
      fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
    - config: # remove the entire item entry from storageDevices list
        deviceClass: hdd
      fullPath: /dev/disk/by-id/scsi-0ATA_HGST_HUS724040AL_PN1334PEHN1VBC

Before MOSK 23.3

nodes:
  kaas-node-5bgk6:
    roles:
    - mon
    - mgr
    storageDevices:
    - config:
        deviceClass: hdd
      name: sdb
    - config: # remove the entire item entry from storageDevices list
        deviceClass: hdd
      name: sdc

Create a YAML template for the KaaSCephOperationRequest CR. Select from the following options:
- Remove Ceph OSD by device name, by-path symlink, or by-id symlink:
```
apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
 name: remove-osd-<machineName>-sdb
 namespace: <managedClusterProjectName>
spec:
 osdRemove:
 nodes:
 <machineName>:
 cleanupByDevice:
 - name: sdb
 kaasCephCluster:
 name: <kaasCephClusterName>
 namespace: <managedClusterProjectName>
```
 Substitute <managedClusterProjectName> with the corresponding cluster namespace and <kaasCephClusterName> with the corresponding KaaSCephCluster name.
 
 Warning
 
 Since MOSK 23.3, Mirantis does not recommend setting device name or device by-path symlink in the cleanupByDevice field as these identifiers are not persistent and can change at node boot. Remove Ceph OSDs with by-id symlinks specified in the path field or use cleanupByOsdId instead.
 
 For details, see Container Cloud documentation: Addressing storage devices.
 Note
 - Since MOSK 23.1, cleanupByDevice is not supported if a device was physically removed from a node. Therefore, use cleanupByOsdId instead. For details, see Remove a failed Ceph OSD by Ceph OSD ID.
 - Before MOSK 23.1, if the storageDevice item was specified with by-id, specify the path parameter in the cleanupByDevice section instead of name.
 - If the storageDevice item was specified with a by-path device path, specify the path parameter in the cleanupByDevice section instead of name.
- Remove Ceph OSD by OSD ID:
```
apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
 name: remove-osd-<machineName>-sdb
 namespace: <managedClusterProjectName>
spec:
 osdRemove:
 nodes:
 <machineName>:
 cleanupByOsdId:
 - 2
 kaasCephCluster:
 name: <kaasCephClusterName>
 namespace: <managedClusterProjectName>
```
 Substitute <managedClusterProjectName> with the corresponding cluster namespace and <kaasCephClusterName> with the corresponding KaaSCephCluster name.
Apply the template on the management cluster in the corresponding namespace:
```
kubectl apply -f remove-osd-<machineName>-sdb.yaml
```

Verify that the corresponding request has been created:

kubectl get kaascephoperationrequest remove-osd-<machineName>-sdb -n <managedClusterProjectName>

Verify that the removeInfo section appeared in the KaaSCephOperationRequest CR status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest remove-osd-<machineName>-sdb -o yaml

Example of system response:

status:
  childNodesMapping:
    kaas-node-d4aac64d-1721-446c-b7df-e351c3025591: <machineName>
  osdRemoveStatus:
    removeInfo:
      cleanUpMap:
        kaas-node-d4aac64d-1721-446c-b7df-e351c3025591:
          osdMapping:
            "10":
              deviceMapping:
                sdb:
                  path: "/dev/disk/by-path/pci-0000:00:1t.9"
                  partition: "/dev/ceph-b-vg_sdb/osd-block-b-lv_sdb"
                  type: "block"
                  class: "hdd"
                  zapDisk: true

Verify that the cleanUpMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest remove-osd-<machineName>-sdb -o yaml

Example of system response:

status:
  phase: ApproveWaiting

Edit the KaaSCephOperationRequest CR and set the approve flag to true:

kubectl -n <managedClusterProjectName> edit kaascephoperationrequest remove-osd-<machineName>-sdb

For example:

spec:
  osdRemove:
    approve: true

Review the following status fields of the KaaSCephOperationRequest CR request processing:
- status.phase - current state of request processing
- status.messages - description of the current phase
- status.conditions - full history of request processing before the current phase
- status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during request processing, if any

Verify that the KaaSCephOperationRequest has been completed. For example:

status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues

Remove the device cleanup jobs:

kubectl delete jobs -n ceph-lcm-mirantis -l app=miraceph-cleanup-disks

Reconfigure a Ceph OSD on a managed cluster¶

There is no hot reconfiguration procedure for existing Ceph OSDs. To reconfigure an existing Ceph node, follow the steps below:

Remove a Ceph OSD from the Ceph cluster as described in Remove a Ceph OSD from a managed cluster.
Add the same Ceph OSD but with a modified configuration as described in Add a Ceph OSD on a managed cluster.

Add, remove, or reconfigure Ceph OSDs with metadata devices¶

Mirantis Ceph Controller simplifies Ceph cluster management by automating LCM operations. This section describes how to add, remove, or reconfigure Ceph OSDs with a separate metadata device.

Add a Ceph OSD with a metadata device¶

From the Ceph disks defined in the BareMetalHostProfile object that was configured using the Configure Ceph disks in a host profile procedure, select one disk for data and one logical volume for metadata of a Ceph OSD to be added to the Ceph cluster.

Note

If you add a new disk after machine provisioning, manually prepare the required machine devices using Logical Volume Manager (LVM) 2 on the existing node because BareMetalHostProfile does not support in-place changes.
To add a Ceph OSD to an existing or hot-plugged raw device
If you want to add a Ceph OSD on top of a raw device that already exists on a node or is hot-plugged, add the required device using the following guidelines:
- You can add a raw device to a node during node deployment.
- If a node supports adding devices without node reboot, you can hot plug a raw device to a node.
- If a node does not support adding devices without node reboot, you can hot plug a raw device during node shutdown. In this case, complete the following steps:
  1. Enable maintenance mode on the managed cluster.
  2. Turn off the required node.
  3. Attach the required raw device to the node.
  4. Turn on the required node.
  5. Disable maintenance mode on the managed cluster.
Open the KaasCephCluster object for editing:
```
kubectl -n <managedClusterProjectName> edit kaascephcluster
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes.<machineName>.storageDevices section, specify the parameters for a Ceph OSD as required. For the parameters description, see Node parameters.

The example configuration of the nodes section with the new node:

Since MOSK 23.3

nodes:
  kaas-node-5bgk6:
    roles:
    - mon
    - mgr
    storageDevices:
    - config: # existing item
        deviceClass: hdd
      fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
    - config: # new item
        deviceClass: hdd
        metadataDevice: /dev/bluedb/meta_1
      fullPath: /dev/disk/by-id/scsi-0ATA_HGST_HUS724040AL_PN1334PEHN1VBC

Before MOSK 23.3

nodes:
  kaas-node-5bgk6:
    roles:
    - mon
    - mgr
    storageDevices:
    - config: # existing item
        deviceClass: hdd
      name: sdb
    - config: # new item
        deviceClass: hdd
        metadataDevice: /dev/bluedb/meta_1
      name: sdc

Warning

Since MOSK 23.3, Mirantis highly recommends using the non-wwn by-id symlinks to specify storage devices in the storageDevices list.

For details, see Container Cloud documentation: Addressing storage devices.

Verify that the Ceph OSD is successfully deployed on the specified node:

kubectl -n <managedClusterProjectName> get kaascephcluster -o yaml

In the system response, the fullClusterInfo section should not contain any issues.

Example of a successful system response:

status:
  fullClusterInfo:
    daemonsStatus:
      ...
      osd:
        running: '4/4 running: 4 up, 4 in'
        status: Ok

Obtain the name of the node on which the machine with the Ceph OSD is running:
```
kubectl -n <managedClusterProjectName> get machine <machineName> -o jsonpath='{.status.nodeRef.name}'
```
Substitute <managedClusterProjectName> and <machineName> with corresponding values.

Verify the Ceph OSD status:

kubectl -n rook-ceph get pod -l app=rook-ceph-osd -o wide | grep <nodeName>

Substitute <nodeName> with the value obtained on the previous step.

Example of system response:

rook-ceph-osd-0-7b8d4d58db-f6czn   1/1     Running   0          42h   10.100.91.6   kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf   <none>           <none>
rook-ceph-osd-1-78fbc47dc5-px9n2   1/1     Running   0          21h   10.100.91.6   kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf   <none>           <none>
rook-ceph-osd-3-647f8d6c69-87gxt   1/1     Running   0          21h   10.100.91.6   kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf   <none>           <none>

Remove a Ceph OSD with a metadata device¶

Note

Ceph OSD removal implies the usage of the KaaSCephOperationRequest custom resource (CR). For workflow overview, spec and phases description, see High-level workflow of Ceph OSD or node removal.

Warning

Open the KaasCephCluster object of the managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

Remove the required Ceph OSD specification from the spec.cephClusterSpec.nodes.<machineName>.storageDevices list:

The example configuration of the nodes section with the new node:

Since MOSK 23.3

nodes:
  kaas-node-5bgk6:
    roles:
    - mon
    - mgr
    storageDevices:
    - config:
        deviceClass: hdd
      fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
    - config: # remove the entire item entry from storageDevices list
        deviceClass: hdd
        metadataDevice: /dev/bluedb/meta_1
      fullPath: /dev/disk/by-id/scsi-0ATA_HGST_HUS724040AL_PN1334PEHN1VBC

Before MOSK 23.3

nodes:
  kaas-node-5bgk6:
    roles:
    - mon
    - mgr
    storageDevices:
    - config:
        deviceClass: hdd
      name: sdb
    - config: # remove the entire item entry from storageDevices list
        deviceClass: hdd
        metadataDevice: /dev/bluedb/meta_1
      name: sdc

Create a YAML template for the KaaSCephOperationRequest CR. For example:
```
apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
 name: remove-osd-<machineName>-sdb
 namespace: <managedClusterProjectName>
spec:
 osdRemove:
 nodes:
 <machineName>:
 cleanupByDevice:
 - name: sdb
 kaasCephCluster:
 name: <kaasCephClusterName>
 namespace: <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding cluster namespace and <kaasCephClusterName> with the corresponding KaaSCephCluster name.

Warning

Since MOSK 23.3, Mirantis does not recommend setting device name or device by-path symlink in the cleanupByDevice field as these identifiers are not persistent and can change at node boot. Remove Ceph OSDs with by-id symlinks specified in the path field or use cleanupByOsdId instead.

For details, see Container Cloud documentation: Addressing storage devices.
Note
- Since MOSK 23.1, cleanupByDevice is not supported if a device was physically removed from a node. Therefore, use cleanupByOsdId instead. For details, see Remove a failed Ceph OSD by Ceph OSD ID.
- Before MOSK 23.1, if the storageDevice item was specified with by-id, specify the path parameter in the cleanupByDevice section instead of name.
- If the storageDevice item was specified with a by-path device path, specify the path parameter in the cleanupByDevice section instead of name.
Apply the template on the management cluster in the corresponding namespace:
```
kubectl apply -f remove-osd-<machineName>-sdb.yaml
```

Verify that the corresponding request has been created:

kubectl get kaascephoperationrequest remove-osd-<machineName>-sdb -n <managedClusterProjectName>

Verify that the removeInfo section appeared in the KaaSCephOperationRequest CR status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest remove-osd-<machineName>-sdb -o yaml

Example of system response:

status:
  childNodesMapping:
    kaas-node-d4aac64d-1721-446c-b7df-e351c3025591: <machineName>
  osdRemoveStatus:
    removeInfo:
      cleanUpMap:
        kaas-node-d4aac64d-1721-446c-b7df-e351c3025591:
          osdMapping:
            "10":
              deviceMapping:
                sdb:
                  path: "/dev/disk/by-path/pci-0000:00:1t.9"
                  partition: "/dev/ceph-b-vg_sdb/osd-block-b-lv_sdb"
                  type: "block"
                  class: "hdd"
                  zapDisk: true
            "5":
              deviceMapping:
                /dev/sdc:
                  deviceClass: hdd
                  devicePath: /dev/disk/by-path/pci-0000:00:0f.0
                  devicePurpose: block
                  usedPartition: /dev/ceph-2d11bf90-e5be-4655-820c-fb4bdf7dda63/osd-block-e41ce9a8-4925-4d52-aae4-e45167cfcf5c
                  zapDisk: true
                /dev/sdf:
                  deviceClass: hdd
                  devicePath: /dev/disk/by-path/pci-0000:00:12.0
                  devicePurpose: db
                  usedPartition: /dev/bluedb/meta_1

Verify that the cleanUpMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest remove-osd-<machineName>-sdb -o yaml

Example of system response:

status:
  phase: ApproveWaiting

In the KaaSCephOperationRequest CR, set the approve flag to true:

kubectl -n <managedClusterProjectName> edit kaascephoperationrequest remove-osd-<machineName>-sdb

Configuration snippet:

spec:
  osdRemove:
    approve: true

Review the following status fields of the KaaSCephOperationRequest CR request processing:
- status.phase - current state of request processing
- status.messages - description of the current phase
- status.conditions - full history of request processing before the current phase
- status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during request processing, if any
Verify that the KaaSCephOperationRequest has been completed.

Example of the positive status.phase field:
```
status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues
```

Remove the device cleanup jobs:

kubectl delete jobs -n ceph-lcm-mirantis -l app=miraceph-cleanup-disks

Reconfigure a partition of a Ceph OSD metadata device¶

There is no hot reconfiguration procedure for existing Ceph OSDs. To reconfigure an existing Ceph node, remove and re-add a Ceph OSD with a metadata device using the following options:

Since Container Cloud 2.24.0, if metadata device partitions are specified in the BareMetalHostProfile object as described in Configure Ceph disks in a host profile, the metadata device definition is an LVM path in metadataDevice of the KaaSCephCluster object.

Therefore, automated LCM will clean up the logical volume without removal and it can be reused. For this reason, to reconfigure a partition of a Ceph OSD metadata device:
1. Remove a Ceph OSD from the Ceph cluster as described in Remove a Ceph OSD with a metadata device.
2. Add the same Ceph OSD but with a modified configuration as described in Add a Ceph OSD with a metadata device.
Before MOSK 23.2 or if metadata device partitions are not specified in the BareMetalHostProfile object as described in Configure Ceph disks in a host profile, the most common definition of a metadata device is a full device name (by-path or by-id) in metadataDevice of the KaaSCephCluster object for Ceph OSD. For example, metadataDevice: /dev/nvme0n1. In this case, to reconfigure a partition of a Ceph OSD metadata device:
1. Remove a Ceph OSD from the Ceph cluster as described in Remove a Ceph OSD with a metadata device. Automated LCM will clean up the data device and will remove the metadata device partition for the required Ceph OSD.
2. Reconfigure the metadata device partition manually to use it during addition of a new Ceph OSD.
 Manual reconfiguration of a metadata device partition
 1. Log in to the Ceph node running a Ceph OSD to reconfigure.
 2. Find the required metadata device used for Ceph OSDs that should have LVM partitions with the osd--db substring:
 
 lsblk
 
 Example of system response:
 
 ... vdf 252:80 0 32G 0 disk ├─ceph--7831901d--398e--415d--8941--e78486f3b019-osd--db--4bdbb0a0--e613--416e--ab97--272f237b7eab │ 253:3 0 16G 0 lvm └─ceph--7831901d--398e--415d--8941--e78486f3b019-osd--db--8f439d5c--1a19--49d5--b71f--3c25ae343303 253:5 0 16G 0 lvm
 
 Capture the volume group UUID and logical volume sizes. In the example above, the volume group UUID is ceph--7831901d--398e--415d--8941--e78486f3b019 and the size is 16G.
 3. Find the volume group of the metadata device:
 
 vgs
 
 Example of system response:
 
 VG #PV #LV #SN Attr VSize VFree ceph-508c7a6d-db01-4873-98c3-52ab204b5ca8 1 1 0 wz--n- <32.00g 0 ceph-62d84b29-8de5-440c-a6e9-658e8e246af7 1 1 0 wz--n- <32.00g 0 ceph-754e0772-6d0f-4629-bf1d-24cb79f3ee82 1 1 0 wz--n- <32.00g 0 ceph-7831901d-398e-415d-8941-e78486f3b019 1 2 0 wz--n- <48.00g <17.00g lvm_root 1 1 0 wz--n- <61.03g 0
 
 Capture the volume group with the name that matches the prefix of LVM partitions of the metadata device. In the example above, the required volume group is ceph-7831901d-398e-415d-8941-e78486f3b019.
 4. Make a manual LVM partitioning for the new Ceph OSD. Create a new logical volume in the obtained volume group:
 
 lvcreate -L <lvSize> -n <lvName> <vgName>
 
 Substitute the following parameters:
 
 <lvSize> with the previously obtained logical volume size. In the example above, it is 16G.
 
 <lvName> with a new logical volume name. For example, meta_1.
 
 <vgName> with the previously obtained volume group name. In the example above, it is ceph-7831901d-398e-415d-8941-e78486f3b019.
 
 Note
 
 Manually created partitions can be removed only manually, or during a complete metadata disk removal, or during the Machine object removal or re-provisioning.
3. Add the same Ceph OSD but with a modified configuration and manually created logical volume of the metadata device as described in Add a Ceph OSD with a metadata device.
 
 For example, instead of metadataDevice: /dev/bluedb/meta_1 define metadataDevice: /dev/ceph-7831901d-398e-415d-8941-e78486f3b019/meta_1 that was manually created in the previous step.

Replace a failed Ceph OSD¶

After a physical disk replacement, you can use Ceph LCM API to redeploy a failed Ceph OSD. The common flow of replacing a failed Ceph OSD is as follows:

Remove the obsolete Ceph OSD from the Ceph cluster by device name, by Ceph OSD ID, or by path.
Add a new Ceph OSD on the new disk to the Ceph cluster.

Note

Ceph OSD replacement presupposes usage of a KaaSCephOperationRequest CR. For workflow overview, spec and phases description, see High-level workflow of Ceph OSD or node removal.

Remove a failed Ceph OSD by device name, path, or ID¶

Warning

The procedure below presuppose that the operator knows the exact device name, by-path, or by-id of the replaced device, as well as on which node the replacement occurred.

Warning

Since Container Cloud 2.23.1 (Cluster release 12.7.0), a Ceph OSD removal using by-path, by-id, or device name is not supported if a device was physically removed from a node. Therefore, use cleanupByOsdId instead. For details, see Remove a failed Ceph OSD by Ceph OSD ID.

Warning

Since MOSK 23.3, Mirantis does not recommend setting device name or device by-path symlink in the cleanupByDevice field as these identifiers are not persistent and can change at node boot. Remove Ceph OSDs with by-id symlinks specified in the path field or use cleanupByOsdId instead.

For details, see Container Cloud documentation: Addressing storage devices.

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, remove the required device:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - name: <deviceName>  # remove the entire item from storageDevices list
          # fullPath: <deviceByPath> if device is specified with symlink instead of name
          config:
            deviceClass: hdd

Substitute <machineName> with the machine name of the node where the device <deviceName> or <deviceByPath> is going to be replaced.

Save KaaSCephCluster and close the editor.

Create a KaaSCephOperationRequest CR template and save it as replace-failed-osd-<machineName>-<deviceName>-request.yaml:

apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
  name: replace-failed-osd-<machineName>-<deviceName>
  namespace: <managedClusterProjectName>
spec:
  osdRemove:
    nodes:
      <machineName>:
        cleanupByDevice:
        - name: <deviceName>
          # If a device is specified with by-path or by-id instead of
          # name, path: <deviceByPath> or <deviceById>.
  kaasCephCluster:
    name: <kaasCephClusterName>
    namespace: <managedClusterProjectName>

Substitute <kaasCephClusterName> with the corresponding KaaSCephCluster resource from the <managedClusterProjectName> namespace.

Apply the template to the cluster:

kubectl apply -f replace-failed-osd-<machineName>-<deviceName>-request.yaml

Verify that the corresponding request has been created:

kubectl get kaascephoperationrequest -n <managedClusterProjectName>

Verify that the removeInfo section appeared in the KaaSCephOperationRequest CR status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-osd-<machineName>-<deviceName> -o yaml

Example of system response:

status:
  childNodesMapping:
    <nodeName>: <machineName>
  osdRemoveStatus:
    removeInfo:
      cleanUpMap:
        <nodeName>:
          osdMapping:
            <osdId>:
              deviceMapping:
                <dataDevice>:
                  deviceClass: hdd
                  devicePath: <dataDeviceByPath>
                  devicePurpose: block
                  usedPartition: /dev/ceph-d2d3a759-2c22-4304-b890-a2d87e056bd4/osd-block-ef516477-d2da-492f-8169-a3ebfc3417e2
                  zapDisk: true

Definition of values in angle brackets:

<machineName> - name of the machine on which the device is being replaced, for example, worker-1
<nodeName> - underlying node name of the machine, for example, kaas-node-5a74b669-7e53-4535-aabd-5b509ec844af
<osdId> - Ceph OSD ID for the device being replaced, for example, 1
<dataDeviceByPath> - by-path of the device placed on the node, for example, /dev/disk/by-path/pci-0000:00:1t.9
<dataDevice> - name of the device placed on the node, for example, /dev/sdb

Verify that the cleanUpMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-osd-<machineName>-<deviceName> -o yaml

Example of system response:

status:
  phase: ApproveWaiting

Edit the KaaSCephOperationRequest CR and set the approve flag to true:

kubectl -n <managedClusterProjectName> edit kaascephoperationrequest replace-failed-osd-<machineName>-<deviceName>

For example:

spec:
  osdRemove:
    approve: true

Review the following status fields of the KaaSCephOperationRequest CR request processing:
- status.phase - current state of request processing
- status.messages - description of the current phase
- status.conditions - full history of request processing before the current phase
- status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during request processing, if any

Verify that the KaaSCephOperationRequest has been completed. For example:

status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues

Remove the device cleanup jobs:

kubectl delete jobs -n ceph-lcm-mirantis -l app=miraceph-cleanup-disks

Remove a failed Ceph OSD by Ceph OSD ID¶

Caution

The procedure below presupposes that the operator knows only the failed Ceph OSD ID.

Identify the node and device names used by the affected Ceph OSD:
Since MCC 2.24.2 (Cluster release 15.0.1)
Using the Ceph CLI in the rook-ceph-tools Pod, run:
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd metadata <osdId>
Substitute <osdId> with the affected OSD ID.

Example output:
{ "id": 1, ... "bluefs_db_devices": "vdc", ... "bluestore_bdev_devices": "vde", ... "devices": "vdc,vde", ... "hostname": "kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf", ... },
In the example above, hostname is the node name and devices are all devices used by the affected Ceph OSD.
Before MCC 2.24.2 (Cluster release 15.0.1)
In the status section of the KaaSCephCluster CR, obtain the osd-device mapping:
kubectl get kaascephcluster -n <managedClusterProjectName> -o yaml
Substitute <managedClusterProjectName> with the corresponding value.

For example:
status: fullClusterInfo: cephDetails: cephDeviceMapping: <nodeName>: <osdId>: <deviceName>
In the system response, capture the following parameters:
- <nodeName> - the corresponding node name that hosts the Ceph OSD
- <osdId> - the ID of the Ceph OSD to replace
- <deviceName> - an actual device name to replace

Obtain <machineName> for <nodeName> where the Ceph OSD is placed:

kubectl -n rook-ceph get node -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.metadata.labels.kaas\.mirantis\.com\/machine-name}{"\n"}{end}'

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, remove the required device:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - name: <deviceName>  # remove the entire item from storageDevices list
          config:
            deviceClass: hdd

Substitute <machineName> with the machine name of the node where the device <deviceName> is going to be replaced.

Save KaaSCephCluster and close the editor.

Create a KaaSCephOperationRequest CR template and save it as replace-failed-<machineName>-osd-<osdId>-request.yaml:

apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
  name: replace-failed-<machineName>-osd-<osdId>
  namespace: <managedClusterProjectName>
spec:
  osdRemove:
    nodes:
      <machineName>:
        cleanupByOsdId:
        - <osdId>
  kaasCephCluster:
    name: <kaasCephClusterName>
    namespace: <managedClusterProjectName>

Substitute <kaasCephClusterName> with the corresponding KaaSCephCluster resource from the <managedClusterProjectName> namespace.

Apply the template to the cluster:

kubectl apply -f replace-failed-<machineName>-osd-<osdId>-request.yaml

Verify that the corresponding request has been created:

kubectl get kaascephoperationrequest -n <managedClusterProjectName>

Verify that the removeInfo section appeared in the KaaSCephOperationRequest CR status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-<machineName>-osd-<osdId>-request -o yaml

Example of system response

status:
  childNodesMapping:
    <nodeName>: <machineName>
  osdRemoveStatus:
    removeInfo:
      cleanUpMap:
        <nodeName>:
          osdMapping:
            <osdId>:
              deviceMapping:
                <dataDevice>:
                  deviceClass: hdd
                  devicePath: <dataDeviceByPath>
                  devicePurpose: block
                  usedPartition: /dev/ceph-d2d3a759-2c22-4304-b890-a2d87e056bd4/osd-block-ef516477-d2da-492f-8169-a3ebfc3417e2
                  zapDisk: true

Definition of values in angle brackets:

<machineName> - name of the machine on which the device is being replaced, for example, worker-1
<nodeName> - underlying node name of the machine, for example, kaas-node-5a74b669-7e53-4535-aabd-5b509ec844af
<osdId> - Ceph OSD ID for the device being replaced, for example, 1
<dataDeviceByPath> - by-path of the device placed on the node, for example, /dev/disk/by-path/pci-0000:00:1t.9
<dataDevice> - name of the device placed on the node, for example, /dev/sdb

Verify that the cleanUpMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-<machineName>-osd-<osdId>-request -o yaml

Example of system response:

status:
  phase: ApproveWaiting

Edit the KaaSCephOperationRequest CR and set the approve flag to true:

kubectl -n <managedClusterProjectName> edit kaascephoperationrequest replace-failed-<machineName>-osd-<osdId>-request

For example:

spec:
  osdRemove:
    approve: true

Review the following status fields of the KaaSCephOperationRequest CR request processing:
- status.phase - current state of request processing
- status.messages - description of the current phase
- status.conditions - full history of request processing before the current phase
- status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during request processing, if any

Verify that the KaaSCephOperationRequest has been completed. For example:

status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues

Remove the device cleanup jobs:

kubectl delete jobs -n ceph-lcm-mirantis -l app=miraceph-cleanup-disks

Deploy a new device after removal of a failed one¶

Note

You can spawn Ceph OSD on a raw device, but it must be clean and without any data or partitions. If you want to add a device that was in use, also ensure it is raw and clean. To clean up all data and partitions from a device, refer to official Rook documentation.

If you want to add a Ceph OSD on top of a raw device that already exists on a node or is hot-plugged, add the required device using the following guidelines:
- You can add a raw device to a node during node deployment.
- If a node supports adding devices without node reboot, you can hot plug a raw device to a node.
- If a node does not support adding devices without node reboot, you can hot plug a raw device during node shutdown. In this case, complete the following steps:
  1. Enable maintenance mode on the managed cluster.
  2. Turn off the required node.
  3. Attach the required raw device to the node.
  4. Turn on the required node.
  5. Disable maintenance mode on the managed cluster.
Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, add a new device:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - fullPath: <deviceByID> # Since 2.25.0 (17.0.0) if device is supposed to be added with by-id
          # name: <deviceByID> # Prior MCC 2.25.0 if device is supposed to be added with by-id
          # fullPath: <deviceByPath> # if device is supposed to be added with by-path
          config:
            deviceClass: hdd

Substitute <machineName> with the machine name of the node where device <deviceName> or <deviceByPath> is going to be added as a Ceph OSD.

Verify that the new Ceph OSD has appeared in the Ceph cluster and is in and up. The fullClusterInfo section should not contain any issues.

kubectl -n <managedClusterProjectName> get kaascephcluster -o yaml

For example:

status:
  fullClusterInfo:
    daemonStatus:
      osd:
        running: '3/3 running: 3 up, 3 in'
        status: Ok

Replace a failed Ceph OSD with a metadata device¶

The document describes various scenarios of a Ceph OSD outage and recovery or replacement. More specifically, this section describes how to replace a failed Ceph OSD with a metadata device:

If the metadata device is specified as a logical volume in the BareMetalHostProfile object and defined in the KaaSCephCluster object as a logical volume path
If the metadata device is specified in the KaaSCephCluster object as a device name

Note

Ceph OSD replacement implies the usage of the KaaSCephOperationRequest custom resource (CR). For workflow overview, spec and phases description, see High-level workflow of Ceph OSD or node removal.

Replace a failed Ceph OSD with a metadata device as a logical volume path¶

You can apply the below procedure in the following cases:

A Ceph OSD failed without data or metadata device outage. In this case, first remove a failed Ceph OSD and clean up all corresponding disks and partitions. Then add a new Ceph OSD to the same data and metadata paths.
A Ceph OSD failed with data or metadata device outage. In this case, you also first remove a failed Ceph OSD and clean up all corresponding disks and partitions. Then add a new Ceph OSD to a newly replaced data device with the same metadata path.

Note

The below procedure also applies to manually created metadata partitions.

Remove a failed Ceph OSD by ID with a defined metadata device¶

Identify the ID of Ceph OSD related to a failed device. For example, use the Ceph CLI in the rook-ceph-tools Pod:

ceph osd metadata

Example of system response:

{
    "id": 0,
    ...
    "bluestore_bdev_devices": "vdc",
    ...
    "devices": "vdc",
    ...
    "hostname": "kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf",
    ...
    "pod_name": "rook-ceph-osd-0-7b8d4d58db-f6czn",
    ...
},
{
    "id": 1,
    ...
    "bluefs_db_devices": "vdf",
    ...
    "bluestore_bdev_devices": "vde",
    ...
    "devices": "vde,vdf",
    ...
    "hostname": "kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf",
    ...
    "pod_name": "rook-ceph-osd-1-78fbc47dc5-px9n2",
    ...
},
...

Open the KaasCephCluster custom resource (CR) for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section:

Find and capture the metadataDevice path to reuse it during re-creation of the Ceph OSD.
Remove the required device:

Example configuration snippet:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - name: <deviceName>  # remove the entire item from the storageDevices list
          # fullPath: <deviceByPath> if device is specified using by-path instead of name
          config:
            deviceClass: hdd
            metadataDevice: /dev/bluedb/meta_1

In the example above, <machineName> is the name of machine that relates to the node on which the device <deviceName> or <deviceByPath> must be replaced.

Create a KaaSCephOperationRequest CR template and save it as replace-failed-osd-<machineName>-<osdID>-request.yaml:
```
apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
 name: replace-failed-osd-<machineName>-<deviceName>
 namespace: <managedClusterProjectName>
spec:
 osdRemove:
 nodes:
 <machineName>:
 cleanupByOsdId:
 - <osdID>
 kaasCephCluster:
 name: <kaasCephClusterName>
 namespace: <managedClusterProjectName>
```
Substitute the following parameters:
- <machineName> and <deviceName> with the machine and device names from the previous step
- <managedClusterProjectName> with the cluster project name
- <osdID> with the ID of the affected Ceph OSD
- <kaasCephClusterName> with the KaaSCephCluster resource name
- <managedClusterProjectName> with the project name of the related managed cluster

Apply the template to the cluster:

kubectl apply -f replace-failed-osd-<machineName>-<osdID>-request.yaml

Verify that the corresponding request has been created:

kubectl get kaascephoperationrequest -n <managedClusterProjectName>

Verify that the status section of KaaSCephOperationRequest contains the removeInfo section:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-osd-<machineName>-<osdID> -o yaml

Example of system response:

childNodesMapping:
  <nodeName>: <machineName>
removeInfo:
  cleanUpMap:
    <nodeName>:
      osdMapping:
        "<osdID>":
          deviceMapping:
            <dataDevice>:
              deviceClass: hdd
              devicePath: <dataDeviceByPath>
              devicePurpose: block
              usedPartition: /dev/ceph-d2d3a759-2c22-4304-b890-a2d87e056bd4/osd-block-ef516477-d2da-492f-8169-a3ebfc3417e2
              zapDisk: true
            <metadataDevice>:
              deviceClass: hdd
              devicePath: <metadataDeviceByPath>
              devicePurpose: db
              usedPartition: /dev/bluedb/meta_1
          uuid: ef516477-d2da-492f-8169-a3ebfc3417e2

Definition of values in angle brackets:

<machineName> - name of the machine on which the device is being replaced, for example, worker-1
<nodeName> - underlying node name of the machine, for example, kaas-node-5a74b669-7e53-4535-aabd-5b509ec844af
<osdId> - Ceph OSD ID for the device being replaced, for example, 1
<dataDeviceByPath> - by-path of the device placed on the node, for example, /dev/disk/by-path/pci-0000:00:1t.9
<dataDevice> - name of the device placed on the node, for example, /dev/vde
<metadataDevice> - metadata name of the device placed on the node, for example, /dev/vdf
<metadataDeviceByPath> - metadata by-path of the device placed on the node, for example, /dev/disk/by-path/pci-0000:00:12.0

Note

The partitions that are manually created or configured using the BareMetalHostProfile object can be removed only manually, or during a complete metadata disk removal, or during the Machine object removal or re-provisioning.

Verify that the cleanUpMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-osd-<machineName>-<osdID> -o yaml

Example of system response:

status:
  phase: ApproveWaiting

In the KaaSCephOperationRequest CR, set the approve flag to true:

kubectl -n <managedClusterProjectName> edit kaascephoperationrequest replace-failed-osd-<machineName>-<osdID>

Configuration snippet:

spec:
  osdRemove:
    approve: true

Review the following status fields of the KaaSCephOperationRequest CR request processing:
- status.phase - current state of request processing
- status.messages - description of the current phase
- status.conditions - full history of request processing before the current phase
- status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during request processing, if any

Verify that the KaaSCephOperationRequest has been completed. For example:

status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues

Re-create a Ceph OSD with the same metadata partition¶

Note

If you want to add a Ceph OSD on top of a raw device that already exists on a node or is hot-plugged, add the required device using the following guidelines:
- You can add a raw device to a node during node deployment.
- If a node supports adding devices without node reboot, you can hot plug a raw device to a node.
- If a node does not support adding devices without node reboot, you can hot plug a raw device during node shutdown. In this case, complete the following steps:
  1. Enable maintenance mode on the managed cluster.
  2. Turn off the required node.
  3. Attach the required raw device to the node.
  4. Turn on the required node.
  5. Disable maintenance mode on the managed cluster.
Open the KaasCephCluster CR for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, add the replaced device with the same metadataDevice path as on the removed Ceph OSD. For example:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - name: <deviceByID> # Recommended. Add a new device by ID, for example, /dev/disk/by-id/...
          #fullPath: <deviceByPath> # Add a new device by path, for example, /dev/disk/by-path/...
          config:
            deviceClass: hdd
            metadataDevice: /dev/bluedb/meta_1 # Must match the value of the previously removed OSD

Substitute <machineName> with the machine name of the node where the new device <deviceByID> or <deviceByPath> must be added.

Wait for the replaced disk to apply to the Ceph cluster as a new Ceph OSD.

You can monitor the application state using either the status section of the KaaSCephCluster CR or in the rook-ceph-tools Pod:
```
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
```

Replace a failed Ceph OSD disk with a metadata device as a device name¶

You can apply the below procedure if a Ceph OSD failed with data disk outage and the metadata partition is not specified in the BareMetalHostProfile custom resource (CR). This scenario implies that the Ceph cluster automatically creates a required metadata logical volume on a desired device.

Remove a Ceph OSD with a metadata device as a device name¶

To remove the affected Ceph OSD with a metadata device as a device name, follow the Remove a failed Ceph OSD by ID with a defined metadata device procedure and capture the following details:

While editing KaasCephCluster in the nodes section, capture the metadataDevice path to reuse it during re-creation of the Ceph OSD.

Example of the spec.nodes section:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - name: <deviceName>  # remove the entire item from the storageDevices list
          # fullPath: <deviceByPath> if device is specified using by-path instead of name
          config:
            deviceClass: hdd
            metadataDevice: /dev/nvme0n1

In the example above, save the metadataDevice device name /dev/nvme0n1.

During verification of removeInfo, capture the usedPartition value of the metadata device located in the deviceMapping.<metadataDevice> section.

Example of the removeInfo section:

removeInfo:
  cleanUpMap:
    <nodeName>:
      osdMapping:
        "<osdID>":
          deviceMapping:
            <dataDevice>:
              deviceClass: hdd
              devicePath: <dataDeviceByPath>
              devicePurpose: block
              usedPartition: /dev/ceph-d2d3a759-2c22-4304-b890-a2d87e056bd4/osd-block-ef516477-d2da-492f-8169-a3ebfc3417e2
              zapDisk: true
            <metadataDevice>:
              deviceClass: hdd
              devicePath: <metadataDeviceByPath>
              devicePurpose: db
              usedPartition: /dev/ceph-b0c70c72-8570-4c9d-93e9-51c3ab4dd9f9/osd-db-ecf64b20-1e07-42ac-a8ee-32ba3c0b7e2f
          uuid: ef516477-d2da-492f-8169-a3ebfc3417e2

In the example above, capture the following values from the <metadataDevice> section:

ceph-b0c70c72-8570-4c9d-93e9-51c3ab4dd9f9 - name of the volume group that contains all metadata partitions on the <metadataDevice> disk
osd-db-ecf64b20-1e07-42ac-a8ee-32ba3c0b7e2f - name of the logical volume that relates to a failed Ceph OSD

Re-create the metadata partition on the existing metadata disk¶

After you remove the Ceph OSD disk, manually create a separate logical volume for the metadata partition in an existing volume group on the metadata device:

lvcreate -l 100%FREE -n meta_1 <vgName>

Subtitute <vgName> with the name of a volume group captured in the usedPartiton parameter.

Note

If you removed more than one OSD, replace 100%FREE with the corresponding partition size. For example:

lvcreate -l <partitionSize> -n meta_1 <vgName>

Substitute <partitionSize> with the corresponding value that matches the size of other partitions placed on the affected metadata drive. To obtain <partitionSize>, use the output of the lvs command. For example: 16G.

During execution of the lvcreate command, the system asks you to wipe the found bluestore label on a metadata device. For example:

WARNING: ceph_bluestore signature detected on /dev/ceph-b0c70c72-8570-4c9d-93e9-51c3ab4dd9f9/meta_1 at offset 0. Wipe it? [y/n]:

Using the interactive shell, answer n to keep all metadata partitions alive. After answering n, the system outputs the following:

Aborted wiping of ceph_bluestore.
1 existing signature left on the device.
Logical volume "meta_1" created.

Re-create the Ceph OSD with the re-created metadata partition¶

Note

If you want to add a Ceph OSD on top of a raw device that already exists on a node or is hot-plugged, add the required device using the following guidelines:
- You can add a raw device to a node during node deployment.
- If a node supports adding devices without node reboot, you can hot plug a raw device to a node.
- If a node does not support adding devices without node reboot, you can hot plug a raw device during node shutdown. In this case, complete the following steps:
  1. Enable maintenance mode on the managed cluster.
  2. Turn off the required node.
  3. Attach the required raw device to the node.
  4. Turn on the required node.
  5. Disable maintenance mode on the managed cluster.
Open the KaasCephCluster CR for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, add the replaced device with the same metadataDevice path as in the previous Ceph OSD:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - fullPath: <deviceByID> # Recommended since MCC 2.25.0 (17.0.0).
                                 # Add a new device by-id symlink, for example, /dev/disk/by-id/...
          #name: <deviceByID> # Add a new device by ID, for example, /dev/disk/by-id/...
          #fullPath: <deviceByPath> # Add a new device by path, for example, /dev/disk/by-path/...
          config:
            deviceClass: hdd
            metadataDevice: /dev/<vgName>/meta_1

Substitute <machineName> with the machine name of the node where the new device <deviceByID> or <deviceByPath> must be added. Also specify metadataDevice with the path to the logical volume created during the Re-create the metadata partition on the existing metadata disk procedure.

Wait for the replaced disk to apply to the Ceph cluster as a new Ceph OSD.

You can monitor the application state using either the status section of the KaaSCephCluster CR or in the rook-ceph-tools Pod:
```
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
```

Replace a failed metadata device¶

This section describes the scenario when an underlying metadata device fails with all related Ceph OSDs. In this case, the only solution is to remove all Ceph OSDs related to the failed metadata device, then attach a device that will be used as a new metadata device, and re-create all affected Ceph OSDs.

Caution

If you used BareMetalHostProfile to automatically partition the failed device, you must create a manual partition of the new device because BareMetalHostProfile does not support hot-load changes and creates an automatic device partition only during node provisioning.

Remove failed Ceph OSDs with the affected metadata device¶

Save the KaaSCephCluster specification of all Ceph OSDs affected by the failed metadata device to re-use this specification during re-creation of Ceph OSDs after disk replacement.

Identify Ceph OSD IDs related to the failed metadata device, for example, using Ceph CLI in the rook-ceph-tools Pod:

ceph osd metadata

Example of system response:

{
    "id": 11,
    ...
    "bluefs_db_devices": "vdc",
    ...
    "bluestore_bdev_devices": "vde",
    ...
    "devices": "vdc,vde",
    ...
    "hostname": "kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf",
    ...
},
{
    "id": 12,
    ...
    "bluefs_db_devices": "vdd",
    ...
    "bluestore_bdev_devices": "vde",
    ...
    "devices": "vdd,vde",
    ...
    "hostname": "kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf",
    ...
},
{
    "id": 13,
    ...
    "bluefs_db_devices": "vdf",
    ...
    "bluestore_bdev_devices": "vde",
    ...
    "devices": "vde,vdf",
    ...
    "hostname": "kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf",
    ...
},
...

Open the KaasCephCluster custom resource (CR) for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, remove all storageDevices items that relate to the failed metadata device. For example:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - name: <deviceName1>  # remove the entire item from the storageDevices list
          # fullPath: <deviceByPath> if device is specified using symlink instead of name
          config:
            deviceClass: hdd
            metadataDevice: <metadataDevice>
        - name: <deviceName2>  # remove the entire item from the storageDevices list
          config:
            deviceClass: hdd
            metadataDevice: <metadataDevice>
        - name: <deviceName3>  # remove the entire item from the storageDevices list
          config:
            deviceClass: hdd
            metadataDevice: <metadataDevice>
        ...

In the example above, <machineName> is the machine name of the node where the metadata device <metadataDevice> must be replaced.

Create a KaaSCephOperationRequest CR template and save it as replace-failed-meta-<machineName>-<metadataDevice>-request.yaml:

apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
  name: replace-failed-meta-<machineName>-<metadataDevice>
  namespace: <managedClusterProjectName>
spec:
  osdRemove:
    nodes:
      <machineName>:
        cleanupByOsdId:
        - <osdID-1>
        - <osdID-2>
        ...
  kaasCephCluster:
    name: <kaasCephClusterName>
    namespace: <managedClusterProjectName>

Substitute the following parameters:

<machineName> and <metadataDevice> with the machine and device names from the previous step
<managedClusterProjectName> with the cluster project name
<osdID-*> with IDs of the affected Ceph OSDs
<kaasCephClusterName> with the KaaSCephCluster CR name
<managedClusterProjectName> with the project name of the related managed cluster

Apply the template to the cluster:

kubectl apply -f replace-failed-meta-<machineName>-<metadataDevice>-request.yaml

Verify that the corresponding request has been created:

kubectl get kaascephoperationrequest -n <managedClusterProjectName>

Verify that the removeInfo section is present in the KaaSCephOperationRequest CR status and that the cleanUpMap section matches the required removal:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-meta-<machineName>-<metadataDevice> -o yaml

Example of system response:

childNodesMapping:
  <nodeName>: <machineName>
removeInfo:
  cleanUpMap:
    <nodeName>:
      osdMapping:
        "<osdID-1>":
          deviceMapping:
            <dataDevice-1>:
              deviceClass: hdd
              devicePath: <dataDeviceByPath-1>
              devicePurpose: block
              usedPartition: <dataLvPartition-1>
              zapDisk: true
            <metadataDevice>:
              deviceClass: hdd
              devicePath: <metadataDeviceByPath>
              devicePurpose: db
              usedPartition: /dev/ceph-b0c70c72-8570-4c9d-93e9-51c3ab4dd9f9/osd-db-ecf64b20-1e07-42ac-a8ee-32ba3c0b7e2f
          uuid: ef516477-d2da-492f-8169-a3ebfc3417e2
        "<osdID-2>":
          deviceMapping:
            <dataDevice-2>:
              deviceClass: hdd
              devicePath: <dataDeviceByPath-2>
              devicePurpose: block
              usedPartition: <dataLvPartition-2>
              zapDisk: true
            <metadataDevice>:
              deviceClass: hdd
              devicePath: <metadataDeviceByPath>
              devicePurpose: db
              usedPartition: /dev/ceph-b0c70c72-8570-4c9d-93e9-51c3ab4dd9f9/osd-db-ecf64b20-1e07-42ac-a8ee-32ba3c0b7e2f
          uuid: ef516477-d2da-492f-8169-a3ebfc3417e2
        ...

Definition of values in angle brackets:

<machineName> - name of the machine on which the device is being replaced, for example, worker-1
<nodeName> - underlying node name of the machine, for example, kaas-node-5a74b669-7e53-4535-aabd-5b509ec844af
<osdId> - Ceph OSD ID for the device being replaced, for example, 1
<dataDeviceByPath> - by-path of the device placed on the node, for example, /dev/disk/by-path/pci-0000:00:1t.9
<dataDevice> - name of the device placed on the node, for example, /dev/vdc
<metadataDevice> - metadata name of the device placed on the node, for example, /dev/vde
<metadataDeviceByPath> - metadata by-path of the device placed on the node, for example, /dev/disk/by-path/pci-0000:00:12.0
<dataLvPartition> - logical volume partition of the data device

Wait for the ApproveWaiting phase to appear in status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-meta-<machineName>-<metadataDevice> -o yaml

Example of system response:

status:
  phase: ApproveWaiting

In the KaaSCephOperationRequest CR, set the approve flag to true:

kubectl -n <managedClusterProjectName> edit kaascephoperationrequest replace-failed-meta-<machineName>-<metadataDevice>

Configuration snippet:

spec:
  osdRemove:
    approve: true

Review the following status fields of the KaaSCephOperationRequest CR request processing:
- status.phase - current state of request processing
- status.messages - description of the current phase
- status.conditions - full history of request processing before the current phase
- status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during request processing, if any

Verify that the KaaSCephOperationRequest has been completed. For example:

status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues

Prepare the replaced metadata device for Ceph OSD re-creation¶

Note

This section describes how to create a metadata disk partition on N logical volumes. To create one partition on a metadata disk, refer to Reconfigure a partition of a Ceph OSD metadata device.

Partition the replaced metadata device by N logical volumes (LVs), where N is the number of Ceph OSDs previously located on a failed metadata device.

Calculate the new metadata LV percentage of used volume group capacity using the 100 / N formula.
Log in to the node with the replaced metadata disk.
Create an LVM physical volume atop the replaced metadata device:
```
pvcreate <metadataDisk>
```
Substitute <metadataDisk> with the replaced metadata device.
Create an LVM volume group atop of the physical volume:
```
vgcreate bluedb <metadataDisk>
```
Substitute <metadataDisk> with the replaced metadata device.
Create N LVM logical volumes with the calculated capacity per each volume:
```
lvcreate -l <X>%VG -n meta_ bluedb
```
Substitute <X> with the result of the 100 / N formula and  with the current number of metadata partitions.

As a result, the replaced metadata device will have N LVM paths, for example, /dev/bluedb/meta_1.

Re-create a Ceph OSD on the replaced metadata device¶

Note

Open the KaasCephCluster CR for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, add the cleaned Ceph OSD device with the replaced LVM paths of the metadata device from previous steps. For example:

spec:
  cephClusterSpec:
    nodes:
      <machineName>:
        storageDevices:
        - name: <deviceByID-1> # Recommended. Add the new device by ID /dev/disk/by-id/...
          #fullPath: <deviceByPath-1> # Add a new device by path /dev/disk/by-path/...
          config:
            deviceClass: hdd
            metadataDevice: /dev/<vgName>/<lvName-1>
        - name: <deviceByID-2> # Recommended. Add the new device by ID /dev/disk/by-id/...
          #fullPath: <deviceByPath-2> # Add a new device by path /dev/disk/by-path/...
          config:
            deviceClass: hdd
            metadataDevice: /dev/<vgName>/<lvName-2>
        - name: <deviceByID-3> # Recommended. Add the new device by ID /dev/disk/by-id/...
          #fullPath: <deviceByPath-3> # Add a new device by path /dev/disk/by-path/...
          config:
            deviceClass: hdd
            metadataDevice: /dev/<vgName>/<lvName-3>

Substitute <machineName> with the machine name of the node where the metadata device has been replaced.
Add all data devices for re-created Ceph OSDs and specify metadataDevice that is the path to the previously created logical volume. Substitute <vgName> with a volume group name that contains N logical volumes <lvName-i>.

Wait for the re-created Ceph OSDs to apply to the Ceph cluster.

You can monitor the application state using either the status section of the KaaSCephCluster CR or in the rook-ceph-tools Pod:
```
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
```

Replace a failed Ceph node¶

After a physical node replacement, you can use the Ceph LCM API to redeploy failed Ceph nodes. The common flow of replacing a failed Ceph node is as follows:

Remove the obsolete Ceph node from the Ceph cluster.
Add a new Ceph node with the same configuration to the Ceph cluster.

Note

Ceph OSD node replacement presupposes usage of a KaaSCephOperationRequest CR. For workflow overview, spec and phases description, see High-level workflow of Ceph OSD or node removal.

Remove a failed Ceph node¶

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, remove the required device:

spec:
  cephClusterSpec:
    nodes:
      <machineName>: # remove the entire entry for the node to replace
        storageDevices: {...}
        role: [...]

Substitute <machineName> with the machine name to replace.

Save KaaSCephCluster and close the editor.

Create a KaaSCephOperationRequest CR template and save it as replace-failed-<machineName>-request.yaml:

apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephOperationRequest
metadata:
  name: replace-failed-<machineName>-request
  namespace: <managedClusterProjectName>
spec:
  osdRemove:
    nodes:
      <machineName>:
        completeCleanUp: true
  kaasCephCluster:
    name: <kaasCephClusterName>
    namespace: <managedClusterProjectName>

Substitute <kaasCephClusterName> with the corresponding KaaSCephCluster resource from the <managedClusterProjectName> namespace.

Apply the template to the cluster:

kubectl apply -f replace-failed-<machineName>-request.yaml

Verify that the corresponding request has been created:

kubectl get kaascephoperationrequest -n <managedClusterProjectName>

Verify that the removeInfo section appeared in the KaaSCephOperationRequest CR status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-<machineName>-request -o yaml

Example of system response:

status:
  childNodesMapping:
    <nodeName>: <machineName>
  osdRemoveStatus:
    removeInfo:
      cleanUpMap:
        <nodeName>:
          osdMapping:
            ...
            <osdId>:
              deviceMapping:
                ...
                <deviceName>:
                  path: <deviceByPath>
                  partition: "/dev/ceph-b-vg_sdb/osd-block-b-lv_sdb"
                  type: "block"
                  class: "hdd"
                  zapDisk: true

If needed, change the following values:

<machineName> - machine name where the replacement occurs, for example, worker-1.
<nodeName> - underlying machine node name, for example, kaas-node-5a74b669-7e53-4535-aabd-5b509ec844af.
<osdId> - actual Ceph OSD ID for the device being replaced, for example, 1.
<deviceName> - actual device name placed on the node, for example, sdb.
<deviceByPath> - actual device by-path placed on the node, for example, /dev/disk/by-path/pci-0000:00:1t.9.

Verify that the cleanUpMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:

kubectl -n <managedClusterProjectName> get kaascephoperationrequest replace-failed-<machineName>-request -o yaml

Example of system response:

status:
  phase: ApproveWaiting

Edit the KaaSCephOperationRequest CR and set the approve flag to true:

kubectl -n <managedClusterProjectName> edit kaascephoperationrequest replace-failed-<machineName>-request

For example:

spec:
  osdRemove:
    approve: true

Review the following status fields of the KaaSCephOperationRequest CR request processing:
- status.phase - current state of request processing
- status.messages - description of the current phase
- status.conditions - full history of request processing before the current phase
- status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during request processing, if any

Verify that the KaaSCephOperationRequest has been completed. For example:

status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues

Remove the device cleanup jobs:

kubectl delete jobs -n ceph-lcm-mirantis -l app=miraceph-cleanup-disks

Deploy a new Ceph node after removal of a failed one¶

Note

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the nodes section, add a new device:

spec:
  cephClusterSpec:
    nodes:
      <machineName>: # add new configuration for replaced Ceph node
        storageDevices:
        - fullPath: <deviceByID> # Recommended since MCC 2.25.0 (17.0.0), non-wwn by-id symlink
          # name: <deviceByID> # Prior MCC 2.25.0, non-wwn by-id symlink
          # fullPath: <deviceByPath> # if device is supposed to be added with by-path
          config:
            deviceClass: hdd
          ...

Substitute <machineName> with the machine name of the replaced node and configure it as required.

Warning

Since MCC 2.25.0 (17.0.0), Mirantis highly recommends using non-wwn by-id symlinks only to specify storage devices in the storageDevices list.

For details, see Container Cloud documentation: Addressing storage devices.

Verify that all Ceph daemons from the replaced node have appeared on the Ceph cluster and are in and up. The fullClusterInfo section should not contain any issues.

kubectl -n <managedClusterProjectName> get kaascephcluster -o yaml

Example of system response:

status:
  fullClusterInfo:
    clusterStatus:
      ceph:
        health: HEALTH_OK
        ...
    daemonStatus:
      mgr:
        running: a is active mgr
        status: Ok
      mon:
        running: '3/3 mons running: [a b c] in quorum'
        status: Ok
      osd:
        running: '3/3 running: 3 up, 3 in'
        status: Ok

Verify the Ceph node on the managed cluster:

kubectl -n rook-ceph get pod -o wide | grep <machineName>

Remove Ceph OSD manually¶

You may need to manually remove a Ceph OSD, for example, in the following cases:

If you have removed a device or node from the KaaSCephCluster spec.cephClusterSpec.nodes or spec.cephClusterSpec.nodeGroups section with manageOsds set to false.
If you do not want to rely on Ceph LCM operations and want to manage the Ceph OSDs life cycle manually.

To safely remove one or multiple Ceph OSDs from a Ceph cluster, perform the following procedure for each Ceph OSD one by one.

Warning

The procedure presupposes the Ceph OSD disk or logical volumes partition cleanup.

To remove a Ceph OSD manually:

Edit the KaaSCephCluster resource on a management cluster:
```
kubectl --kubeconfig <mgmtKubeconfig> -n <managedClusterProjectName> edit kaascephcluster
```
Substitute <mgmtKubeconfig> with the management cluster kubeconfig and <managedClusterProjectName> with the project name of the managed cluster.
In the spec.cephClusterSpec.nodes section, remove the required storageDevices item of the corresponding node spec. If after removal storageDevices becomes empty and the node spec has no roles specified, also remove the node spec.
Obtain kubeconfig of the managed cluster and provide it as an environment variable:
```
export KUBECONFIG=<pathToManagedKubeconfig>
```

Verify that all Ceph OSDs are up and in, the Ceph cluster is healthy, and no rebalance or recovery is in progress:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s

Example of system response:

cluster:
  id:     8cff5307-e15e-4f3d-96d5-39d3b90423e4
  health: HEALTH_OK
  ...
  osd: 4 osds: 4 up (since 10h), 4 in (since 10h)

Stop the rook-ceph/rook-ceph-operator deployment to avoid premature reorchestration of the Ceph cluster:
```
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
```

Enter the ceph-tools pod:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash

Mark the required Ceph OSD as out:
```
ceph osd out osd.<ID>
```
Note

In the command above and in the steps below, substitute <ID> with the number of the Ceph OSD to remove.
Wait until data backfilling to other OSDs is complete:
```
ceph -s
```
Once all of the PGs are active+clean, backfilling is complete and it is safe to remove the disk.

Note

For additional information on PGs backfilling, run ceph pg dump_stuck.
Exit from the ceph-tools pod:
```
exit
```

Scale the rook-ceph/rook-ceph-osd-<ID> deployment to 0 replicas:

kubectl -n rook-ceph scale deploy rook-ceph-osd-<ID> --replicas 0

Enter the ceph-tools pod:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash

Verify that the number of Ceph OSDs that are up and in has decreased by one daemon:
```
ceph -s
```
Example of system response:
```
osd: 4 osds: 3 up (since 1h), 3 in (since 5s)
```

Remove the Ceph OSD from the Ceph cluster:

ceph osd purge <ID> --yes-i-really-mean-it

Delete the Ceph OSD auth entry, if present. Otherwise, skip this step.
```
ceph auth del osd.<ID>
```
If you have removed the last Ceph OSD on the node and want to remove this node from the Ceph cluster, remove the CRUSH map entry:
```
ceph osd crush remove <nodeName>
```
Substitute <nodeName> with the name of the node where the removed Ceph OSD was placed.
Verify that the failure domain within Ceph OSDs has been removed from the CRUSH map:
```
ceph osd tree
```
If you have removed the node, it will be removed from the CRUSH map.
Exit from the ceph-tools pod:
```
exit
```
Clean up the disk used by the removed Ceph OSD. For details, see official Rook documentation.

Warning

If you are using multiple Ceph OSDs per device or metadata device, make sure that you can clean up the entire disk. Otherwise, instead clean up only the logical volume partitions for the volume group by running lvremove <lvpartion_uuid> any Ceph OSD pod that belongs to the same host as the removed Ceph OSD.
Delete the rook-ceph/rook-ceph-osd-<ID> deployment previously scaled to 0 replicas:
```
kubectl -n rook-ceph delete deploy rook-ceph-osd-<ID>
```
Substitute <ID> with the number of the removed Ceph OSD.
Scale the rook-ceph/rook-ceph-operator deployment to 1 replica and wait for the orchestration to complete:
```
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1
kubectl -n rook-ceph get pod -w
```
Once done, Ceph OSD removal is complete.

Migrate Ceph cluster to address storage devices using by-id¶

The by-id identifier is the only persistent device identifier for a Ceph cluster that remains stable after the cluster upgrade or any other maintenance. Therefore, Mirantis recommends using device by-id symlinks rather than device names or by-path symlinks.

Container Cloud uses the device by-id identifier as the default method of addressing the underlying devices of Ceph OSDs. Thus, you should migrate all existing Ceph clusters, which are still utilizing the device names or device by-path symlinks, to the by-id format.

This section explains how to configure the KaaSCephCluster specification to use the by-id symlinks instead of disk names and by-path identifiers as the default method of addressing storage devices.

Note

Mirantis recommends avoiding the use of wwn symlinks as by-id identifiers due to their lack of persistence expressed in inconsistent discovery during node boot.

Besides migrating to by-id, consider using the fullPath field for the by-id symlinks configuration, instead of the name field in the spec.cephClusterSpec.nodes.storageDevices section. This approach allows for clear understanding of field namings and their use cases.

Note

MOSK enables you to use fullPath for the by-id symlinks since MCC 2.25.0 (Cluster release 17.0.0). For earlier product versions, use the name field instead.

Migrate the Ceph nodes section to by-id identifiers¶

Available since MCC 2.25.0 (Cluster release 17.0.0)

Make sure that your managed cluster is not currently running an upgrade or any other maintenance process.

Obtain the list of all KaasCephCluster storage devices that use disk names or disk by-path as identifiers of Ceph node storage devices:

kubectl -n <managedClusterProject> get kcc -o yaml

Substitute <managedClusterProject> with the corresponding managed cluster namespace.

Output example:

spec:
  cephClusterSpec:
    nodes:
      ...
      managed-worker-1:
        storageDevices:
        - config:
            deviceClass: hdd
          name: sdc
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
      managed-worker-2:
        storageDevices:
        - config:
            deviceClass: hdd
          name: /dev/disk/by-id/wwn-0x26d546263bd312b8
        - config:
            deviceClass: hdd
          name: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dsdc
      managed-worker-3:
        storageDevices:
        - config:
            deviceClass: nvme
          name: nvme3n1
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS

Verify the items from the storageDevices sections to be moved to the by-id symlinks. The list of the items to migrate includes:
- A disk name in the name field. For example, sdc, nvme3n1, and so on.
- A disk /dev/disk/by-path symlink in the fullPath field. For example, /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2.
- A disk /dev/disk/by-id symlink in the name field.
  
  Note
  
  This condition applies since MCC 2.25.0 (Cluster release 17.0.0).
- A disk /dev/disk/by-id/wwn symlink, which is programmatically calculated at boot. For example, /dev/disk/by-id/wwn-0x26d546263bd312b8.
For the example above, we have to migrate both items of managed-worker-1, both items of managed-worker-2, and the first item of managed-worker-3. The second item of managed-worker-3 has already been configured in the required format, therefore, we are leaving it as is.
To migrate all affected storageDevices items to by-id symlinks, open the KaaSCephCluster custom resource for editing:
```
kubectl -n <managedClusterProject> edit kcc
```

For each affected node from the spec.cephClusterSpec.nodes section, obtain a corresponding status.providerStatus.hardware.storage section from the Machine custom resource:

kubectl -n <managedClusterProject> get machine <machineName> -o yaml

Substitute <managedClusterProject> with the corresponding cluster namespace and <machineName> with the machine name.

Output example for managed-worker-1:

status:
  providerStatus:
    hardware:
      storage:
      - byID: /dev/disk/by-id/wwn-0x05ad99618d66a21f
        byIDs:
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_05ad99618d66a21f
        - /dev/disk/by-id/scsi-305ad99618d66a21f
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_05ad99618d66a21f
        - /dev/disk/by-id/wwn-0x05ad99618d66a21f
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:0
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:0
        name: /dev/sda
        serialNumber: 05ad99618d66a21f
        size: 61
        type: hdd
      - byID: /dev/disk/by-id/wwn-0x26d546263bd312b8
        byIDs:
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_26d546263bd312b8
        - /dev/disk/by-id/scsi-326d546263bd312b8
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_26d546263bd312b8
        - /dev/disk/by-id/wwn-0x26d546263bd312b8
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
        name: /dev/sdb
        serialNumber: 26d546263bd312b8
        size: 32
        type: hdd
      - byID: /dev/disk/by-id/wwn-0x2e52abb48862dbdc
        byIDs:
        - /dev/disk/by-id/lvm-pv-uuid-MncrcO-6cel-0QsB-IKaY-e8UK-6gDy-k2hOtf
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2e52abb48862dbdc
        - /dev/disk/by-id/scsi-32e52abb48862dbdc
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc
        - /dev/disk/by-id/wwn-0x2e52abb48862dbdc
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
        name: /dev/sdc
        serialNumber: 2e52abb48862dbdc
        size: 61
        type: hdd

For each affected storageDevices item from the considered Machine, obtain a correct by-id symlink from status.providerStatus.hardware.storage.byIDs. Such by-id symlink must contain status.providerStatus.hardware.storage.serialNumber and must not contain wwn.

For managed-worker-1, according to the example output above, we can use the following by-id symlinks:
- Replace the first item of storageDevices that contains name: sdc with fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc;
- Replace the second item of storageDevices that contains fullPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2 with fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_26d546263bd312b8.

Replace all affected storageDevices items in KaaSCephCluster with the obtained ones.

Note

Prior to MCC 2.25.0 (Cluster release 17.0.0), place the by-id symlinks in the name field instead of the fullPath field.

The resulting example of the storage device identifier migration:

spec:
  cephClusterSpec:
    nodes:
      ...
      managed-worker-1:
        storageDevices:
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_26d546263bd312b8
      managed-worker-2:
        storageDevices:
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_031d9054c9b48f79
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dsdc
      managed-worker-3:
        storageDevices:
        - config:
            deviceClass: nvme
          fullPath: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS

Save and quit editing the KaaSCephCluster custom resource.

After migration, the re-orchestration occurs. The whole procedure should not result in any real changes to the Ceph cluster state in Ceph OSDs.

See also

Migrate the Ceph nodeGroups section to by-id identifiers¶

Available since MCC 2.25.0 (Cluster release 17.0.0)

Besides the nodes section, your cluster may contain the nodeGroups section specified with disk names instead of by-id symlinks. Despite of inplace replacement of the nodes storage device identifiers, nodeGroups requires another approach because of the repeatable spec section for different nodes.

In the case of migrating nodeGroups storage devices, the deviceLabels section should be used to label different disks with the same labels and use these labels in node groups after. For the deviceLabels section specification, refer to Ceph advanced configuration: extraOpts.

The following procedure describes how to keep the nodeGroups section but use unique by-id identifiers instead of disk names.

To migrate the Ceph nodeGroups section to by-id identifiers:

Make sure that your managed cluster is not currently running an upgrade or any other maintenance process.

Obtain the list of all KaasCephCluster storage devices that use disk names or disk by-path as identifiers of Ceph node group storage devices:

kubectl -n <managedClusterProject> get kcc -o yaml

Substitute <managedClusterProject> with the corresponding managed cluster namespace.

Output example of the KaaSCephCluster nodeGroups section with disk names used as identifiers:

spec:
  cephClusterSpec:
    nodeGroups:
      ...
      rack-1:
        nodes:
        - node-1
        - node-2
        spec:
          crush:
            rack: "rack-1"
          storageDevices:
          - name: nvme0n1
            config:
              deviceClass: nvme
          - name: nvme1n1
            config:
              deviceClass: nvme
          - name: nvme2n1
            config:
              deviceClass: nvme
      rack-2:
        nodes:
        - node-3
        - node-4
        spec:
          crush:
            rack: "rack-2"
          storageDevices:
          - name: nvme0n1
            config:
              deviceClass: nvme
          - name: nvme1n1
            config:
              deviceClass: nvme
          - name: nvme2n1
            config:
              deviceClass: nvme
      rack-3:
        nodes:
        - node-5
        - node-6
        spec:
          crush:
            rack: "rack-3"
          storageDevices:
          - name: nvme0n1
            config:
              deviceClass: nvme
          - name: nvme1n1
            config:
              deviceClass: nvme
          - name: nvme2n1
            config:
              deviceClass: nvme

Verify the items from the storageDevices sections to be moved to by-id symlinks. The list of the items to migrate includes:
- A disk name in the name field. For example, sdc, nvme3n1, and so on.
- A disk /dev/disk/by-path symlink in the fullPath field. For example, /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2.
- A disk /dev/disk/by-id symlink in the name field.
  
  Note
  
  This condition applies since MCC 2.25.0 (Cluster release 17.0.0).
- A disk /dev/disk/by-id/wwn symlink, which is programmatically calculated at boot. For example, /dev/disk/by-id/wwn-0x26d546263bd312b8.
All storageDevice sections in the example above contain disk names in the name field. Therefore, you need to replace them with by-id symlinks.
Open the KaaSCephCluster custom resource for editing to start migration of all affected storageDevices items to by-id symlinks:
```
kubectl -n <managedClusterProject> edit kcc
```

Within each impacted Ceph node group in the nodeGroups section, add disk labels to the deviceLabels sections for every affected storage device linked with the nodes listed in nodes of that specific node group. Verify that these disk labels are equal to by-id symlinks of corresponding disks.

For example, if the node group rack-1 contains two nodes node-1 and node-2 and spec contains three items with name, you need to obtain proper by-id symlinks for disk names from both nodes and write it down with the same disk labels. The following example contains the labels for by-id symlinks of nvme0n1, nvme1n1, and nvme2n1 disks from node-1 and node-2 correspondingly:

spec:
  cephClusterSpec:
    extraOpts:
      deviceLabels:
        node-1:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R372150
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R183266
        node-2:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R900128
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R805840
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R848469

Note

Keep device labels repeatable for all nodes from the node group. This allows for specifying unified spec for different by-id symlinks of different nodes.

Example of the full deviceLabels section for the nodeGroups section:

spec:
  cephClusterSpec:
    extraOpts:
      deviceLabels:
        node-1:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R372150
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R183266
        node-2:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R900128
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R805840
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R848469
        node-3:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00T2B0A-00007_S46FNY0R900128
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00T2B0A-00007_S46FNY0R805840
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00T2B0A-00007_S46FNY0R848469
        node-4:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00Z4SA0-00007_S46FNY0R286212
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00Z4SA0-00007_S46FNY0R350024
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00Z4SA0-00007_S46FNY0R300756
        node-5:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB8UK0QBD-00007_S46FNY0R577024
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB8UK0QBD-00007_S46FNY0R718411
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB8UK0QBD-00007_S46FNY0R831424
        node-6:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB01DAU34-00007_S46FNY0R908440
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB01DAU34-00007_S46FNY0R945405
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB01DAU34-00007_S46FNY0R224911

For each affected node group in the nodeGroups section, replace the field with the insufficient disk identifier to the devLabel field with the disk label from the deviceLabels section.

For the example above, the updated nodeGroups section looks as follows:

spec:
  cephClusterSpec:
    nodeGroups:
      ...
      rack-1:
        nodes:
        - node-1
        - node-2
        spec:
          crush:
            rack: "rack-1"
          storageDevices:
          - devLabel: nvme-1
            config:
              deviceClass: nvme
          - devLabel: nvme-2
            config:
              deviceClass: nvme
          - devLabel: nvme-3
            config:
              deviceClass: nvme
      rack-2:
        nodes:
        - node-3
        - node-4
        spec:
          crush:
            rack: "rack-2"
          storageDevices:
          - devLabel: nvme-1
            config:
              deviceClass: nvme
          - devLabel: nvme-2
            config:
              deviceClass: nvme
          - devLabel: nvme-3
            config:
              deviceClass: nvme
      rack-3:
        nodes:
        - node-5
        - node-6
        spec:
          crush:
            rack: "rack-3"
          storageDevices:
          - devLabel: nvme-1
            config:
              deviceClass: nvme
          - devLabel: nvme-2
            config:
              deviceClass: nvme
          - devLabel: nvme-3
            config:
              deviceClass: nvme

Save and quit editing the KaaSCephCluster custom resource.

After migration, the re-orchestration occurs. The whole procedure should not result in any real changes to the Ceph cluster state in Ceph OSDs.

See also

Obtain a by-id symlink of a storage device¶

You can start using a storage device only after a corresponding Machine becomes ready and accessible. Thus, KaaSCephCluster can be created only after all machines receive the status.providerStatus.hardware.storage configuration containing all required device by-id symlinks.

To obtain a device by-id symlink:

Verify that the Machine is Ready:
```
kubectl -n <managedClusterProject> get machine <machineName> -o jsonpath='{.status.phase}{"\n"}'
```
Substitute <managedClusterProject> with the cluster namespace and <machineName> with the machine name.

Obtain storage details for the Machine:

kubectl -n <managedClusterProject> get machine <machineName> -o yaml

Output example:

status:
  providerStatus:
    hardware:
      storage:
      - byID: /dev/disk/by-id/wwn-0x05ad99618d66a21f
        byIDs:
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_05ad99618d66a21f
        - /dev/disk/by-id/scsi-305ad99618d66a21f
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_05ad99618d66a21f
        - /dev/disk/by-id/wwn-0x05ad99618d66a21f
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:0
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:0
        name: /dev/sda
        serialNumber: 05ad99618d66a21f
        size: 61
        type: hdd
      - byID: /dev/disk/by-id/wwn-0x26d546263bd312b8
        byIDs:
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_26d546263bd312b8
        - /dev/disk/by-id/scsi-326d546263bd312b8
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_26d546263bd312b8
        - /dev/disk/by-id/wwn-0x26d546263bd312b8
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
        name: /dev/sdb
        serialNumber: 26d546263bd312b8
        size: 32
        type: hdd
      - byID: /dev/disk/by-id/wwn-0x2e52abb48862dbdc
        byIDs:
        - /dev/disk/by-id/lvm-pv-uuid-MncrcO-6cel-0QsB-IKaY-e8UK-6gDy-k2hOtf
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2e52abb48862dbdc
        - /dev/disk/by-id/scsi-32e52abb48862dbdc
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc
        - /dev/disk/by-id/wwn-0x2e52abb48862dbdc
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
        name: /dev/sdc
        serialNumber: 2e52abb48862dbdc
        size: 61
        type: hdd

Obtain the item from the byIDs list from the status.providerStatus.hardware.storage section that contains serialNumber and does not contain wwn as a bus ID.

In the example above, for the disk with the /dev/sdc name, you can use the /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2e52abb48862dbdc symlink as a persistent identifier of the storage device because it contains the 2e52abb48862dbdc serial number and does not contain wwn.

Note

Do not rely on the byID field only. This field may contain a /dev/disk/by-id/wwn symlink that cannot be considered a persistent identifier of a storage device.

See also

Increase Ceph cluster storage size¶

This section describes how to increase the overall storage size for all Ceph pools of the same device class: hdd, ssd, or nvme. The procedure presupposes adding a new Ceph OSD. The overall storage size for the required device class automatically increases once the Ceph OSD becomes available in the Ceph cluster.

To increase the overall storage size for a device class:

Identify the current storage size for the required device class:

kubectl --kubeconfig <managedClusterKubeconfig> -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph df

Substitute <managedClusterKubeconfig> with a managed cluster kubeconfig.

Example of system response:

--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED    RAW USED  %RAW USED
hdd    128 GiB  101 GiB  23 GiB    27 GiB      21.40
TOTAL  128 GiB  101 GiB  23 GiB    27 GiB      21.40

--- POOLS ---
POOL                   ID  PGS  STORED  OBJECTS  USED    %USED  MAX AVAIL
device_health_metrics   1    1     0 B        0     0 B      0     30 GiB
kubernetes-hdd          2   32  12 GiB    3.13k  23 GiB  20.57     45 GiB

Identify the number of Ceph OSDs with the required device class:

kubectl --kubeconfig <managedClusterKubeconfig> -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd df <deviceClass>

Substitute the following parameters:

<managedClusterKubeconfig> with a managed cluster kubeconfig
<deviceClass> with the required device class: hdd, ssd, or nvme

Example of system response for the hdd device class:

ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP      META      AVAIL    %USE   VAR   PGS  STATUS
 1    hdd  0.03119   1.00000   32 GiB  5.8 GiB  4.8 GiB   1.5 MiB  1023 MiB   26 GiB  18.22  0.85   14      up
 3    hdd  0.03119   1.00000   32 GiB  6.9 GiB  5.9 GiB   1.1 MiB  1023 MiB   25 GiB  21.64  1.01   17      up
 0    hdd  0.03119   0.84999   32 GiB  6.8 GiB  5.8 GiB  1013 KiB  1023 MiB   25 GiB  21.24  0.99   16      up
 2    hdd  0.03119   1.00000   32 GiB  7.9 GiB  6.9 GiB   1.2 MiB  1023 MiB   24 GiB  24.55  1.15   20      up
                       TOTAL  128 GiB   27 GiB   23 GiB   4.8 MiB   4.0 GiB  101 GiB  21.41
MIN/MAX VAR: 0.85/1.15  STDDEV: 2.29

Follow Add a Ceph OSD on a managed cluster to add a new device with a supported device class: hdd, ssd, or nvme.
Wait for the new Ceph OSD pod to start Running:
```
kubectl --kubeconfig <managedClusterKubeconfig> -n rook-ceph get pod -l app=rook-ceph-osd
```
Substitute <managedClusterKubeconfig> with a managed cluster kubeconfig.
Verify that the new Ceph OSD has rebalanced and Ceph health is HEALTH_OK:
```
kubectl --kubeconfig <managedClusterKubeconfig> -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
```
Substitute <managedClusterKubeconfig> with a managed cluster kubeconfig.

Verify that the new Ceph has been OSD added to the list of device class OSDs:

kubectl --kubeconfig <managedClusterKubeconfig> -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd df <deviceClass>

Substitute the following parameters:

<managedClusterKubeconfig> with a managed cluster kubeconfig
<deviceClass> with the required device class: hdd, ssd, or nvme

Example of system response for the hdd device class after adding a new Ceph OSD:

ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP      META      AVAIL    %USE   VAR   PGS  STATUS
 1    hdd  0.03119   1.00000   32 GiB  4.5 GiB  3.5 GiB   1.5 MiB  1023 MiB   28 GiB  13.93  0.78   10      up
 3    hdd  0.03119   1.00000   32 GiB  5.5 GiB  4.5 GiB   1.1 MiB  1023 MiB   26 GiB  17.22  0.96   13      up
 0    hdd  0.03119   0.84999   32 GiB  6.5 GiB  5.5 GiB  1013 KiB  1023 MiB   25 GiB  20.32  1.14   15      up
 2    hdd  0.03119   1.00000   32 GiB  7.5 GiB  6.5 GiB   1.2 MiB  1023 MiB   24 GiB  23.43  1.31   19      up
 4    hdd  0.03119   1.00000   32 GiB  4.6 GiB  3.6 GiB       0 B     1 GiB   27 GiB  14.45  0.81   10      up
                       TOTAL  160 GiB   29 GiB   24 GiB   4.8 MiB   5.0 GiB  131 GiB  17.87
MIN/MAX VAR: 0.78/1.31  STDDEV: 3.62

Verify the total storage capacity increased for the entire device class:

kubectl --kubeconfig <managedClusterKubeconfig> -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph df

Substitute <managedClusterKubeconfig> with a managed cluster kubeconfig.

Example of system response:

--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED    RAW USED  %RAW USED
hdd    160 GiB  131 GiB  24 GiB    29 GiB      17.97
TOTAL  160 GiB  131 GiB  24 GiB    29 GiB      17.97

--- POOLS ---
POOL                   ID  PGS  STORED  OBJECTS  USED    %USED  MAX AVAIL
device_health_metrics   1    1     0 B        0     0 B      0     38 GiB
kubernetes-hdd          2   32  12 GiB    3.18k  24 GiB  17.17     57 GiB

Move a Ceph Monitor daemon to another node¶

This document describes how to migrate a Ceph Monitor daemon from one node to another without changing the general number of Ceph Monitors in the cluster. In the Ceph Controller concept, migration of a Ceph Monitor means manually removing it from one node and adding it to another.

Consider the following exemplary placement scheme of Ceph Monitors in the nodes spec of the KaaSCephCluster CR:

nodes:
  node-1:
    roles:
    - mon
    - mgr
  node-2:
    roles:
    - mgr

Using the example above, if you want to move the Ceph Monitor from node-1 to node-2 without changing the number of Ceph Monitors, the roles table of the nodes spec must result as follows:

nodes:
  node-1:
    roles:
    - mgr
  node-2:
    roles:
    - mgr
    - mon

However, due to the Rook limitation related to Kubernetes architecture, once you move the Ceph Monitor through the KaaSCephCluster CR, changes will not apply automatically. This is caused by the following Rook behavior:

Rook creates Ceph Monitor resources as deployments with nodeSelector, which binds Ceph Monitor pods to a requested node.
Rook does not recreate new Ceph Monitors with the new node placement if the current mon quorum works.

Therefore, to move a Ceph Monitor to another node, you must also manually apply the new Ceph Monitors placement to the Ceph cluster as described below.

To move a Ceph Monitor to another node:

Open the KaasCephCluster CR of a managed cluster:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.
In the nodes spec of the KaaSCephCluster CR, change the mon roles placement without changing the total number of mon roles. For details, see the example above. Note the nodes on which the mon roles have been removed.
Wait until the corresponding MiraCeph resource is updated with the new nodes spec:
```
kubectl --kubeconfig <kubeconfig> -n ceph-lcm-mirantis get miraceph -o yaml
```
Substitute <kubeconfig> with the Container Cloud cluster kubeconfig that hosts the required Ceph cluster.
In the MiraCeph resource, determine which node has been changed in the nodes spec. Save the name value of the node where the mon role has been removed for further usage.
```
kubectl -n <managedClusterProjectName> get machine -o jsonpath='{range .items[*]}{.metadata.name .status.nodeRef.name}{"\n"}{end}'
```
Substitute <managedClusterProjectName> with the corresponding value.
If you perform a managed cluster update, follow additional steps:
1. Verify that the following conditions are met before proceeding to the next step:
 - There are at least 2 running and available Ceph Monitors so that the Ceph cluster is accessible during the Ceph Monitor migration:
 kubectl -n rook-ceph get pod -l app=rook-ceph-mon kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
 - The MiraCeph object on the managed cluster has the required node with the mon role added in the nodes section of spec:
 kubectl -n ceph-lcm-mirantis get miraceph -o yaml
 - The Ceph NodeWorkloadLock for the required node is created:
 kubectl --kubeconfig child-kubeconfig get nodeworkloadlock -o jsonpath='{range .items[?(@.spec.nodeName == "<desiredNodeName>")]}{@.metadata.name}{"\n"}{end}' | grep ceph
2. Scale the ceph-maintenance-controller deployment to 0 replicas:
```
kubectl -n ceph-lcm-mirantis scale deploy ceph-maintenance-controller --replicas 0
```
3. Manually edit the managed cluster node labels: remove the ceph_role_mon label from the obsolete node and add this label to the new node:
```
kubectl label node <obsoleteNodeName> ceph_role_mon-
kubectl label node <newNodeName> ceph_role_mon=true
```
4. Verify that the rook-ceph-operator deployment is scaled to 0 replica:
```
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
```
Obtain the rook-ceph-mon deployment name placed on the obsolete node using the previously obtained node name:
```
kubectl -n rook-ceph get deploy -l app=rook-ceph-mon -o jsonpath="{.items[?(@.spec.template.spec.nodeSelector['kubernetes\.io/hostname'] == '<nodeName>')].metadata.name}"
```
Substitute <nodeName> with the name of the node where you removed the mon role.

Back up the rook-ceph-mon deployment placed on the obsolete node:

kubectl -n rook-ceph get deploy <rook-ceph-mon-name> -o yaml > <rook-ceph-mon-name>-backup.yaml

Remove the rook-ceph-mon deployment placed on the obsolete node:

kubectl -n rook-ceph delete deploy <rook-ceph-mon-name>

If you perform a managed cluster update, follow additional steps:

Enter the ceph-tools pod:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

Remove the Ceph Monitor from the Ceph monmap by letter:
```
ceph mon rm <monLetter>
```
Substitute <monLetter> with the old Ceph Monitor letter. For example, mon-b has the letter b.
Verify that the Ceph cluster does not have any information about the the removed Ceph Monitor:
```
ceph mon dump
ceph -s
```
Exit the ceph-tools pod.

Scale up the rook-ceph-operator deployment to 1 replica:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1

Wait for the missing Ceph Monitor failover process to start:

kubectl -n rook-ceph logs -l app=rook-ceph-operator -f

Example of log extract:

2024-03-01 12:33:08.741215 W | op-mon: mon b NOT found in ceph mon map, failover
2024-03-01 12:33:08.741244 I | op-mon: marking mon "b" out of quorum
...
2024-03-01 12:33:08.766822 I | op-mon: Failing over monitor "b"
2024-03-01 12:33:08.766881 I | op-mon: starting new mon...

Select one of the following options:
If you do not perform a managed cluster update
Wait approximately 10 minutes until rook-ceph-operator performs a failover of the Pending mon pod. Inspect the logs during the failover process:
kubectl -n rook-ceph logs -l app=rook-ceph-operator -f
Example of log extract:
2021-03-15 17:48:23.471978 W | op-mon: mon "a" not found in quorum, waiting for timeout (554 seconds left) before failover
Note

If the failover process fails:
1. Scale down the rook-ceph-operator deployment to 0 replicas.
2. Apply the backed-up rook-ceph-mon deployment.
3. Scale back the rook-ceph-operator deployment to 1 replica.
If you perform a managed cluster update
1. Scale the rook-ceph-operator deployment to 0 replicas:
  kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
2. Scale the ceph-maintenance-controller deployment to 3 replicas:
  kubectl -n ceph-lcm-mirantis scale deploy ceph-maintenance-controller --replicas 3

Once done, Rook removes the obsolete Ceph Monitor from the node and creates a new one on the specified node with a new letter. For example, if the a, b, and c Ceph Monitors were in quorum and mon-c was obsolete, Rook removes mon-c and creates mon-d. In this case, the new quorum includes the a, b, and d Ceph Monitors.

Migrate a Ceph Monitor before machine replacement¶

Available since MCC 2.25.0 (Cluster release 17.0.0)

This document describes how to migrate a Ceph Monitor to another machine on baremetal-based clusters before node replacement as described in Delete a cluster machine using web UI.

Warning

Remove the Ceph Monitor role before the machine removal.
Make sure that the Ceph cluster always has an odd number of Ceph Monitors.

The procedure of a Ceph Monitor migration assumes that you temporarily move the Ceph Manager/Monitor to a worker machine. After a node replacement, we recommend migrating the Ceph Manager/Monitor to the new manager machine.

To migrate a Ceph Monitor to another machine:

Move the Ceph Manager/Monitor daemon from the affected machine to one of the worker machines as described in Move a Ceph Monitor daemon to another node.
Delete the affected machine as described in Delete a cluster machine.
Add a new manager machine without the Monitor and Manager role as described in Create a machine using CLI.

Warning

The addition of a new machine with the Monitor and Manager role breaks the odd number quorum of Ceph Monitors.
Move the previously migrated Ceph Manager/Monitor daemon to the new manager machine as described in Move a Ceph Monitor daemon to another node.

Enable Ceph RGW Object Storage¶

Ceph Controller enables you to deploy RADOS Gateway (RGW) Object Storage instances and automatically manage its resources such as users and buckets. Ceph Object Storage has an integration with OpenStack Object Storage (Swift) in MOSK.

To enable the RGW Object Storage:

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with a corresponding value.

Using the following table, update the cephClusterSpec.objectStorage.rgw section specification as required:

Caution

Since MCC 2.24.0 (Cluster releases 15.0.1 and 14.0.1), explicitly specify the deviceClass parameter for dataPool and metadataPool.

Warning

Since Container Cloud 2.6.0, the spec.rgw section is deprecated and its parameters are moved under objectStorage.rgw. If you continue using spec.rgw, it is automatically translated into objectStorage.rgw during the Container Cloud update to 2.6.0.

We strongly recommend changing spec.rgw to objectStorage.rgw in all KaaSCephCluster CRs before spec.rgw becomes unsupported and is deleted.

RADOS Gateway parameters¶
Parameter	Description
`name`	Ceph Object Storage instance name.
`dataPool`	Mutually exclusive with the `zone` parameter. Object storage data pool spec that should only contain `replicated` or `erasureCoded` and `failureDomain` parameters. The `failureDomain` parameter may be set to `osd` or `host`, defining the failure domain across which the data will be spread. For `dataPool`, Mirantis recommends using an `erasureCoded` pool. For details, see Rook documentation: Erasure coding. For example: cephClusterSpec: objectStorage: rgw: dataPool: erasureCoded: codingChunks: 1 dataChunks: 2
`metadataPool`	Mutually exclusive with the `zone` parameter. Object storage metadata pool spec that should only contain `replicated` and `failureDomain` parameters. The `failureDomain` parameter may be set to `osd` or `host`, defining the failure domain across which the data will be spread. Can use only `replicated` settings. For example: cephClusterSpec: objectStorage: rgw: metadataPool: replicated: size: 3 failureDomain: host where `replicated.size` is the number of full copies of data on multiple nodes. Warning When using the non-recommended Ceph pools `replicated.size` of less than `3`, Ceph OSD removal cannot be performed. The minimal replica size equals a rounded up half of the specified `replicated.size`. For example, if `replicated.size` is `2`, the minimal replica size is `1`, and if `replicated.size` is `3`, then the minimal replica size is `2`. The replica size of `1` allows Ceph having PGs with only one Ceph OSD in the `acting` state, which may cause a `PG_TOO_DEGRADED` health warning that blocks Ceph OSD removal. Mirantis recommends setting `replicated.size` to `3` for each Ceph pool.
`gateway`	The gateway settings corresponding to the `rgw` daemon settings. Includes the following parameters: `port` - the port on which the Ceph RGW service will be listening on HTTP. `securePort` - the port on which the Ceph RGW service will be listening on HTTPS. `instances` - the number of pods in the Ceph RGW ReplicaSet. If `allNodes` is set to `true`, a DaemonSet is created instead. Note Mirantis recommends using 2 instances for Ceph Object Storage. `allNodes` - defines whether to start the Ceph RGW pods as a DaemonSet on all nodes. The `instances` parameter is ignored if `allNodes` is set to `true`. For example: cephClusterSpec: objectStorage: rgw: gateway: allNodes: false instances: 1 port: 80 securePort: 8443
`preservePoolsOnDelete`	Defines whether to delete the data and metadata pools in the `rgw` section if the object storage is deleted. Set this parameter to `true` if you need to store data even if the object storage is deleted. However, Mirantis recommends setting this parameter to `false`.
`objectUsers` and `buckets`	Optional. To create new Ceph RGW resources, such as buckets or users, specify the following keys. Ceph Controller will automatically create the specified object storage users and buckets in the Ceph cluster. `objectUsers` - a list of user specifications to create for object storage. Contains the following fields: `name` - a user name to create. `displayName` - the Ceph user name to display. `capabilities` - user capabilities: `user` - admin capabilities to read/write Ceph Object Store users. `bucket` - admin capabilities to read/write Ceph Object Store buckets. `metadata` - admin capabilities to read/write Ceph Object Store metadata. `usage` - admin capabilities to read/write Ceph Object Store usage. `zone` - admin capabilities to read/write Ceph Object Store zones. The available options are ``, `read`, `write`, `read, write`. For details, see Ceph documentation: Add/remove admin capabilities. `quotas` - user quotas: `maxBuckets` - the maximum bucket limit for the Ceph user. Integer, for example, `10`. `maxSize` - the maximum size limit of all objects across all the buckets of a user. String size, for example, `10G`. `maxObjects` - the maximum number of objects across all buckets of a user. Integer, for example, `10`. For example: objectUsers: - capabilities: bucket: '' metadata: read user: read displayName: test-user name: test-user quotas: maxBuckets: 10 maxSize: 10G `users` - a list of strings that contain user names to create for object storage. Note This field is deprecated. Use `objectUsers` instead. If `users` is specified, it will be automatically transformed to the `objectUsers` section. `buckets` - a list of strings that contain bucket names to create for object storage.
`zone`	Optional. Mutually exclusive with `metadataPool` and `dataPool`. Defines the Ceph Multisite zone where the object storage must be placed. Includes the `name` parameter that must be set to one of the `zones` items. For details, see Enable multisite for Ceph RGW Object Storage. For example: cephClusterSpec: objectStorage: multisite: zones: - name: master-zone ... rgw: zone: name: master-zone
`SSLCert`	Optional. Custom TLS certificate parameters used to access the Ceph RGW endpoint. If not specified, a self-signed certificate will be generated. For example: cephClusterSpec: objectStorage: rgw: SSLCert: cacert: \| -----BEGIN CERTIFICATE----- ca-certificate here -----END CERTIFICATE----- tlsCert: \| -----BEGIN CERTIFICATE----- private TLS certificate here -----END CERTIFICATE----- tlsKey: \| -----BEGIN RSA PRIVATE KEY----- private TLS key here -----END RSA PRIVATE KEY-----
`SSLCertInRef`	Optional. Available since {{ product_name_abbr }} 25.1. Flag to determine that a TLS certificate for accessing the Ceph RGW endpoint is used but not exposed in `spec`. For example: cephClusterSpec: objectStorage: rgw: SSLCertInRef: true The operator must manually provide TLS configuration using the `rgw-ssl-certificate` secret in the `rook-ceph` namespace of the managed cluster. The secret object must have the following structure: data: cacert: <base64encodedCaCertificate> cert: <base64encodedCertificate> When removing an already existing `SSLCert` block, no additional actions are required, because this block uses the same `rgw-ssl-certificate` secret in the `rook-ceph` namespace. When adding a new secret directly without exposing it in `spec`, the following rules apply: `cert` - base64 representation of a file with the server TLS key, server TLS cert, and cacert. `cacert` - base64 representation of a cacert only.

For example:

cephClusterSpec:
  objectStorage:
    rgw:
      name: rgw-store
      dataPool:
        deviceClass: hdd
        erasureCoded:
          codingChunks: 1
          dataChunks: 2
        failureDomain: host
      metadataPool:
        deviceClass: hdd
        failureDomain: host
        replicated:
          size: 3
      gateway:
        allNodes: false
        instances: 1
        port: 80
        securePort: 8443
      preservePoolsOnDelete: false

Enable multisite for Ceph RGW Object Storage¶

TechPreview

The Ceph multisite feature allows object storage to replicate its data over multiple Ceph clusters. Using multisite, such object storage is independent and isolated from another object storage in the cluster. Only the multi-zone multisite setup is currently supported. For more details, see Ceph documentation: Multisite.

Enable the multisite RGW Object Storage¶

Select from the following options:
- If you do not have a Container cloud cluster yet, open kaascephcluster.yaml.template for editing.
- If the Container cloud cluster is already deployed, open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
 Substitute <managedClusterProjectName> with a corresponding value.

Using the following table, update the cephClusterSpec.objectStorage.multisite section specification as required:

Multisite parameters¶
Parameter	Description
`realms` ^{Technical Preview}	List of realms to use, represents the realm namespaces. Includes the following parameters: `name` - the realm name. `pullEndpoint` - optional, required only when the master zone is in a different storage cluster. The endpoint, access key, and system key of the system user from the realm to pull from. Includes the following parameters: `endpoint` - the endpoint of the master zone in the master zone group. `accessKey` - the access key of the system user from the realm to pull from. `secretKey` - the system key of the system user from the realm to pull from.
`zoneGroups` ^{Technical Preview}	The list of zone groups for realms. Includes the following parameters: `name` - the zone group name. `realmName` - the realm namespace name to which the zone group belongs to.
`zones` ^{Technical Preview}	The list of zones used within one zone group. Includes the following parameters: `name` - the zone name. `metadataPool` - the settings used to create the Object Storage metadata pools. Must use replication. For details, see Pool parameters. `dataPool` - the settings to create the Object Storage data pool. Can use replication or erasure coding. For details, see Pool parameters. `zoneGroupName` - the zone group name. `endpointsForZone` - available since {{ product_name_abbr }} 24.2. The list of all endpoints in the zone group. If you use ingress proxy for RGW, the list of endpoints must contain that FQDN/IP address to access RGW. By default, if no ingress proxy is used, the list of endpoints is set to the IP address of the RGW external service. Endpoints must follow the HTTP URL format.

Caution

The multisite configuration requires master and secondary zones to be reachable from each other.

Select from the following options:

If you do not need to replicate data from a different storage cluster, and the current cluster represents the master zone, modify the current objectStorage section to use the multisite mode:

Configure the zone RADOS Gateway (RGW) parameter by setting it to the RGW Object Storage name.

Note

Leave dataPool and metadataPool empty. These parameters are ignored because the zone block in the multisite configuration specifies the pools parameters. Other RGW parameters do not require changes.

For example:

objectStorage:
  rgw:
    dataPool: {}
    gateway:
      allNodes: false
      instances: 2
      port: 80
      securePort: 8443
    healthCheck: {}
    metadataPool: {}
    name: openstack-store
    preservePoolsOnDelete: false
    zone:
      name: openstack-store

Create the multiSite section where the names of realm, zone group, and zone must match the current RGW name.

Since MCC 2.27.0 (Cluster release 17.2.0), specify the endpointsForZone parameter according to your configuration:

If you use ingress proxy, which is defined in the spec.cephClusterSpec.ingress section, add the FQDN endpoint.
If you do not use any ingress proxy and access the RGW API using the default RGW external service, add the IP address of the external service or leave this parameter empty.

The following example illustrates a complete objectStorage section:

objectStorage:
  multiSite:
    realms:
    - name: openstack-store
    zoneGroups:
    - name: openstack-store
      realmName: openstack-store
    zones:
    - name: openstack-store
      zoneGroupName: openstack-store
      endpointsForZone: http://10.11.0.75:8080
      metadataPool:
        failureDomain: host
          replicated:
            size: 3
      dataPool:
        erasureCoded:
          codingChunks: 1
          dataChunks: 2
        failureDomain: host
  rgw:
    dataPool: {}
    gateway:
      allNodes: false
      instances: 2
      port: 80
      securePort: 8443
    healthCheck: {}
    metadataPool: {}
    name: openstack-store
    preservePoolsOnDelete: false
    zone:
      name: openstack-store

If you use a different storage cluster, and its object storage data must be replicated, specify the realm and zone group names along with the pullEndpoint parameter. Additionally, specify the endpoint, access key, and system keys of the system user of the realm from which you need to replicate data. For details, see the step 2 of this procedure.

To obtain the endpoint of the cluster zone that must be replicated, run the following command by specifying the zone group name of the required master zone on the master zone side:
```
radosgw-admin zonegroup get --rgw-zonegroup=<ZONE_GROUP_NAME> | jq -r '.endpoints'
```
The endpoint is located in the endpoints field.
To obtain the access key and the secret key of the system user, run the following command on the required Ceph cluster:
```
radosgw-admin user list
```
To obtain the system user name, which has your RGW ObjectStorage name as prefix:
```
radosgw-admin user info --uid="<USER_NAME>" | jq -r '.keys'
```

For example:

objectStorage:
  multiSite:
    realms:
    - name: openstack-store
      pullEndpoint:
        endpoint: http://10.11.0.75:8080
        accessKey: DRND5J2SVC9O6FQGEJJF
        secretKey: qpjIjY4lRFOWh5IAnbrgL5O6RTA1rigvmsqRGSJk
    zoneGroups:
    - name: openstack-store
      realmName: openstack-store
    zones:
    - name: openstack-store-backup
      zoneGroupName: openstack-store
      metadataPool:
        failureDomain: host
        replicated:
          size: 3
      dataPool:
        erasureCoded:
          codingChunks: 1
          dataChunks: 2
        failureDomain: host

Note

Mirantis recommends using the same metadataPool and dataPool settings as you use in the master zone.

Configure the zone RGW parameter and leave dataPool and metadataPool empty. These parameters are ignored because the zone section in the multisite configuration specifies the pools parameters.

Also, you can split the RGW daemon on daemons serving clients and daemons running synchronization. To enable this option, specify splitDaemonForMultisiteTrafficSync in the gateway section.

For example:

objectStorage:
  multiSite:
     realms:
     - name: openstack-store
       pullEndpoint:
         endpoint: http://10.11.0.75:8080
         accessKey: DRND5J2SVC9O6FQGEJJF
         secretKey: qpjIjY4lRFOWh5IAnbrgL5O6RTA1rigvmsqRGSJk
     zoneGroups:
     - name: openstack-store
       realmName: openstack-store
     zones:
     - name: openstack-store-backup
       zoneGroupName: openstack-store
       metadataPool:
         failureDomain: host
         replicated:
           size: 3
       dataPool:
         erasureCoded:
           codingChunks: 1
           dataChunks: 2
         failureDomain: host
  rgw:
    dataPool: {}
    gateway:
      allNodes: false
      instances: 2
      splitDaemonForMultisiteTrafficSync: true
      port: 80
      securePort: 8443
    healthCheck: {}
    metadataPool: {}
    name: openstack-store-backup
    preservePoolsOnDelete: false
    zone:
      name: openstack-store-backup

On the ceph-tools pod, verify the multisite status:
```
radosgw-admin sync status
```

Once done, ceph-operator will create the required resources and Rook will handle the multisite configuration. For details, see: Rook documentation: Object Multisite.

Configure and clean up a multisite configuration¶

Warning

Rook does not handle multisite configuration changes and cleanup. Therefore, once you enable multisite for Ceph RGW Object Storage, perform these operations manually in the ceph-tools pod. For details, see Rook documentation: Multisite cleanup.

If automatic update of zone group hostnames is disabled, manually specify all required hostnames and update the zone group. In the ceph-tools pod, run the following script:

/usr/local/bin/zonegroup_hostnames_update.sh --rgw-zonegroup <ZONEGROUP_NAME> --hostnames fqdn1[,fqdn2]

If the multisite setup is completely cleaned up, manually execute the following steps on the ceph-tools pod:

Remove the .rgw.root pool:
```
ceph osd pool rm .rgw.root .rgw.root --yes-i-really-really-mean-it
```
Some other RGW pools may also require a removal after cleanup.

Remove the related RGW crush rules:

ceph osd crush rule ls | grep rgw | xargs -I% ceph osd crush rule rm %

Manage Ceph RBD or CephFS clients and RGW users¶

Available since 2.23.1 (Cluster release 12.7.0)

The section describes how to create, access, and remove Ceph RADOS Block Device (RBD) or Ceph File System (CephFS) clients and RADOS Gateway (RGW) users.

Manage Ceph RBD or CephFS clients¶

Available since 2.23.1 (Cluster release 12.7.0)

The KaaSCephCluster resource allows managing custom Ceph RADOS Block Device (RBD) or Ceph File System (CephFS) clients. This section describes how to create, access, and remove Ceph RBD or CephFS clients.

For all supported parameters of Ceph clients, refer to Clients parameters.

Create an RBD or CephFS client¶

Edit the KaaSCephCluster resource by adding a new Ceph client to the spec section:

kubectl -n <managedClusterProjectName> edit kaascephcluster

Substitute <managedClusterProject> with the corresponding Container Cloud project where the managed cluster was created.

Example of adding an RBD client to the kubernetes-ssd pool:

spec:
  cephClusterSpec:
    clients:
    - name: rbd-client
      caps:
        mon: allow r, allow command "osd blacklist"
        osd: profile rbd pool=kubernetes-ssd

Example of adding a CephFS client to the cephfs-1 Ceph File System :

spec:
  cephClusterSpec:
    clients:
    - name: cephfs-1-client
      caps:
        mds: allow rwp
        mon: allow r, allow command "osd blacklist"
        osd: allow rw tag cephfs data=cephfs-1 metadata=*

For details about caps, refer to Ceph documentation: Authorization (capabilities).

Note

Ceph supports only providing of client access to the whole Ceph File System with all data pools in it.

Wait for created clients to become ready in the KaaSCephCluster status:

kubectl -n <managedClusterProject> get kaascephcluster -o yaml

Example output:

status:
  fullClusterInfo:
    blockStorageStatus:
      clientsStatus:
        rbd-client:
          present: true
          status: Ready
        cephfs-1-client:
          present: true
          status: Ready

Access data using an RBD or CephFS client¶

Using the KaaSCephCluster status, obtain secretInfo with the Ceph client credentials :

kubectl -n <managedClusterProject> get kaascephcluster -o yaml

Example output:

status:
  miraCephSecretsInfo:
    secretInfo:
      clientSecrets:
      - name: rbd-client
        secretName: rook-ceph-client-rbd-client
        secretNamespace: rook-ceph
      - name: cephfs-1-client
        secretName: rook-ceph-client-cephfs-1-client
        secretNamespace: rook-ceph

Use secretName and secretNamespace to access the Ceph client credentials from a managed cluster:
```
kubectl --kubeconfig <managedClusterKubeconfig> -n <secretNamespace> get secret <secretName> -o jsonpath='{.data.<clientName>}' | base64 -d; echo
```
Substitute the following parameters:
- <managedClusterKubeconfig> with a managed cluster kubeconfig
- <secretNamespace> with secretNamespace from the previous step
- <secretName> with secretName from the previous step
- <clientName> with the Ceph RBD or CephFS client name set in spec.cephClusterSpec.clients the KaaSCephCluster resource, for example, rbd-client
Example output:
```
AQAGHDNjxWYXJhAAjafCn3EtC6KgzgI1x4XDlg==
```
Using the obtained credentials, create two configuration files on the required workloads to connect them with Ceph pools or file systems:
- /etc/ceph/ceph.conf:
```
[default]
 mon_host = <mon1IP>:6789,<mon2IP>:6789,...,<monNIP>:6789
```
 where mon_host are the comma-separated IP addresses with 6789 ports of the current Ceph Monitors. For example, 10.10.0.145:6789,10.10.0.153:6789,10.10.0.235:6789.
- /etc/ceph/ceph.client.<clientName>.keyring:
```
[client.<clientName>]
 key = <cephClientCredentials>
```
 - <clientName> is a client name set in spec.cephClusterSpec.clients the KaaSCephCluster resource, for example, rbd-client
 - <cephClientCredentials> are the client credentials obtained in the previous steps. For example, AQAGHDNjxWYXJhAAjafCn3EtC6KgzgI1x4XDlg==
If the client caps parameters contain mon: allow r, verify the client access using the following command:
```
ceph -n client.<clientName> -s
```

Remove an RBD or CephFS client¶

Edit the KaaSCephCluster resource by removing the Ceph client from spec.cephClusterSpec.clients:
```
kubectl -n <managedClusterProject> edit kaascephcluster
```
Wait for the client to be removed from the KaaSCephCluster status in status.fullClusterInfo.blockStorageStatus.clientsStatus:
```
kubectl -n <managedClusterProject> get kaascephcluster -o yaml
```

Manage Ceph Object Storage users¶

Available since 2.23.1 (Cluster release 12.7.0)

The KaaSCephCluster resource allows managing custom Ceph Object Storage users. This section describes how to create, access, and remove Ceph Object Storage users.

For all supported parameters of Ceph Object Storage users, refer to RADOS Gateway parameters.

Create a Ceph Object Storage user¶

Edit the KaaSCephCluster resource by adding a new Ceph Object Storage user to the spec section:

kubectl -n <managedClusterProject> edit kaascephcluster

Substitute <managedClusterProject> with the corresponding Container Cloud project where the managed cluster was created.

Example of adding the Ceph Object Storage user user-a:

Caution

For user name, apply the UUID format with no capital letters.

spec:
  cephClusterSpec:
    objectStorage:
      rgw:
        objectUsers:
        - capabilities:
            bucket: '*'
            metadata: read
            user: read
          displayName: user-a
          name: userA
          quotas:
            maxBuckets: 10
            maxSize: 10G

Wait for the created user to become ready in the KaaSCephCluster status:

kubectl -n <managedClusterProject> get kaascephcluster -o yaml

Example output:

status:
  fullClusterInfo:
    objectStorageStatus:
      objectStoreUsers:
        user-a:
          present: true
          phase: Ready

Access data using a Ceph Object Storage user¶

Using the KaaSCephCluster status, obtain secretInfo with the Ceph user credentials :

kubectl -n <managedClusterProject> get kaascephcluster -o yaml

Example output:

status:
  miraCephSecretsInfo:
    secretInfo:
      rgwUserSecrets:
      - name: user-a
        secretName: rook-ceph-object-user-<objstoreName>-<username>
        secretNamespace: rook-ceph

Substitute <objstoreName> with a Ceph Object Storage name and <username> with a Ceph Object Storage user name.

Use secretName and secretNamespace to access the Ceph Object Storage user credentials from a managed cluster. The secret contains Amazon S3 access and secret keys.
- To obtain the user S3 access key:
```
kubectl --kubeconfig <managedClusterKubeconfig> -n <secretNamespace> get secret <secretName> -o jsonpath='{.data.AccessKey}' | base64 -d; echo
```
 Substitute the following parameters in the commands above and below:
 - <managedClusterKubeconfig> with a managed cluster kubeconfig
 - <secretNamespace> with secretNamespace from the previous step
 - <secretName> with secretName from the previous step
 Example output:
```
D49G060HQ86U5COBTJ13
```
- To obtain the user S3 secret key:
```
kubectl --kubeconfig <managedClusterKubeconfig> -n <secretNamespace> get secret <secretName> -o jsonpath='{.data.SecretKey}' | base64 -d; echo
```
 Example output:
```
bpuYqIieKvzxl6nzN0sd7L06H40kZGXNStD4UNda
```
Configure the S3 client with the access and secret keys of the created user. You can access the S3 client using various tools such as s3cmd or awscli.

Remove a Ceph Object Storage user¶

Edit the KaaSCephCluster resource by removing the required Ceph Object Storage user from spec.cephClusterSpec.objectStorage.rgw.objectUsers:
```
kubectl -n <managedClusterProject> edit kaascephcluster
```
Wait for the removed user to be removed from the KaaSCephCluster status in status.fullClusterInfo.objectStorageStatus.objectStoreUsers:
```
kubectl -n <managedClusterProject> get kaascephcluster -o yaml
```

Verify Ceph¶

This section describes how to verify the components of a Ceph cluster after deployment. For troubleshooting, verify Ceph Controller and Rook logs as described in Verify Ceph Controller and Rook.

Verify the Ceph core services¶

To confirm that all Ceph components including mon, mgr, osd, and rgw have joined your cluster properly, analyze the logs for each pod and verify the Ceph status:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
ceph -s

Example of a positive system response:

cluster:
    id:     4336ab3b-2025-4c7b-b9a9-3999944853c8
    health: HEALTH_OK

services:
    mon: 3 daemons, quorum a,b,c (age 20m)
    mgr: a(active, since 19m)
    osd: 6 osds: 6 up (since 16m), 6 in (since 16m)
    rgw: 1 daemon active (miraobjstore.a)

data:
    pools:   12 pools, 216 pgs
    objects: 201 objects, 3.9 KiB
    usage:   6.1 GiB used, 174 GiB / 180 GiB avail
    pgs:     216 active+clean

Verify rook-discover¶

To ensure that rook-discover is running properly, verify if the local-device configmap has been created for each Ceph node specified in the cluster configuration:

Obtain the list of local devices:

kubectl get configmap -n rook-ceph | grep local-device

Example of a system response:

local-device-01      1      30m
local-device-02      1      29m
local-device-03      1      30m

Verify that each device from the list contains information about available devices for the Ceph node deployment:

kubectl describe configmap local-device-01 -n rook-ceph

Example of a positive system response:

Name:         local-device-01
Namespace:    rook-ceph
Labels:       app=rook-discover
              rook.io/node=01
Annotations:  <none>

Data
====
devices:
----
[{"name":"vdd","parent":"","hasChildren":false,"devLinks":"/dev/disk/by-id/virtio-41d72dac-c0ff-4f24-b /dev/disk/by-path/virtio-pci-0000:00:09.0","size":32212254720,"uuid":"27e9cf64-85f4-48e7-8862-faa7270202ed","serial":"41d72dac-c0ff-4f24-b","type":"disk","rotational":true,"readOnly":false,"Partitions":null,"filesystem":"","vendor":"","model":"","wwn":"","wwnVendorExtension":"","empty":true,"cephVolumeData":"{\"path\":\"/dev/vdd\",\"available\":true,\"rejected_reasons\":[],\"sys_api\":{\"size\":32212254720.0,\"scheduler_mode\":\"none\",\"rotational\":\"1\",\"vendor\":\"0x1af4\",\"human_readable_size\":\"30.00 GB\",\"sectors\":0,\"sas_device_handle\":\"\",\"rev\":\"\",\"sas_address\":\"\",\"locked\":0,\"sectorsize\":\"512\",\"removable\":\"0\",\"path\":\"/dev/vdd\",\"support_discard\":\"0\",\"model\":\"\",\"ro\":\"0\",\"nr_requests\":\"128\",\"partitions\":{}},\"lvs\":[]}","label":""},{"name":"vdb","parent":"","hasChildren":false,"devLinks":"/dev/disk/by-path/virtio-pci-0000:00:07.0","size":67108864,"uuid":"988692e5-94ac-4c9a-bc48-7b057dd94fa4","serial":"","type":"disk","rotational":true,"readOnly":false,"Partitions":null,"filesystem":"","vendor":"","model":"","wwn":"","wwnVendorExtension":"","empty":true,"cephVolumeData":"{\"path\":\"/dev/vdb\",\"available\":false,\"rejected_reasons\":[\"Insufficient space (\\u003c5GB)\"],\"sys_api\":{\"size\":67108864.0,\"scheduler_mode\":\"none\",\"rotational\":\"1\",\"vendor\":\"0x1af4\",\"human_readable_size\":\"64.00 MB\",\"sectors\":0,\"sas_device_handle\":\"\",\"rev\":\"\",\"sas_address\":\"\",\"locked\":0,\"sectorsize\":\"512\",\"removable\":\"0\",\"path\":\"/dev/vdb\",\"support_discard\":\"0\",\"model\":\"\",\"ro\":\"0\",\"nr_requests\":\"128\",\"partitions\":{}},\"lvs\":[]}","label":""},{"name":"vdc","parent":"","hasChildren":false,"devLinks":"/dev/disk/by-id/virtio-e8fdba13-e24b-41f0-9 /dev/disk/by-path/virtio-pci-0000:00:08.0","size":32212254720,"uuid":"190a50e7-bc79-43a9-a6e6-81b173cd2e0c","serial":"e8fdba13-e24b-41f0-9","type":"disk","rotational":true,"readOnly":false,"Partitions":null,"filesystem":"","vendor":"","model":"","wwn":"","wwnVendorExtension":"","empty":true,"cephVolumeData":"{\"path\":\"/dev/vdc\",\"available\":true,\"rejected_reasons\":[],\"sys_api\":{\"size\":32212254720.0,\"scheduler_mode\":\"none\",\"rotational\":\"1\",\"vendor\":\"0x1af4\",\"human_readable_size\":\"30.00 GB\",\"sectors\":0,\"sas_device_handle\":\"\",\"rev\":\"\",\"sas_address\":\"\",\"locked\":0,\"sectorsize\":\"512\",\"removable\":\"0\",\"path\":\"/dev/vdc\",\"support_discard\":\"0\",\"model\":\"\",\"ro\":\"0\",\"nr_requests\":\"128\",\"partitions\":{}},\"lvs\":[]}","label":""}]

Verify Ceph cluster state through CLI¶

Verifying Ceph cluster state is an entry point for issues investigation. This section describes how to verify Ceph state using the KaaSCephCluster, MiraCeph, and MiraCephLog resources.

Note

Before MOSK 25.1, use MiraCephLog instead of MiraCephHealth.

Verify Ceph cluster state¶

To verify the state of a Ceph cluster, Ceph Controller provides special sections in KaaSCephCluster.status. The resource contains information about the state of the Ceph cluster components, their health, and potentially problematic components.

To verify the Ceph cluster state from a managed cluster:

Obtain kubeconfig of a managed cluster and provide it as an environment variable:
```
export KUBECONFIG=<pathToManagedKubeconfig>
```
Obtain the MiraCeph resource in YAML format:
```
kubectl -n ceph-lcm-mirantis get miraceph -o yaml
```
Information from MiraCeph.status is passed to the miraCephInfo section of the KaaSCephCluster CR. For details, see KaaSCephCluster.status miraCephInfo specification.
Obtain the MiraCephHealth resource in YAML format:
```
kubectl -n ceph-lcm-mirantis get miracephhealth -o yaml
```
Information from MiraCephHealth is passed to the fullClusterInfo and shortClusterInfo sections of the KaaSCephCluster CR. For details, see KaaSCephCluster.status shortClusterInfo specification and KaaSCephCluster.status fullClusterInfo specification.

Note

Before MOSK 25.1, use MiraCephLog instead of MiraCephHealth as the resource name and in the command above.

To verify the Ceph cluster state from a management cluster:

Obtain the KaaSCephCluster resource in the YAML format:
```
kubectl -n <projectName> get kaascephcluster -o yaml
```
Substitute <projectName> with the project name of the managed cluster.
Verify the state of the required component using KaaSCephCluster.status description.

KaaSCephCluster.status description¶

KaaSCephCluster.status allows you to learn the current health of a Ceph cluster and identify potentially problematic components. This section describes KaaSCephCluster.status and its fields. To view KaaSCephCluster.status, perform the steps described in Verify Ceph cluster state through CLI.

KaaSCephCluster.status specification
KaaSCephCluster.status miraCephInfo specification
KaaSCephCluster.status shortClusterInfo specification
KaaSCephCluster.status fullClusterInfo specification
KaaSCephCluster.status miraCephSecretsInfo specification ^{Since MCC 2.23.1 (Cluster release 12.7.0)}

KaaSCephCluster.status specification¶
Field	Description
`kaasCephState`	Available since MCC 2.25.0 (Cluster release 17.0.0). Describes the current state of `KaasCephCluster` and reflects any errors during object reconciliation, including spec generation, object creation on a managed cluster, and status retrieval.
`miraCephInfo`	Describes the current phase of Ceph spec reconciliation and spec validation result. The `miraCephInfo` section contains information about the current validation and reconcile of the `KaaSCephCluster` and `MiraCeph` resources. It helps to understand whether the specified configuration is valid to create a Ceph cluster and informs about the current phase of applying this configuration. For `miraCephInfo` fields description, see KaaSCephCluster.status miraCephInfo specification.
`shortClusterInfo`	Reresents a short version of `fullclusterinfo` and contains a summary on the Ceph cluster state collecting process and potential issues. It helps to quickly verify if the `fullClusterInfo` is actual and if any errors occurred during the information collecting. For `shortClusterInfo` fields description, see KaaSCephCluster.status shortClusterInfo specification.
`fullClusterInfo`	Contains a complete Ceph cluster information including cluster, Ceph resources, and daemons health. It helps to reveal the potentially problematic components. For `fullClusterInfo` fields description, see KaaSCephCluster.status fullClusterInfo specification.
`miraCephSecretsInfo`	Contains information about secrets of the managed cluster that are used in the Ceph cluster, such as keyrings, Ceph clients, RADOS Gateway user credentials, and so on. For `miraCephSecretsInfo` fields description, see KaaSCephCluster.status miraCephSecretsInfo specification.

The following tables describe all sections of KaaSCephCluster.status.

KaaSCephCluster.status miraCephInfo specification¶
Field	Description
`phase`	Contains the current phase of handling of the applied Ceph cluster spec. Can equal to `Creating`, `Deploying`, `Validation`, `Ready`, `Deleting`, or `Failed`.
`message`	Contains a detailed description of the current phase or an error message if the phase is `Failed`.
`validation`	Contains the `KaaSCephCluster`/`MiraCeph` spec validation result (`Succeed` or `Failed`) with a list of messages, if any. The `validation` section includes the following fields: validation: result: Succeed or Failed messages: ["error", "messages", "list"]

KaaSCephCluster.status shortClusterInfo specification¶
Field	Description
`state`	Current Ceph cluster collector status: `Ready` if information collecting works as expected `Failed` if an error occurs
`lastCheck`	`DateTime` that equals to the last time when the cluster was verified.
`lastUpdate`	`DateTime` that equals to the last time when the Ceph cluster information was updated.
`messages`	List of error or warning messages found when gathering the facts about the Ceph cluster.

KaaSCephCluster.status fullClusterInfo specification¶
Field	Description
`clusterStatus`	General information from Rook about the Ceph cluster health and current state. The `clusterStatus` field contains the following fields: clusterStatus: state: <rook ceph cluster common status> phase: <rook ceph cluster spec reconcile phase> message: <rook ceph cluster phase details> conditions: <history of rook ceph cluster reconcile steps> ceph: <ceph cluster health> storage: deviceClasses: <list of used device classes in ceph cluster> version: image: <ceph image used in ceph cluster> version: <ceph version of ceph cluster>
`operatorStatus`	Status of the Rook Ceph Operator pod that is `Ok` or `Not running`.
`daemonsStatus`	Map of statuses for each Ceph cluster daemon type. Indicates the expected and actual number of Ceph daemons on the cluster. Available daemon types are: `mgr`, `mon`, `osd`, and `rgw`. The `daemonsStatus` field contains the following fields: daemonsStatus: <daemonType>: status: <daemons status> running: <number of running daemons with details> For example: daemonsStatus: mgr: running: a is active mgr ([] standBy) status: Ok mon: running: '3/3 mons running: [a c d] in quorum' status: Ok osd: running: '4/4 running: 4 up, 4 in' status: Ok rgw: running: 2/2 running ([openstack.store.a openstack.store.b]) status: Ok
`blockStorageStatus`	State of the Ceph cluster block storage resources. Includes the following fields: `pools` - status map for each `CephBlockPool` resource. The map includes the following fields: pools: <cephBlockPoolName>: present: <flag whether desired pool is present in ceph cluster> status: <rook ceph block pool resource status> `clients` - status map for each Ceph client resource. The map includes the following fields: clients: <cephClientName>: present: <flag whether desired client is present in ceph cluster> status: <rook ceph client resource status>
`objectStorageStatus`	State of the Ceph cluster object storage resources. Includes the following fields: `objectStoreStatus` - status of the Rook Ceph Object Store. Information comes from Rook. `objectStoreUsers` - status map for each Ceph Object User resource. The map includes the following fields: objectStoreUsers: <cephObjectUserName>: present: <flag whether desired rgw user is present in ceph cluster> phase: <rook ceph object user resource phase> `objectStoreBuckets` - status map for each Ceph Object Bucket resource. The map includes the following fields: objectStoreBuckets: <cephObjectBucketName>: present: <flag whether desired rgw bucket is present in ceph cluster> phase: <rook ceph object bucket resource phase>
`cephDetails`	Verbose details of the Ceph cluster state. `cephDetails` includes the following fields: `diskUsage` - the used, available, and total storage size for each `deviceClass` and `pool`. cephDetails: diskUsage: deviceClass: <deviceClass>: # The amount of raw storage consumed by user data (excluding bluestore database). bytesUsed: "<number>" # The amount of free space available in the cluster. bytesAvailable: "<number>" # The amount of storage capacity managed by the cluster. bytesTotal: "<number>" pools: <poolName>: # The space allocated for a pool over all OSDs. This includes replication, # allocation granularity, and erasure-coding overhead. Compression savings # and object content gaps are also taken into account. BlueStore database # is not included in this amount. bytesUsed: "<number>" # The notional percentage of storage used per pool. usedPercentage: "<number>" # Number calculated with the formula: bytesTotal - bytesUsed. bytesAvailable: "<number>" # An estimate of the notional amount of data that can be written to this pool. bytesTotal: "<number>" `cephDeviceMapping` - a key-value mapping of which node contains which Ceph OSD and which Ceph OSD uses which disk. cephDetails: cephDeviceMapping: <kubernetes node name>: osd.<ID>: <deviceName> Note In MCC 2.24.2 (Cluster release 15.0.1), `cephDeviceMapping` is removed because its large size can potentially exceed the Kubernetes 1.5 MB quota.
`cephCSIPluginDaemonsStatus`	Contains information, similar to the `daemonsStatus` format, for each Ceph CSI plugin deployed in the Ceph cluster: `rbd` and, if enabled, `cephfs`. The `cephCSIPluginDaemonsStatus` field contains the following fields: cephCSIPluginDaemonsStatus: <csiPlugin>: running: <number of running daemons with details> status: <csi plugin status> For example: cephCSIPluginDaemonsStatus: csi-rbdplugin: running: 1/3 running status: Some csi-rbdplugin daemons are not ready csi-cephfsplugin: running: 3/3 running status: Ok

KaaSCephCluster.status miraCephSecretsInfo specification Available since MCC 2.23.1 (Cluster release 12.7.0)¶
Field	Description
`state`	Current state of the secret collector on the Ceph cluster: `Ready` - secrets information is collected successfully `Failed` - secrets information fails to be collected
`lastSecretCheck`	`DateTime` when the Ceph cluster secrets were verified last time.
`lastSecretUpdate`	`DateTime` when the Ceph cluster secrets were updated last time.
`secretsInfo`	List of secrets for Ceph clients and RADOS Gateway users: `clientSecrets` - details on secrets for Ceph clients `rgwUserSecrets` - details on secrets for Ceph RADOS Gateway users For example: lastSecretCheck: "2022-09-05T07:05:35Z" lastSecretUpdate: "2022-09-05T06:02:00Z" secretInfo: clientSecrets: - name: client.admin secretName: rook-ceph-admin-keyring secretNamespace: rook-ceph state: Ready
`messages`	List of error or warning messages, if any, found when collecting information about the Ceph cluster.

View Ceph cluster summary through the Container Cloud web UI¶

Warning

Mirantis highly recommends verifying a Ceph cluster using the CLI instead of the web UI. For details, see Verify Ceph cluster state through CLI.

The web UI capabilities for adding and managing a Ceph cluster are limited and lack flexibility in defining Ceph cluster specifications. For example, if an error occurs while adding a Ceph cluster using the web UI, usually you can address it only through the CLI.

The web UI functionality for managing Ceph cluster is going to be deprecated in one of the following releases.

Verifying Ceph cluster state is an entry point for issues investigation. Through the Ceph Clusters page of the Container Cloud web UI, you can view a detailed summary on all Ceph clusters deployed, including the cluster name and ID, health status, number of Ceph OSDs, and so on.

To view Ceph cluster summary:

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The page with cluster details opens.
In the Ceph Clusters tab, verify the overall cluster health and rebalancing statuses.
Available since MCC 2.25.0 (Cluster release 17.0.0). Click Cluster Details:
- The Machines tab contains the list of deployed Ceph machines with the following details:
 - Status - deployment status
 - Role - role assigned to a machine, manager or monitor
 - Storage devices - number of storage devices assigned to a machine
 - UP OSDs and IN OSDs - number of up and in Ceph OSDs belonging to a machine
 Note
 
 To obtain details about a specific machine used for Ceph deployment, in the Clusters > <clusterName> > Machines tab, click the required machine name containing the storage label.
- The OSDs tab contains the list of Ceph OSDs comprising the Ceph cluster with the following details:
 - OSD - Ceph OSD ID
 - Storage Device ID - storage device ID assigned to a Ceph OSD
 - Type - type of storage device assigned to a Ceph OSD
 - Partition - partition name where Ceph OSD is located
 - Machine - machine name where Ceph OSD is located
 - UP/DOWN - status of a Ceph OSD in a cluster
 - IN/OUT - service state of a Ceph OSD in a cluster

Verify Ceph Controller and Rook¶

The starting point for Ceph troubleshooting is the ceph-controller and rook-operator logs. Once you locate the component that causes issues, verify the logs of the related pod. This section describes how to verify the Ceph Controller and Rook objects of a Ceph cluster.

To verify Ceph Controller and Rook:

Verify the Ceph cluster status:
1. Verify that the status of each pod in the ceph-lcm-mirantis and rook-ceph namespaces is Running:
  - For ceph-lcm-mirantis:
    kubectl get pod -n ceph-lcm-mirantis
  - For rook-ceph:
    kubectl get pod -n rook-ceph
Verify Ceph Controller. Ceph Controller prepares the configuration that Rook uses to deploy the Ceph cluster, managed using the KaasCephCluster resource. If Rook cannot finish the deployment, verify the Rook Operator logs as described in the step 4.
1. List the pods:
```
kubectl -n ceph-lcm-mirantis get pods
```
2. Verify the logs of the required pod:
```
kubectl -n ceph-lcm-mirantis logs <ceph-controller-pod-name>
```
3. Verify the configuration:
```
kubectl get kaascephcluster -n <managedClusterProjectName> -o yaml
```
4. On the managed cluster, verify the MiraCeph subresource:
```
kubectl get miraceph -n ceph-lcm-mirantis -o yaml
```
Verify the Rook Operator logs. Rook deploys a Ceph cluster based on custom resources created by the Ceph Controller, such as pools, clients, cephcluster, and so on. Rook logs contain details about components orchestration. For details about the Ceph cluster status and to get access to CLI tools, connect to the ceph-tools pod as described in the step 5.
1. Verify the Rook Operator logs:
```
kubectl -n rook-ceph logs -l app=rook-ceph-operator
```
2. Verify the CephCluster configuration:
  
  Note
  
  The Ceph Controller manages the CephCluster CR . Open the CephCluster CR only for verification and do not modify it manually.
```
kubectl get cephcluster -n rook-ceph -o yaml
```

Verify the ceph-tools pod:

Execute the ceph-tools pod:

kubectl --kubeconfig <pathToManagedClusterKubeconfig> -n rook-ceph exec -it $(kubectl --kubeconfig <pathToManagedClusterKubeconfig> -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash

Verify that CLI commands can run on the ceph-tools pod:
```
ceph -s
```

Verify hardware:
1. Through the ceph-tools pod, obtain the required device in your cluster:
```
ceph osd tree
```
2. Enter all Ceph OSD pods in the rook-ceph namespace one by one:
```
kubectl exec -it -n rook-ceph <osd-pod-name> bash
```
3. Verify that the ceph-volume tool is available on all pods running on the target node:
```
ceph-volume lvm list
```

Verify data access. Ceph volumes can be consumed directly by Kubernetes workloads and internally, for example, by OpenStack services. To verify the Kubernetes storage:

Verify the available storage classes. The storage classes that are automatically managed by Ceph Controller use the rook-ceph.rbd.csi.ceph.com provisioner.

kubectl get storageclass

Example of system response:

NAME                            PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
kubernetes-ssd (default)        rook-ceph.rbd.csi.ceph.com     Delete          Immediate              false                  55m
stacklight-alertmanager-data    kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  55m
stacklight-elasticsearch-data   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  55m
stacklight-postgresql-db        kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  55m
stacklight-prometheus-data      kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  55m

Verify that volumes are properly connected to the Pod:

Obtain the list of volumes in all namespaces or use a particular one:

kubectl get persistentvolumeclaims -A

Example of system response:

NAMESPACE   NAME       STATUS   VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS     AGE
rook-ceph   app-test   Bound    pv-test   1Gi        RWO            kubernetes-ssd   11m

For each volume, verify the connection. For example:

kubectl describe pvc app-test -n rook-ceph

Example of a positive system response:

Name:          app-test
Namespace:     kaas
StorageClass:  rook-ceph
Status:        Bound
Volume:        pv-test
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Events:        <none>

In case of connection issues, inspect the Pod description for the volume information:

kubectl describe pod <crashloopbackoff-pod-name>

Example of system response:

...
Events:
  FirstSeen LastSeen Count From    SubObjectPath Type     Reason           Message
  --------- -------- ----- ----    ------------- -------- ------           -------
  1h        1h       3     default-scheduler     Warning  FailedScheduling PersistentVolumeClaim is not bound: "app-test" (repeated 2 times)
  1h        35s      36    kubelet, 172.17.8.101 Warning  FailedMount      Unable to mount volumes for pod "wordpress-mysql-918363043-50pjr_default(08d14e75-bd99-11e7-bc4c-001c428b9fc8)": timeout expired waiting for volumes to attach/mount for pod "default"/"wordpress-mysql-918363043-50pjr". list of unattached/unmounted volumes=[mysql-persistent-storage]
  1h        35s      36    kubelet, 172.17.8.101 Warning  FailedSync       Error syncing pod

Verify that the CSI provisioner plugins started properly and are in the Running status:
1. Obtain the list of CSI provisioner plugins:
```
kubectl -n rook-ceph get pod -l app=csi-rbdplugin-provisioner
```
2. Verify the logs of the required CSI provisioner:
```
kubectl logs -n rook-ceph <csi-provisioner-plugin-name> csi-provisioner
```

Enable Ceph tolerations and resources management¶

This section describes how to configure Ceph Controller to manage Ceph nodes resources.

Enable Ceph tolerations and resources management¶

Warning

This document does not provide any specific recommendations on requests and limits for Ceph resources. The document stands for a native Ceph resources configuration for any cluster with MOSK.

You can configure Ceph Controller to manage Ceph resources by specifying their requirements and constraints. To configure the resources consumption for the Ceph nodes, consider the following options that are based on different Helm release configuration values:

Configuring tolerations for taint nodes for the Ceph Monitor, Ceph Manager, and Ceph OSD daemons. For details, see Taints and Tolerations.
Configuring nodes resources requests or limits for the Ceph daemons and for each Ceph OSD device class such as HDD, SSD, or NVMe. For details, see Managing Resources for Containers.

To enable Ceph tolerations and resources management:

To avoid Ceph cluster health issues during daemons configuration changing, set Ceph noout, nobackfill, norebalance, and norecover flags through the ceph-tools pod before editing Ceph tolerations and resources:
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
"app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
ceph osd set noout
ceph osd set nobackfill
ceph osd set norebalance
ceph osd set norecover
exit
```
Note

Skip this step if you are only configuring the PG rebalance timeout and replicas count parameters.
Edit the KaaSCephCluster resource of a managed cluster:
```
kubectl -n <managedClusterProjectName> edit kaascephcluster
```
Substitute <managedClusterProjectName> with the project name of the required managed cluster.

Specify the parameters in the hyperconverge section as required. The hyperconverge section includes the following parameters:

Parameter

Description

Example values

tolerations

Specifies tolerations for taint nodes for the defined daemon type. Each daemon type key contains the following parameters:

cephClusterSpec:
  hyperconverge:
    tolerations:
      <daemonType>:
        rules:
        - key: ""
          operator: ""
          value: ""
          effect: ""
          tolerationSeconds: 0

Possible values for <daemonType> are osd, mon, mgr, and rgw. The following values are also supported:

all - specifies general toleration rules for all daemons if no separate daemon rule is specified.
mds - specifies the CephFS Metadata Server daemons.

hyperconverge:
  tolerations:
    mon:
      rules:
      - effect: NoSchedule
        key: node-role.kubernetes.io/controlplane
        operator: Exists
    mgr:
      rules:
      - effect: NoSchedule
        key: node-role.kubernetes.io/controlplane
        operator: Exists
    osd:
      rules:
      - effect: NoSchedule
        key: node-role.kubernetes.io/controlplane
        operator: Exists
    rgw:
      rules:
      - effect: NoSchedule
        key: node-role.kubernetes.io/controlplane
        operator: Exists

resources

Specifies resources requests or limits. The parameter is a map with the daemon type as a key and the following structure as a value:

hyperconverge:
  resources:
    <daemonType>:
      requests: <kubernetes valid spec of daemon resource requests>
      limits: <kubernetes valid spec of daemon resource limits>

Possible values for <daemonType> are mon, mgr, osd, osd-hdd, osd-ssd, osd-nvme, prepareosd, rgw, and mds. The osd-hdd, osd-ssd, and osd-nvme resource requirements handle only the Ceph OSDs with a corresponding device class.

hyperconverge:
  resources:
    mon:
      requests:
        memory: 1Gi
        cpu: 2
      limits:
        memory: 2Gi
        cpu: 3
    mgr:
      requests:
        memory: 1Gi
        cpu: 2
      limits:
        memory: 2Gi
        cpu: 3
    osd:
      requests:
        memory: 1Gi
        cpu: 2
      limits:
        memory: 2Gi
        cpu: 3
    osd-hdd:
      requests:
        memory: 1Gi
        cpu: 2
      limits:
        memory: 2Gi
        cpu: 3
    osd-ssd:
      requests:
        memory: 1Gi
        cpu: 2
      limits:
        memory: 2Gi
        cpu: 3
    osd-nvme:
      requests:
        memory: 1Gi
        cpu: 2
      limits:
        memory: 2Gi
        cpu: 3

For the Ceph node specific resources settings, specify the resources section in the corresponding nodes spec of KaaSCephCluster:

spec:
  cephClusterSpec:
    nodes:
      <nodeName>:
        resources:
          requests: <kubernetes valid spec of daemon resource requests>
          limits: <kubernetes valid spec of daemon resource limits>

Substitute <nodeName> with the node requested for specific resources. For example:

spec:
  cephClusterSpec:
    nodes:
      <nodeName>:
        resources:
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3

For the RADOS Gateway instances specific resources settings, specify the resources section in the rgw spec of KaaSCephCluster:

spec:
  cephClusterSpec:
    objectStorage:
      rgw:
        gateway:
          resources:
            requests: <kubernetes valid spec of daemon resource requests>
            limits: <kubernetes valid spec of daemon resource limits>

For example:

spec:
  cephClusterSpec:
    objectStorage:
      rgw:
        gateway:
          resources:
            requests:
              memory: 1Gi
              cpu: 2
            limits:
              memory: 2Gi
              cpu: 3

Save the reconfigured KaaSCephCluster resource and wait for ceph-controller to apply the updated Ceph configuration. It will recreate Ceph Monitors, Ceph Managers, or Ceph OSDs according to the specified hyperconverge configuration.
If you have specified any osd tolerations, additionally specify tolerations for the rook instances:
1. Open the Cluster resource of the required Ceph cluster on a management cluster:
```
kubectl -n <ClusterProjectName> edit cluster
```
 Substitute <ClusterProjectName> with the project name of the required cluster.
2. Specify the parameters in the ceph-controller section of spec.providerSpec.value.helmReleases:
 1. Specify the hyperconverge.tolerations.rook parameter as required:
 hyperconverge: tolerations: rook: | <yamlFormattedKubernetesTolerations>
 In <yamlFormattedKubernetesTolerations>, specify YAML-formatted tolerations from cephClusterSpec.hyperconverge.tolerations.osd.rules of the KaaSCephCluster spec. For example:
 hyperconverge: tolerations: rook: | - effect: NoSchedule key: node-role.kubernetes.io/controlplane operator: Exists
 2. In controllers.cephRequest.parameters.pgRebalanceTimeoutMin, specify the PG rebalance timeout for requests. The default is 30 minutes. For example:
 controllers: cephRequest: parameters: pgRebalanceTimeoutMin: 35
 3. In controllers.cephController.replicas, controllers.cephRequest.replicas, and controllers.cephStatus.replicas, specify the replicas count. The default is 3 replicas. For example:
 controllers: cephController: replicas: 1 cephRequest: replicas: 1 cephStatus: replicas: 1
3. Save the reconfigured Cluster resource and wait for the ceph-controller Helm release update. It will recreate Ceph CSI and discover pods according to the specified hyperconverge.tolerations.rook configuration.
Specify tolerations for different Rook resources using the following chart-based options:
- hyperconverge.tolerations.rook - general toleration rules for each Rook service if no exact rules specified
- hyperconverge.tolerations.csiplugin - for tolerations of the ceph-csi plugins DaemonSets
- hyperconverge.tolerations.csiprovisioner - for the ceph-csi provisioner deployment tolerations
- hyperconverge.nodeAffinity.csiprovisioner - provides the ceph-csi provisioner node affinity with a value section

After a successful Ceph reconfiguration, unset the flags set in step 1 through the ceph-tools pod:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
"app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
ceph osd unset
ceph osd unset noout
ceph osd unset nobackfill
ceph osd unset norebalance
ceph osd unset norecover
exit

Note

Skip this step if you have only configured the PG rebalance timeout and replicas count parameters.

Once done, proceed to Verify Ceph tolerations and resources management.

Verify Ceph tolerations and resources management¶

After you enable Ceph resources management as described in Enable Ceph tolerations and resources management, perform the steps below to verify that the configured tolerations, requests, or limits have been successfully specified in the Ceph cluster.

To verify Ceph tolerations and resources management:

To verify that the required tolerations are specified in the Ceph cluster, inspect the output of the following commands:

kubectl -n rook-ceph get $(kubectl -n rook-ceph get cephcluster -o name) -o jsonpath='{.spec.placement.mon.tolerations}'
kubectl -n rook-ceph get $(kubectl -n rook-ceph get cephcluster -o name) -o jsonpath='{.spec.placement.mgr.tolerations}'
kubectl -n rook-ceph get $(kubectl -n rook-ceph get cephcluster -o name) -o jsonpath='{.spec.placement.osd.tolerations}'

To verify RADOS Gateway tolerations:

kubectl -n rook-ceph get $(kubectl -n rook-ceph get cephobjectstore -o name) -o jsonpath='{.spec.gateway.placement.tolerations}'

To verify that the required resources requests or limits are specified for the Ceph mon, mgr, or osd daemons, inspect the output of the following command:
```
kubectl -n rook-ceph get $(kubectl -n rook-ceph get cephcluster -o name) -o jsonpath='{.spec.resources}'
```
To verify that the required resources requests and limits are specified for the RADOS Gateway daemons, inspect the output of the following command:
```
kubectl -n rook-ceph get $(kubectl -n rook-ceph get cephobjectstore -o name) -o jsonpath='{.spec.gateway.resources}'
```
To verify that the required resources requests or limits are specified for the Ceph OSDs hdd, ssd, or nvme device classes, perform the following steps:
1. Identify which Ceph OSDs belong to the <deviceClass> device class in question:
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o name) -- ceph osd crush class ls-osd <deviceClass>
```
2. For each <osdID> obtained in the previous step, run the following command. Compare the output with the desired result.
```
kubectl -n rook-ceph get deploy rook-ceph-osd-<osdID> -o jsonpath='{.spec.template.spec.containers[].resources}'
```

Enable Ceph multinetwork¶

Ceph allows establishing multiple IP networks and subnet masks for clusters with configured L3 network rules. In MOSK, you can configure multinetwork through the network section of the KaaSCephCluster CR. Ceph Controller uses this section to specify the Ceph networks for external access and internal daemon communication. The parameters in the network section use the CIDR notation, for example, 10.0.0.0/24.

Before enabling multiple networks for a Ceph cluster, consider the following requirements:

Do not confuse the IP addresses you define with the public-facing IP addresses the network clients may use to access the services.
If you define more than one IP address and subnet mask for the public or cluster network, ensure that the subnets within the network can route to each other.
Include each IP address or subnet in the network section to IP tables and open ports for them as necessary.
The pods of the Ceph OSD and RadosGW daemons use cross-pods health checkers to verify that the entire Ceph cluster is healthy. Therefore, each CIDR must be accessible inside Ceph pods.
Avoid using the 0.0.0.0/0 CIDR in the network section. With a zero range in publicNet and/or clusterNet, the Ceph daemons behavior is unpredictable.

To enable multinetwork for Ceph:

Select from the following options:
- If the Ceph cluster is not deployed on a managed cluster yet, edit the deployment KaaSCephCluster YAML template.
- If the Ceph cluster is already deployed on a managed cluster, open KaaSCephCluster for editing:
```
kubectl -n <managedClusterProjectName> edit kaascephcluster
```
 Substitute <managedClusterProjectName> with a corresponding value.
In the clusterNet and/or publicNet parameters of the cephClusterSpec.network section, define a comma-separated array of CIDRs. For example:
```
network:
  publicNet:  10.12.0.0/24,10.13.0.0/24
  clusterNet: 10.10.0.0/24,10.11.0.0/24
```
Select from the following options:
- If you are creating a managed cluster, save the updated KaaSCephCluster template to the corresponding file and proceed with the managed cluster creation.
- If you are configuring KaaSCephCluster of an existing managed cluster, exiting the text editor will apply the changes.

Once done, the specified network CIDRs will be passed to the Ceph daemons pods through the rook-config-override ConfigMap.

Enable Ceph RBD mirroring¶

This section describes how to configure and use RADOS Block Device (RBD) mirroring for Ceph pools using the rbdMirror section in the KaaSCephCluster CR. The feature may be useful if, for example, you have two interconnected managed clusters. Once you enable RBD mirroring, the images in the specified pools will be replicated and if a cluster becomes unreachable, the second one will provide users with instant access to all images. For details, see Ceph Documentation: RBD Mirroring.

Note

Ceph Controller only supports bidirectional mirroring.

To enable Ceph RBD monitoring, follow the procedure below and use the following rbdMirror parameters description:

Ceph rbdMirror section parameters¶
Parameter	Description
`daemonsCount`	Count of `rbd-mirror` daemons to spawn. Mirantis recommends using one instance of the `rbd-mirror` daemon.
`peers`	Optional. List of mirroring peers of an external cluster to connect to. Only a single peer is supported. The `peer` section includes the following parameters: `site` - the label of a remote Ceph cluster associated with the token. `token` - the token that will be used by one site (Ceph cluster) to pull images from the other site. To obtain the token, use the rbd mirror pool peer bootstrap create command. `pools` - optional, a list of pool names to mirror.

To enable Ceph RBD mirroring:

In KaaSCephCluster CRs of both Ceph clusters where you want to enable mirroring, specify positive daemonsCount in the spec.cephClusterSpec.rbdMirror section:
```
spec:
  cephClusterSpec:
    rbdMirror:
      daemonsCount: 1
```
On both Ceph clusters where you want to enable mirroring, wait for the Ceph RBD Mirror daemons to start running:
```
kubectl -n rook-ceph get pod -l app=rook-ceph-rbd-mirror
```
In KaaSCephCluster of both Ceph clusters where you want to enable mirroring, specify the spec.cephClusterSpec.pools.mirroring.mode parameter for all pools that must be mirrored.
Mirroring mode recommendations
- Mirantis recommends using the pool mode for mirroring. For the pool mode, explicitly enable journaling for each image.
- To use the image mirroring mode, explicitly enable mirroring as described in the step 8.
```
spec:
  cephClusterSpec:
    pools:
    - name: image-hdd
      ...
      mirroring:
        mode: pool
    - name: volumes-hdd
      ...
      mirroring:
        mode: pool
```
Obtain the name of an external site to mirror with. On pools with mirroring enabled, the name is typically ceph fsid:
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
"app=rook-ceph-tools" -o name
rbd mirror pool info <mirroringPoolName>
# or
ceph fsid
```
Substitute <mirroringPoolName> with the name of a pool to be mirrored.
On an external site to mirror with, create a new bootstrap peer token. Execute the following command within the ceph-tools pod CLI:
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
"app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
rbd mirror pool peer bootstrap create <mirroringPoolName> --site-name <siteName>
```
Substitute <mirroringPoolName> with the name of a pool to be mirrored. In <siteName>, assign a label for the external Ceph cluster that will be used along with mirroring.

For details, see Ceph documentation: Bootstrap peers.

In KaaSCephCluster on the cluster that should mirror pools, specify spec.cephClusterSpec.rbdMirror.peers with the obtained peer and pools to mirror:

spec:
  cephClusterSpec:
    rbdMirror:
      ...
      peers:
      - site: <siteName>
        token: <bootstrapPeer>
        pools: [<mirroringPoolName1>, <mirroringPoolName2>, ...]

Substitute <siteName> with the label assigned to the external Ceph cluster, <bootstrapPeer> with the token obtained in the previous step, and <mirroringPoolName> with names of pools that have the mirroring.mode parameter defined.

For example:

spec:
  cephClusterSpec:
    rbdMirror:
      ...
      peers:
      - site: cluster-b
        token: <base64-string>
        pools:
        - images-hdd
        - volumes-hdd
        - special-pool-ssd

Verify that mirroring is enabled and each pool with spec.cephClusterSpec.pools.mirroring.mode defined has an external peer site:
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
"app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
rbd mirror pool info <mirroringPoolName>
```
Substitute <mirroringPoolName> with the name of a pool with mirroring enabled.
If you have set the image mirroring mode in the pools section, explicitly enable mirroring for each image with rbd within the pool:

Note

Execute the following command within the ceph-tools pod with ceph and rbd CLI.
```
rbd mirror image enable <poolName>/<imageName> <imageMirroringMode>
```
Substitute <poolName> with the name of a pool with the image mirroring mode, <imageName> with the name of an image stored in the specified pool. Substitute <imageMirroringMode> with one of:
- journal - for mirroring to use the RBD journaling image feature to replicate the image contents. If the RBD journaling image feature is not yet enabled on the image, it will be enabled automatically.
- snapshot - for mirroring to use RBD image mirror-snapshots to replicate the image contents. Once enabled, an initial mirror-snapshot will automatically be created. To create additional RBD image mirror-snapshots, use the rbd command.
For details, see Ceph Documentation: Enable image mirroring.

Configure Ceph Shared File System (CephFS)¶

Available since 2.23.1 (Cluster release 12.7.0)

Caution

Since Ceph Pacific, Ceph CSI driver does not propagate the 777 permission on the mount point of persistent volumes based on any StorageClass of the CephFS data pool.

The Ceph Shared File System, or CephFS, provides the capability to create read/write shared file system Persistent Volumes (PVs). These PVs support the ReadWriteMany access mode for the FileSystem volume mode. CephFS deploys its own daemons called MetaData Servers or Ceph MDS. For details, see Ceph Documentation: Ceph File System.

Note

By design, CephFS data pool and metadata pool must be replicated only.

Limitations

CephFS is supported as a Kubernetes CSI plugin that only supports creating Kubernetes Persistent Volumes based on the FileSystem volume mode. For a complete modes support matrix, see Ceph CSI: Support Matrix.
Before MOSK 25.1, Ceph Controller supports only one CephFS installation per Ceph cluster.
Re-creating of the CephFS instance in a cluster requires a different value for the name parameter.

CephFS specification¶

The KaaSCephCluster CR includes the spec.cephClusterSpec.sharedFilesystem.cephFS section with the following CephFS parameters:

CephFS specification¶
Parameter	Description
`name`	CephFS instance name.
`dataPools`	A list of CephFS data pool specifications. Each spec contains the `name`, `replicated` or `erasureCoded`, `deviceClass`, and `failureDomain` parameters. The first pool in the list is treated as the default data pool for CephFS and must always be `replicated`. The `failureDomain` parameter may be set to `osd` or `host`, defining the failure domain across which the data will be spread. The number of data pools is unlimited, but the default pool must always be present. For example: cephClusterSpec: sharedFilesystem: cephFS: - name: cephfs-store dataPools: - name: default-pool deviceClass: ssd replicated: size: 3 failureDomain: host - name: second-pool deviceClass: hdd erasureCoded: dataChunks: 2 codingChunks: 1 Where `replicated.size` is the number of full copies of data on multiple nodes. Warning When using the non-recommended Ceph pools `replicated.size` of less than `3`, Ceph OSD removal cannot be performed. The minimal replica size equals a rounded up half of the specified `replicated.size`. For example, if `replicated.size` is `2`, the minimal replica size is `1`, and if `replicated.size` is `3`, then the minimal replica size is `2`. The replica size of `1` allows Ceph having PGs with only one Ceph OSD in the `acting` state, which may cause a `PG_TOO_DEGRADED` health warning that blocks Ceph OSD removal. Mirantis recommends setting `replicated.size` to `3` for each Ceph pool. Warning Modifying of `dataPools` on a deployed CephFS has no effect. You can manually adjust pool settings through the Ceph CLI. However, for any changes in `dataPools`, Mirantis recommends re-creating CephFS.
`metadataPool`	CephFS metadata pool spec that should only contain `replicated`, `deviceClass`, and `failureDomain` parameters. The `failureDomain` parameter may be set to `osd` or `host`, defining the failure domain across which the data will be spread. Can use only `replicated` settings. For example: cephClusterSpec: sharedFilesystem: cephFS: - name: cephfs-store metadataPool: deviceClass: nvme replicated: size: 3 failureDomain: host where `replicated.size` is the number of full copies of data on multiple nodes. Warning Modifying of `metadataPool` on a deployed CephFS has no effect. You can manually adjust pool settings through the Ceph CLI. However, for any changes in `metadataPool`, Mirantis recommends re-creating CephFS.
`preserveFilesystemOnDelete`	Defines whether to delete the data and metadata pools if CephFS is deleted. Set to `true` to avoid occasional data loss in case of human error. However, for security reasons, Mirantis recommends setting `preserveFilesystemOnDelete` to `false`.
`metadataServer`	Metadata Server settings correspond to the Ceph MDS daemon settings. Contains the following fields: `activeCount` - the number of active Ceph MDS instances. As load increases, CephFS will automatically partition the file system across the Ceph MDS instances. Rook will create double the number of Ceph MDS instances as requested by `activeCount`. The extra instances will be in the standby mode for failover. Mirantis recommends specifying this parameter to `1` and increasing the MDS daemons count only in case of high load. `activeStandby` - defines whether the extra Ceph MDS instances will be in active standby mode and will keep a warm cache of the file system metadata for faster failover. The instances will be assigned by CephFS in failover pairs. If `false`, the extra Ceph MDS instances will all be in passive standby mode and will not maintain a warm cache of the metadata. The default value is `false`. `resources` - represents Kubernetes resource requirements for Ceph MDS pods. For example: cephClusterSpec: sharedFilesystem: cephFS: - name: cephfs-store metadataServer: activeCount: 1 activeStandby: false resources: # example, non-prod values requests: memory: 1Gi cpu: 1 limits: memory: 2Gi cpu: 2

Configure CephFS¶

Optional. Override the CSI CephFS gRPC and liveness metrics port. For example, if an application is already using the default CephFS ports 9092 and 9082, which may cause conflicts on the node.
1. Open the Cluster CR of a managed cluster for editing:
```
kubectl -n <managedClusterProjectName> edit cluster
```
 Substitute <managedClusterProjectName> with the corresponding value.
2. In the spec.providerSpec.helmReleases section, configure csiCephFsGPCMetricsPort and csiCephFsLivenessMetricsPort as required. For example:
```
spec:
 providerSpec:
 helmReleases:
 ...
 - name: ceph-controller
 ...
 values:
 ...
 rookExtraConfig:
 csiCephFsEnabled: true
 csiCephFsGPCMetricsPort: "9092" # should be a string
 csiCephFsLivenessMetricsPort: "9082" # should be a string
```
Rook will enable the CephFS CSI plugin and provisioner.
Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with the corresponding value.

In the sharedFilesystem section, specify parameters according to CephFS specification. For example:

spec:
  cephClusterSpec:
    sharedFilesystem:
      cephFS:
      - name: cephfs-store
        dataPools:
        - name: cephfs-pool-1
          deviceClass: hdd
          replicated:
            size: 3
          failureDomain: host
        metadataPool:
          deviceClass: nvme
          replicated:
            size: 3
          failureDomain: host
        metadataServer:
          activeCount: 1
          activeStandby: false

Define the mds role for the corresponding nodes where Ceph MDS daemons should be deployed. Mirantis recommends labeling only one node with the mds role. For example:
```
spec:
  cephClusterSpec:
    nodes:
      ...
      worker-1:
        roles:
        ...
        - mds
```

Once CephFS is specified in the KaaSCephCluster CR, Ceph Controller will validate it and request Rook to create CephFS. Then Ceph Controller will create a Kubernetes StorageClass, required to start provisioning the storage, which will operate the CephFS CSI driver to create Kubernetes PVs.

Specify placement of Ceph cluster daemons¶

If you need to configure the placement of Rook daemons on nodes, you can add extra values in the Cluster providerSpec section of the ceph-controller Helm release.

The procedures in this section describe how to specify the placement of rook-ceph-operator, rook-discover, and csi-rbdplugin.

To specify rook-ceph-operator placement:

On the management cluster, edit the Cluster resource of the target managed cluster:
```
kubectl -n <managedClusterProjectName> edit cluster
```

Add the following parameters to the ceph-controller Helm release values:

spec:
  providerSpec:
    value:
      helmReleases:
      - name: ceph-controller
        values:
          rookOperatorPlacement:
            affinity: <rookOperatorAffinity>
            nodeSelector: <rookOperatorNodeSelector>
            tolerations: <rookOperatorTolerations>

<rookOperatorAffinity> is a key-value mapping that contains a valid Kubernetes affinity specification
<rookOperatorNodeSelector> is a key-value mapping that contains a valid Kubernetes nodeSelector specification
<rookOperatorTolerations> is a list that contains valid Kubernetes toleration items

Wait for some time and verify on a managed cluster that the changes have applied:
```
kubectl -n rook-ceph get deploy rook-ceph-operator -o yaml
```

To specify rook-discover and csi-rbdplugin placement simultaneously:

On the management cluster, edit the desired Cluster resource:
```
kubectl -n <managedClusterProjectName> edit cluster
```

Add the following parameters to the ceph-controller Helm release values:

spec:
  providerSpec:
    value:
      helmReleases:
      - name: ceph-controller
        values:
          rookExtraConfig:
            extraDaemonsetLabels: <labelSelector>

Substitute <labelSelector> with a valid Kubernetes label selector expression to place the rook-discover and csi-rbdplugin DaemonSet pods.

Wait for some time and verify on a managed cluster that the changes have applied:

kubectl -n rook-ceph get ds rook-discover -o yaml
kubectl -n rook-ceph get ds csi-rbdplugin -o yaml

To specify rook-discover and csi-rbdplugin placement separately:

On the management cluster, edit the desired Cluster resource:
```
kubectl -n <managedClusterProjectName> edit cluster
```

If required, add the following parameters to the ceph-controller Helm release values:

spec:
  providerSpec:
    value:
      helmReleases:
      - name: ceph-controller
        values:
          hyperconverge:
            nodeAffinity:
              csiplugin: <labelSelector1>
              rookDiscover: <labelSelector2>

Substitute <labelSelectorX> with a valid Kubernetes label selector expression to place the rook-discover and csi-rbdplugin DaemonSet pods. For example, "role=storage-node; discover=true".

Wait for some time and verify on the managed cluster that the changes have applied:

kubectl -n rook-ceph get ds rook-discover -o yaml
kubectl -n rook-ceph get ds csi-rbdplugin -o yaml

Migrate Ceph pools from one failure domain to another¶

The document describes how to change the failure domain of an already deployed Ceph cluster.

Note

This document focuses on changing the failure domain from a smaller to wider one, for example, from host to rack. Using the same instruction, you can move the failure domain from a wider to smaller scale.

Caution

Data movement implies the Ceph cluster rebalancing that may impact cluster performance, depending on the cluster size.

High-level overview of the procedure includes the following steps:

Set correct labels on the nodes.
Create the new bucket hierarchy.
Move nodes to new buckets.
Modify the CRUSH rules.
Add the manual changes to the KaaSCephCluster spec.
Scale the Ceph controllers.

Prerequisites¶

Verify that the Ceph cluster has enough space for multiple copies of data to migrate. Mirantis highly recommends that the Ceph cluster has a minimum of 25% of free space for the procedure to succeed.

Note

The migration procedure implies data movement and optional modification of CRUSH rules that cause a large amount of data (depending on the cluster size) to be first copied to a new location in the Ceph cluster before data removal.
Create a backup of the current KaaSCephCluster object from the managed namespace of the management cluster:
```
kubectl -n <managedClusterProject> get kaascephcluster -o yaml > kcc-backup.yaml
```
Substitute <managedClusterProject> with the corresponding managed cluster namespace of the management cluster.

In the rook-ceph-tools pod on a managed cluster, obtain a backup of the CRUSH map:

ceph osd getcrushmap -o /tmp/crush-map-orig
crushtool -d /tmp/crush-map-orig -o /tmp/crush-map-orig.txt

Migrate Ceph pools¶

This procedure contains an example of moving failure domains of all pools from host to rack. Using the same instruction, you can migrate pools from other types of failure domains, migrate pools separately, and so on.

To migrate Ceph pools from one failure domain to another:

Set the required CRUSH topology in the KaaSCephCluster object for each defined node. For details on the crush parameter, see Node parameters.

Setting the CRUSH topology to each node causes the Ceph Controller to set proper Kubernetes labels on the nodes.

On the managed cluster, verify that the required buckets and bucket types are present in the Ceph hierarchy:

Enter the ceph-tools pod:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

Verify that the required bucket type is present by default:

ceph osd getcrushmap -o /tmp/crush-map
crushtool -d /tmp/crush-map -o /tmp/crush-map.txt
cat /tmp/crush-map.txt # Look for the section named → “# types”

Example of system response:

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 zone
type 10 region
type 11 root

Verify that the buckets with the required bucket type are present:

cat /tmp/crush-map.txt # Look for the section named → “# buckets”

Example of system response of an existing rack bucket:

# buckets
rack rack-1 {
  id -15
  id -16 class hdd
  # weight 0.00000
  alg straw2
  hash 0
}

If the required buckets are not created, create new ones with the required bucket type:

ceph osd crush add-bucket <bucketName> <bucketType> root=default

For example:

ceph osd crush add-bucket rack-1 rack root=default
ceph osd crush add-bucket rack-2 rack root=default
ceph osd crush add-bucket rack-3 rack root=default

Exit the ceph-tools pod.

Optional. Order buckets as required:
1. On the managed cluster, add the first Ceph CRUSH smaller bucket to its respective wider bucket:
```
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
ceph osd crush move <smallerBucketName> <bucketType>=<widerBucketName>
```
 Substitute the following parameters:
 - <smallerBucketName> with the name of the smaller bucket, for example host name
 - <bucketType> with the required bucket type, for example rack
 - <widerBucketName> with the name of the wider bucket, for example rack name
 For example:
```
ceph osd crush move kaas-node-1 rack=rack-1 root=default
```
 Warning
 
 Mirantis highly recommends moving one bucket at a time.
 
 For more details, refer to official Ceph documentation: CRUHS Maps: Moving a bucket.
2. After the bucket is moved to the new location in the CRUSH hierarchy, verify that no data rebalancing occurs:
```
ceph -s
```
 Caution
 
 Wait for rebalancing to complete before proceeding to the next step.
3. Add the remaining Ceph CRUSH smaller buckets to their respective wider buckets one by one.

Scale the Ceph Controller and Rook Operator deployments to 0 replicas:

kubectl -n ceph-lcm-mirantis scale deploy --all --replicas 0
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0

On the managed cluster, manually modify the CRUSH rules for Ceph pools to enable data placement on a new failure domain:

Enter the ceph-tools pod:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

List the CRUSH rules and erasure code profiles for the pools:

ceph osd pool ls detail

Example output

pool 1 'mirablock-k8s-block-hdd' replicated size 2 min_size 1 crush_rule 9 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1193 lfor 0/0/85 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 1.31
pool 2 '.mgr' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 70 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 6.06
pool 3 'openstack-store.rgw.otp' replicated size 2 min_size 1 crush_rule 11 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 2.27
pool 4 'openstack-store.rgw.meta' replicated size 2 min_size 1 crush_rule 12 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 1.50
pool 5 'openstack-store.rgw.log' replicated size 2 min_size 1 crush_rule 10 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 3.00
pool 6 'openstack-store.rgw.buckets.non-ec' replicated size 2 min_size 1 crush_rule 13 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 1.50
pool 7 'openstack-store.rgw.buckets.index' replicated size 2 min_size 1 crush_rule 15 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 2.25
pool 8 '.rgw.root' replicated size 2 min_size 1 crush_rule 14 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 3.75
pool 9 'openstack-store.rgw.control' replicated size 2 min_size 1 crush_rule 16 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 3.00
pool 10 'other-hdd' replicated size 2 min_size 1 crush_rule 19 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1179 lfor 0/0/85 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 1.69
pool 11 'openstack-store.rgw.buckets.data' erasure profile openstack-store.rgw.buckets.data_ecprofile size 3 min_size 2 crush_rule 18 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1198 lfor 0/0/86 flags hashpspool,ec_overwrites stripe_width 8192 application rook-ceph-rgw
pool 12 'vms-hdd' replicated size 2 min_size 1 crush_rule 21 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode on last_change 1182 lfor 0/0/95 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.4 application rbd read_balance_score 1.24
pool 13 'volumes-hdd' replicated size 2 min_size 1 crush_rule 23 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 1185 lfor 0/0/89 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.2 application rbd read_balance_score 1.31
pool 14 'backup-hdd' replicated size 2 min_size 1 crush_rule 25 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1188 lfor 0/0/90 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.1 application rbd read_balance_score 2.06
pool 15 'images-hdd' replicated size 2 min_size 1 crush_rule 27 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1191 lfor 0/0/90 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.1 application rbd read_balance_score 1.50

For each replicated Ceph pool:
1. Obtain the current CRUSH rule name:
```
ceph osd crush rule dump <oldCrushRuleName>
```
2. Create a new CRUSH rule with the required bucket type using the same root, device class, and new bucket type:
```
ceph osd crush rule create-replicated <newCrushRuleName> <root> <bucketType> <deviceClass>
```
 For example:
```
ceph osd crush rule create-replicated images-hdd-rack default rack hdd
```
 For more details, refer to official Ceph documentation: CRUSH Maps: Creating a rule for a replicated pool.
3. Apply a new crush rule to the Ceph pool:
```
ceph osd pool set <poolName> crush_rule <newCrushRuleName>
```
 For example:
```
ceph osd pool set images-hdd crush_rule images-hdd-rack
```
4. Wait for data to be rebalanced after moving the Ceph pool under the new failure domain (bucket type) by monitoring Ceph health:
```
ceph -s
```
 Caution
 
 Update the following Ceph pool only after data rebalancing completes for the current Ceph pool.
5. Verify that the old CRUSH rule is not used anymore:
```
ceph osd pool ls detail
```
 The rule ID is located in the CRUSH map and must match the rule ID in the output of ceph osd dump.
6. Remove the old unused CRUSH rule and rename the new one to the original name:
```
ceph osd crush rule rm <oldCrushRuleName>
ceph osd crush rule rename <newCrushRuleName> <oldCrushRuleName>
```
For each erasure-coded Ceph pool:

Note

Erasure-coded pools require different number of buckets to store data. Instead of the number of replicas in replicated pools, erasure-coded pools require the coding chunks + data chunks number of buckets existing in the Ceph cluster. For example, if an erasure-coded pool has 2 coding chunks and 2 data chunks configured, then the pool requires 4 different buckets, for example, 4 racks, to store data.
1. Obtain the current parameters of the erasure-coded profile:
```
ceph osd erasure-code-profile get <ecProfile>
```
2. In the profile, add the new bucket type as the failure domain using the crush-failure-domain parameter:
```
ceph osd erasure-code-profile set <ecProfile> k=<int> m=<int> crush-failure-domain=<bucketType> crush-device-class=<deviceClass>
```
3. Create a new CRUSH rule in the profile:
```
ceph osd crush rule create-erasure <newEcCrushRuleName> <ecProfile>
```
4. Apply the new CRUSH rule to the pool:
```
ceph osd pool set <poolName> crush_rule <newEcCrushRuleName>
```
5. Wait for data to be rebalanced after moving the Ceph pool under the new failure domain (bucket type) by monitoring Ceph health:
```
ceph -s
```
 Caution
 
 Update the following Ceph pool only after data rebalancing completes for the current Ceph pool.
6. Verify that the old CRUSH rule is not used anymore:
```
ceph osd pool ls detail
```
 The rule ID is located in the CRUSH map and must match the rule ID in the output of ceph osd dump.
7. Remove the old unused CRUSH rule and rename the new one to the original name:
```
ceph osd crush rule rm <oldCrushRuleName>
ceph osd crush rule rename <newCrushRuleName> <oldCrushRuleName>
```
 Note
 
 New erasure-coded profiles cannot be renamed, so they will not be removed automatically during pools cleanup. Remove them manually, if needed.
Exit the ceph-tools pod.

In the management cluster, update the KaaSCephCluster object by setting the failureDomain: rack parameter for each pool. The configuration from the Rook perspective must match the manually created configuration. For example:

spec:
  cephClusterSpec:
    pools:
    - name: images
      ...
      failureDomain: rack
    - name: volumes
      ...
      failureDomain: rack
    ...
    objectStorage:
      rgw:
        dataPool:
          failureDomain: rack
          ...
        metadataPool:
          failureDomain: rack
          ...

Monitor the Ceph cluster health and wait until rebalancing is completed:
```
ceph -s
```
Example of a successful system response:
```
HEALTH_OK
```

Scale back the Ceph Controller and Rook Operator deployments:

kubectl -n ceph-lcm-mirantis scale deploy --all --replicas 3
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1

Enable periodic Ceph performance testing¶

TechPreview

Warning

Performance testing affects the overall Ceph cluster performance. Do not run it unless you are sure that user load will not be affected.

This section describes how to configure periodic Ceph performance testing using Kubernetes batch or cron jobs that execute a fio process in a separate container with a connection to the Ceph cluster. The test results can then be stored in a persistent volume attached to the container.

Ceph performance testing is managed by the KaaSCephOperationRequest CR that creates separate CephPerfTestRequest requests to handle the test run. Once you configure the perfTest section of the KaaSCephOperationRequest spec, it propagates to CephPerfTestRequest on the managed cluster in the ceph-lcm-mirantis namespace. You can create a performance test for a single or scheduled runs.

Create a Ceph performance test request¶

TechPreview

Warning

Performance testing affects the overall Ceph cluster performance. Do not run it unless you are sure that user load will not be affected.

This section describes how to create a Ceph performance test request through the KaaSCephOperationRequest CR.

To create a Ceph performance test request:

Create an RBD image with the required parameters. For example, run the following command in ceph-tools-container to allow execution of the perftest example below on a managed cluster:
```
kubectl exec -ti -n rook-ceph <ceph-tools-pod> -- bash
rbd create <pool_name>/<image_name> --size 10G
```
Substitute <ceph-tools-pod> with the ceph-tools Pod ID, <pool_name> and <image_name> with pool and image names, and specify the size. In the example below, mirablock-k8s-block-hdd is used as pool name and tests as image name:
```
kubectl exec -ti -n rook-ceph rook-ceph-tools-94985cd9f-tjl29 -- bash
rbd create mirablock-k8s-block-hdd/tests --size 10G
```

Create a YAML template for the KaaSCephOperationRequest CR. For details, see KaaSCephOperationRequest CR perftest specification.

kubectl apply -f <example_file_name>.yaml

Review the KaaSCephOperationRequest status. For details, see KaaSCephOperationRequest perftest status.

kubectl get kaascephoperationrequest test-managed-req -n managed-ns

Example of system response:

NAME            KAASCEPHCLUSTER    CLUSTER       AGE   PHASE       MESSAGE
test-perf-req   ceph-kaas-managed  kaas-managed  20m   Completed

Review the CephPerfTestRequest status on the managed cluster.

kubectl get cephperftestrequest -n ceph-lcm-mirantis

Example of system response:

NAME            AGE   PHASE      START TIME             DURATION   JOB STATUS   SCHEDULE
test-perf-req   55m   Finished   2022-06-17T09:29:57Z   5m53s      Completed

Review the performance test result by inspecting logs for the corresponding job Pod on the managed cluster:
```
kubectl --kubeconfig <managedKubeconfig> -n rook-ceph logs -l app=ceph-perftest,perftest=<name>
```
Substitute <managedKubeconfig> with the managed cluster kubeconfig and <name> with the KaaSCephOperationRequest metadata.name, for example, test-perf-req.
Optional. Remove the KaaSCephOperationRequest. Removal of KaaSCephOperationRequest also removes the CephPerfTestRequest CR propagated to the managed cluster.

KaaSCephOperationRequest CR perftest specification¶

TechPreview

This section describes the KaaSCephOperationRequest CR specification used to automatically create a CephPerfTestRequest request. For the procedure workflow, see Enable periodic Ceph performance testing.

Spec of the KaaSCephOperationRequest perftest high-level parameters¶

Parameter

Description

perfTest

Describes the definition for the CephPerfTestRequest spec. For details on the perfTest parameters, see the tables below.

kaasCephCluster

Defines KaaSCephCluster on which the KaaSCephOperationRequest depends on. Use the kaasCephCluster parameter if the name or project of the corresponding Container Cloud cluster differs from the default one:

spec:
  kaasCephCluster:
    name: ceph-kaas-mgmt
    namespace: default

k8sCluster

Defines the cluster on which the KaaSCephOperationRequest depends on. Use the k8sCluster parameter if the name or project of the corresponding Container Cloud cluster differs from the default one:

spec:
  k8sCluster:
    name: kaas-mgmt
    namespace: default

If you omit this parameter, ceph-kcc-controller will set it automatically.

Ceph performance test parameters¶

Parameter	Description
`parameters`	A list of command arguments for a performance test execution. For all available parameters, see fio documentation. Note Performance test results will be saved on a PVC if the test run parameters contain an argument to save to a file. Otherwise, test results will be saved only as Pod logs. For example, for the default `fio` image, use the `--output=/results/<fileName>` option to redirect to a file that will be saved on the attached PVC. Configuring a mount point is not supported.
`command`	Optional. Entrypoint command to run performance test in the container. If the performance image is updated, you may also update the command. By default, equals the image entry point.
`image`	Container image to use for jobs. By default, `vineethac/fio_image`. Mirantis recommends using the default `fio` image as it supports multitude I/O engines. For details, see fio man page.
`periodic`	Configuration of the performance test runs as periodic jobs. Leave empty if a single run is required. For details, see Ceph performance periodic parameters.
`saveResultOnPvc`	Option that enables saving of the performance test results on a PVC. Contains the following fields: `pvcName` - PVC name to use. If not specified, a PVC name for the performance test will be created automatically. Namespace is static and equals `rook-ceph`. `pvcStorageClass` - `StorageClass` to use for PVC. If not specified, the default storage class is used. `pvcSize` - PVC size, defaults to `10Gi`. `preserveOnDelete` - PVC preservation after removal of the performance test.

Ceph performance periodic parameters¶

Parameter	Description
`schedule`	Required. Schedule in base cron format. For example, `* * 30 2 \*`. The field format follows the Cron schedule syntax.
`suspended`	Pause CronJob scheduling to prevent performance test execution. Only for future scheduling.
`runsToKeep`	Number of runs to keep in history. Supported only by keeping old run Pods with their outputs.

KaaSCephOperationRequest perftest status¶

TechPreview

This section describes the status.perfTestStatus fields of the KaaSCephOperationRequest CR that you can use to check the status of a Ceph performance test request.

Note

Performance test results will be saved on PVC if the test run parameters contain the saveResultOnPvc option. Otherwise, test results will be saved only as Pod logs. For details, see Ceph performance test parameters.

Status of the *KaaSCephOperationRequest* high-level parameters¶
Parameter	Description
`perfTestStatus`	Describes the status of the current `CephPerfTestRequest`. For details, see Status of the KaaSCephOperationRequest perfTestStatus parameters.

Status of the *KaaSCephOperationRequest* *perfTestStatus* parameters¶
Parameter	Description
`phase`	Describes the current request phase: `Pending` - the request is created and placed in the request queue. `Scheduling` - the performance test is handled, waiting for a Pod to be scheduled for the run. `WaitingNextRun` - the performance test is waiting for the next run of the periodic job. `Running` - the performance test is executing. `Finished` - the performance test executed successfully. `Suspended` - the performance test is suspended. Only for periodic jobs. `Failed` - the performance test failed.
`LastStartTime`	The last start time of the performance test execution.
`LastDurationTime`	The duration of the last successful performance test.
`LastJobStatus`	The execution status of the last performance test.
`messages`	Issues or warnings found during the performance test run.
`results`	Location of the performance test result. Contains the following fields: `perftestReference` - reference to the job or cron job with the performance test run. `referenceNamespace` - namespace of the job or cron job with the performance test run. `storedOnPvc` - location of the performance test results on a PVC with `pvcName` in `pvcNamespace` if the test run parameters contain the `saveResultOnPvc` option.
`statusHistory`	History of statuses and timings for cron jobs: `StartTime` - start time of the previous performance test `JobStatus` - last status of the performance test `DurationTime` - last duration of the performance test `Messages` - issues occured during the previous performance test

See also

Troubleshoot Ceph

StackLight operations¶

The section covers the StackLight management aspects.

Configure StackLight¶

This section describes how to configure StackLight in your Mirantis OpenStack on Kubernetes deployment and includes the description of StackLight parameters and their verification.

StackLight configuration procedure¶

This section describes the StackLight configuration workflow.

To configure StackLight:

Download your management cluster kubeconfig:
1. Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
2. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
3. Expand the menu of the tab with your user name.
4. Click Download kubeconfig to download kubeconfig of your management cluster.
5. Log in to any local machine with kubectl installed.
6. Copy the downloaded kubeconfig to this machine.

Run one of the following commands:

For a management cluster:

kubectl --kubeconfig <mgmtClusterKubeconfigPath> edit -n default cluster <mgmtClusterName>

For a managed cluster:

kubectl --kubeconfig <mgmtClusterKubeconfigPath> edit -n <managedClusterProjectName> cluster <managedClusterName>

In the following section of the opened manifest, configure the required StackLight parameters as described in StackLight configuration parameters.
```
spec:
  providerSpec:
    value:
      helmReleases:
     - name: stacklight
       values:
```
Verify StackLight after configuration.

StackLight configuration parameters¶

This section describes the StackLight configuration keys that you can specify in the values section to change StackLight settings as required. Prior to making any changes to StackLight configuration, perform the steps described in StackLight configuration procedure. After changing StackLight configuration, verify the changes as described in Verify StackLight after configuration.

Important

Some parameters are marked as mandatory. Failure to specify values for such parameters causes the Admission Controller to reject cluster creation.

OpenStack cluster configuration parameters¶

This section describes the OpenStack-related StackLight configuration keys. For MOSK cluster configuration keys, see MOSK cluster configuration parameters.

Gnocchi
Ironic
OpenStack
RabbitMQ
SSL certificates
Telegraf
Tungsten Fabric

Gnocchi¶

Key	Description	Example values
`openstack.gnocchi.enabled` (bool)	Enables Gnocchi monitoring. Set to `false` by default.	`true` or `false`

Ironic¶

Key	Description	Example values
`openstack.ironic.enabled` (bool)	Enables Ironic monitoring. Set to `false` by default.	`true` or `false`

OpenStack¶

Key	Description	Example values
`openstack.enabled` (bool)	Enables OpenStack monitoring. Set to `true` by default.	`true` or `false`
`openstack.namespace` (string)	Defines the namespace within which the OpenStack virtualized control plane is installed. Set to `openstack` by default.	`openstack`

RabbitMQ¶

Key	Description	Example values
`openstack.rabbitmq.credentialsConfig` (map)	Defines the RabbitMQ credentials to use if credentials discovery is disabled or some required parameters were not found during the discovery.	credentialsConfig: username: "stacklight" password: "stacklight" host: "rabbitmq.openstack.svc" queue: "notifications" vhost: "openstack"
`openstack.rabbitmq.credentialsDiscovery` (map)	Enables the credentials discovery to obtain the username and password from the secret object.	credentialsDiscovery: enabled: true namespace: openstack secretName: os-rabbitmq-user-credentials

SSL certificates¶

Key	Description	Example values
`openstack.externalFQDN` (string) ^Deprecated	External FQDN used to communicate with OpenStack services for certificates monitoring. The option is deprecated, use `openstack.externalFQDNs.enabled` instead.	`https://os.ssl.mirantis.net/`
`openstack.externalFQDNs.enabled` (bool)	External FQDN used to communicate with OpenStack services. Used for certificates monitoring. Set to `false` by default.	`true` or `false`
`openstack.insecure` (string)	Defines whether to verify the trust chain of the OpenStack endpoint SSL certificates during monitoring.	insecure: internal: true external: false

Telegraf¶

Key	Description	Example values
`openstack.telegraf.credentialsConfig` (map)	Specifies the OpenStack credentials to use if the credentials discovery is disabled or some required parameters were not found during the discovery.	credentialsConfig: identityEndpoint: "" # "http://keystone-api.openstack.svc:5000/v3" domain: "" # "default" password: "" # "workshop" project: "" # "admin" region: "" # "RegionOne" username: "" # "admin"
`openstack.telegraf.credentialsDiscovery` (map)	Enables the credentials discovery to obtain all required parameters from the secret object.	credentialsDiscovery: enabled: true namespace: openstack secretName: keystone-keystone-admin
`openstack.telegraf.interval` (string)	Specifies the interval of metrics gathering from the OpenStack API. Set to `1m` by default.	`1m`, `3m`
`openstack.telegraf.insecure` (bool)	Enables or disables the server certificate chain and host name verification. Set to `true` by default.	`true` or `false`
`openstack.telegraf.skipPublicEndpoints` (bool)	Enables or disables HTTP probes for public endpoints from the OpenStack service catalog. Set to `false` by default, meaning that Telegraf verifies all endpoints from the OpenStack service catalog, including the public, admin, and internal endpoints.	`true` or `false`

Tungsten Fabric¶

Key

Description

Example values

tungstenFabricMonitoring.enabled (bool)

Enables Tungsten Fabric monitoring.

Since MOSK 23.1, the parameter is set to true by default if Tungsten Fabric is deployed.

Before MOSK 23.1, the parameter is set to false by default. Set it to true only if Tungsten Fabric is deployed.

true or false

tungstenFabricMonitoring.exportersTimeout (string)

Available since MOSK 23.3. Defines the timeout of the tungstenfabric-exporter client requests. Set to 5s by default.

tungstenFabricMonitoring:
  exportersTimeout: "5s"

tungstenFabricMonitoring.analyticsEnabled (bool)

Available since MOSK 24.1. Enables or disables monitoring of the Tungsten Fabric analytics services.

In MOSK 24.1, defaults to true.

Since MOSK 24.2, the default value is set automatically based on the real state of the Tungsten Fabric analytics services (enabled or disabled) in the Tungsten Fabric cluster.

true or false

MOSK cluster configuration parameters¶

This section describes the MOSK cluster StackLight configuration keys. For OpenStack cluster configuration keys, see OpenStack cluster configuration parameters.

Alert configuration
Alerta
Alertmanager integrations
Alertmanager: notifications to email
Alertmanager: notifications to Microsoft Teams
Alertmanager: notifications to Salesforce
Alertmanager: notifications to ServiceNow
Alertmanager: notifications to Slack
Alertmanager: Watchdog alert
Byte limit for Telemeter client
Cluster size
Grafana
High availability
Kubernetes network policies
Kubernetes tolerations

Log filtering for namespaces
Log verbosity
Logging
Logging: Enforce OOPS compression
Logging to external outputs
Logging to external outputs: secrets
Logging to syslog
Monitoring of Ceph
Monitoring of external endpoint
Monitoring of Ironic
Monitoring of Mirantis Kubernetes Engine
Monitoring of SSL certificates
Monitoring of workload
NodeSelector

OpenSearch
OpenSearch Dashboards extra settings
OpenSearch extra settings
Prometheus
Prometheus Blackbox Exporter
Prometheus custom recording rules
Prometheus custom scrape configurations
Prometheus metrics filtering
Prometheus Node Exporter
Prometheus Relay
Prometheus remote write
Resource limits
Salesforce reporter
Storage class

Alert configuration¶

Key	Description	Example values
`prometheusServer.customAlerts` (slice)	Defines custom alerts. Also, modifies or disables existing alert configurations. For the list of predefined alerts, see StackLight alerts. While adding or modifying alerts, follow the Alerting rules.	customAlerts: # To add a new alert: - alert: ExampleAlert annotations: description: Alert description summary: Alert summary expr: example_metric > 0 for: 5m labels: severity: warning # To modify an existing alert expression: - alert: AlertmanagerFailedReload expr: alertmanager_config_last_reload_successful == 5 # To disable an existing alert: - alert: TargetDown enabled: false An optional field `enabled` is accepted in the alert body to disable an existing alert by setting to `false`. All fields specified using the `customAlerts` definition override the default predefined definitions in the charts’ values.

Alerta¶

Key	Description	Example values
`alerta.enabled` (bool)	Enables or disables Alerta. Using the Alerta web UI, you can view the most recent or watched alerts, group, and filter alerts. Set to `true` by default.	`true` or `false`

Alertmanager integrations¶

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and Alertmanager notifications external rules.

Key	Description	Example values
`alertmanagerSimpleConfig.genericReceivers` (slice)	Provides a generic template for notifications receiver configurations. For a list of supported receivers, see Prometheus Alertmanager documentation: Receiver.	For example, to enable notifications to OpsGenie: alertmanagerSimpleConfig: genericReceivers: - name: HTTP-opsgenie enabled: true # optional opsgenie_configs: - api_url: "https://example.app.eu.opsgenie.com/" api_key: "secret-key" send_resolved: true
`alertmanagerSimpleConfig.genericRoutes` (slice)	Provides a template for notifications route configuration. For details, see Prometheus Alertmanager documentation: Route.	genericRoutes: - receiver: HTTP-opsgenie enabled: true # optional matchers: severity=~"major\|critical" continue: true
`alertmanagerSimpleConfig.inhibitRules.enabled` (bool)	Disables or enables alert inhibition rules. If enabled, Alertmanager decreases alert noise by suppressing dependent alerts notifications to provide a clearer view on the cloud status and simplify troubleshooting. Enabled by default. For details, see Alert dependencies. For details on inhibition rules, see Prometheus documentation.	`true` or `false`

Alertmanager: notifications to email¶

Key	Description	Example values
`alertmanagerSimpleConfig.email.enabled` (bool)	Enables or disables Alertmanager integration with email. Set to `false` by default.	`true` or `false`
`alertmanagerSimpleConfig.email` (map)	Defines the notification parameters for Alertmanager integration with email. For details, see Prometheus Alertmanager documentation: Email configuration.	email: enabled: false send_resolved: true to: "to@test.com" from: "from@test.com" smarthost: smtp.gmail.com:587 auth_username: "from@test.com" auth_password: password auth_identity: "from@test.com" require_tls: true
`alertmanagerSimpleConfig.email.route` (map)	Defines the route for Alertmanager integration with email. For details, see Prometheus Alertmanager documentation: Route.	route: matchers: [] routes: []

Alertmanager: notifications to Microsoft Teams¶

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled. The Microsoft Teams integration depends on the Internet access through HTTPS.

Key	Description	Example values
`alertmanagerSimpleConfig.msteams.enabled` (bool)	Enables or disables Alertmanager integration with Microsoft Teams. Requires a set up Microsoft Teams channel and a channel connector. Set to `false` by default.	`true` or `false`
`alertmanagerSimpleConfig.msteams.url` (string)	Defines the URL of an Incoming Webhook connector of a Microsoft Teams channel. For details about channel connectors, see Microsoft documentation.	`https://example.webhook.office.com/webhookb2/UUID`
`alertmanagerSimpleConfig.msteams.route` (map)	Defines the notifications route for Alertmanager integration with MS Teams. For details, see Prometheus Alertmanager documentation: Route.	route: matchers: [] routes: []

Alertmanager: notifications to Salesforce¶

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled. The Salesforce integration depends on the Internet access through HTTPS.

Key	Description	Example values
`clusterId` (string)	Unique cluster identifier `clusterId="<Cluster Project>/<Cluster Name>/<UID>"`, generated for each cluster using Cluster Project, Cluster Name, and cluster UID, separated by a slash. Used for both `sf-notifier` and `sf-reporter` services. The `clusterId` is automatically defined for each cluster. Do not set or modify it manually.	Do not modify `clusterId`.
`alertmanagerSimpleConfig.salesForce.enabled` (bool)	Enables or disables Alertmanager integration with Salesforce using the `sf-notifier` service. Disabled by default.	`true` or `false`
`alertmanagerSimpleConfig.salesForce.auth` (map)	Defines the Salesforce parameters and credentials for integration with Alertmanager.	auth: url: "<SF instance URL>" username: "<SF account email address>" password: "<SF password>" environment_id: "<Cloud identifier>" organization_id: "<Organization identifier>" sandbox_enabled: "<Set to true or false>"
`alertmanagerSimpleConfig.salesForce.route` (map)	Defines the notifications route for Alertmanager integration with Salesforce. For details, see Prometheus Alertmanager documentation: Route.	route: matchers: - severity="critical" routes: [] Note By default, only `Critical` alerts will be sent to Salesforce.
`alertmanagerSimpleConfig.salesForce.feed_enabled` (bool)	Enables or disables feed update in Salesforce. To save API calls, this parameter is set to `false` by default.	`true` or `false`
`alertmanagerSimpleConfig.salesForce.link_prometheus` (bool)	Enables or disables links to the Prometheus web UI in alerts sent to Salesforce. To simplify troubleshooting, set to `true` by default.	`true` or `false`

Alertmanager: notifications to ServiceNow¶

Caution

Prior to configuring the integration with ServiceNow, perform the following prerequisite steps using the ServiceNow documentation of the required version.

In a new or existing Incident table, add the Alert ID field as described in Add fields to a table. To avoid alerts duplication, select Unique.
Create an Access Control List (ACL) with read/write permissions for the Incident table as described in Securing table records.
Set up a service account.

Key

Description

Example values

alertmanagerSimpleConfig.serviceNow.enabled (bool)

Enables or disables Alertmanager integration with ServiceNow. Set to false by default. Requires a set up ServiceNow account and compliance with the Incident table requirements above.

true or false

alertmanagerSimpleConfig.serviceNow (map)

Defines the ServiceNow parameters and credentials for integration with Alertmanager:

incident_table - name of the table created in ServiceNow. Do not confuse with the table label.
api_version - version of the ServiceNow HTTP API. By default, v1.
alert_id_field - name of the unique string field configured in ServiceNow to hold Prometheus alert IDs. Do not confuse with the table label.
auth.instance - URL of the instance.
auth.username - name of the ServiceNow user account with access to Incident table.
auth.password - password of the ServiceNow user account.

serviceNow:
  enabled: true
  incident_table: "incident"
  api_version: "v1"
  alert_id_field: "u_alert_id"
  auth:
    instance: "https://dev00001.service-now.com"
    username: "testuser"
    password: "testpassword"

Alertmanager: notifications to Slack¶

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled. The Slack integration depends on the Internet access through HTTPS.

Key	Description	Example values
`alertmanagerSimpleConfig.slack.enabled` (bool)	Enables or disables Alertmanager integration with Slack. For details, see Prometheus Alertmanager documentation: Slack configuration. Set to `false` by default.	`true` or `false`
`alertmanagerSimpleConfig.slack.api_url` (string)	Defines the Slack webhook URL.	`http://localhost:8888`
`alertmanagerSimpleConfig.slack.channel` (string)	Defines the Slack channel or user to send notifications to.	`monitoring`
`alertmanagerSimpleConfig.slack.route` (map)	Defines the notifications route for Alertmanager integration with Slack. For details, see Prometheus Alertmanager documentation: Route.	route: matchers: [] routes: []

Alertmanager: Watchdog alert¶

Key	Description	Example values
`prometheusServer.watchDogAlertEnabled` (bool)	Enables or disables the `Watchdog` alert that constantly fires as long as the entire alerting pipeline is functional. You can use this alert to verify that Alertmanager notifications properly flow to the Alertmanager receivers. Set to `true` by default.	`true` or `false`

Byte limit for Telemeter client¶

For internal StackLight use only

Key	Description	Example values
`telemetry.telemeterClient.limitBytes` (string)	Specifies the size limit of the incoming data length in bytes for the Telemeter client. Defaults to `1048576`.	`4194304`

Cluster size¶

Key

Description

Example values

clusterSize (string)

Specifies the approximate expected cluster size. Set to small by default. Other possible values include medium and large. Depending on the choice, appropriate resource limits are passed according to the resources or resourcesPerClusterSize parameter.

Caution

Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), resourcesPerClusterSize is deprecated and is overridden by the resources parameter. Therefore, use the resources parameter instead.

The values differ by the OpenSearch and Prometheus resource limits:

small (default) - 2 CPU, 6 Gi RAM for OpenSearch, 1 CPU, 8 Gi RAM for Prometheus. Use small only for testing and evaluation purposes with no workloads expected.
medium - 4 CPU, 16 Gi RAM for OpenSearch, 3 CPU, 16 Gi RAM for Prometheus.
large - 8 CPU, 32 Gi RAM for OpenSearch, 6 CPU, 32 Gi RAM for Prometheus. Set to large only in case of lack of resources for OpenSearch and Prometheus.

small, medium, or large

Grafana¶

Key	Description	Example values
`grafana.renderer.enabled` (bool)	Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Disables Grafana Image Renderer. For example, for resource-limited environments. Enabled by default.	`true` or `false`
`grafana.homeDashboard` (string)	Defines the home dashboard. Set to `kubernetes-cluster` by default. You can define any of the available dashboards.	`kubernetes-cluster`

High availability¶

Key	Description	Example values
`highAvailabilityEnabled` (bool) ^Mandatory	Enables or disables StackLight multiserver mode. For details, see StackLight database modes in Deployment architecture. On managed clusters, set to `false` by default. On management clusters, `true` is mandatory.	`true` or `false`

Kubernetes network policies¶

Available since MCC 2.25.1 (Cluster releases 17.0.1 and 16.0.1)

Key

Description

Example values

networkPolicies.enabled (bool)

Enables or disables the Kubernetes Network Policy resource that allows controlling network connections to and from Pods deployed in the stackLight namespace. Enabled by default.

For the list of network policy rules, refer to StackLight rules for Kubernetes network policies. Customization of network policies is not supported.

true or false

Kubernetes tolerations¶

Key

Description

Example values

tolerations.default (slice)

Kubernetes tolerations to add to all StackLight components.

default:
- key: "com.docker.ucp.manager"
  operator: "Exists"
  effect: "NoSchedule"

tolerations.component (map)

Defines Kubernetes tolerations (overrides the default ones) for any StackLight component.

component:
  # elasticsearch:
  opensearch:
  - key: "com.docker.ucp.manager"
    operator: "Exists"
    effect: "NoSchedule"
  postgresql:
  - key: "node-role.kubernetes.io/master"
    operator: "Exists"
    effect: "NoSchedule"

Log filtering for namespaces¶

Available since MCC 2.25.0 (Cluster releases 17.0.0 and 16.0.0)

Key	Description	Example values
`logging.namespaceFiltering.logs.enabled` (bool)	Limits the number of namespaces for Pods log collection. Enabled by default with the following list of monitored Kubernetes namespaces: Kubernetes namespaces monitored by default `ceph` ^{If Ceph is enabled} `ceph-lcm-mirantis` ^{If Ceph is enabled} `default` `kaas` `kube-node-lease` `kube-public` `kube-system` `lcm-system` `local-path-storage` `metallb` `metallb-system` `node-feature-discovery` `openstack` `openstack-ceph-shared` ^{If Ceph is enabled} `openstack-lma-shared` `openstack-provider-system` `openstack-redis` `openstack-tf-share` ^{If Tungsten Fabric is enabled} `openstack-vault` `osh-system` `rook-ceph` ^{If Ceph is enabled} `stacklight` `system` `tf` ^{If Tungsten Fabric is enabled}	`true` or `false`
`logging.namespaceFiltering.logs.extraNamespaces` (map)	Adds extra namespaces to collect Kubernetes Pod logs from. Requires `logging.enabled` and `logging.namespaceFiltering.logs.enabled` set to `true`. Defines a YAML-formatted list of namespaces, which is empty by default.	logging: namespaceFiltering: logs: enabled: true extraNamespaces: - custom-ns-1
`logging.namespaceFiltering.events.enabled` (bool)	Limits the number of namespaces for Kubernetes events collection. Disabled by default due to sysdig scanner present on some MOSK clusters and due to cluster-scoped objects producing events by default to the `default` namespace, but it is not passed to StackLight configuration anyhow. Requires `logging.enabled` set to `true`.	`true` or `false`
`logging.namespaceFiltering.events.extraNamespaces` (map)	Adds extra namespaces to collect Kubernetes events from. Requires `logging.enabled` and `logging.namespaceFiltering.events.enabled` set to `true`. Defines a YAML-formatted list of namespaces, which is empty by default.	logging: namespaceFiltering: events: enabled: true extraNamespaces: - custom-ns-1

Log verbosity¶

Key	Description	Example values
`stacklightLogLevels.default` (string)	Defines the log verbosity level for all StackLight components if not defined using `component`. To use the component default log verbosity level, leave the string empty.	`trace` - most verbose log messages, generates large amounts of data `debug` - messages typically of use only for debugging purposes `info` - informational messages describing common processes such as service starting or stopping; can be ignored during normal system operation but may provide additional input for investigation `warn` - messages about conditions that may require attention `error` - messages on error conditions that prevent normal system operation and require action `crit` - messages on critical conditions indicating that a service is not working, working incorrectly or is unusable, requiring immediate attention Since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0), the `NO_SEVERITY` severity label is automatically added to a log with no severity label in the message. This enables greater control over determining which logs Fluentd processes and which ones are skipped by mistake.
`stacklightLogLevels.component` (map)	Defines (overrides the `default` value) the log verbosity level for any StackLight component separately. To use the component default log verbosity, leave the string empty.	component: kubeStateMetrics: "" prometheusAlertManager: "" prometheusBlackboxExporter: "" prometheusNodeExporter: "" prometheusServer: "" alerta: "" alertmanagerWebhookServicenow: "" elasticsearchCurator: "" postgresql: "" prometheusEsExporter: "" sfNotifier: "" sfReporter: "" fluentd: "" # fluentdElasticsearch "" fluentdLogs: "" telemeterClient: "" telemeterServer: "" tfControllerExporter: "" tfVrouterExporter: "" telegrafDs: "" telegrafS: "" # elasticsearch: "" opensearch: "" # kibana: "" grafana: "" opensearchDashboards: "" metricbeat: "" prometheusMsTeams: ""

Logging¶

Key	Description	Example values
`logging.enabled` (bool) ^Mandatory	Enables or disables the StackLight logging stack. For details about the logging components, see Deployment architecture. Set to `true` by default. On management clusters, `true` is mandatory.	`true` or `false`
`logging.level` (bool)	Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Sets the least important level of log messages to send to OpenSearch. Requires `logging.enabled` set to `true`. The default logging level is `INFO`, meaning that StackLight will drop log messages for the lower `DEBUG` and `TRACE` levels. Levels from `WARNING` to `EMERGENCY` require attention. Note The `FLUENTD_ERROR` logs are of special type and cannot be dropped.	`TRACE` - the most verbose logs. Such level generates large amounts of data. `DEBUG`- messages typically of use only for debugging purposes. `INFO` - informational messages describing common processes such as service starting or stopping. Can be ignored during normal system operation but may provide additional input for investigation. `NOTICE` - normal but significant conditions that may require special handling. `WARNING` - messages on unexpected conditions that may require attention. `ERROR` - messages on error conditions that prevent normal system operation and require action. `CRITICAL` - messages on critical conditions indicating that a service is not working or working incorrectly. `ALERT` - messages on severe events indicating that action is needed immediately. `EMERGENCY` - messages indicating that a service is unusable.
`logging.metricQueries` (map)	Allows configuring OpenSearch queries for the data present in OpenSearch. Prometheus Elasticsearch Exporter then queries the OpenSearch database and exposes such metrics in the Prometheus format. For details, see Create logs-based metrics. Includes the following parameters: `indices` - specifies the index pattern `interval` and `timeout` - specify in seconds how often to send the query to OpenSearch and how long it can last before timing out `onError` and `onMissing` - modify the `prometheus-es-exporter` behavior on query error and missing index. For details, see Prometheus Elasticsearch Exporter.	For usage example, see Create logs-based metrics.
`logging.retentionTime` (map)	Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Specifies the retention time per index. Includes the following parameters: `logstash` - specifies the `logstash-` index retention time. `events` - specifies the `kubernetes_events-` index retention time. `notifications` - specifies the `notification-*` index retention time. The allowed values include integers (days) and numbers with suffixes: y, m, w, d, h, including capital letters.	logging: retentionTime: logstash: 3 events: "2w" notifications: "1M"

Logging: Enforce OOPS compression¶

Available since MCC 2.25.0 (Cluster releases 17.0.0 and 16.0.0)

Key	Description	Example values
`logging.enforceOopsCompression`	Enforces 32 GB of heap size, unless the defined memory limit allows using 50 GB of heap. Requires `logging.enabled` set to `true`. Enabled by default. When disabled, StackLight computes heap as ⅘ of the set memory limit for any resulting heap value. For more details, see Tune OpenSearch performance.	logging: enforceOopsCompression: true

Logging to external outputs¶

Available since MCC 2.23.0 (Cluster release 11.7.0)

Key

Description

Example values

logging.externalOutputs (map)

Specifies external Elasticsearch, OpenSearch, and syslog destinations as fluentd-logs outputs. Requires logging.enabled: true. For configuration procedure, see Enable log forwarding to external destinations.

logging:
  externalOutputs:
    elasticsearch:
      # disabled: false
      type: elasticsearch
      level: info
      plugin_log_level: info
      tag_exclude: '{fluentd-logs,systemd}'
      host: elasticsearch-host
      port: 9200
      logstash_date_format: '%Y.%m.%d'
      logstash_format: true
      logstash_prefix: logstash
      ...
      buffer:
        # disabled: false
        chunk_limit_size: 16m
        flush_interval: 15s
        flush_mode: interval
        overflow_action: block
        ...
    opensearch:
      disabled: true
      type: opensearch
      ...

Logging to external outputs: secrets¶

Available since MCC 2.23.0 (Cluster release 11.7.0)

Key

Description

Example values

logging.externalOutputSecretMounts (map)

Specifies authentication secret mounts for external log destinations. Requires logging.externalOutputs to be enabled and a Kubernetes secret to be created under the stacklight namespace. Contains the following values:

secretName
Mandatory. Kubernetes secret name.
mountPath
Mandatory. Mount path of the Kubernetes secret defined in secretName.
defaultMode
Optional. Decimal number defining secret permissions, 420 by default.

Secret mount configuration:

logging:
  externalOutputSecretMounts:
  - secretName: elasticsearch-certs
    mountPath: /tmp/elasticsearch-certs
    defaultMode: 420
  - secretName: opensearch-certs
    mountPath: /tmp/opensearch-certs

Elasticsearch configuration for the above secret mount:

logging:
  externalOutputs:
    elasticsearch:
      ...
      ca_file: /tmp/elasticsearch-certs/ca.pem
      client_cert: /tmp/elasticsearch-certs/client.pem
      client_key: /tmp/elasticsearch-certs/client.key
      client_key_pass: password

Logging to syslog¶

Deprecated since MCC 2.23.0 (Cluster release 11.7.0)

Note

Since Container Cloud 2.23.0 (Cluster release 11.7.0), logging.syslog is deprecated for the sake of logging.externalOutputs. For details, see Logging to external outputs.

Key	Description	Example values
`logging.syslog.enabled` (bool)	Enables or disables remote logging to syslog. Disabled by default. Requires `logging.enabled` set to `true`. For details and configuration example, see Enable remote logging to syslog.	`true` or `false`
`logging.syslog.host` (string)	Specifies the remote syslog host.	`remote-syslog.svc`
`logging.syslog.level` (string)	Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Specifies logging level for the syslog output.	`INFO`
`logging.syslog.port` (string)	Specifies the remote syslog port.	`514`
`logging.syslog.packetSize` (string)	Defines the packet size in bytes for the syslog logging output. Set to `1024` by default. May be useful for syslog setups allowing packet size larger than 1 kB. Mirantis recommends that you tune this parameter to allow sending full log lines.	`1024`
`logging.syslog.protocol` (bool)	Specifies the remote syslog protocol. Set to `udp` by default.	`tcp` or `udp`
`logging.syslog.tls.enabled` (bool)	Optional. Disabled by default. Enables or disables TLS. Use TLS only for the TCP protocol. TLS will not be enabled if you set a protocol other than TCP.	`true` or `false`
`logging.syslog.tls.verify_mode` (int)	Optional. Configures TLS verification.	`0` for `OpenSSL::SSL::VERIFY_NONE` `1` for `OpenSSL::SSL::VERIFY_PEER` `2` for `OpenSSL::SSL::VERIFY_FAIL_IF_NO_PEER_CERT` `4` for `OpenSSL::SSL::VERIFY_CLIENT_ONCE`
`logging.syslog.tls.certificate` (string)	Defines how to pass the certificate. `secret` takes precedence over `hostPath`. `secret` - specifies the name of the secret holding the certificate. `hostPath` - specifies an absolute host path to the PEM certificate.	certificate: secret: "" hostPath: "/etc/ssl/certs/ca-bundle.pem"
`tag_exclude` (string) ^{Since MCC 2.23.0 (11.7.0)}	Optional. Overrides `tag_include`. Sets logs by tags to exclude from the destination output. For example, to exclude all logs with the `test` tag, set `tag_exclude: '/.test./'`. How to obtain tags for logs Select from the following options: In the main OpenSearch output, use the `logger` field that equals the tag. Use logs of a particular Pod or container by following the below order, with the first match winning: The value of the `app` Pod label. For example, for `app=opensearch-master`, use `opensearch-master` as the log tag. The value of the `k8s-app` Pod label. The value of the `app.kubernetes.io/name` Pod label. If a `release_group` Pod label exists and the component Pod label starts with `app`, use the value of the component label as the tag. Otherwise, the tag is the application label joined to the component label with a `-`. The name of the container from which the log is taken. The values for `tag_exclude` and `tag_include` are placed into `<match>` directives of Fluentd and only accept regex types that are supported by the `<match>` directive of Fluentd. For details, refer to the Fluentd official documentation.	`'{fluentd-logs,systemd}'`
`tag_include` (string) ^{Since MCC 2.23.0 (11.7.0)}	Optional. Is overridden by `tag_exclude`. Sets logs by tags to include to the destination output. For example, to include all logs with the `auth` tag, set `tag_include: '/.auth./'`.	`'/.auth./'`

Monitoring of Ceph¶

Key	Description	Example values
`ceph.enabled` (bool)	Enables or disables Ceph monitoring on managed clusters. Set to `false` by default.	`true` or `false`

Monitoring of external endpoint¶

Key	Description	Example values
`externalEndpointMonitoring.enabled` (bool)	Enables or disables HTTP endpoints monitoring. If enabled, the monitoring tool performs the probes against the defined endpoints every 15 seconds. Set to `false` by default.	`true` or `false`
`externalEndpointMonitoring.certificatesHostPath` (string)	Defines the directory path with external endpoints certificates on host.	`/etc/ssl/certs/`
`externalEndpointMonitoring.domains` (slice)	Defines the list of HTTP endpoints to monitor. The endpoints must successfully respond to a liveness probe. For success, a request to a specific endpoint must result in a 2xx HTTP response code.	domains: - https://prometheus.io/health - http://example.com:8080/status - http://example.net:8080/pulse

Monitoring of Ironic¶

Key	Description	Example values
`ironic.endpoint` (string)	Enables or disables monitoring of bare metal Ironic. To enable, specify the Ironic API URL.	`http://ironic-api-http.kaas.svc:6385/v1`
`ironic.insecure` (bool)	Defines whether to skip the chain and host verification. Set to `false` by default.	`true` or `false`

Monitoring of Mirantis Kubernetes Engine¶

Key	Description	Example values
`mke.enabled` (bool)	Enables or disables Mirantis Kubernetes Engine (MKE) monitoring. Set to `true` by default.	`true` or `false`
`mke.dockerdDataRoot` (string)	Defines the dockerd data root directory of persistent Docker state. For details, see Docker documentation: Daemon CLI (dockerd).	`/var/lib/docker`

Monitoring of SSL certificates¶

Key	Description	Example values
`sslCertificateMonitoring.enabled` (bool)	Enables or disables StackLight to monitor and alert on the expiration date of the TLS certificate of an HTTPS endpoint. If enabled, the monitoring tool performs the probes against the defined endpoints every hour. Set to `false` by default.	`true` or `false`
`sslCertificateMonitoring.domains` (slice)	Defines the list of HTTPS endpoints to monitor the certificates from.	domains: - https://prometheus.io - https://example.com:8080

Monitoring of workload¶

Key

Description

Example values

metricFilter (map)

On the clusters that run large-scale workloads, workload monitoring generates a big amount of resource-consuming metrics. To prevent generation of excessive metrics, you can disable workload monitoring in the StackLight metrics and monitor only the infrastructure.

The metricFilter parameter enables the cAdvisor (Container Advisor) and kubeStateMetrics metric ingestion filters for Prometheus. Set to false by default. If set to true, you can define the namespaces to which the filter will apply. The parameter is designed for managed clusters.

metricFilter:
  enabled: true
  action: keep
  namespaces:
  - kaas
  - kube-system
  - stacklight

enabled - enable or disable metricFilter using true or false
action - action to take by Prometheus:
- keep - keep only metrics from namespaces that are defined in the namespaces list
- drop - ignore metrics from namespaces that are defined in the namespaces list
namespaces - list of namespaces to keep or drop metrics from regardless of the boolean value for every namespace

NodeSelector¶

Key	Description	Example values
`nodeSelector.default` (map)	Defines the `NodeSelector` to use for the most of StackLight pods (except some pods that refer to `DaemonSets`) if the `NodeSelector` of a component is not defined.	default: role: stacklight
`nodeSelector.component` (map)	Defines the `NodeSelector` to use for particular StackLight component pods. Overrides `nodeSelector.default`.	component: alerta: role: stacklight component: alerta # kibana: # role: stacklight # component: kibana opensearchDashboards: role: stacklight component: opensearchdashboards

OpenSearch¶

Key	Description	Example values
`elasticsearch.retentionTime` (map)	Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Specifies the retention time per index. Includes the following parameters: `logstash` - specifies the `logstash-` index retention time. `events` - specifies the `kubernetes_events-` index retention time. `notifications` - specifies the `notification-*` index retention time. The allowed values include integers (days) and numbers with suffixes: y, m, w, d, h, including capital letters. By default, values set in `elasticsearch.logstashRetentionTime` are used. However, the `elasticsearch.retentionTime` parameters, if defined, take precedence over `elasticsearch.logstashRetentionTime`.	elasticsearch: retentionTime: logstash: 3 events: "2w" notifications: "1M"
`elasticsearch.logstashRetentionTime` (int)	Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Defines the OpenSearch (Elasticsearch) `logstash-` index retention time in days. The `logstash-` index stores all logs gathered from all nodes and containers. Set to `1` by default. Note Due to the known issue 27732-2, a custom setting for this parameter is dismissed during cluster deployment and changes to one day (default). Refer to the known issue description for the affected `Cluster` releases and available workaround.	`1`, `5`, `15`
`elasticsearch.persistentVolumeClaimSize` (string) ^Mandatory	Specifies the OpenSearch (Elasticsearch) PVC(s) size. The number of PVCs depends on the StackLight database mode. For HA, three PVCs will be created, each of the size specified in this parameter. For non-HA, one PVC of the specified size. Important You cannot modify this parameter after cluster creation. Note Due to the known issue 27732-1, that is fixed in Container Cloud 2.22.0 (Cluster releases 11.6.0 and 12.7.0), the OpenSearch PVC size configuration is dismissed during a cluster deployment. Refer to the known issue description for affected `Cluster` releases and available workarounds.	elasticsearch: persistentVolumeClaimSize: 30Gi
`elasticsearch.persistentVolumeUsableStorageSizeGB` (integer)	Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Optional. Specifies the number of gigabytes that is exclusively available for the OpenSearch data. Since Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0), defines ceiling for storage-based retention, though only a portion of this storage will be available for indices, depending on the total size and cluster configuration. Before Container Cloud 2.29.0 (Cluster releases 17.3.0, 16.3.0, or earlier), defines ceiling for storage-based retention where 80% of the defined value is assumed as available disk space for normal OpenSearch node functioning. If not set (by default), the number of gigabytes from `elasticsearch.persistentVolumeClaimSize` is used. This parameter is useful in the following cases: The real storage behind the volume is shared between multiple consumers. As a result, OpenSearch cannot use all `elasticsearch.persistentVolumeClaimSize`. The real volume size is bigger than `elasticsearch.persistentVolumeClaimSize`. As a result, OpenSearch can use more than `elasticsearch.persistentVolumeClaimSize`.	elasticsearch: persistentVolumeUsableStorageSizeGB: 160

OpenSearch Dashboards extra settings¶

Key	Description	Example values
`logging.dashboardsExtraConfig` (map)	Additional configuration for `opensearch_dashboards.yml`.	logging: dashboardsExtraConfig: opensearch.requestTimeout: 60000

OpenSearch extra settings¶

Key

Description

Example values

logging.extraConfig (map)

Additional configuration for opensearch.yml that allows setting various OpenSearch parameters, including logging settings, node watermarks, and other cluster-level configurations.

Since Container Cloud 2.29.0 and MOSK 25.1, by default, StackLight manages watermarks efficiently (low/high/flood: 150/100/50 GB). If .extraConfig sets any watermark, StackLight stops managing them. In this case, explicitly set all watermarks using absolute values instead of percentages to prevent issues. While percentages are accepted, they may cause unexpected behavior, especially in clusters that use LVP as a storage provisioner, where OpenSearch shares storage with other components.

logging:
  extraConfig:
    cluster.max_shards_per_node: 5000

Prometheus¶

Key	Description	Example values
`prometheusServer.alertResendDelay` (string)	Defines the minimum amount of time for Prometheus to wait before resending an alert to Alertmanager. Passed to the `--rules.alert.resend-delay` flag. Set to `2m` by default.	`2m`, `90s`
`prometheusServer.alertsCommonLabels` (dict)	Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Defines the list of labels to be injected to firing alerts while they are sent to Alertmanager. Empty by default. The following labels are reserved for internal purposes and cannot be overridden: `cluster_id`, `service`, `severity`. Caution When new labels are injected, Prometheus sends alert updates with a new set of labels, which can potentially cause Alertmanager to have duplicated alerts for a short period of time if the cluster currently has firing alerts.	alertsCommonLabels: region: west environment: prod
`prometheusServer.persistentVolumeClaimSize` (string) ^Mandatory	Specifies the Prometheus PVC(s) size. The number of PVCs depends on the StackLight database mode. For HA, three PVCs will be created, each of the size specified in this parameter. For non-HA, one PVC of the specified size. Important You cannot modify this parameter after cluster creation.	prometheusServer: persistentVolumeClaimSize: 16Gi
`prometheusServer.queryConcurrency` (string)	Available since Container Cloud 2.24.0 (Cluster release 14.0.0). Defines the number of concurrent queries limit. Passed to the `--query.max-concurrency` flag. Set to `20` by default.	`25`
`prometheusServer.retentionSize` (string)	Defines the Prometheus database retention size. Passed to the `--storage.tsdb.retention.size` flag. Set to `15GB` by default.	`15GB`, `512MB`
`prometheusServer.retentionTime` (string)	Defines the Prometheus database retention period. Passed to the `--storage.tsdb.retention.time` flag. Set to `15d` by default.	`15d`, `1000h`, `10d12h`

Prometheus Blackbox Exporter¶

Key	Description	Example values
`blackboxExporter.customModules` (map)	Specifies a set of custom Blackbox Exporter modules. For details, see Blackbox Exporter configuration: module. The `http_2xx`, `http_2xx_verify`, `http_openstack`, `http_openstack_insecure`, `tls`, `tls_verify` names are reserved for internal usage and any overrides will be discarded.	customModules: http_post_2xx: prober: http timeout: 5s http: method: POST headers: Content-Type: application/json body: '{}'
`blackboxExporter.timeoutOffset` (string)	Specifies the offset to subtract from timeout in seconds (`--timeout-offset`), upper bounded by 5.0 to comply with the built-in StackLight functionality. If nothing is specified, the Blackbox Exporter default value is used. For example, for Blackbox Exporter v0.19.0, the default value is `0.5`.	`timeoutOffset: "0.1"`

Prometheus custom recording rules¶

Key

Description

Example values

prometheusServer.customRecordingRules (slice)

Defines custom Prometheus recording rules. Overriding of existing recording rules is not supported.

customRecordingRules:
- name: ExampleRule.http_requests_total
  rules:
  - expr: sum by(job) (rate(http_requests_total[5m]))
    record: job:http_requests:rate5m
  - expr: avg_over_time(job:http_requests:rate5m[1w])
    record: job:http_requests:rate5m:avg_over_time_1w

Prometheus custom scrape configurations¶

Key	Description	Example values
`prometheusServer.customScrapeConfigs` (map)	Defines custom Prometheus scrape configurations. For details, see Prometheus documentation: scrape_config. The names of default StackLight scrape configurations, which you can view in the Status -> Targets tab of the Prometheus web UI, are reserved for internal usage and any overrides will be discarded. Therefore, provide unique names to avoid overrides.	customScrapeConfigs: custom-grafana: scrape_interval: 10s scrape_timeout: 5s kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: - __meta_kubernetes_service_label_app - __meta_kubernetes_endpoint_port_name regex: grafana;service action: keep - source_labels: - __meta_kubernetes_pod_name target_label: pod

Prometheus metrics filtering¶

Available since Container Cloud 2.24.0 (Cluster release 14.0.0)

Key

Description

Example values

metricsFiltering.enabled (bool)

Configuration for managing Prometheus metrics filtering. When enabled (default), only actively used and explicitly white-listed metrics get scraped by Prometheus.

prometheusServer:
  metricsFiltering:
    enabled: true

metricsFiltering.extraMetricsInclude (map)

List of extra metrics to whitelist, which are dropped by default. Contains the following parameters:

<job name> - scraping job name as a key for extra white-listed metrics to add under the key. For the list of job names, see White list of Prometheus scrape jobs. If a job name is not present in this list, its target metrics are not dropped and are collected by Prometheus by default.

You can also use group key names to add metrics to more than one job using _group-<key name>. The following list combines jobs by groups:

Note

The prometheus-coredns job from the go-collector-metrics and process-collector-metrics groups is removed in Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0).

<list of metrics to collect> - extra metrics of <job name> to be white-listed.

prometheusServer:
  metricsFiltering:
    enabled: true
    extraMetricsInclude:
      cadvisor:
        - container_memory_failcnt
        - container_network_transmit_errors_total
      calico:
        - felix_route_table_per_iface_sync_seconds_sum
        - felix_bpf_dataplane_endpoints
      _group-go-collector-metrics:
        - go_gc_heap_goal_bytes
        - go_gc_heap_objects_objects

Prometheus Node Exporter¶

Key

Description

Example values

nodeExporter.netDeviceExclude (string)

Excludes monitoring of RegExp-specified network devices. The number of network interface-related metrics is significant and may cause extended Prometheus RAM usage in big clusters. Therefore, Prometheus Node Exporter only collects information of a basic set of interfaces (both host and container) and excludes the following monitoring interfaces:

veth/cali - the host-side part of the container-host Ethernet tunnel
o-hm0 - the OpenStack Octavia management interface for communication with the amphora machine
tap, qg-, qr-, ha- - the Open vSwitch virtual bridge ports
br-(ex|int|tun) - the Open vSwitch virtual bridges
docker0, br- - the Docker bridge (master for the veth interfaces)
ovs-system - the Open vSwitch interface (mapping interfaces to bridges)

To enable information collecting for the interfaces above, edit the list of blacklisted devices as needed.

nodeExporter:
  netDeviceExclude: "^(veth.+|cali.+|o-hm0|tap.+|qg-.+|qr-.+|ha-.+|br-.+|ovs-system|docker0)$"

nodeExporter.extraCollectorsEnabled (slice)

Enables Node Exporter collectors. For a list of available collectors, see Node Exporter Collectors. The following collectors are enabled by default in StackLight:

arp
conntrack
cpu
diskstats
entropy
filefd
filesystem
hwmon
loadavg
meminfo
netdev
netstat
nfs
stat
sockstat
textfile
time
timex
uname
vmstat

extraCollectorsEnabled:
  - bcache
  - bonding
  - softnet

Prometheus Relay¶

Note

Prometheus Relay is set up as an endpoint in the Prometheus datasource in Grafana. Therefore, all requests from Grafana are sent to Prometheus through Prometheus Relay. If Prometheus Relay reports request timeouts or exceeds the response size limits, you can configure the parameters below. In this case, Prometheus Relay resource limits may also require tuning.

Key	Description	Example values
`prometheusRelay.clientTimeout` (string)	Specifies the client timeout in seconds. If empty, defaults to a value determined by the cluster size: `10` for small, `30` for medium, `60` for large. Note The cluster size parameters are available since Container Cloud 2.24.0 (Cluster release 14.0.0).	`10`
`prometheusRelay.responseLimitBytes` (string)	Specifies the response size limit in bytes. If empty, defaults to a value determined by the cluster size: `6291456` for small, `18874368` for medium, `37748736` for large. Note The cluster size parameters are available since Container Cloud 2.24.0 (Cluster release 14.0.0).	`1048576`

Prometheus remote write¶

Allows sending of metrics from Prometheus to a custom monitoring endpoint. For details, see Prometheus Documentation: remote_write.

Key	Description	Example values
`prometheusServer.remoteWriteSecretMounts` (slice)	Skip this step if your remote server does not have authorization. Defines additional mounts for `remoteWrites` secrets. Secret objects with credentials needed to access the remote endpoint must be precreated in the `stacklight` namespace. For details, see Kubernetes Secrets. Note To create more than one file for the same remote write endpoint, for example, to configure TLS connections, use a single secret object with multiple keys in the `data` field. Using the following example configuration, two files will be created, `cert_file` and `key_file`: ... data: cert_file: aWx1dnRlc3Rz key_file: dGVzdHVzZXI= ...	remoteWriteSecretMounts: - secretName: prom-secret-files mountPath: /etc/config/remote_write
`prometheusServer.remoteWrites` (slice)	Defines the configuration of a custom remote_write endpoint for sending Prometheus samples. Note If the remote server uses authorization, first create secret(s) in the `stacklight` namespace and mount them to Prometheus through `prometheusServer.remoteWriteSecretMounts`. Then define the created secret in the `authorization` field.	remoteWrites: - url: http://remote_url/push authorization: credentials_file: /etc/config/remote_write/key_file

Resource limits¶

Key

Description

Example values

resourcesPerClusterSize (map)

Provides the capability to override the default resource requests or limits for any StackLight component for the predefined cluster sizes.

Caution

StackLight components for resource limits customization

Note

The below list has the componentName: <podNamePrefix>/<containerName> format.

alerta: alerta/alerta
alertmanager: prometheus-alertmanager/prometheus-alertmanager
alertmanagerWebhookServicenow: alertmanager-webhook-servicenow/alertmanager-webhook-servicenow
blackboxExporter: prometheus-blackbox-exporter/blackbox-exporter
elasticsearch: opensearch-master/opensearch # Deprecated
elasticsearchCurator: elasticsearch-curator/elasticsearch-curator
elasticsearchExporter: elasticsearch-exporter/elasticsearch-exporter
fluentdElasticsearch: fluentd-logs/fluentd-logs # Deprecated
fluentdLogs: fluentd-logs/fluentd-logs
fluentdNotifications: fluentd-notifications/fluentd
grafana: grafana/grafana
grafanaRenderer: grafana/grafana-renderer # Removed in MCC 2.27.0 (17.2.0 and 16.2.0)
iamProxy: iam-proxy/iam-proxy # Deprecated
iamProxyAlerta: iam-proxy-alerta/iam-proxy
iamProxyAlertmanager: iam-proxy-alertmanager/iam-proxy
iamProxyGrafana: iam-proxy-grafana/iam-proxy
iamProxyKibana: iam-proxy-kibana/iam-proxy # Deprecated
iamProxyOpenSearchDashboards: iam-proxy-kibana/iam-proxy
iamProxyPrometheus: iam-proxy-prometheus/iam-proxy
kibana: opensearch-dashboards/opensearch-dashboards # Deprecated
kubeStateMetrics: prometheus-kube-state-metrics/prometheus-kube-state-metrics
libvirtExporter: prometheus-libvirt-exporter/prometheus-libvirt-exporter
metricCollector: metric-collector/metric-collector
metricbeat: metricbeat/metricbeat
nodeExporter: prometheus-node-exporter/prometheus-node-exporter
opensearch: opensearch-master/opensearch
opensearchDashboards: opensearch-dashboards/opensearch-dashboards
patroniExporter: patroni/patroni-patroni-exporter
pgsqlExporter: patroni/patroni-pgsql-exporter
postgresql: patroni/patroni
prometheusEsExporter: prometheus-es-exporter/prometheus-es-exporter
prometheusMsTeams: prometheus-msteams/prometheus-msteams
prometheusRelay: prometheus-relay/prometheus-relay
prometheusServer: prometheus-server/prometheus-server
sfNotifier: sf-notifier/sf-notifier
sfReporter: sf-reporter/sf-reporter
stacklightHelmControllerController: stacklight-helm-controller/controller
telegrafDockerSwarm: telegraf-docker-swarm/telegraf-docker-swarm
telegrafDs: telegraf-ds-smart/telegraf-ds-smart # Deprecated
telegrafDsSmart: telegraf-ds-smart/telegraf-ds-smart
telegrafOpenstack: telegraf-openstack/telegraf-openstack # replaced with osdpl-exporter in 24.1
telegrafS: telegraf-docker-swarm/telegraf-docker-swarm # Deprecated
telemeterClient: telemeter-client/telemeter-client
telemeterServer: telemeter-server/telemeter-server
telemeterServerAuthServer: telemeter-server/telemeter-server-authorization-server
tfControllerExporter: prometheus-tf-controller-exporter/prometheus-tungstenfabric-exporter
tfVrouterExporter: prometheus-tf-vrouter-exporter/prometheus-tungstenfabric-exporter

resourcesPerClusterSize:
  # elasticsearch:
  opensearch:
    small:
      limits:
        cpu: "1000m"
        memory: "4Gi"
    medium:
      limits:
        cpu: "2000m"
        memory: "8Gi"
      requests:
        cpu: "1000m"
        memory: "4Gi"
    large:
      limits:
        cpu: "4000m"
        memory: "16Gi"

resources (map)

Provides the capability to override the containers resource requests or limits for any StackLight component.

StackLight components for resource limits customization

Note

The below list has the componentName: <podNamePrefix>/<containerName> format.

alerta: alerta/alerta
alertmanager: prometheus-alertmanager/prometheus-alertmanager
alertmanagerWebhookServicenow: alertmanager-webhook-servicenow/alertmanager-webhook-servicenow
blackboxExporter: prometheus-blackbox-exporter/blackbox-exporter
elasticsearch: opensearch-master/opensearch # Deprecated
elasticsearchCurator: elasticsearch-curator/elasticsearch-curator
elasticsearchExporter: elasticsearch-exporter/elasticsearch-exporter
fluentdElasticsearch: fluentd-logs/fluentd-logs # Deprecated
fluentdLogs: fluentd-logs/fluentd-logs
fluentdNotifications: fluentd-notifications/fluentd
grafana: grafana/grafana
grafanaRenderer: grafana/grafana-renderer # Removed in MCC 2.27.0 (17.2.0 and 16.2.0)
iamProxy: iam-proxy/iam-proxy # Deprecated
iamProxyAlerta: iam-proxy-alerta/iam-proxy
iamProxyAlertmanager: iam-proxy-alertmanager/iam-proxy
iamProxyGrafana: iam-proxy-grafana/iam-proxy
iamProxyKibana: iam-proxy-kibana/iam-proxy # Deprecated
iamProxyOpenSearchDashboards: iam-proxy-kibana/iam-proxy
iamProxyPrometheus: iam-proxy-prometheus/iam-proxy
kibana: opensearch-dashboards/opensearch-dashboards # Deprecated
kubeStateMetrics: prometheus-kube-state-metrics/prometheus-kube-state-metrics
libvirtExporter: prometheus-libvirt-exporter/prometheus-libvirt-exporter
metricCollector: metric-collector/metric-collector
metricbeat: metricbeat/metricbeat
nodeExporter: prometheus-node-exporter/prometheus-node-exporter
opensearch: opensearch-master/opensearch
opensearchDashboards: opensearch-dashboards/opensearch-dashboards
patroniExporter: patroni/patroni-patroni-exporter
pgsqlExporter: patroni/patroni-pgsql-exporter
postgresql: patroni/patroni
prometheusEsExporter: prometheus-es-exporter/prometheus-es-exporter
prometheusMsTeams: prometheus-msteams/prometheus-msteams
prometheusRelay: prometheus-relay/prometheus-relay
prometheusServer: prometheus-server/prometheus-server
sfNotifier: sf-notifier/sf-notifier
sfReporter: sf-reporter/sf-reporter
stacklightHelmControllerController: stacklight-helm-controller/controller
telegrafDockerSwarm: telegraf-docker-swarm/telegraf-docker-swarm
telegrafDs: telegraf-ds-smart/telegraf-ds-smart # Deprecated
telegrafDsSmart: telegraf-ds-smart/telegraf-ds-smart
telegrafOpenstack: telegraf-openstack/telegraf-openstack # replaced with osdpl-exporter in 24.1
telegrafS: telegraf-docker-swarm/telegraf-docker-swarm # Deprecated
telemeterClient: telemeter-client/telemeter-client
telemeterServer: telemeter-server/telemeter-server
telemeterServerAuthServer: telemeter-server/telemeter-server-authorization-server
tfControllerExporter: prometheus-tf-controller-exporter/prometheus-tungstenfabric-exporter
tfVrouterExporter: prometheus-tf-vrouter-exporter/prometheus-tungstenfabric-exporter

resources:
  alerta:
    requests:
      cpu: "50m"
      memory: "200Mi"
    limits:
      memory: "500Mi"

Using the example above, each pod in the alerta service will be requesting 50 millicores of CPU and 200 MiB of memory, while being hard-limited to 500 MiB of memory usage. Each configuration key is optional.

Note

The logging mechanism performance depends on the cluster log load. If the cluster components send an excessive amount of logs, the default resource requests and limits for fluentdLogs (or fluentdElasticsearch) may be insufficient, which may cause its pods to be OOMKilled and trigger the KubePodCrashLooping alert. In such case, increase the default resource requests and limits for fluentdLogs. For example:

resources:
  # fluentdElasticsearch:
  fluentdLogs:
    requests:
      memory: "500Mi"
    limits:
      memory: "1500Mi"

Salesforce reporter¶

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled. The Salesforce reporter depends on the Internet access through HTTPS.

Key	Description	Example values
`clusterId` (string)	Unique cluster identifier `clusterId="<Cluster Project>/<Cluster Name>/<UID>"`, generated for each cluster using Cluster Project, Cluster Name, and cluster UID, separated by a slash. Used for both `sf-reporter` and `sf-notifier` services. The `clusterId` key is automatically defined for each cluster. Do not set or modify it manually.	Do not modify `clusterId`.
`sfReporter.enabled` (bool)	Enables or disables reporting of Prometheus metrics to Salesforce. For details, see Deployment architecture. Disabled by default.	`true` or `false`
`sfReporter.salesForceAuth` (map)	Salesforce parameters and credentials for the metrics reporting integration.	Note Modify this parameter if `sf-notifier` is not configured or if you want to use a different Salesforce user account to send reports to. salesForceAuth: url: "<SF instance URL>" username: "<SF account email address>" password: "<SF password>" environment_id: "<Cloud identifier>" organization_id: "<Organization identifier>" sandbox_enabled: "<Set to true or false>"
`sfReporter.cronjob` (map)	Defines the Kubernetes cron job for sending metrics to Salesforce. By default, reports are sent at midnight server time.	cronjob: schedule: "0 0 * * *" concurrencyPolicy: "Allow" failedJobsHistoryLimit: "" successfulJobsHistoryLimit: "" startingDeadlineSeconds: 200

Storage class¶

In an HA StackLight setup, when highAvailabilityEnabled is set to true, all StackLight Persistent Volumes (PVs) use the Local Volume Provisioner (LVP) storage class not to rely on dynamic provisioners such as Ceph, which are not available in every deployment. In a non-HA StackLight setup, when no storage class is specified, PVs use the default storage class of a cluster.

Key	Description	Example values
`storage.defaultStorageClass` (string)	Defines the `StorageClass` to use for all StackLight Persistent Volume Claims (PVCs) if a component `StorageClass` is not defined using the `componentStorageClasses`. To use the default storage class, leave the string empty.	`lvp`, `standard`
`storage.componentStorageClasses` (map)	Defines (overrides the `defaultStorageClass` value) the storage class for any StackLight component separately. To use the default storage class, leave the string empty.	componentStorageClasses: elasticsearch: "" opensearch: "" fluentd: "" postgresql: "" prometheusAlertManager: "" prometheusServer: ""

Verify StackLight after configuration¶

This section describes how to verify StackLight after configuring its parameters as described in StackLight configuration procedure and StackLight configuration parameters. Perform the verification procedure described for a particular modified StackLight key.

Verify StackLight configuration of an OpenStack cluster¶

Key	Verification procedure
`externalFQDNs.enabled` `openstack.insecure`	In the Prometheus web UI, navigate to Status > Targets. Verify that the `blackbox-external-endpoint` target contains the configured domains (URLs).
`openstack.enabled` `openstack.namespace`	In the Grafana web UI, verify that the OpenStack dashboards are present and not empty. In the Prometheus web UI, click Alerts and verify that the OpenStack alerts are present in the list of alerts.
`openstack.gnocchi.enabled`	In the Grafana web UI, verify that the Gnocchi dashboard is present and not empty. Alternatively, verify that the Gnocchi dashboard ConfigMap is present: kubectl get cm -n stacklight \ grafana-dashboards-default-gnocchi In the OpenSearch Dashboards web UI, verify that logs for the `gnocchi-metricd` and `gnocchi-api` loggers are present.
`openstack.ironic.enabled`	In the Grafana web UI, verify that the Ironic dashboard is present and not empty. In the Prometheus web UI, click Alerts and verify that the `Ironic*` alerts are present in the list of alerts.
`openstack.rabbitmq.credentialsConfig` `openstack.rabbitmq.credentialsDiscovery`	In the OpenSearch Dashboards web UI, click Discover and verify that the `audit-` and `notifications-` indexes contain documents.
`openstack.telegraf.credentialsConfig` `openstack.telegraf.credentialsDiscovery` `openstack.telegraf.interval` `openstack.telegraf.insecure` `openstack.telegraf.skipPublicEndpoints`	In the Grafana web UI, verify that the OpenStack dashboards are present and not empty.
`tungstenFabricMonitoring.enabled`	In the Grafana web UI, verify that the Tungsten Fabric dashboards are present and not empty. In the Prometheus web UI, click Alerts and verify that the Tungsten Fabric alerts are present in the list of alerts.

Verify StackLight configuration of a MOSK cluster¶

Key	Verification procedure
`alerta.enabled`	Verify that Alerta is present in the list of StackLight resources. An empty output indicates that Alerta is disabled. kubectl get all -n stacklight -l app=alerta
`alertmanagerSimpleConfig.email` `alertmanagerSimpleConfig.email.enabled` `alertmanagerSimpleConfig.email.route`	In the Alertmanager web UI, navigate to Status and verify that the Config section contains the `Email` receiver and route.
`alertmanagerSimpleConfig.genericReceivers`	In the Alertmanager web UI, navigate to Status and verify that the Config section contains the intended receiver(s).
`alertmanagerSimpleConfig.genericRoutes`	In the Alertmanager web UI, navigate to Status and verify that the Config section contains the intended route(s).
`alertmanagerSimpleConfig.inhibitRules.enabled`	Run the following command. An empty output indicates either a failure or that the feature is disabled. kubectl get cm -n stacklight prometheus-alertmanager -o \ yaml \| grep -A 6 inhibit_rules
`alertmanagerSimpleConfig.msteams.enabled` `alertmanagerSimpleConfig.msteams.url` `alertmanagerSimpleConfig.msteams.route`	Verify that the Prometheus Microsoft Teams pod is up and running: kubectl get pods -n stacklight -l \ 'app=prometheus-msteams' Verify that the Prometheus Microsoft Teams pod logs have no errors: kubectl logs -f -n stacklight -l \ 'app=prometheus-msteams' Verify that notifications are being sent to the Microsoft Teams channel.
`alertmanagerSimpleConfig.salesForce.enabled` `alertmanagerSimpleConfig.salesForce.auth` `alertmanagerSimpleConfig.salesForce.route`	Verify that `sf-notifier` is enabled. The output must include the `sf-notifier` pod name, `1/1` in the `READY` field and `Running` in the `STATUS` field. kubectl get pods -n stacklight Verify that `sf-notifier` successfully authenticates to Salesforce. The output must include the Salesforce authentication successful line. kubectl logs -f -n stacklight <sf-notifier-pod-name> In the Alertmanager web UI, navigate to Status and verify that the Config section contains the `HTTP-salesforce` receiver and route.
`alertmanagerSimpleConfig.salesForce.feed_enabled`	Verify that the `sf-notifier` pod logs include Creating feed item messages. For such messages to appear in logs, `DEBUG` logging level must be set up. Verify through Salesforce: Log in to the Salesforce web UI. Click the Feed tab for a case created by `sf-notifier`. Verify that All Messages gets updated.
`alertmanagerSimpleConfig.salesForce.link_prometheus`	Verify that `SF_NOTIFIER_ADD_LINKS` has changed to `true` or `false` according to your customization: kubectl get deployment sf-notifier \ -o=jsonpath='{.spec.template.spec.containers[0].env}' \| jq .
`alertmanagerSimpleConfig.serviceNow`	Verify that the `alertmanager-webhook-servicenow` pod is up and running: kubectl get pods -n stacklight -l \ 'app=alertmanager-webhook-servicenow' Verify that authentication to ServiceNow was successful. The output should include ServiceNow authentication successful. In case of authentication failure, the `ServiceNowAuthFailure` alert will raise. kubectl logs -f -n stacklight \ <alertmanager-webhook-servicenow-pod-name> In your ServiceNow instance, verify that the Watchdog alert appears in the Incident table. Once the incident is created, the pod logs should include a line similar to Created Incident: bef260671bdb2010d7b540c6cc4bcbed. In case of any failure: Verify that your ServiceNow instance is not in hibernation. Verify that the service user credentials, table name, and `alert_id_field` are correct. Verify that the ServiceNow user has access to the table with permission to read, create, and update records.
`alertmanagerSimpleConfig.slack.enabled` `alertmanagerSimpleConfig.slack.api_url` `alertmanagerSimpleConfig.slack.channel` `alertmanagerSimpleConfig.slack.route`	In the Alertmanager web UI, navigate to Status and verify that the Config section contains the `HTTP-slack` receiver and route.
`blackboxExporter.customModules`	Verify that your module is present in the list of modules. It can take up to 10 minutes for the module to appear in the ConfigMap. kubectl get cm prometheus-blackbox-exporter -n stacklight \ -o=jsonpath='{.data.blackbox\.yaml}' Review the `configmap-reload` container logs to verify that the reload happened successfully. It can take up to 1 minute for reload to happen after the module appears in the ConfigMap. kubectl logs -l app=prometheus-blackbox-exporter -n stacklight -c \ configmap-reload
`blackboxExporter.timeoutOffset`	Verify that the `args` parameter of the `blackbox-exporter` container contains the specified `--timeout-offset`: kubectl get deployment.apps/prometheus-blackbox-exporter -n stacklight \ -o=jsonpath='{.spec.template.spec.containers[?(@.name=="blackbox-exporter")].args}' For example, for `blackboxExporter.timeoutOffset` set to `0.1`, the output should include `["--config.file=/config/blackbox.yaml","--timeout-offset=0.1"]`. It can take up to 10 minutes for the parameter to be populated.
`ceph.enabled`	In the Grafana web UI, verify that Ceph dashboards are present in the list of dashboards and are populated with data. In the Prometheus web UI, click Alerts and verify that the list of alerts contains `Ceph*` alerts.
`clusterSize` `resourcesPerClusterSize` ^Deprecated `resources`	Obtain the list of pods: kubectl get po -n stacklight Verify that the desired resource limits or requests are set in the `resources` section of every container in the pod: kubectl get po <pod_name> -n stacklight -o yaml
`elasticsearch.logstashRetentionTime` ^{Removed in MCC 2.26.0 (17.1.0, 16.1.0)}	Verify that the `unit_count` parameter contains the desired number of days: kubectl get cm elasticsearch-curator-config -n \ stacklight -o=jsonpath='{.data.action_file\.yml}'
`elasticsearch.persistentVolumeClaimSize`	Verify that the PVC(s) capacity is equal or higher (in case of statically provisioned volumes) than specified: kubectl get pvc -n stacklight -l "app=opensearch-master"
`elasticsearch.retentionTime` `logging.retentionTime` ^{Removed in MCC 2.26.0 (17.1.0, 16.1.0)}	Verify that `configMap` includes the new data. The output should include the changed values. kubectl get cm elasticsearch-curator-config -n stacklight --kubeconfig=<pathToKubeconfig> -o yaml Verify that the `elasticsearch-curator-{JOB_ID}-{POD_ID}` job has successfully completed: kubectl logs elasticsearch-curator-<jobID>-<podID> -n stacklight --kubeconfig=<pathToKubeconfig>
`externalEndpointMonitoring.enabled` `externalEndpointMonitoring.domains`	In the Prometheus web UI, navigate to Status -> Targets. Verify that the `blackbox-external-endpoint` target contains the configured domains (URLs).
`grafana.homeDashboard`	In the Grafana web UI, verify that the desired dashboard is set as a home dashboard.
`grafana.renderer.enabled` ^{Removed in MCC 2.27.0 (17.2.0, 16.2.0)}	Verify the Grafana Image Renderer. If set to `true`, the output should include `HTTP Server started, listening at http://localhost:8081`. kubectl logs -f -n stacklight -l app=grafana \ --container grafana-renderer
`highAvailabilityEnabled`	Verify the number of service replicas for the HA or non-HA StackLight mode. For details, see Deployment architecture. kubectl get sts -n stacklight
`ironic.endpoint` `ironic.insecure`	In the Grafana web UI, verify that the Ironic BM dashboard displays valuable data (no false-positive or empty panels).
`logging.dashboardsExtraConfig`	Verify that the customization has applied: kubectl -n stacklight get cm opensearch-dashboards -o=jsonpath='{.data}' Example of system response: {"opensearch_dashboards.yml":"opensearch.hosts: http://opensearch-master:9200\ \nopensearch.requestTimeout: 60000\ \nopensearchDashboards.defaultAppId: dashboard/2d53aa40-ad1f-11e9-9839-052bda0fdf49\ \nserver:\ \n host: 0.0.0.0\ \n name: opensearch-dashboards\n"}
`logging.enabled`	Verify that OpenSearch, Fluentd, and OpenSearch Dashboards are present in the list of StackLight resources. An empty output indicates that the StackLight logging stack is disabled. kubectl get all -n stacklight -l 'app in (opensearch-master,opensearchDashboards,fluentd-logs)'
`logging.externalOutputs`	Verify the `fluentd-logs` Kubernetes configmap in the `stacklight` namespace: kubectl get cm -n stacklight fluentd-logs -o \ "jsonpath={.data['output-logs\.conf']}" The output must contain an additional output stream according to configured external outputs. After restart of the `fluentd-logs` pods, verify that their logs do not contain any delivery error messages. For example: kubectl logs -n stacklight -f <fluentd-logs-pod-name>\| grep '\[error\]' Example output with a missing parameter: [...] 2023-07-25 09:39:33 +0000 [error]: config error file="/etc/fluentd/fluent.conf" error_class=Fluent::ConfigError error="host or host_with_port is required" If a parameter is missing, verify the configuration as described in Enable log forwarding to external destinations. Verify that the log messages are appearing in the external server database. To troubleshoot issues with Splunk, refer to No logs are forwarded to Splunk.
`logging.externalOutputSecretMounts`	Verify that files were created for the specified path in the Fluentd container: kubectl get pods -n stacklight -o name \| grep fluentd-logs \| \ xargs -I{} kubectl exec -i {} -c fluentd-logs -n stacklight -- \ ls <logging.externalOutputSecretMounts.mountPath>
`logging.extraConfig`	Verify that the customization has applied: kubectl -n stacklight get cm opensearch-master-config -o=jsonpath='{.data}' Example of system response: {"opensearch.yml":"cluster.name: opensearch\ \nnetwork.host: 0.0.0.0\ \nplugins.security.disabled: true\ \nplugins.index_state_management.enabled: false\ \npath.data: /usr/share/opensearch/data\ \ncompatibility.override_main_response_version: true\ \ncluster.max_shards_per_node: 5000\n"}
`logging.level` ^{Removed in MCC 2.26.0 (17.1.0, 16.1.0)}	Inspect the `fluentd-logs` Kubernetes configmap in the `stacklight` namespace: kubectl get cm -n stacklight fluentd-logs \ -o "jsonpath={.data['output-logs\.conf']}" Grep the output using the following command. The `pattern` should contain all logging levels below the expected one. @type grep <exclude> key severity_label pattern /^<pattern>$/ </exclude>
`logging.metricQueries`	For details, see steps 4.2 and 4.3 in Create logs-based metrics.
`logging.syslog.enabled`	Verify the `fluentd-logs` Kubernetes configmap in the `stacklight` namespace: kubectl get cm -n stacklight fluentd-logs -o \ "jsonpath={.data['output-logs\.conf']}" The output must contain an additional container with the remote syslog configuration. After restart of the `fluentd-logs` pods, verify that their logs do not contain any delivery error messages. Verify that the log messages are appearing in the remote syslog database.
`logging.syslog.packetSize`	Verify that `packetSize` has changed according to your customization: kubectl get cm -n stacklight fluentd-logs -o \ yaml \| grep packet_size
`metricFilter`	In the Prometheus web UI, navigate to Status > Configuration. Verify that the following fields in the `metric_relabel_configs` section for the `kubernetes-nodes-cadvisor` and `prometheus-kube-state-metrics` scrape jobs have the required configuration: `action` is set to `keep` or `drop` `regex` contains a regular expression with configured namespaces delimited by `\|` `source_labels` is set to `[namespace]`
`mke.dockerdDataRoot`	In the Prometheus web UI, navigate to Alerts and verify that the `MKEAPIDown` is not false-positively firing due to the certificate absence.
`mke.enabled`	In the Grafana web UI, verify that the MKE Cluster and MKE Containers dashboards are present and not empty. In the Prometheus web UI, navigate to Alerts and verify that the `MKE*` alerts are present in the list of alerts.
`nodeExporter.extraCollectorsEnabled`	In the Prometheus web UI, run the following PromQL queries. The result should not be empty. node_scrape_collector_duration_seconds{collector="<COLLECTOR_NAME>"} node_scrape_collector_success{collector="<COLLECTOR_NAME>"}
`nodeExporter.netDeviceExclude`	Verify the DaemonSet configuration of the Node Exporter: kubectl get daemonset -n stacklight prometheus-node-exporter \ -o=jsonpath='{.spec.template.spec.containers[0].args}' \| jq . Expected system response: [ "--path.procfs=/host/proc", "--path.sysfs=/host/sys", "--collector.netclass.ignored-devices=<paste_your_excluding_regexp_here>", "--collector.netdev.device-blacklist=<paste_your_excluding_regexp_here>", "--no-collector.ipvs" ] In the Prometheus web UI, run the following PromQL query. The expected result is `1`. absent(node_network_transmit_bytes_total{device=~"<paste_your_excluding_regexp_here>"})
`nodeSelector.component` `nodeSelector.default` `tolerations.component` `tolerations.default`	Verify that the appropriate components pods are located on the intended nodes: kubectl get pod -o=custom-columns=NAME:.metadata.name,\ STATUS:.status.phase,NODE:.spec.nodeName -n stacklight
`prometheusRelay.clientTimeout` `prometheusRelay.responseLimitBytes`	Verify that the Prometheus Relay pod is up and running: kubectl get pods -n stacklight -l 'component=relay' Verify that the values have changed according to your customization: kubectl get pods -n stacklight prometheus-relay-9f87df558-zjpvn \ -o=jsonpath='{.spec.containers[0].env}' \| jq .
`prometheusServer.alertsCommonLabels`	In the Prometheus web UI, navigate to Status > Configuration. Verify that the `alerting.alert_relabel_configs` section contains the customization for common labels that you added in `prometheusServer.alertsCommonLabels` during StackLight configuration.
`prometheusServer.customAlerts`	In the Prometheus web UI, navigate to Alerts and verify that the list of alerts has changed according to your customization.
`prometheusServer.customRecordingRules`	In the Prometheus web UI, navigate to Status > Rules. Verify that the list of Prometheus recording rules has changed according to your customization.
`prometheusServer.customScrapeConfigs`	In the Prometheus web UI, navigate to Status > Targets. Verify that the required target has appeared in the list of targets. It may take up to 10 minutes for the change to apply.
`prometheusServer.persistentVolumeClaimSize`	Verify that the PVC(s) capacity equals or is higher (in case of statically provisioned volumes) than specified: kubectl get pvc -n stacklight -l "app=prometheus,component=server"
`prometheusServer.alertResendDelay` `prometheusServer.queryConcurrency` `prometheusServer.retentionSize` `prometheusServer.retentionTime`	In the Prometheus web UI, navigate to Status > Command-Line Flags. Verify the values for the following flags: `rules.alert.resend-delay` `query.max-concurrency` `storage.tsdb.retention.size` `storage.tsdb.retention.time`
`prometheusServer.remoteWrites`	Inspect the `remote_write` configuration in the Status > Configuration section of the Prometheus web UI. Inspect the Prometheus server logs for errors: kubectl logs prometheus-server-0 prometheus-server -n stacklight
`prometheusServer.remoteWriteSecretMounts`	Verify that files were created for the specified path in the Prometheus container: kubectl exec -it prometheus-server-0 -c prometheus-server -n \ stacklight -- ls <remoteWriteSecretMounts.mountPath>
`prometheusServer.watchDogAlertEnabled`	In the Prometheus web UI, navigate to Alerts and verify that the list of alerts contains the `Watchdog` alert.
`sfReporter.cronjob` `sfReporter.enabled` `sfReporter.salesForce`	Verify that Salesforce reporter is enabled. The `SUSPEND` field in the output must be `False`. kubectl get cronjob -n stacklight Verify that the Salesforce reporter configuration includes all expected queries: kubectl get configmap -n stacklight \ sf-reporter-config -o yaml After cron job execution (by default, at midnight server time), obtain the Salesforce reporter pod name. The output should include the Salesforce reporter pod name and `STATUS` must be `Completed`. kubectl get pods -n stacklight Verify that Salesforce reporter successfully authenticates to Salesforce and creates records. The output must include the Salesforce authentication successful, Created record or Duplicate record and Updated record lines. kubectl logs -n stacklight <sf-reporter-pod-name>
`sslCertificateMonitoring.domains` `sslCertificateMonitoring.enabled`	In the Prometheus web UI, navigate to Status -> Targets. Verify that the `blackbox` target contains the configured domains (URLs).
`storage.componentStorageClasses` `storage.defaultStorageClass`	Verify that the appropriate components PVCs have been created according to the configured `StorageClass`: kubectl get pvc -n stacklight

Tune OpenSearch performance¶

The following hardware recommendations and software settings apply for better OpenSearch performance in a MOSK cluster.

To tune OpenSearch performance:

Depending on your cluster size, set the required disk and CPU size along with memory limit and heap size.

Heap size is calculated in StackLight as ⅘ of the specified memory limit. If the calculated heap size exceeds 32 GB, slightly crossing this threshold causes significant waste of memory due to loss of Ordinary Object Pointers (OOPS) compression, which allows storing 64-bit pointers in 32-bits.

Since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0), to prevent this behavior, for the memory limit in the 31-50 GB range, the heap size is set to fixed 31 GB using the enforceOopsCompression parameter, which is enabled by default. For details, see Logging: Enforce OOPS compression. Exceeding the range causes loss of benefit of OOPS compression, so the ⅘ formula applies again.

OpenSearch is write-heavy, so SSD is preferable as a disk type.

Hardware recommendations for OpenSearch¶

Cluster size

Memory limit (GB)

Heap size (GB)

CPU (# of cores)

Small

16

12.8

2

Medium

32

25.6

4

Large

64

51.2

8

To configure hardware settings for OpenSearch, refer to Resource limits in the StackLight configuration procedure section.
Configure the maximum count of mmap files. OpenSearch uses mmapfs to map shards stored on disk, which is set to 65530 by default.

To verify max_map_count:
```
sysctl -n vm.max_map_count
```
To increase max_map_count, follow the Create MOSK host profiles procedure.

Example configuration:
```
kernelParameters:
 sysctl:
 vm.max_map_count: "<value>"
```
Extended retention periods, which depend on open shards, require increasing this value significantly. For example, to 262144.
Configure swap as it significantly degrades performance. Lower swappiness to 1 or 0 (to disable swap). For details, use the Create MOSK host profiles procedure.

Example configuration:
```
kernelParameters:
 sysctl:
 vm.swappiness: "<value>"
```
Configure the kernel I/O scheduler to improve timing of disk writing operations. Change it to one of the following options:
- none - applies the FIFO queue.
- mq-deadline - applies three queues: FIFO read, FIFO write, and sorted.
Changing I/O scheduling is also possible through BareMetalHostProfile. However, the specific implementation highly depends on the disk type used:
```
cat /sys/block/sda/queue/scheduler

mq-deadline kyber bfq [none]
```

Export logs from OpenSearch Dashboards to CSV¶

Available since MCC 2.23.0 (12.7.0 and 11.7.0)

This section describes how to export logs from the OpenSearch Dashboards navigation panel to the CSV format.

Caution

The log limit is set 10 000 rows, and it does not take into account the resulted file size.

Note

The following instruction describes how to export all logs from the opensearch-master-0 node of an OpenSearch cluster.

To export logs from the OpenSearch Dashboards navigation panel to CSV:

Log in to the OpenSearch Dashboards web UI as described in Getting access.
Navigate to the Discover page.
In the left navigation panel, select the required log index pattern from the top drop-down menu. For example, system* for system logs and audit* for audit logs.
In the middle top menu, click Add filter and add the required filters. For example:
- event.provider matches the opensearch-master logger
- orchestrator.pod matches the opensearch-master-0 node name
In Search field names, search for required fields to be present in the resulting CSV file. For example:
- orchestrator.pod for opensearch-master-0
- message for the log message
In the right top menu:
1. Click Save to save the filter after naming it.
2. Click Reporting > Generate CSV.
When the report generation completes, download the file depending on your browser settings.

OpenSearch Dashboards¶

This section describes OpenSearch Dashboards that enable you to observe visual representation of logs and Kubernetes events of your cluster.

View OpenSearch Dashboards¶

OpenSearch Dashboards is part of the StackLight logging stack. Using the OpenSearch Dashboards web UI, you can view the visual representation of your OpenStack deployment notifications, logs, Kubernetes events, and other cluster notifications related to your deployment.

Note

By default, StackLight logging stack, including OpenSearch Dashboards, is disabled. For details, see Deployment architecture.

To view the OpenSearch Dashboards:

Click the required dashboard to inspect the visualizations or perform a search:

Dashboard	Description
Notifications	Provides visualizations on the number of notifications over time per source and severity, host, and breakdowns. The dashboard includes search.
K8s events	Provides visualizations on the number of Kubernetes events per type, and top event-producing resources and namespaces by reason and event type. Includes search.
System Logs	Available for clusters created since Container Cloud 2.26.0 (Cluster releases 17.1.x, 16.1.x, or later). Provides visualizations on the number of log messages per severity, source, and top log-producing host, namespaces, containers, and applications. Includes search. Caution Due to a known issue, this dashboard does not exist in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). The issue is addressed in Container Cloud 2.26.1 (Cluster releases 17.1.1 and 16.1.1). To work around the issue in 2.26.0, you can map the fields of the `logstash` index to the `system` one and view logs in the deprecated Logs dashboard. For mapping details, see System index fields mapped to Logstash index fields.
Logs ^{Deprecated in 2.26.0 (17.1.0 and 16.1.0)}	Available only for clusters created before Container Cloud 2.26.0 (Cluster releases 17.0.x, 16.0.x, or earlier). Analogous to System Logs but contains logs generated only for the mentioned Cluster releases.

Search in OpenSearch Dashboards¶

OpenSearch Dashboards provide the following search tools:

Filters
Queries
Full-text search

Filters enable you to organize the output information using the interface tools. You can search for information by a set of indexed fields using a variety of logical operators.

Queries enable you to construct search commands using OpenSearch query domain-specific language (DSL) expressions. These expressions allow you to search by the fields not included in the index.

In addition to filters and queries, you can use the Search input field for full-text search.

Create a filter¶

From the dashboard view, click Add filter.
In the dialog that opens, select the field of search in the Field drop-down menu.
Select the logical operator in the Operator drop-down menu.
Type or select the filter value from the Value drop-down menu.

Create a filter using the ‘flat object’ field type¶

Available since MCC 2.23.0 (12.7.0 and 11.7.0)

For the orchestrator.labels field of the system and audit log indices, you can use the flat_object field type to apply the filtering using value or valueAndPath. For example:

Using value: to obtain all logs produced by iam-proxy, add the following filters:
- orchestrator.type that matches kubernetes
- orchestrator.labels._value that matches iam-proxy
Using valueAndPath: to obtain all logs produced by the OpenSearch cluster, add the following filters:
- orchestrator.type that matches kubernetes
- orchestrator.labels._valueAndPath that matches orchestrator.labels.app=opensearch-master

Create a query¶

From the dashboard view, click Add filter.
In the dialog that opens, click Edit as Query DSL and type in the search request.

Learn more

OpenSearch documentation:

View Grafana dashboards¶

Using the Grafana web UI, you can view the visual representation of the metric graphs based on the time series databases.

Most Grafana dashboards include a View logs in OpenSearch Dashboards link to immediately view relevant logs in the OpenSearch Dashboards web UI. The OpenSearch Dashboards web UI displays logs filtered using the Grafana dashboard variables, such as the drop-downs. Once you amend the variables, wait for Grafana to generate a new URL.

Note

Due to the known issue, the View logs in OpenSearch Dashboards link does not work in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). The issue is addressed in Container Cloud 2.26.1 (Cluster releases 17.1.1 and 16.1.1).

Caution

The Grafana dashboards that contain drop-down lists are limited to 1000 lines. Therefore, if you require data on a specific item, use the filter by name instead.

Note

Grafana dashboards that present node data have an additional Node identifier drop-down menu. By default, it is set to machine to display short names for Kubernetes nodes. To display Kubernetes node name labels, change this option to node.

To view the Grafana dashboards:

From the drop-down list, select the required dashboard to inspect the status and statistics of the corresponding service in your management or MOSK cluster:

Component	Dashboard	Description
Ceph cluster	Ceph Cluster	Provides the overall health status of the Ceph cluster, capacity, latency, and recovery metrics.
	Ceph Nodes	Provides an overview of the host-related metrics, such as the number of Ceph Monitors, Ceph OSD hosts, average usage of resources across the cluster, network and hosts load. This dashboard is deprecated since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0) and is removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Therefore, Mirantis recommends switching to the following dashboards in the current release: For Ceph stats, use the Ceph Cluster dashboard. For resource utilization, use the System dashboard, which includes filtering by Ceph node labels, such as `ceph_role_osd`, `ceph_role_mon`, and `ceph_role_mgr`.
	Ceph OSDs	Provides metrics for Ceph OSDs, including the Ceph OSD read and write latencies, distribution of PGs per Ceph OSD, Ceph OSDs and physical device performance.
	Ceph Pools	Provides metrics for Ceph pools, including the client IOPS and throughput by pool and pools capacity usage.
Ironic	Ironic BM	Provides graphs on Ironic health, HTTP API availability, provisioned nodes by state and installed `ironic-conductor` backend drivers.
Container Cloud	Clusters Overview	Represents the main cluster capacity statistics for all clusters of a Container Cloud deployment where StackLight is installed. Note Due to the known issue, the Prometheus Targets Unavailable panel of the Clusters Overview dashboard does not display data for managed clusters of the 11.7.0, 11.7.4, 12.5.0, and 12.7.x series Cluster releases after update to Container Cloud 2.24.0.
	Etcd	Available since Container Cloud 2.21.0 (Cluster release 11.5.0). Provides graphs on database size, leader elections, requests duration, incoming and outgoing traffic.
	MCC Applications Performance	Available since Container Cloud 2.23.0 (Cluster release 11.7.0). Provides information on the Container Cloud internals work based on Golang, controller runtime, and custom metrics. You can use it to verify performance of applications and for troubleshooting purposes.
Kubernetes resources	Kubernetes Calico	Provides metrics of the entire Calico cluster usage, including the cluster status, host status, and Felix resources.
	Kubernetes Cluster	Provides metrics for the entire Kubernetes cluster, including the cluster status, host status, and resources consumption.
	Kubernetes Containers	Provides charts showing resource consumption per deployed Pod containers running on Kubernetes nodes.
	Kubernetes Deployments	Provides information on the desired and current state of all service replicas deployed on a Container Cloud cluster.
	Kubernetes Namespaces	Provides the Pods state summary and the CPU, MEM, network, and IOPS resources consumption per name space.
	Kubernetes Nodes	Provides charts showing resources consumption per Container Cloud cluster node.
	Kubernetes Pods	Provides charts showing resources consumption per deployed Pod.
NGINX	NGINX	Provides the overall status of the NGINX cluster and information about NGINX requests and connections.
OpenStack	OpenStack - Overview	Provides general information on OpenStack services resources consumption, API errors, deployed OpenStack compute nodes and block storage usage.
	OpenStack Ingress controller	Available since MOSK 23.3. Monitors the number of requests, response times and statuses, as well as the number of Ingress SSL certificates including expiration time and resources usage.
	OpenStack Instances Availability	Available since MOSK 23.2. Provides information about the availability of instance floating IPs per OpenStack compute node and project. Also, enables monitoring of probe statistics for individual instance floating IPs.
	OpenStack Network IP Capacity	Available since MOSK 25.1. Provides information about the statistics of IP address allocation for external networks and subnets on non-Tungsten Fabric based MOSK clusters. For configuration details, see Start monitoring IP address capacity.
	OpenStack PortProber	Available since MOSK 24.2. Provides information about the availability of Neutron ports per OpenStack compute node, project, and port owner.
	OpenStack PortProber [Deprecated]	Available since MOSK 25.1. Provides information about the availability of Neutron ports per OpenStack compute node, project, and port owner. Deprecated in favor of the OpenStack PortProber dashboard. Use this deprecated dashboard only to access old data collected before MOSK 25.1.
	OpenStack PowerDNS	Available since MOSK 24.3. Provides different stats about OpenStack PowerDNS servers such as connections, resources, queries, rings, errors, and other.
	OpenStack Usage Efficiency	Available since MOSK 23.3. Provides information about requested (allocated) CPU and memory usage efficiency on a per-project and per-flavor basis. Aims to identify flavors that specific projects are not effectively using, with allocations significantly exceeding actual usage. Also, evaluates per-instance underuse for specific projects.
	KPI - Provisioning	Provides provisioning statistics for OpenStack compute instances, including graphs on VM creation results by day.
	Cinder	Provides graphs on the OpenStack Block Storage service health, HTTP API availability, pool capacity and utilization, number of created volumes and snapshots.
	Glance	Provides graphs on the OpenStack Image service health, HTTP API availability, number of created images and snapshots.
	Gnocchi	Provides panels and graphs on the Gnocchi health and HTTP API availability.
	Heat	Provides graphs on the OpenStack Orchestration service health, HTTP API availability and usage.
	Ironic OpenStack	Provides graphs on the OpenStack Bare Metal Provisioning service health, HTTP API availability, provisioned nodes by state and installed ironic-conductor backend drivers.
	Keystone	Provides graphs on the OpenStack Identity service health, HTTP API availability, number of tenants and users by state.
	Neutron	Provides graphs on the OpenStack networking service health, HTTP API availability, agents status and usage of Neutron L2 and L3 resources.
	NGINX Ingress controller	Not recommended. Deprecated since MOSK 23.3 and is removed in MOSK 24.1. Use OpenStack Ingress controller instead. Monitors the number of requests, response times and statuses, as well as the number of Ingress SSL certificates including expiration time and resources usage.
	Nova - Availability Zones	Provides detailed graphs on the OpenStack availability zones and hypervisor usage.
	Nova - Hypervisor Overview	Provides a set of single-stat panels presenting resources usage by host.
	Nova - Instances	Provides graphs on libvirt Prometheus exporter health and resources usage. Monitors the number of running instances and tasks and allows sorting the metrics by top instances.
	Nova - Overview	Provides graphs on the OpenStack compute services (`nova-scheduler`, `nova-conductor`, and `nova-compute`) health, as well as HTTP API availability.
	Nova - Tenants	Provides graphs on CPU, RAM, disk throughput, IOPS, and space usage and allocation and allows sorting the metrics by top tenants.
	Nova - Users	Provides graphs on CPU, RAM, disk throughput, IOPS, and space usage and allocation and allows sorting the metrics by top users.
	Nova - Utilization	Provides detailed graphs on Nova hypervisor resources capacity and consumption.
	Memcached	Memcached Prometheus exporter dashboard. Monitors Kubernetes Memcached pods and displays memory usage, hit rate, evicts and reclaims rate, items in cache, network statistics, and commands rate.
	MySQL	MySQL Prometheus exporter dashboard. Monitors Kubernetes MySQL pods, resources usage and provides details on current connections and database performance.
	RabbitMQ [Deprecated]	Not recommended. Deprecated since MOSK 25.1. RabbitMQ Prometheus exporter dashboard. Monitors Kubernetes RabbitMQ pods, resources usage and provides details on cluster utilization and performance. Caution This dashboard is renamed from RabbitMQ to RabbitMQ [Deprecated] in MOSK 25.1 and will be removed in one of the following releases for the sake of the RabbitMQ Overview and RabbitMQ Erlang dashboards. For deprecation details, see Deprecation notes: RabbitMQ Prometheus Exporter.
	RabbitMQ Erlang	Available since MOSK 25.1. Monitors RabbitMQ BEAM performance, memory details, load and distribution metrics using native Prometheus plugin metrics.
	RabbitMQ Overview	Available since MOSK 25.1. Monitors RabbitMQ node performance, resource usage, message queue, channel, and connection statistics using native Prometheus plugin metrics.
	Cassandra	Provides graphs on Cassandra clusters’ health, ongoing operations, and resource consumption.
	Kafka	Provides graphs on Kafka clusters’ and broker health, as well as broker and topic usage.
	Redis	Provides graphs on Redis clusters’ and pods’ health, connections, command calls, and resource consumption.
Tungsten Fabric	Tungsten Fabric Controller	Provides graphs on the overall Tungsten Fabric Controller cluster processes and usage.
	Tungsten Fabric vRouter	Provides graphs on the overall Tungsten Fabric vRouter cluster processes and usage.
	ZooKeeper	Provides graphs on ZooKeeper clusters’ quorum health and resource consumption.
StackLight	Alertmanager	Provides performance metrics on the overall health status of the Prometheus Alertmanager service, the number of firing and resolved alerts received for various periods, the rate of successful and failed notifications, and the resources consumption.
	OpenSearch	Provides information about the overall health status of the OpenSearch cluster, including the resources consumption, number of operations and their performance.
	OpenSearch Indices	Provides detailed information about the state of indices, including their size, the number and the size of segments.
	Grafana	Provides performance metrics for the Grafana service, including the total number of Grafana entities, CPU and memory consumption.
	PostgreSQL	Provides PostgreSQL statistics, including read (DQL) and write (DML) row operations, transaction and lock, replication lag and conflict, and checkpoint statistics, as well as PostgreSQL performance metrics.
	Prometheus	Provides the availability and performance behavior of the Prometheus servers, the sample ingestion rate, and system usage statistics per server. Also, provides statistics about the overall status and uptime of the Prometheus service, the chunks number of the local storage memory, target scrapes, and queries duration.
	Prometheus Relay	Provides service status and resources consumption metrics.
	Telemeter Server	Provides statistics and the overall health status of the Telemeter service. Note Due to the known issue, the Telemeter Client Status panel of the Telemeter Server dashboard does not display data for managed clusters of the 11.7.0, 11.7.4, 12.5.0, and 12.7.x series Cluster releases after update to Container Cloud 2.24.0.
System	System	Provides a detailed resource consumption and operating system information per Container Cloud cluster node.
Mirantis Kubernetes Engine (MKE)	MKE Cluster	Provides a global overview of an MKE cluster: statistics about the number of the worker and manager nodes, containers, images, Swarm services.
	MKE Containers	Provides per container resources consumption metrics for the MKE containers such as CPU, RAM, network.

Export data from Table panels of Grafana dashboards to CSV¶

This section describes how to export data from Table panels of Grafana dashboards to .csv files.

Note

Grafana performs data exports for individual panels on a dashboard, not the entire dashboard.

To export data from Table panels of Grafana dashboards to CSV:

Log in to the Grafana web UI as described in Getting access.
In the right top corner of the required Table panel, click the kebab menu icon and select Inspect > Data.
In Data options of the Data tab, configure export options:
- Enable Apply panel transformation
- Leave Formatted data enabled
- Enable Download for Excel, if required
Click Download CSV.

StackLight alerts¶

This section provides an overview of the available predefined StackLight alerts, including OpenStack, Tungsten Fabric, Container Cloud, Ceph, StackLight, MKE, and other alerts that can contain information about both OpenStack and MOSK clusters.

To view the alerts, use the Prometheus web UI. To view the firing alerts, use Alertmanager or Alerta web UI.

For alert troubleshooting guidelines, see Troubleshoot alerts.

OpenStack¶

This section describes the alerts available for OpenStack services.

OpenStack system services¶

This section describes the alerts available for the OpenStack system services.

Libvirt¶

This section lists the alerts for the libvirt service.

LibvirtDown
LibvirtExporterTargetDown
LibvirtExporterTargetsOutage

LibvirtDown¶

Severity	Critical
Summary	Failure to gather libvirt metrics.
Description	Libvirt Exporter fails to gather metrics on the `{{ $labels.node }}` node for 2 minutes.

LibvirtExporterTargetDown¶

Severity	Major
Summary	Libvirt Exporter Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Libvirt Exporter endpoint on the `{{ $labels.node }}` node.

LibvirtExporterTargetsOutage¶

Severity	Critical
Summary	Libvirt Exporter Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all Libvirt Exporter endpoints.

MariaDB¶

This section lists the alerts for the MariaDB service.

MariadbClusterDown
MariadbExporterClusterTargetsOutage
MariadbExporterTargetDown
MariadbGaleraDonorFallingBehind

MariadbGaleraNotReady
MariadbGaleraOutOfSync
MariadbInnodbLogWaits
MariadbTableLockWaitHigh

MariadbClusterDown¶

Severity	Critical
Summary	MariaDB cluster is down.
Description	The MariaDB `{{ $labels.cluster }}` cluster in the `{{ $labels.namespace }}` namespace is down.

MariadbExporterClusterTargetsOutage¶

Replaced with MariadbExporterTargetDown in 23.3

Severity	Critical
Summary	MariaDB Exporter cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of the `{{ $labels.cluster }}` cluster exporters endpoints (more than 1/10 failed scrapes).

MariadbExporterTargetDown¶

Since 23.3 to replace MariadbExporterClusterTargetsOutage

Severity	Critical
Summary	MariaDB Exporter cluster Prometheus target down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.cluster }}` cluster on the `{{ $labels.node }}` node.

MariadbGaleraDonorFallingBehind¶

Severity	Warning
Summary	MariaDB node is falling behind.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` MariaDB node in the `{{ $labels.cluster }}` cluster is falling behind (queue size `{{ $value }}`).

MariadbGaleraNotReady¶

Severity	Major
Summary	MariaDB cluster node is not ready.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` MariaDB node in the `{{ $labels.cluster }}` cluster is not ready to accept queries.

MariadbGaleraOutOfSync¶

Severity	Warning
Summary	MariaDB cluster node is out of sync.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` MariaDB node in the `{{ $labels.cluster }}` cluster is not in sync (`{{ $value }}` != 4).

MariadbInnodbLogWaits¶

Severity	Warning
Summary	MariaDB InnoDB log writes are stalling.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` MariaDB InnoDB logs are waiting for disk at a rate of `{{ $value }}` per second (more than 10).

MariadbTableLockWaitHigh¶

Severity	Warning
Summary	MariaDB table lock waits are high.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` MariaDB node in the `{{ $labels.cluster }}` cluster has high table lock waits of `{{ $value }}` percentage (more than 30).

Memcached¶

This section lists the alerts for the Memcached service.

MemcachedClusterDown
MemcachedConnectionsNoneWarning
MemcachedConnectionsNoneMajor
MemcachedEvictionsLimit
MemcachedExporterTargetDown
MemcachedExporterClusterTargetsOutage

MemcachedClusterDown¶

Severity	Critical
Summary	Memcached cluster is down.
Description	The Memcached `{{ $labels.cluster }}` cluster in the `{{ $labels.namespace }}` namespace is down.

MemcachedConnectionsNoneWarning¶

Severity	Warning
Summary	Memcached has no open connections.
Description	The Memcached database cluster `{{ $labels.cluster }}` in the `{{ $labels.namespace }}` namespace has no open connections.

MemcachedConnectionsNoneMajor¶

Severity	Warning
Summary	Memcached has no open connections on all nodes.
Description	The Memcached database cluster `{{ $labels.cluster }}` in the `{{ $labels.namespace }}` namespace has no open connections on all nodes.

MemcachedEvictionsLimit¶

Severity	Warning
Summary	10 Memcached evictions.
Description	An average of `{{ $value }}` evictions occurred in the Memcached database cluster `{{ $labels.cluster }}` in the `{{ $labels.namespace }}` namespace during the last minute.

MemcachedExporterTargetDown¶

Since 23.3 to replace MemcachedExporterClusterTargetsOutage

Severity	Critical
Summary	Memcached Exporter cluster Prometheus target down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.cluster }}` cluster on the `{{ $labels.node }}` node.

MemcachedExporterClusterTargetsOutage¶

Replaced with MemcachedExporterTargetDown in 23.3

Severity	Critical
Summary	Memcached Exporter cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of the `{{ $labels.cluster }}` cluster exporters endpoints (more than 1/10 failed scrapes).

SSL certificates¶

This section describes the alerts for the OpenStack SSL certificates. By default, these alerts are disabled. To enable them, set openstack.externalFQDNs.enabled to true. For details, see Configuration options for SSL certificates.

OpenstackSSLCertExpirationHigh
OpenstackSSLCertExpirationMedium
OpenstackSSLProbesFailing
OpenstackSSLProbesTargetOutage

OpenstackSSLCertExpirationHigh¶

Severity	Critical
Summary	SSL certificate for an OpenStack service expires on `{{ $value \| humanizeTimestamp }}`
Description	The SSL certificate for the OpenStack `{{ $labels.namespace }}/{{ $labels.service_name }}` service endpoints expires on `{{ $value \| humanizeTimestamp }}`, less than 10 days are left.

OpenstackSSLCertExpirationMedium¶

Severity	Major
Summary	SSL certificate for an OpenStack service expires on `{{ $value \| humanizeTimestamp }}`
Description	The SSL certificate for the OpenStack `{{ $labels.namespace }}/{{ $labels.service_name }}` service endpoints expires on `{{ $value \| humanizeTimestamp }}`, less than 30 days are left.

OpenstackSSLProbesFailing¶

Severity	Critical
Summary	SSL certificate probes for an OpenStack service are failing.
Description	The SSL certificate probes for the OpenStack `{{ $labels.namespace }}`/`{{ $labels.service_name }}` service endpoints are failing.

OpenstackSSLProbesTargetOutage¶

Severity	Critical
Summary	OpenStack `{{ $labels.service_name }` SSL ingress target outage.
Description	Prometheus fails to probe the OpenStack `{{ $labels.service_name }}` service SSL ingress target.

RabbitMQ¶

This section lists the alerts for the RabbitMQ service.

RabbitMQUnreachablePeersDetected
RabbitMQDown
RabbitMQExporterTargetDown
RabbitMQOperatorTargetDown
RabbitMQFileDescriptorUsageWarning
RabbitMQNodeDiskFreeAlarm
RabbitMQNodeMemoryAlarm
RabbitMQTargetDown

RabbitMQUnreachablePeersDetected¶

Note

Before Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0), this alert was named RabbitMQNetworkPartitionsDetected.

Severity	Major
Summary	RabbitMQ unreachable peers detected.
Description	The `{{ $labels.pod }}` RabbitMQ pod in the `{{ $labels.namespace }}` namespace has `{{ $value }}` unreachable peers.

RabbitMQDown¶

Deprecated since MCC 2.29.0 (17.4.0 and 16.4.0)

Severity	Critical
Summary	RabbitMQ is down.
Description	The `{{ $labels.cluster }}` RabbitMQ cluster in the `{{ $labels.namespace }}` namespace is down for the last 2 minutes.

RabbitMQExporterTargetDown¶

Deprecated since MCC 2.29.0 (17.4.0 and 16.4.0)

Severity	Critical
Summary	`{{ $labels.service_name }}` RabbitMQ Exporter Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.namespace }}/{{ $labels.service_name }}` on the `{{ $labels.node }}` node.

RabbitMQOperatorTargetDown¶

Severity	Major
Summary	RabbitMQ operator Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

RabbitMQFileDescriptorUsageWarning¶

Severity	Warning
Summary	RabbitMQ file descriptors usage is high for the last 10 minutes.
Description	The `{{ $labels.pod }}` RabbitMQ pod in the `{{ $labels.namespace }}` namespace has high file descriptor usage of `{{ $value }}` percent.

RabbitMQNodeDiskFreeAlarm¶

Severity	Warning
Summary	RabbitMQ disk space usage is high.
Description	The `{{ $labels.pod }}` RabbitMQ pod in the `{{ $labels.namespace }}` namespace has low disk free space available.

RabbitMQNodeMemoryAlarm¶

Severity	Major
Summary	RabbitMQ memory usage is high.
Description	The `{{ $labels.pod }}` RabbitMQ pod in the `{{ $labels.namespace }}` namespace has low free memory.

RabbitMQTargetDown¶

Available since MCC 2.29.0 (17.4.0 and 16.4.0)

Severity	Major
Summary	`{{ $labels.pod }}` RabbitMQ Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.namespace }}`/`{{ $labels.service_name }}` on the `{{ $labels.node }}` node.

OpenStack core services¶

This section describes the alerts available for the OpenStack core services.

OpenStack services API¶

This section describes the alerts for the OpenStack services API:

OpenstackIngressControllerTargetsOutage
OpenstackAPI401Critical
OpenstackAPI5xxCritical
OpenstackApiServiceDown
OpenstackPublicAPI401Critical
OpenstackPublicAPI5xxCritical
OpenstackServiceInternalApiOutage
OpenstackServicePublicApiOutage

OpenstackIngressControllerTargetsOutage¶

Severity	Critical
Summary	OpenStack ingress controller Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all OpenStack ingress controller endpoints.

OpenstackAPI401Critical¶

Severity	Critical
Summary	OpenStack API responds with HTTP 401.
Description	The OpenStack API `{{ $labels.component }}` responds with HTTP 401 for more than 5% of requests for the last 10 minutes.

OpenstackAPI5xxCritical¶

Severity	Critical
Summary	OpenStack API responds with HTTP 5xx.
Description	The OpenStack API `{{ $labels.component }}` responds with HTTP 5xx for more than 1% of requests for the last 10 minutes.

OpenstackApiServiceDown¶

Available since MOSK 24.1

Severity	Critical
Summary	OpenStack `{{ $labels.url }}` API outage
Description	The OpenStack `{{ $labels.url }}` API is not accessible.

OpenstackPublicAPI401Critical¶

Severity	Critical
Summary	OpenStack public API responds with HTTP 401.
Description	The OpenStack `{{ $labels.ingress }}` public ingress responds with HTTP 401 for more than 5% of requests for the last 10 minutes.

OpenstackPublicAPI5xxCritical¶

Severity	Critical
Summary	OpenStack Public API responds with HTTP 5xx.
Description	The OpenStack `{{ $labels.ingress }}` public ingress responds with HTTP 5xx for more than 1% of requests for the last 10 minutes.

OpenstackServiceInternalApiOutage¶

Removed in MOSK 24.1

Severity	Critical
Summary	OpenStack `{{ $labels.service_name }}` internal API outage.
Description	The OpenStack `{{ $labels.service_name }}` internal API is not accessible.

OpenstackServicePublicApiOutage¶

Removed in MOSK 24.1

Severity	Critical
Summary	OpenStack `{{ $labels.service_name }}` public API outage.
Description	The OpenStack `{{ $labels.service_name }}` public API is not accessible.

Cinder¶

This section lists the alerts for Cinder:

CinderServiceDisabled
CinderServiceDown
CinderServiceOutage

CinderServiceDisabled¶

Severity	Critical
Summary	`{{ $labels.binary }}` service is disabled.
Description	The `{{ $labels.binary }}` service is disabled in the `{{ $labels.zone }}` zone on all hosts.

CinderServiceDown¶

Severity	Major
Summary	`{{ $labels.binary }}` service is down.
Description	The `{{ $labels.binary }}` service is in the `down` state in the `{{ $labels.zone }}` zone on the `{{ $labels.host }}` host.

CinderServiceOutage¶

Severity	Critical
Summary	`{{ $labels.binary }}` service outage.
Description	The `{{ $labels.binary }}` service is down in the `{{ $labels.zone }}` zone on all hosts where it is enabled.

Ironic¶

This section lists the alerts for Ironic.

IronicDriverMissing¶

Severity	Major
Summary	`ironic-conductor` `{{ $labels.driver }}` backend driver missing.
Description	The `{{ $labels.driver }}` backend driver of the `ironic-conductor` container is missing on `{{ $value }}` node(s).

Neutron¶

This section lists the alerts for Neutron:

NeutronAgentDisabled
NeutronAgentDown
NeutronAgentOutage

NeutronAgentDisabled¶

Severity	Critical
Summary	`{{ $labels.binary }}` agent is disabled.
Description	The `{{ $labels.binary }}` agent is disabled in the `{{ $labels.zone }}` zone on all hosts.

NeutronAgentDown¶

Severity	Critical
Summary	`{{ $labels.binary }}` agent is down.
Description	The `{{ $labels.binary }}` agent is in the `down` state in the `{{ $labels.zone }}` zone on the `{{ $labels.host }}` host.

NeutronAgentOutage¶

Severity	Critical
Summary	`{{ $labels.binary }}` agent outage.
Description	The `{{ $labels.binary }}` agent is down in the `{{ $labels.zone }}` zone on all hosts where it is enabled.

Nova¶

This section lists the alerts for Nova:

NovaOrphanedAllocationsDetected
NovaServiceDisabled
NovaServiceDown
NovaServiceOutage

NovaOrphanedAllocationsDetected¶

Available since MOSK 24.3

Severity	Major
Summary	Openstack Nova orphaned allocations detected.
Description	Orphaned resource allocations are detected on compute nodes.

NovaServiceDisabled¶

Severity	Critical
Summary	`{{ $labels.binary }}` service is disabled.
Description	The `{{ $labels.binary }}` service is disabled in the `{{ $labels.zone }}` zone on all hosts.

NovaServiceDown¶

Severity	Critical
Summary	`{{ $labels.binary }}` service is down.
Description	The `{{ $labels.binary }}` service is in the `down` state in the `{{ $labels.zone }}` zone on the `{{ $labels.host }}` host.

NovaServiceOutage¶

Severity	Critical
Summary	`{{ $labels.binary }}` service outage.
Description	The `{{ $labels.binary }}` service is down in the `{{ $labels.zone }}` zone on all hosts where it is enabled.

Cloudprober¶

This section lists the alerts for the Cloudprober service:

OpenstackCloudproberTargetDown
OpenstackCloudproberTargetsOutage

OpenstackCloudproberTargetDown¶

Since 23.3 to replace OpenstackCloudproberTargetsOutage TechPreview

Severity	Major
Summary	Openstack Cloudprober Prometheus target down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

OpenstackCloudproberTargetsOutage¶

Replaced with OpenstackCloudproberTargetDown in 23.3 Available since 23.2 TechPreview

Severity	Major
Summary	Openstack Cloudprober Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all OpenStack Cloudprober endpoints (more than 1/10 failed scrapes).

Portprober¶

This section lists the alerts for the Portprober service:

OpenstackPortproberTargetsOutage

OpenstackPortproberTargetsOutage¶

Available since MOSK 24.2 TechPreview

Severity	Major
Summary	OpenStack Portprober Prometheus targets outage.
Description	Prometheus fails to scrape metrics from more than one OpenStack Portprober endpoint.

PowerDNS¶

Available since MOSK 24.3

This section lists the alerts for the PowerDNS service:

OpenstackPowerDNSProbeFailure
OpenstackPowerDNSProbesTargetOutage
OpenstackPowerDNSQueryDurationHigh
OpenstackPowerDNSTargetDown
OpenstackPowerDNSUDPInCsumErrors
OpenstackPowerDNSUDPInErrors
OpenstackPowerDNSUDPRecvBufErrors
OpenstackPowerDNSUDPSndBufErrors

OpenstackPowerDNSProbeFailure¶

Severity	Critical
Summary	DNS probe failure for `{{ $labels.target_name }}` `{{ $labels.target_type }}`
Description	The DNS probe failed at least 3 times for the DNS `{{ $labels.target_type }}` `{{ $labels.target_name }}` using the `{{ $labels.protocol }}` protocol in the last 20 minutes.

OpenstackPowerDNSProbesTargetOutage¶

Severity	Critical
Summary	DNS probe target experienced outage for `{{ $labels.target_name }}` `{{ $labels.target_type }}`
Description	Prometheus failed to probe the DNS `{{ $labels.target_type }} {{ $labels.target_name }}` 3 times using the `{{ $labels.protocol }}` protocol in the last 20 minutes.

OpenstackPowerDNSQueryDurationHigh¶

Severity	Warning
Summary	High DNS query duration for `{{ $labels.target_name }}` `{{ $labels.target_type }}`
Description	The DNS query duration for the DNS `{{ $labels.target_type }}` `{{ $labels.target_name }}` using the `{{ $labels.protocol }}` protocol exceeded 3 seconds at least 3 times in the last 20 minutes.

OpenstackPowerDNSTargetDown¶

Severity	Critical
Summary	PowerDNS Prometheus target is down
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

OpenstackPowerDNSUDPInCsumErrors¶

Severity	Critical
Summary	Openstack PowerDNS UDP checksum errors detected
Description	The number of UDP checksum errors has increased by `{{ printf "%.0f" $value }}` on the `{{ $labels.pod }}` Pod over the last 2 hours.

OpenstackPowerDNSUDPInErrors¶

Severity	Critical
Summary	Openstack PowerDNS UDP input errors detected
Description	The number of UDP input errors has increased by `{{ printf "%.0f" $value }}` on the `{{ $labels.pod }}` Pod over the last 2 hours.

OpenstackPowerDNSUDPRecvBufErrors¶

Severity	Critical
Summary	Openstack PowerDNS UDP receive buffer errors detected
Description	The number of UDP receive buffer errors has increased by `{{ printf "%.0f" $value }}` on the `{{ $labels.pod }}` Pod over the last 2 hours.

OpenstackPowerDNSUDPSndBufErrors¶

Severity	Critical
Summary	Openstack PowerDNS UDP send buffer errors detected
Description	The number of UDP send buffer errors has increased by `{{ printf "%.0f" $value }}` on the `{{ $labels.pod }}` Pod over the last 2 hours.

Credential rotation¶

Available since MOSK 24.1

This section lists the alerts for OpenStack credential rotation:

OpenstackAdminCredentialsRotationOverdue
OpenstackServiceCredentialsRotationOverdue

Note

You can adjust thresholds for the alerts included in this section. If needed, refer to Alerts configuration.

OpenstackAdminCredentialsRotationOverdue¶

Severity	Warning
Summary	OpenStack administrator credentials are overdue for rotation
Description	The OpenStack administrator credentials have not been rotated since `{{ $value \| humanizeTimestamp }}`, for more than 30 days.

OpenstackServiceCredentialsRotationOverdue¶

Severity	Warning
Summary	OpenStack service user credentials are overdue for rotation
Description	The OpenStack service user credentials have not been rotated since `{{ $value \| humanizeTimestamp }}`, for more than 30 days.

See also

Rotate OpenStack credentials

OpenStack Controller (Rockoon)¶

This section describes the alerts available for the OpenStack Controller (Rockoon).

OsDplExporterCollectorFailure
OsDplExporterTargetDown
OsDplSSLCertExpirationHigh
OsDplSSLCertExpirationMedium

OsDplExporterCollectorFailure¶

Available since MOSK 24.3

Severity	Major
Summary	Collector failure in the OpenStackDeployment Exporter.
Description	The `{{$labels.collector}}` collector of the OpenStackDeployment Exporter fails to retrieve data for the last 2 minutes.

OsDplExporterTargetDown¶

Severity	Critical
Summary	OpenStackDeployment Exporter Prometheus target is down.
Description	Prometheus fails to scrape metrics from the OpenStackDeployment Exporter endpoint.

OsDplSSLCertExpirationHigh¶

Severity	Warning
Summary	SSL certificate for an OpenStack service expires on `{{ $value \| humanizeTimestamp }}`
Description	The SSL certificate `{{ $labels.identifier }}` from the `OpenStackDeployment` expires on `{{ $value \| humanizeTimestamp }}`, less than 10 days are left.

OsDplSSLCertExpirationMedium¶

Severity	Major
Summary	SSL certificate for an OpenStack service expires on `{{ $value \| humanizeTimestamp }}`
Description	The SSL certificate `{{ $labels.identifier }}` from the `OpenStackDeployment` expires on `{{ $value \| humanizeTimestamp }}`, less than 30 days are left.

Tungsten Fabric¶

This section describes the alerts available for the Tungsten Fabric services.

Cassandra¶

This section lists the alerts for Cassandra.

CassandraAuthFailures
CassandraCacheHitRateTooLow
CassandraClientRequestFailure
CassandraClientRequestUnavailable
CassandraClusterTargetDown
CassandraClusterTargetsOutage
CassandraCommitlogTasksPending
CassandraCompactionExecutorTasksBlocked
CassandraCompactionTasksPending

CassandraConnectionTimeouts
CassandraFlushWriterTasksBlocked
CassandraHintsTooMany
CassandraRepairTasksBlocked
CassandraStorageExceptions
CassandraTombstonesTooManyCritical
CassandraTombstonesTooManyMajor
CassandraTombstonesTooManyWarning
CassandraViewWriteLatencyTooHigh

CassandraAuthFailures¶

Severity	Warning
Summary	Cassandra authentication failures.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reports an increased number of authentication failures.

CassandraCacheHitRateTooLow¶

Severity	Major
Summary	Cassandra cache hit rate is too low.
Description	The average hit rate for the `{{ $labels.cache }}` cache in the `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster is below 85%.

CassandraClientRequestFailure¶

Severity	Major
Summary	Cassandra client `{{ $labels.operation }}` request failure.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reports an increased number of `{{ $labels.operation }}` operation failures. A failure is a non-timeout exception.

CassandraClientRequestUnavailable¶

Severity	Critical
Summary	Cassandra client `{{ $labels.operation }}` request is unavailable.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ labels.cassandra_cluster }}` cluster reports an increased number of `{{ $labels.operation }}` operations ending with `UnavailableException`. There are not enough replicas alive to perform the `{{ $labels.operation }}` query with the requested consistency level.

CassandraClusterTargetDown¶

Available since 23.3 to replace CassandraClusterTargetsOutage

Severity	Critical
Summary	Cassandra cluster target down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the {{ $labels.cluster }} cluster on the {{ $labels.node }} node.

CassandraClusterTargetsOutage¶

Replaced by CassandraClusterTargetDown in 23.3

Severity	Critical
Summary	Cassandra cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of the `{{ $labels.cluster }}` cluster endpoints (more than 1/10 failed scrapes).

CassandraCommitlogTasksPending¶

Severity	Warning
Summary	Cassandra `commitlog` has too many pending tasks.
Description	The `commitlog` in the `{{ $labels.namespace }}`/ `{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reached 15 pending tasks.

CassandraCompactionExecutorTasksBlocked¶

Severity	Warning
Summary	Cassandra compaction executor tasks are blocked.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ labels.cassandra_cluster }}` cluster reports that `{{ $value }}` compaction executor tasks are blocked.

CassandraCompactionTasksPending¶

Severity	Warning
Summary	Cassandra has too many pending compactions.
Description	The pending compaction tasks in the `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ labels.cassandra_cluster }}` cluster reached the threshold of 100 on average as measured over 30 minutes. This may occur due to a too low cluster I/O capacity.

CassandraConnectionTimeouts¶

Severity	Critical
Summary	Cassandra connection timeouts.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reports an increased number of connection timeouts between nodes.

CassandraFlushWriterTasksBlocked¶

Severity	Warning
Summary	Cassandra flush writer tasks are blocked.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reports that `{{ $value }` flush writer tasks are blocked.

CassandraHintsTooMany¶

Severity	Major
Summary	Cassandra has too many hints.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reports an increased number of hints. Replica nodes are not available to accept mutation due to a failure or maintenance.

CassandraRepairTasksBlocked¶

Severity	Warning
Summary	Cassandra repair tasks are blocked.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reports that `{{ $value }}` repair tasks are blocked.

CassandraStorageExceptions¶

Severity	Critical
Summary	Cassandra storage exceptions.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reports an increased number of storage exceptions.

CassandraTombstonesTooManyCritical¶

Severity	Critical
Summary	Cassandra scanned 50000 tombstones.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster scanned `{{ $value }}` tombstones in 99% of read queries.

CassandraTombstonesTooManyMajor¶

Severity	Major
Summary	Cassandra scanned 25000 tombstones.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster scanned `{{ $value }}` tombstones in 99% of read queries.

CassandraTombstonesTooManyWarning¶

Severity	Warning
Summary	Cassandra scanned 10000 tombstones.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster scanned `{{ $value }}` tombstones in 99% of read queries.

CassandraViewWriteLatencyTooHigh¶

Severity	Warning
Summary	Cassandra high view/write latency.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Cassandra Pod in the `{{ $labels.cassandra_cluster }}` cluster reports over 1-second view/write latency for 99% of requests.

Kafka¶

This section lists the alerts for Kafka.

KafkaClusterTargetDown
KafkaClusterTargetsOutage
KafkaInsufficientBrokers
KafkaMissingController

KafkaOfflinePartitionsDetected
KafkaTooManyControllers
KafkaUncleanLeaderElectionOccured
KafkaUnderReplicatedPartitions

KafkaClusterTargetDown¶

Since 23.3 to replace KafkaClusterTargetsOutage

Severity	Critical
Summary	Kafka cluster Prometheus target down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.cluster }}` cluster on the `{{ $labels.node }}` node.

KafkaClusterTargetsOutage¶

Replaced with KafkaClusterTargetDown in 23.3

Severity	Critical
Summary	Kafka cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of the `{{ $labels.cluster }}` cluster endpoints (more than 1/10 failed scrapes).

KafkaInsufficientBrokers¶

Severity	Critical
Summary	Kafka cluster has missing brokers.
Description	The `{{ $labels.cluster }}` Kafka cluster in the `{{ $labels.namespace }}` namespace has missing brokers.

KafkaMissingController¶

Severity	Critical
Summary	Kafka cluster controller is missing.
Description	The `{{ $labels.cluster }}` Kafka cluster in the `{{ $labels.namespace }}` namespace has no controllers.

KafkaOfflinePartitionsDetected¶

Severity	Critical
Summary	Unavailable partitions in Kafka cluster.
Description	Partitions without a primary replica have been detected in the `{{ $labels.cluster }}` Kafka cluster in the `{{ $labels.namespace }}` namespace.

KafkaTooManyControllers¶

Severity	Critical
Summary	Kafka cluster has too many controllers.
Description	The `{{ $labels.cluster }}` Kafka cluster in the `{{ $labels.namespace }}` in namespace has too many controllers.

KafkaUncleanLeaderElectionOccured¶

Severity	Major
Summary	Unclean Kafka broker was elected as cluster leader.
Description	A Kafka broker that has not finished the replication state has been elected as leader in `{{ $labels.cluster }}` within the `{{ $labels.namespace }}` namespace.

KafkaUnderReplicatedPartitions¶

Severity	Warning
Summary	Kafka cluster has underreplicated partitions.
Description	The topics in the `{{ $labels.cluster }}` Kafka cluster in the `{{ $labels.namespace }}` namespace have insufficient replica partitions.

Redis¶

This section lists the alerts for Redis.

RedisClusterFlapping
RedisClusterTargetDown
RedisClusterTargetsOutage
RedisDisconnectedReplicas
RedisDown

RedisMissingPrimary
RedisMultiplePrimaries
RedisRejectedConnections
RedisReplicationBroken

RedisClusterFlapping¶

Severity	Major
Summary	Redis cluster is flapping.
Description	Changes have been detected in the `{{ $labels.cluster }}` Redis cluster within the `{{ $labels.namespace }}` namespace replica connections.

RedisClusterTargetDown¶

Since 23.3 to replace RedisClusterTargetsOutage

Severity	Major
Summary	Redis cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.cluster }}` cluster on the `{{ $labels.node }}` node.

RedisClusterTargetsOutage¶

Replaced with RedisClusterTargetDown in 23.3

Severity	Major
Summary	Redis cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of the `{{ $labels.cluster }}` cluster endpoints (more than 1/10 failed scrapes).

RedisDisconnectedReplicas¶

Severity	Warning
Summary	Redis has disconnected replicas.
Description	The `{{ $labels.cluster }}` Redis cluster in the `{{ $labels.namespace }}` namespace is not replicating to all replicas. Consider verifying the Redis replication status.

RedisDown¶

Severity	Critical
Summary	Redis Pod is down.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Redis Pod in the `{{ $labels.cluster }}` cluster is down.

RedisMissingPrimary¶

Severity	Critical
Summary	Redis cluster has no primary node.
Description	The `{{ $labels.cluster }}` Redis cluster in the `{{ $labels.namespace }}` namespace has no node marked as primary.

RedisMultiplePrimaries¶

Severity	Major
Summary	Redis has multiple primaries.
Description	The `{{ $labels.cluster }}` Redis cluster in the `{{ $labels.namespace }}` namespace has `{{ $value }}` nodes marked as primary.

RedisRejectedConnections¶

Severity	Major
Summary	Redis cluster has rejected connections.
Description	Some connections to the `{{ $labels.namespace }}`/ `{{ $labels.pod }}` Redis Pod in the `{{ $labels.cluster }}` cluster have been rejected.

RedisReplicationBroken¶

Severity	Major
Summary	Redis replication is broken.
Description	The `{{ $labels.cluster }}` Redis cluster in the `{{ $labels.namespace }}` namespace instance lost a replica.

Tungsten Fabric Operator¶

This section lists alerts for the Tungsten Fabric Operator.

TungstenFabricOperatorTargetDown

TungstenFabricOperatorTargetDown¶

Available since MOSK 23.3

Severity	Critical
Summary	Tungsten Fabric Operator Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Tungsten Fabric Operator metrics service.

Tungsten Fabric¶

This section lists the alerts for Tungsten Fabric.

TungstenFabricAPI401Critical¶

Severity	Critical
Summary	Tungsten Fabric API responds with HTTP 401.
Description	The Tungsten Fabric API responds with HTTP 401 for more than 5% of requests for the last 10 minutes.

TungstenFabricAPI5xxCritical¶

Severity	Critical
Summary	Tungsten Fabric API responds with HTTP 5xx.
Description	The Tungsten Fabric API responds with HTTP 5xx for more than 1% of requests for the last 10 minutes.

TungstenFabricBGPSessionsDown¶

Severity	Warning
Summary	Tungsten Fabric BGP sessions are down.
Description	`{{ $value }}` Tungsten Fabric BGP sessions on the `{{ $labels.node }}` node are down for 2 minutes.

TungstenFabricBGPSessionsNoActive¶

Severity	Warning
Summary	No active Tungsten Fabric BGP sessions.
Description	There are no active Tungsten Fabric BGP sessions on the `{{ $labels.node }}` node for 2 minutes.

TungstenFabricBGPSessionsNoEstablished¶

Severity	Warning
Summary	No established Tungsten Fabric BGP sessions.
Description	There are no established Tungsten Fabric BGP sessions on the `{{ $labels.node }}` node for 2 minutes.

TungstenFabricControllerDown¶

Severity	Warning
Summary	Tungsten Fabric Controller is down.
Description	The Tungsten Fabric Controller on the `{{ $labels.node }}` node is down for 2 minutes.

TungstenFabricControllerOutage¶

Severity	Critical
Summary	All Tungsten Fabric Controllers are down.
Description	All Tungsten Fabric Controllers are down for 2 minutes.

TungstenFabricControllerTargetsOutage¶

Severity	Critical
Summary	Tungsten Fabric Controller Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of the Tungsten Fabric Controller exporter endpoints.

TungstenFabricVrouterDown¶

Severity	Warning
Summary	Tungsten Fabric vRouter is down.
Description	The Tungsten Fabric vRouter on the `{{ $labels.node }}` node is down for 2 minutes.

TungstenFabricVrouterLLSSessionsChangesTooHigh¶

Severity	Warning
Summary	Tungsten Fabric vRouter LLS sessions changes reached the limit of 5.
Description	The Tungsten Fabric vRouter LLS sessions on the `{{ $labels.node }}` node have changed `{{ $value }}` times.

TungstenFabricVrouterLLSSessionsTooHigh¶

Severity	Warning
Summary	Tungsten Fabric vRouter LLS sessions reached the limit of 10.
Description	`{{ $value }}` Tungsten Fabric vRouter LLS sessions are open on the `{{ $labels.node }}` node for 2 minutes.

TungstenFabricVrouterMetadataCheck¶

Severity	Critical
Summary	Tungsten Fabric metadata is unavailable.
Description	The Tungsten Fabric metadata on the `{{ $labels.node }}` node is unavailable for 15 minutes.

TungstenFabricVrouterOutage¶

Severity	Critical
Summary	All Tungsten Fabric vRouters are down.
Description	All Tungsten Fabric vRouters are down for 2 minutes.

TungstenFabricVrouterTargetDown¶

Severity	Major
Summary	Tungsten Fabric vRouter Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Tungsten Fabric vRouter exporter endpoint on the `{{ $labels.node }}` node.

TungstenFabricVrouterTargetsOutage¶

Severity	Critical
Summary	Tungsten Fabric vRouter Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all Tungsten Fabric vRouter exporter endpoints.

TungstenFabricVrouterXMPPSessionsChangesTooHigh¶

Severity	Warning
Summary	Tungsten Fabric vRouter XMPP sessions changes reached the limit of 5.
Description	The Tungsten Fabric vRouter XMPP sessions on the `{{ $labels.node }}` node have changed `{{ $value }}` times.

TungstenFabricVrouterXMPPSessionsTooHigh¶

Severity	Warning
Summary	Tungsten Fabric vRouter XMPP sessions reached the limit of 10.
Description	`{{ $value }}` Tungsten Fabric vRouter XMPP sessions are open on the `{{ $labels.node }}` node for 2 minutes.

TungstenFabricVrouterXMPPSessionsZero¶

Severity	Warning
Summary	No Tungsten Fabric vRouter XMPP sessions.
Description	There are no Tungsten Fabric vRouter XMPP sessions on the `{{ $labels.node }}` node for 2 minutes.

TungstenFabricXMPPSessionsChangesTooHigh¶

Severity	Warning
Summary	Tungsten Fabric XMPP sessions changes reached the limit of 100.
Description	The Tungsten Fabric XMPP sessions on the `{{ $labels.node }}` node have changed `{{ $value }}` times.

TungstenFabricXMPPSessionsDown¶

Severity	Warning
Summary	Tungsten Fabric XMPP sessions are down.
Description	`{{ $value }}` Tungsten Fabric XMPP sessions on the `{{ $labels.node }}` node are down for 2 minutes.

TungstenFabricXMPPSessionsMissing¶

Severity	Warning
Summary	Missing Tungsten Fabric XMPP sessions.
Description	`{{ $value }}` Tungsten Fabric XMPP sessions are missing on the compute cluster for 2 minutes.

TungstenFabricXMPPSessionsMissingEstablished¶

Severity	Warning
Summary	Missing established Tungsten Fabric XMPP sessions.
Description	`{{ $value }}` established Tungsten Fabric XMPP sessions are missing on the compute cluster for 2 minutes.

TungstenFabricXMPPSessionsTooHigh¶

Severity	Warning
Summary	Tungsten Fabric XMPP sessions reached the limit of 500.
Description	`{{ $value }}` Tungsten Fabric XMPP sessions on the `{{ $labels.node }}` node are open for 2 minutes.

ZooKeeper¶

This section lists the alerts for ZooKeeper.

ZooKeeperClusterTargetDown
ZooKeeperClusterTargetsOutage
ZooKeeperMissingFollowers
ZooKeeperRequestOverload
ZooKeeperRunningOutOfFileDescriptors
ZooKeeperSyncOverload

ZooKeeperClusterTargetDown¶

Since 23.3 to replace ZooKeeperClusterTargetsOutage

Severity	Major
Summary	ZooKeeper cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.cluster }}` cluster on the `{{ $labels.node }}` node.

ZooKeeperClusterTargetsOutage¶

Replaced with ZooKeeperClusterTargetDown in 23.3

Severity	Major
Summary	ZooKeeper cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of the `{{ $labels.cluster }}` cluster endpoints (more than 1/10 failed scrapes).

ZooKeeperMissingFollowers¶

Severity	Warning
Summary	ZooKeeper cluster has missing followers.
Description	The `{{ $labels.cluster }}` ZooKeeper cluster in the `{{ $labels.namespace }}` namespace has missing follower servers.

ZooKeeperRequestOverload¶

Severity	Warning
Summary	ZooKeeper server request overload.
Description	The `{{ $labels.namespace }}`/ `{{ $labels.pod }}` ZooKeeper Pod in the `{{ $labels.cluster }}` cluster is not keeping up with request handling.

ZooKeeperRunningOutOfFileDescriptors¶

Severity	Warning
Summary	ZooKeeper server is running out of file descriptors.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` ZooKeeper Pod in the `{{ $labels.cluster }}` cluster is using at least 85% of available file descriptors.

ZooKeeperSyncOverload¶

Severity	Warning
Summary	ZooKeeper leader synchronization overload.
Description	The ZooKeeper leader in the `{{ $labels.cluster }}` cluster in the `{{ $labels.namespace }}` namespace is not keeping up with synchronization.

StackLight¶

This section describes the alerts available for StackLight components.

Alertmanager¶

This section describes the alerts for the Alertmanager service.

AlertmanagerTargetDown
AlertmanagerClusterTargetsOutage
AlertmanagerFailedReload
AlertmanagerMembersInconsistent
AlertmanagerNotificationFailureWarning
AlertmanagerAlertsInvalidWarning

AlertmanagerTargetDown¶

Available since MCC 2.25.0 (17.0.0 and 16.0.0)

Severity	Major
Summary	Prometheus Alertmanager target down.
Description	Prometheus fails to scrape metrics from the {{ $labels.pod }} Pod on the {{ $labels.node }} node.

AlertmanagerClusterTargetsOutage¶

Replaced with AlertmanagerTargetDown in MCC 2.25.0 (17.0.0 and 16.0.0)

Severity	Major
Summary	Prometheus Alertmanager targets outage.
Description	Prometheus fails to scrape metrics from all Alertmanager endpoints (more than 1/10 failed scrapes).

AlertmanagerFailedReload¶

Severity	Warning
Summary	Failure to reload Alertmanager configuration.
Description	Reloading the Alertmanager configuration has failed.

AlertmanagerMembersInconsistent¶

Severity	Major
Summary	Alertmanager cluster members cannot be found.
Description	Alertmanager has not found all other members of the cluster.

AlertmanagerNotificationFailureWarning¶

Severity	Warning
Summary	Alertmanager notifications fail.
Description	An average of `{{ $value }}` Alertmanager `{{ $labels.integration }}` notifications fail for 2 minutes.

AlertmanagerAlertsInvalidWarning¶

Severity	Warning
Summary	Alertmanager alerts are invalid.
Description	An average of `{{ $value }}` Alertmanager `{{ $labels.integration }}` alerts are invalid for 2 minutes.

cAdvisor¶

This section lists the alerts for the cAdvisor service. For troubleshooting guidelines, see Troubleshoot cAdvisor alerts.

cAdvisorTargetDown
cAdvisorTargetsOutage
KubeContainersCPUThrottlingHigh
KubeContainerScrapeError

cAdvisorTargetDown¶

Severity	Major
Summary	cAdvisor Prometheus target is down.
Description	Prometheus fails to scrape metrics from the cAdvisor endpoint on the `{{ $labels.node }}` node.

cAdvisorTargetsOutage¶

Severity	Critical
Summary	cAdvisor Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all cAdvisor endpoints.

KubeContainersCPUThrottlingHigh¶

Severity	Warning
Summary	Containers CPU throttling.
Description	`{{ printf "%0.0f" $value }}%` throttling of CPU for container(s) in Pod(s) of `{{ $labels.created_by_name }}` `{{ $labels.created_by_kind }}` in the `{{ $labels.namespace }}` namespace.

KubeContainerScrapeError¶

Severity	Warning
Summary	Failure to get Kubernetes container metrics.
Description	cAdvisor was not able to scrape metrics from some containers on the `{{ $labels.node }}` Kubernetes node.

Elasticsearch Exporter¶

This section describes the alerts for the Elasticsearch Exporter service.

ElasticsearchExporterTargetDown
PrometheusEsExporterTargetDown

ElasticsearchExporterTargetDown¶

Severity	Critical
Summary	Elasticsearch Exporter Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Elasticsearch Exporter service.

PrometheusEsExporterTargetDown¶

Severity	Major
Summary	Prometheus Elasticsearch Exporter Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Prometheus Elasticsearch Exporter service.

Fluentd¶

This section describes the alerts for Fluentd-logs.

FluentdTargetDown
FluentdTargetsOutage

FluentdTargetDown¶

Severity	Major
Summary	Fluentd Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Fluentd pod on the `{{ $labels.node }}` node.

FluentdTargetsOutage¶

Severity	Critical
Summary	Fluentd Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all Fluentd pods.

General¶

This section describes general StackLight alerts.

Watchdog
StacklightGenericTargetsOutage

Watchdog¶

Severity	None
Summary	Watchdog alert that is always firing.
Description	This alert ensures that the entire alerting pipeline is functional. This alert should always be firing in Alertmanager against a receiver. Some integrations with various notification mechanisms can send a notification when this alert is not firing. For example, the `DeadMansSnitch` integration in PagerDuty.

StacklightGenericTargetsOutage¶

Severity	Major
Summary	`{{ $labels.service_name }}` service targets outage.
Description	Prometheus fails to scrape metrics from all `{{ $labels.namespace }}`/`{{ $labels.service_name }}` service endpoint(s).

Grafana¶

This section describes the alerts for Grafana.

GrafanaTargetDown

GrafanaTargetDown¶

Severity	Major
Summary	Grafana Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

Monitoring of external endpoints¶

This section lists the alerts for monitoring of external endpoints.

ExternalEndpointDown
ExternalEndpointTCPFailure
ExternalEndpointTargetDown
SSLCertExpirationMedium
SSLCertExpirationHigh
SSLProbesFailing
SSLProbeTargetDown

ExternalEndpointDown¶

Severity	Critical
Summary	External endpoint is down.
Description	The `{{ $labels.instance }}` external endpoint probed by the `{{ $labels.job }}` job is not accessible for the last 2 minutes.

ExternalEndpointTCPFailure¶

Severity	Critical
Summary	Failure to establish a TCP or TLS connection.
Description	The system cannot establish a TCP or TLS connection to `{{ $labels.instance }}` probed by the `{{ $labels.job }}` job.

ExternalEndpointTargetDown¶

Severity	Critical
Summary	`{{ $labels.instance }}` external endpoint target down.
Description	Prometheus fails to probe the `{{ $labels.instance }}` external endpoint.

SSLCertExpirationMedium¶

Severity	Major
Summary	SSL certificate expires on `{{ $value \| humanizeTimestamp }}`
Description	The SSL certificate for `{{ $labels.instance }}` expires on `{{ $value \| humanizeTimestamp }}`, less than 30 days are left.

SSLCertExpirationHigh¶

Severity	Critical
Summary	SSL certificate expires on `{{ $value \| humanizeTimestamp }}`
Description	The SSL certificate for `{{ $labels.instance }}` expires on `{{ $value \| humanizeTimestamp }}`, less than 10 days are left.

SSLProbesFailing¶

Severity	Critical
Summary	SSL certificate probes are failing.
Description	The SSL certificate probes for `{{ $labels.instance }}` are failing.

SSLProbeTargetDown¶

Severity	Critical
Summary	`{{ $labels.instance }}` SSL target down.
Description	Prometheus fails to probe the `{{ $labels.instance }}` SSL endpoint.

OpenSearch¶

This section describes the alerts for the OpenSearch service.

OpenSearchHeapUsageCritical
OpenSearchHeapUsageWarning
OpenSearchClusterStatusCritical
OpenSearchClusterStatusWarning
OpenSearchPVCMismatch
OpenSearchNumberOfRelocationShards
OpenSearchNumberOfInitializingShards
OpenSearchNumberOfUnassignedShards
OpenSearchNumberOfPendingTasks
OpenSearchStorageUsageCritical
OpenSearchStorageUsageMajor

See also

Troubleshoot OpenSearch alerts

OpenSearchHeapUsageCritical¶

Severity	Critical
Summary	OpenSearch heap usage is too high (>90%).
Description	The heap usage of the OpenSearch `{{ $labels.name }}` node from the cluster `{{ $labels.cluster }}` is over 90% for 5 minutes.

OpenSearchHeapUsageWarning¶

Severity	Warning
Summary	OpenSearch heap usage is high (>80%).
Description	The heap usage of the OpenSearch `{{ $labels.name }}` node from the cluster `{{ $labels.cluster }}` is over 80% for 5 minutes.

OpenSearchClusterStatusCritical¶

Severity	Critical
Summary	OpenSearch critical status.
Description	The OpenSearch `{{ $labels.cluster }}` cluster status has changed to `red`.

OpenSearchClusterStatusWarning¶

Severity	Warning
Summary	OpenSearch warning status.
Description	The OpenSearch `{{ $labels.cluster }}` cluster status has changed to `yellow`. The alert persists for the cluster in the `red` status.

OpenSearchPVCMismatch¶

Available since MCC 2.22.0 (Cluster release 11.6.0)

Severity	Warning
Summary	OpenSearch PVC size mismatch
Description	The PVC size requested by OpenSearch StatefulSet does not match the configured PVC size. To troubleshoot the issue, refer to OpenSearchPVCMismatch alert raises due to the OpenSearch PVC size mismatch.

OpenSearchNumberOfRelocationShards¶

Severity	Warning
Summary	Shards relocation takes more than 20 minutes.
Description	The number of relocating shards in the OpenSearch cluster `{{ $labels.cluster }}` is `{{ $value }}` for 20 minutes.

OpenSearchNumberOfInitializingShards¶

Severity	Warning
Summary	Shards initialization takes more than 10 minutes.
Description	The number of initializing shards in the OpenSearch cluster `{{ $labels.cluster }}` is `{{ $value }}` for 10 minutes.

OpenSearchNumberOfUnassignedShards¶

Removed in MCC 2.27.0 (17.2.0 and 16.2.0)

Severity	Major
Summary	Shards have unassigned status for 10 minutes.
Description	The number of unassigned shards in the OpenSearch cluster `{{ $labels.cluster }}` is `{{ $value }}` for 10 minutes.

OpenSearchNumberOfPendingTasks¶

Severity	Warning
Summary	Tasks have pending state for 10 minutes.
Description	The number of pending tasks in the OpenSearch cluster `{{ $labels.cluster }}` is `{{ $value }}` for 10 minutes. The cluster works slowly.

OpenSearchStorageUsageCritical¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0)

Severity	Critical
Summary	OpenSearch node reached 95% of storage usage
Description	Storage usage of `{{ $labels.persistentvolumeclaim }}` PVC mounted to the OpenSearch node reached the 95% threshold.

OpenSearchStorageUsageMajor¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0)

Severity	Major
Summary	OpenSearch node reached 90% of storage usage
Description	Storage usage of `{{ $labels.persistentvolumeclaim }}` PVC mounted to the OpenSearch node reached the 90% threshold.

PostgreSQL and Patroni¶

This section lists the alerts for the PoststgreSQL and Patroni services.

PostgresqlDataPageCorruption
PostgresqlDeadlocksDetected
PostgresqlInsufficientWorkingMemory
PostgresqlPatroniClusterSplitBrain
PostgresqlPatroniClusterUnlocked
PostgresqlReplicaDown
PostgresqlReplicationNonStreamingReplicas
PostgresqlReplicationPaused
PostgresqlReplicationSlowWalApplication
PostgresqlReplicationSlowWalDownload
PostgresqlReplicationWalArchiveWriteFailing
PostgresqlTargetsOutage
PostgresqlTargetDown

PostgresqlDataPageCorruption¶

Severity	Critical
Summary	Patroni cluster member is experiencing data page corruption.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Patroni Pod in the `{{ $labels.cluster }}` cluster fails to calculate the data page checksum due to a possible hardware fault.

PostgresqlDeadlocksDetected¶

Severity	Warning
Summary	PostgreSQL transactions deadlocks.
Description	The transactions submitted to the `{{ $labels.datname }}` database in the `{{ $labels.cluster }}` Patroni cluster in the `{{ $labels.namespace }}` namespace are experiencing deadlocks.

PostgresqlInsufficientWorkingMemory¶

Severity	Warning
Summary	Insufficient memory for PostgreSQL queries.
Description	The query data does not fit into working memory of the `{{ $labels.pod }}` Pod in the `{{ $labels.cluster }}` Patroni cluster in the `{{ $labels.namespace }}` namespace.

PostgresqlPatroniClusterSplitBrain¶

Severity	Critical
Summary	Patroni cluster split-brain detected.
Description	The `{{ $labels.cluster }}` Patroni cluster in the `{{ $labels.namespace }}` namespace has multiple primaries, split-brain detected.

PostgresqlPatroniClusterUnlocked¶

Severity	Major
Summary	Patroni cluster primary node is missing.
Description	The `{{ $labels.cluster }}` Patroni cluster in the `{{ $labels.namespace }}` namespace is down due to missing primary node.

PostgresqlReplicaDown¶

Severity	Warning
Summary	Patroni cluster has replicas with inoperable PostgreSQL.
Description	The `{{ $labels.cluster }}` Patroni cluster in the `{{ $labels.namespace }}` namespace has `{{ $value }}%` of replicas with inoperable PostgreSQL.

PostgresqlReplicationNonStreamingReplicas¶

Severity	Warning
Summary	Patroni cluster has non-streaming replicas.
Description	The `{{ $labels.cluster }}` Patroni cluster in the `{{ $labels.namespace }}` namespace has replicas not streaming segments from the primary node.

PostgresqlReplicationPaused¶

Severity	Major
Summary	Replication has stopped.
Description	Replication has stopped on the `{{ $labels.namespace }}`/`{{ $labels.pod }}` replica Pod in the `{{ $labels.cluster }}` cluster.

PostgresqlReplicationSlowWalApplication¶

Severity	Warning
Summary	WAL segment application is slow.
Description	Slow replication while applying WAL segments on the `{{ $labels.namespace }}`/`{{ $labels.pod }}` replica Pod in the `{{ $labels.cluster }}` cluster.

PostgresqlReplicationSlowWalDownload¶

Severity	Warning
Summary	Streaming replication is slow.
Description	Slow replication while downloading WAL segments for the `{{ $labels.namespace }}`/`{{ $labels.pod }}` replica Pod in the `{{ $labels.cluster }}` cluster.

See also

Patroni replication lag

PostgresqlReplicationWalArchiveWriteFailing¶

Severity	Major
Summary	Patroni cluster WAL segment writes are failing.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Patroni Pod in the `{{ $labels.cluster }}` cluster fails to write replication segments.

PostgresqlTargetsOutage¶

Replaced with PostgresqlTargetDown in MCC 2.25.0 (17.0.0 and 16.0.0)

Severity	Critical
Summary	Patroni cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of Patroni `{{ $labels.cluster }}` cluster endpoints (more than 1/10 failed scrapes).

PostgresqlTargetDown¶

Since MCC 2.25.0 (17.0.0 and 16.0.0) to replace PostgresqlTargetsOutage

Severity	Critical
Summary	Patroni cluster Prometheus target down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.cluster }}` cluster on the `{{ $labels.node }}` node.

Prometheus¶

This section describes the alerts for the Prometheus service.

PrometheusConfigReloadFailed
PrometheusNotificationQueueRunningFull
PrometheusErrorSendingAlertsWarning
PrometheusErrorSendingAlertsMajor
PrometheusNotConnectedToAlertmanagers
PrometheusTSDBReloadsFailing
PrometheusTSDBCompactionsFailing
PrometheusTSDBWALCorruptions
PrometheusNotIngestingSamples
PrometheusTargetScrapesDuplicate
PrometheusRuleEvaluationsFailed
PrometheusServerTargetDown
PrometheusServerTargetsOutage

PrometheusConfigReloadFailed¶

Severity	Warning
Summary	Failure to reload the Prometheus configuration.
Description	Reloading of the Prometheus configuration has failed.

PrometheusNotificationQueueRunningFull¶

Severity	Warning
Summary	Prometheus alert notification queue is running full.
Description	The Prometheus alert notification queue is running full for the `{{ $labels.namespace }}`/`{{ $labels.pod }}` Pod.

PrometheusErrorSendingAlertsWarning¶

Severity	Warning
Summary	Errors while sending alerts from Prometheus.
Description	Errors while sending alerts from the `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod to the `{{ $labels.Alertmanager }}` Alertmanager.

PrometheusErrorSendingAlertsMajor¶

Severity	Major
Summary	Errors while sending alerts from Prometheus.
Description	Errors while sending alerts from the `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod to the `{{ $labels.alertmanager }}` Alertmanager.

PrometheusNotConnectedToAlertmanagers¶

Severity	Warning
Summary	Prometheus is not connected to any Alertmanager.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod is not connected to any Alertmanager instance.

PrometheusTSDBReloadsFailing¶

Severity	Warning
Summary	Prometheus has issues reloading data blocks from disk.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod had `{{ $value \| humanize }}` reload failures over the last 12 hours.

PrometheusTSDBCompactionsFailing¶

Severity	Warning
Summary	Prometheus has issues compacting sample blocks.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod had `{{ $value \| humanize }}` compaction failures over the last 12 hours.

PrometheusTSDBWALCorruptions¶

Severity	Warning
Summary	Prometheus encountered WAL corruptions.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod has write-ahead log (WAL) corruptions in the time series database (TSDB) for the last 5 minutes.

PrometheusNotIngestingSamples¶

Severity	Major
Summary	Prometheus does not ingest samples.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod does not ingest samples.

PrometheusTargetScrapesDuplicate¶

Severity	Warning
Summary	Prometheus has many samples rejected.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod has many samples rejected due to duplicate timestamps but different values.

PrometheusRuleEvaluationsFailed¶

Severity	Warning
Summary	Prometheus failed to evaluate recording rules.
Description	The `{{ $labels.namespace }}`/`{{ $labels.pod }}` Prometheus Pod has failed evaluations for recording rules. Verify the rules state in the Status/Rules section of the Prometheus web UI.

PrometheusServerTargetDown¶

Since MCC 2.25.0 (17.0.0 and 16.0.0) to replace PrometheusServerTargetsOutage

Severity	Critical
Summary	Prometheus server target down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

PrometheusServerTargetsOutage¶

Replaced with PrometheusServerTargetDown in MCC 2.25.0 (17.0.0 and 16.0.0)

Severity	Critical
Summary	Prometheus server targets outage.
Description	Prometheus fails to scrape metrics from all of its endpoints (more than 1/10 failed scrapes).

Prometheus MS Teams¶

This section lists the alerts for the Prometheus MS Teams service.

PrometheusMsTeamsTargetDown

PrometheusMsTeamsTargetDown¶

Severity	Major
Summary	Prometheus MS Teams Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

Prometheus Relay¶

This section describes the alerts for the Prometheus Relay service.

PrometheusRelayTargetDown

PrometheusRelayTargetDown¶

Severity	Major
Summary	Prometheus Relay Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

ServiceNow¶

This section lists the alerts for the ServiceNow receiver service.

ServiceNowAuthFailure
ServiceNowWebhookReceiverTargetDown

ServiceNowAuthFailure¶

Severity	Major
Summary	`alertmanager-webhook-servicenow` authentication failure.
Description	The `alertmanager-webhook-servicenow` Pod fails to authenticate to ServiceNow for 1 minute.

ServiceNowWebhookReceiverTargetDown¶

Severity	Major
Summary	`alertmanager-webhook-servicenow` Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

Salesforce notifier¶

This section lists the alerts for the Salesforce notifier service.

SfNotifierAuthFailure
SfNotifierTargetDown

SfNotifierAuthFailure¶

Severity	Critical
Summary	Failure to authenticate to Salesforce.
Description	The `sf-notifier` Pod fails to authenticate to Salesforce for 1 minute.

SfNotifierTargetDown¶

Severity	Critical
Summary	Salesforce notifier Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

Telegraf¶

This section lists the alerts for the Telegraf service.

TelegrafGatherErrors
TelegrafDockerSwarmGatherErrors
TelegrafDockerSwarmTargetDown
TelegrafOpenstackTargetDown
TelegrafSMARTGatherErrors
TelegrafSMARTTargetDown
TelegrafSMARTTargetsOutage

TelegrafGatherErrors¶

Removed in MCC 2.29.0 (17.4.0 and 16.4.0)

Note

TelegrafGatherErrors was replaced with TelegrafDockerSwarmGatherErrors and TelegrafSMARTGatherErrors.

Severity	Major
Summary	`{{ $labels.job }}` failed to gather metrics.
Description	The `{{ $labels.job }}` Prometheus target has gathering errors for the last 10 minutes.

TelegrafDockerSwarmGatherErrors¶

Since MCC 2.29.0 (17.4.0 and 16.4.0)

Severity	Major
Summary	Telegraf Docker Swarm failed to gather metrics.
Description	The Telegraf Docker Swarm Prometheus target contains gathering errors for the last 30 minutes.

TelegrafDockerSwarmTargetDown¶

Severity	Critical
Summary	Telegraf Docker Swarm Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

TelegrafOpenstackTargetDown¶

Removed in MOSK 24.1

Severity	Critical
Summary	Telegraf OpenStack Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Telegraf OpenStack service.

TelegrafSMARTGatherErrors¶

Since MCC 2.29.0 (17.4.0 and 16.4.0)

Severity	Major
Summary	Telegraf SMART failed to gather metrics.
Description	The Telegraf SMART Prometheus target contains gathering errors for the last 10 minutes.

TelegrafSMARTTargetDown¶

Severity	Major
Summary	Telegraf SMART Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Telegraf SMART endpoint on the `{{ $labels.node }}` node.

TelegrafSMARTTargetsOutage¶

Severity	Critical
Summary	Telegraf SMART Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all Telegraf SMART endpoints.

Telemeter¶

This section describes the alerts for the Telemeter service.

TelemeterClientFailed
TelemeterClientHAFailed
TelemeterClientTargetDown
TelemeterServerFederationTargetDown
TelemeterServerTargetDown

TelemeterClientFailed¶

Severity	Warning
Summary	Telemeter client failed to federate or send data.
Description	Telemeter client has failed to federate data from the Prometheus or send data to the Telemeter server more than four times for the last 10 minutes.

TelemeterClientHAFailed¶

Severity	Warning
Summary	Telemeter client failed to federate or send data.
Description	Telemeter client has failed to federate data from the Prometheus or send data to the Telemeter server more than once for the last 10 minutes.

TelemeterClientTargetDown¶

Severity	Major
Summary	Telemeter client Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

TelemeterServerFederationTargetDown¶

Severity	Major
Summary	Telemeter server Prometheus federation target is down.
Description	Prometheus fails to federate metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

TelemeterServerTargetDown¶

Severity	Major
Summary	Telemeter server Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

Ceph¶

This section describes the alerts for the Ceph cluster.

CephClusterHealthWarning¶

Severity	Warning
Summary	Ceph cluster health is `WARNING`.
Description	The Ceph cluster is in the `WARNING` state. For details, run ceph -s.

CephClusterHealthCritical¶

Severity	Critical
Summary	Ceph cluster health is `CRITICAL`.
Description	The Ceph cluster is in the `CRITICAL` state. For details, run ceph -s.

CephClusterTargetDown¶

Severity	Critical
Summary	Ceph cluster Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

CephDaemonSlowOps¶

Available MCC 2.24.0 (Cluster release 14.0.0)

Severity	Warning
Summary	`{{ $labels.ceph_daemon }}` operations are slow.
Description	`{{ $labels.ceph_daemon }}` operations take too long to process on the Ceph cluster (complaint time exceeded).

CephMonClockSkew¶

Available MCC 2.24.0 (Cluster release 14.0.0)

Severity	Warning
Summary	Ceph Monitor clock skew detected.
Description	Ceph Monitor clock drift exceeds configured threshold on the Ceph cluster.

CephMonQuorumAtRisk¶

Severity	Major
Summary	Ceph cluster quorum at risk.
Description	The Ceph Monitors quorum on the Ceph cluster is low.

CephOSDDown¶

Removed in MCC 2.25.0 (Cluster releases 17.0.0 and 16.0.0)

Severity	Critical
Summary	Ceph OSDs are down.
Description	`{{ $value }}` Ceph OSDs on the `{{ $labels.rook_cluster }}` cluster are down. For details, run ceph osd tree.

CephOSDFlapping¶

Available MCC 2.24.0 (Cluster release 14.0.0)

Severity	Warning
Summary	Ceph OSDs flap due to network issues.
Description	The Ceph OSD `{{ $labels.ceph_daemon }}` on the Ceph cluster cluster changed between up and down state `{{ $value \| humanize }}` times for 5 minutes.

CephOSDDiskNotResponding¶

Severity	Critical
Summary	Disk not responding.
Description	The `{{ $labels.device }}` disk device is not responding to `{{ $labels.ceph_daemon }}` on the `{{ $labels.node }}` node of the Ceph cluster.

CephOSDDiskUnavailable¶

Severity	Critical
Summary	Disk not accessible.
Description	The `{{ $labels.device }}` disk device is not accessible by `{{ $labels.ceph_daemon }}` on the `{{ $labels.node }}` node of the Ceph cluster.

CephOSDSlowClusterNetwork¶

Available MCC 2.24.0 (Cluster release 14.0.0)

Severity	Warning
Summary	Cluster network slows down Ceph OSD heartbeats.
Description	Ceph OSD heartbeats on the cluster network (backend) of the cluster are slow.

CephOSDSlowPublicNetwork¶

Available MCC 2.24.0 (Cluster release 14.0.0)

Severity	Warning
Summary	Public network slows down Ceph OSD heartbeats.
Description	Ceph OSD heartbeats on the public network (front end) are running slow.

CephClusterFullWarning¶

Severity	Warning
Summary	Ceph cluster is nearly full.
Description	The Ceph cluster utilization has crossed 85%. Expansion is required.

CephClusterFullCritical¶

Severity	Critical
Summary	Ceph cluster is full.
Description	The Ceph cluster utilization has crossed 95% and needs immediate expansion.

CephOSDPgNumTooHighWarning¶

Severity	Warning
Summary	Ceph OSDs have more than 200 PGs.
Description	Some Ceph OSDs contain more than 200 Placement Groups. This may have a negative impact on the cluster performance. For details, run ceph pg dump.

CephOSDPgNumTooHighCritical¶

Severity	Critical
Summary	Ceph OSDs have more than 300 PGs.
Description	Some Ceph OSDs contain more than 300 Placement Groups. This may have a negative impact on the cluster performance. For details, run ceph pg dump.

CephMonHighNumberOfLeaderChanges¶

Severity	Major
Summary	Ceph cluster has too many leader changes.
Description	The Ceph Monitor `{{ $labels.ceph_daemon }}` on the Ceph cluster has detected `{{ $value }}` leader changes per minute.

CephOSDNodeDown¶

Since MCC 2.25.0 (17.0.0 and 16.0.0) to replace CephNodeDown

Severity	Critical
Summary	Ceph node `{{ $labels.node }}` went down.
Description	The Ceph OSD node `{{ $labels.node }}` of the Ceph cluster went down and requires immediate verification.

CephNodeDown¶

“Renamed to CephOSDNodeDown in MCC 2.25.0 (17.0.0 and 16.0.0)

Severity	Critical
Summary	Ceph node `{{ $labels.node }}` went down.
Description	The Ceph node `{{ $labels.node }}` of the `{{ $labels.rook_cluster }}` cluster went down and requires immediate verification.

CephOSDVersionMismatch¶

Severity	Warning
Summary	Multiple versions of Ceph OSDs running.
Description	`{{ $value }}` different versions of Ceph OSD daemons are running on the cluster.

CephMonVersionMismatch¶

Severity	Warning
Summary	Multiple versions of Ceph Monitors running.
Description	`{{ $value }}` different versions of Ceph Monitors are running on the Ceph cluster.

CephPGInconsistent¶

Severity	Warning
Summary	Too many inconsistent Ceph PGs.
Description	The Ceph cluster detects inconsistencies in one or more replicas of an object in `{{ $value }}` Placement Groups on the `{{ $labels.name }}` pool.

CephPGUndersized¶

Severity	Warning
Summary	Too many undersized Ceph PGs.
Description	The Ceph cluster reports `{{ $value }}` Placement Groups have fewer copies than the configured pool replication level on the `{{ $labels.name }}` pool.

Bare metal¶

This section describes the alerts available for bare metal services.

Bond interface¶

Available since MCC 2.24.0 (14.0.0)

This section describes the alerts for bond interface issues.

BondInterfaceDown
BondInterfaceOneSlaveConfigured
BondInterfaceOneSlaveLeft
BondInterfaceSlaveDown

BondInterfaceDown¶

Severity	Critical
Summary	`{{ $labels.master }}` bond interface is down.
Description	The `{{ $labels.master }}` bond interface of `{{ $labels.node }}` is down.

BondInterfaceOneSlaveConfigured¶

Severity	Warning
Summary	`{{ $labels.master }}` has only 1 slave configured.
Description	The `{{ $labels.master }}` bond interface of `{{ $labels.node }}` has only 1 slave configured.

BondInterfaceOneSlaveLeft¶

Severity	Critical
Summary	`{{ $labels.master }}` has only 1 active slave.
Description	The `{{ $labels.master }}` bond interface of `{{ $labels.node }}` has only 1 active slave.

BondInterfaceSlaveDown¶

Severity	Major
Summary	`{{ $labels.master }}` has down slave(s).
Description	The `{{ $labels.master }}` bond interface of `{{ $labels.node }}` has `{{ $value }}` down slave(s).

Ironic¶

This section describes the alerts for Ironic bare metal. The alerted events include Ironic API availability and Ironic processes availability.

IronicBmMetricsMissing
IronicBmApiOutage
IronicBmTargetDown

IronicBmMetricsMissing¶

Removed in 2.24.0 (14.0.0) in favor of IronicBmApiOutage

Severity	Major
Summary	Ironic metrics missing.
Description	Metrics retrieved from the Ironic Exporter are not available for 2 minutes.

IronicBmApiOutage¶

Severity	Critical
Summary	Ironic API outage.
Description	The Ironic API is not accessible.

IronicBmTargetDown¶

Severity	Critical
Summary	Ironic Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Ironic service.

Kernel¶

This section describes the alerts for Ubuntu kernel.

KernelIOErrorsDetected

KernelIOErrorsDetected¶

Available since MCC 2.27.0 (Cluster releases 17.2.0 and 16.2.0)

Severity	Critical
Summary	The `{{ $labels.node }}` node kernel reports IO errors.
Description	The `{{ $labels.node }}` node kernel reports IO errors. Investigate kernel logs.

Host Operating System Modules Controller¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0) TechPreview

This section lists alerts for the host-os-modules-controller service, including alerts for HostOSConfiguration and HostOSConfigurationModules custom resources. For details about these resources, refer to API Reference: HostOSConfiguration and HostOSConfigurationModules.

For troubleshooting guidelines, see Troubleshoot Host Operating System Modules Controller alerts.

Day2ManagementControllerTargetDown
Day2ManagementDeprecatedConfigs

Day2ManagementControllerTargetDown¶

Severity	Critical
Summary	`host-os-modules-controller` Prometheus target is down
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `host-os-modules-controller` on the `{{ $labels.node }}` node.

Day2ManagementDeprecatedConfigs¶

Severity	Informational
Summary	Configuration with deprecated module in use
Description	Deprecated module `{{ $labels.module_name }}` version `{{ $labels.module_version }}` is used by `{{ $value }}` `HostOSConfiguration` object(s). It is deprecated by the module `{{ $labels.deprecated_by_module_name }}` version `{{ $labels.deprecated_by_module_version }}`.

Generic¶

This section lists the generic alerts that apply to a number of Kubernetes or MOSK cluster components.

Calico¶

This section describes the alerts for Calico.

CalicoDataplaneFailuresHigh
CalicoDataplaneAddressMsgBatchSizeHigh
CalicoDataplaneIfaceMsgBatchSizeHigh
CalicoIPsetErrorsHigh
CalicoIptablesSaveErrorsHigh
CalicoIptablesRestoreErrorsHigh
CalicoTargetDown
CalicoTargetsOutage

CalicoDataplaneFailuresHigh¶

Severity	Warning
Summary	Data plane updates fail.
Description	The Felix daemon on the `{{ $labels.node }}` node has detected `{{ $value }}` data plane update failures within the last 10 minutes.

CalicoDataplaneAddressMsgBatchSizeHigh¶

Severity	Warning
Summary	Interface address messages in a batch exceed 5.
Description	The Felix daemon on the `{{ $labels.node }}` node has seen a high average value of `{{ $value }}` data plane interface messages in batches.

CalicoDataplaneIfaceMsgBatchSizeHigh¶

Severity	Warning
Summary	Interface state messages in a batch exceed 5.
Description	The Felix daemon on the `{{ $labels.node }}` node has detected a high average value of `{{ $value }}` data plane interface state messages in batches.

CalicoIPsetErrorsHigh¶

Severity	Warning
Summary	ipset commands fail.
Description	The Felix daemon on the `{{ $labels.node }}` node has detected `{{ $value }}` ipset command failures within the last hour.

CalicoIptablesSaveErrorsHigh¶

Severity	Warning
Summary	iptables-save fails.
Description	The Felix daemon on the `{{ $labels.node }}` node has detected `{{ $value }}` iptables-save errors within the last hour.

CalicoIptablesRestoreErrorsHigh¶

Severity	Warning
Summary	iptables-restore fails.
Description	The Felix daemon on the `{{ $labels.node }}` node has detected `{{ $value }}` iptables-restore errors within the last hour.

CalicoTargetDown¶

Severity	Major
Summary	Calico Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Calico pod on the `{{ $labels.node }}` node.

CalicoTargetsOutage¶

Severity	Critical
Summary	Calico Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all Calico pods.

Etcd¶

This section describes the alerts for the etcd service.

etcdDbSizeCritical
etcdDbSizeMajor
etcdInsufficientMembers
etcdNoLeader
etcdHighNumberOfLeaderChanges
etcdHighNumberOfFailedProposals
etcdTargetDown
etcdTargetsOutage

etcdDbSizeCritical¶

Available since MCC 2.21.0 (11.5.0 and 7.11.0)

Severity	Critical
Summary	Etcd database passed 95% of quota.
Description	The `{{ $labels.job }}` etcd database reached `{{ $value }}` % of defined quota on the `{{ $labels.node }}` node.

etcdDbSizeMajor¶

Available since MCC 2.21.0 (11.5.0 and 7.11.0)

Severity	Major
Summary	Etcd database passed 85% of quota.
Description	The `{{ $labels.job }}` etcd database reached `{{ $value }}` % of defined quota on the `{{ $labels.node }}` node.

etcdInsufficientMembers¶

Severity	Critical
Summary	Etcd cluster has insufficient members.
Description	The `{{ $labels.job }}` etcd cluster has `{{ $value }}` insufficient members.

etcdNoLeader¶

Severity	Critical
Summary	Etcd cluster has no leader.
Description	The `{{ $labels.node }}` member of the `{{ $labels.job }}` etcd cluster has no leader.

etcdHighNumberOfLeaderChanges¶

Severity	Warning
Summary	Etcd cluster has detected more than 3 leader changes within the last hour.
Description	The `{{ $labels.node }}` node of the `{{ $labels.job }}` etcd cluster has `{{ $value }}` leader changes within the last hour.

etcdHighNumberOfFailedProposals¶

Severity	Warning
Summary	Etcd cluster has more than 5 proposal failures.
Description	The `{{ $labels.job }}` etcd cluster has `{{ $value }}` proposal failures on the `{{ $labels.node }}` etcd node within the last hour.

etcdTargetDown¶

Since MCC 2.25.0 (17.0.0 and 16.0.0) to replace etcdTargetsOutage

Severity	Critical
Summary	Etcd cluster Prometheus target down.
Description	Prometheus fails to scrape metrics from the etcd `{{ $labels.job }}` cluster instance on the `{{ $labels.node }}` node.

etcdTargetsOutage¶

Replaced with etcdTargetDown in MCC 2.25.0 (17.0.0 and 16.0.0)

Severity	Critical
Summary	Etcd cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of etcd nodes (more than 1/10 failed scrapes).

Kubernetes¶

This section describes the alerts available for Kubernetes services.

Kubernetes applications¶

This section lists the alerts for Kubernetes applications.

For troubleshooting guidelines, see Troubleshoot Kubernetes applications alerts.

KubePodsCrashLooping
KubePodsNotReady
KubePodsRegularLongTermRestarts
KubeDeploymentGenerationMismatch
KubeDeploymentReplicasMismatch
KubeDeploymentOutage
KubeStatefulSetReplicasMismatch
KubeStatefulSetGenerationMismatch
KubeStatefulSetOutage
KubeStatefulSetUpdateNotRolledOut
KubeDaemonSetRolloutStuck
KubeDaemonSetNotScheduled
KubeDaemonSetMisScheduled
KubeDaemonSetOutage
KubeCronJobRunning
KubeJobFailed

KubePodsCrashLooping¶

Severity	Warning
Summary	Pod of `{{ $labels.created_by_name }}` `{{ $labels.created_by_kind }}` in crash loop.
Description	At least one Pod container of `{{ $labels.created_by_name }}` `{{ $labels.created_by_kind }}` in the `{{ $labels.namespace }}` namespace was restarted more than twice during the last 20 minutes.

KubePodsNotReady¶

Removed in MCC 2.25.0 (17.0.0 and 16.0.0)

Severity	Warning
Summary	Pods of `{{ $labels.created_by_name }}` `{{ $labels.created_by_kind }}` in non-ready state.
Description	`{{ $labels.created_by_name }}` `{{ $labels.created_by_kind }}` in the `{{ $labels.namespace }}` namespace has Pods in non-`Ready` state for longer than 12 minutes.

KubePodsRegularLongTermRestarts¶

Severity	Warning
Summary	`{{ $labels.created_by_name }}` `{{ $labels.created_by_kind }}` Pod restarted regularly.
Description	The Pod of `{{ $labels.created_by_name }}` `{{ $labels.created_by_kind }}` in the `{{ $labels.namespace }}` namespace has a container that was restarted at least once a day during the last 2 days.

KubeDeploymentGenerationMismatch¶

Severity	Major
Summary	Deployment `{{ $labels.deployment }}` generation does not match the metadata.
Description	The `{{ $labels.namespace }}`/`{{ $labels.deployment }}` Deployment generation does not match the metadata, indicating that the Deployment has failed but has not been rolled back.

KubeDeploymentReplicasMismatch¶

Severity	Major
Summary	Deployment `{{ $labels.deployment }}` has wrong number of replicas.
Description	The `{{ $labels.namespace }}`/`{{ $labels.deployment }}` Deployment has not matched the expected number of replicas for longer than 30 minutes.

KubeDeploymentOutage¶

Severity	Critical
Summary	Deployment `{{ $labels.deployment }}` outage.
Description	The `{{ $labels.namespace }}`/`{{ $labels.deployment }}` Deployment has all Pod(s) unavailable for the last 5 minutes.

KubeStatefulSetReplicasMismatch¶

Severity	Major
Summary	StatefulSet `{{ $labels.statefulset }}` has a wrong number of ready replicas.
Description	The `{{ $labels.namespace }}`/`{{ $labels.statefulset }}` StatefulSet has not matched the expected number of ready replicas for longer than 30 minutes.

KubeStatefulSetGenerationMismatch¶

Severity	Major
Summary	StatefulSet `{{ $labels.statefulset }}` generation does not match the metadata.
Description	The `{{ $labels.namespace }}`/`{{ $labels.statefulset }}` StatefulSet generation does not match the metadata, indicating that the StatefulSet has failed but has not been rolled back.

KubeStatefulSetOutage¶

Severity	Critical
Summary	StatefulSet `{{ $labels.statefulset }}` outage.
Description	The `{{ $labels.namespace }}`/`{{ $labels.statefulset }}` StatefulSet has more than 1 not ready replica for the last 5 minutes.

KubeStatefulSetUpdateNotRolledOut¶

Severity	Major
Summary	StatefulSet `{{ $labels.statefulset }}` update has not been rolled out.
Description	The `{{ $labels.namespace }}`/`{{ $labels.statefulset }}` StatefulSet update has not been rolled out.

KubeDaemonSetRolloutStuck¶

Severity	Major
Summary	DaemonSet `{{ $labels.daemonset }}` is not ready.
Description	`{{ $value }}` Pods of the `{{ $labels.namespace }}`/`{{ $labels.daemonset }}` DaemonSet are scheduled but not ready.

KubeDaemonSetNotScheduled¶

Severity	Warning
Summary	DaemonSet `{{ $labels.daemonset }}` has not scheduled pods
Description	`{{ $value }}` Pods of the `{{ $labels.namespace }}`/`{{ $labels.daemonset }}` DaemonSet are not scheduled.

KubeDaemonSetMisScheduled¶

Removed in 2.27.0 (17.2.0 and 16.2.0)

Severity	Warning
Summary	DaemonSet `{{ $labels.daemonset }}` has misscheduled pods.
Description	`{{ $value }}` Pods of the `{{ $labels.namespace }}`/`{{ $labels.daemonset }}` DaemonSet are running where they are not supposed to run.

KubeDaemonSetOutage¶

Severity	Critical
Summary	DaemonSet `{{ $labels.daemonset }}` outage.
Description	All Pods of the `{{ $labels.namespace }}`/`{{ $labels.daemonset }}` DaemonSet are scheduled but not ready for the last 2 minutes.

KubeCronJobRunning¶

Severity	Warning
Summary	CronJob `{{ $labels.cronjob }}` is stuck.
Description	The `{{ $labels.namespace }}`/`{{ $labels.cronjob }}` CronJob missed its scheduled time (waiting for 15 minutes to start).

KubeJobFailed¶

Severity	Warning
Summary	Job `{{ $labels.created_by_name }}` has failed.
Description	`{{ $value }}` Pod(s) of the `{{ $labels.namespace }}`/`{{ $labels.created_by_name }}` Job failed to complete.

Kubernetes resources¶

This section lists the alerts for Kubernetes resources.

For troubleshooting guidelines, see Troubleshoot Kubernetes resources alerts.

KubeCPUOvercommitPods
KubeMemOvercommitPods

KubeCPUOvercommitPods¶

Severity	Warning
Summary	Kubernetes has overcommitted CPU requests.
Description	The Kubernetes cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure.

KubeMemOvercommitPods¶

Severity	Warning
Summary	Kubernetes has overcommitted memory requests.
Description	The Kubernetes cluster has overcommitted memory resource requests for Pods and cannot tolerate node failure.

Kubernetes storage¶

This section lists the alerts for Kubernetes storage.

For troubleshooting guidelines, see Troubleshoot Kubernetes storage alerts.

Caution

Due to the upstream bug in Kubernetes, metrics for the KubePersistentVolumeUsageCritical and KubePersistentVolumeFullInFourDays alerts that are collected for persistent volumes provisioned by cinder-csi-plugin are not available.

KubePersistentVolumeUsageCritical
KubePersistentVolumeFullInFourDays
KubePersistentVolumeErrors

KubePersistentVolumeUsageCritical¶

Severity	Critical
Summary	PersistentVolume `{{ $labels.persistentvolumeclaim }}` has less than 3% of free space.
Description	The PersistentVolume claimed by `{{ $labels.persistentvolumeclaim }}` in the `{{ $labels.namespace }}` namespace is only `{{ printf "%0.2f" $value }}%` free.

KubePersistentVolumeFullInFourDays¶

Severity	Warning
Summary	PersistentVolume `{{ $labels.persistentvolumeclaim }}` is expected to fill up in 4 days.
Description	The PersistentVolume claimed by `{{ $labels.persistentvolumeclaim }}` in the `{{ $labels.namespace }}` namespace is expected to fill up within four days. Currently, `{{ printf "%0.2f" $value }}%` of free space is available.

KubePersistentVolumeErrors¶

Severity	Critical
Summary	PersistentVolume `{{ $labels.persistentvolume }}` is in the `failed` or `pending` state.
Description	The PersistentVolume `{{ $labels.persistentvolume }}` is in the `failed` or `pending` state.

Kubernetes system¶

This section lists the alerts for the Kubernetes system.

For troubleshooting guidelines, see Troubleshoot Kubernetes system alerts.

KubeAPICertExpirationHigh¶

Severity	Critical
Summary	Kubernetes API certificate expires on `{{ $value \| humanizeTimestamp }}`.
Description	The SSL certificate for Kubernetes API expires on `{{ $value \| humanizeTimestamp }}`, less than 10 days are left.

KubeAPICertExpirationMedium¶

Severity	Major
Summary	Kubernetes API certificate expires on `{{ $value \| humanizeTimestamp }}`.
Description	The SSL certificate for Kubernetes API expires on `{{ $value \| humanizeTimestamp }}`, less than 30 days are left.

KubeAPIDown¶

Severity	Critical
Summary	A Kubernetes API endpoint is down.
Description	The `{{ $labels.node }}` Kubernetes API endpoint is not accessible for the last 3 minutes.

KubeAPIErrorsHighMajor¶

Severity	Major
Summary	API server is returning errors for more than 3% of requests.
Description	The `{{ $labels.instance }}` API server is returning errors for `{{ $value }}%` of requests.

KubeAPIErrorsHighWarning¶

Severity	Warning
Summary	API server is returning errors for more than 1% of requests.
Description	The API server is returning errors for `{{ $value }}%` of requests.

KubeAPIOutage¶

Severity	Critical
Summary	Kubernetes API is down.
Description	The Kubernetes API is not accessible for the last 30 seconds.

KubeAPIResourceErrorsHighMajor¶

Severity	Major
Summary	API server is returning errors for 10% of requests.
Description	The `{{ $labels.instance }}` API server is returning errors for `{{ $value }}%` of requests for `{{ $labels.resource }}` `{{ $labels.subresource }}`.

KubeAPIResourceErrorsHighWarning¶

Severity	Warning
Summary	API server is returning errors for 5% of requests.
Description	The `{{ $labels.instance }}` API server is returning errors for `{{ $value }}%` of requests for `{{ $labels.resource }}` `{{ $labels.subresource }}`.

KubeClientCertificateExpirationInOneDay¶

Removed in MCC 2.28.0 (17.3.0 and 16.3.0)

Severity	Critical
Summary	Client certificate expires in 24 hours.
Description	The client certificate used to authenticate to the API server expires in less than 24 hours.

KubeClientCertificateExpirationInSevenDays¶

Removed in MCC 2.28.0 (17.3.0 and 16.3.0)

Severity	Major
Summary	Client certificate expires in 7 days.
Description	The client certificate used to authenticate to the API server expires in less than 7 days.

KubeClientErrors¶

Severity	Warning
Summary	Kubernetes API client has more than 1% of error requests.
Description	The `{{ $labels.instance }}` Kubernetes API server client has `{{ printf "%0.0f" $value }}%` errors.

KubeDNSTargetsOutage¶

Removed in MCC 2.25.0 (17.0.0 and 16.0.0)

Severity	Critical
Summary	CoreDNS Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all CoreDNS endpoints (more than 1/10 failed scrapes).

KubeletTargetDown¶

Severity	Critical
Summary	Kubelet Prometheus target is down.
Description	Prometheus fails to scrape metrics from kubelet on the `{{ $labels.node }}` node (more than 1/10 failed scrapes).

KubeletTargetsOutage¶

Severity	Critical
Summary	Kubelet Prometheus targets outage.
Description	Prometheus fails to scrape metrics from kubelet on all nodes (more than 1/10 failed scrapes).

KubeletTooManyPods¶

Severity	Warning
Summary	kubelet reached 90% of Pods limit.
Description	The kubelet container on the `{{ $labels.node }}` Node is running `{{ $value }}` Pods, nearly 90% of possible allocation.

KubeNodeNotReady¶

Severity	Warning
Summary	Node `{{ $labels.node }}` is not ready.
Description	The `{{ $labels.node }}` Kubernetes has been unready for more than an hour.

KubernetesApiserverTargetsOutage¶

Severity	Critical
Summary	Kubernetes API server Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of Kubernetes API server endpoints.

KubernetesMasterAPITargetsOutage¶

Severity	Critical
Summary	Kubernetes master API Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of Kubernetes master API nodes.

KubeStateMetricsTargetDown¶

Severity	Critical
Summary	`kube-state-metrics` Prometheus target is down.
Description	Prometheus fails to scrape metrics from the `kube-state-metrics` service.

KubeVersionMismatch¶

Severity	Warning
Summary	Kubernetes components version mismatch.
Description	There are `{{ $value }}` different semantic versions of Kubernetes components running.

Mirantis Container Cloud¶

This section describes the alerts available for Mirantis Container Cloud services.

Helm Controller¶

This section lists the alerts for the Helm Controller service and the HelmBundle custom resources.

For troubleshooting guidelines, see Troubleshoot Helm Controller alerts.

HelmBundleReleaseNotDeployed
HelmControllerReconcileDown
HelmControllerTargetDown

HelmBundleReleaseNotDeployed¶

Severity	Critical
Summary	HelmBundle release is not deployed.
Description	The `{{ $labels.release_namespace }}`/`{{ $labels.release_name }}` release of the `{{ $labels.namespace }}`/`{{ $labels.name }}` HelmBundle reconciled by the `{{ $labels.controller_namespace }}`/ `{{ $labels.controller_name }}` Controller is not in the `deployed` status for the last 15 minutes.

HelmControllerReconcileDown¶

Severity	Critical
Summary	Helm Controller reconciliation is down.
Description	Reconciliation fails in the `{{ $labels.controller_namespace }}`/ `{{ $labels.controller_name }}` Helm Controller for the last 3 minutes.

HelmControllerTargetDown¶

Severity	Critical
Summary	Helm Controller Prometheus target is down
Description	Prometheus fails to scrape metrics from the `{{ $labels.controller_pod }}` of the `{{ $labels.controller_namespace }}`/`{{ $labels.controller_name }}` on the `{{ $labels.node }}` node.

Container Cloud¶

This section describes the alerts for Mirantis Container Cloud. These alerts are based on metrics from the Mirantis Container Cloud Metric Exporter (MCC Exporter) service.

For troubleshooting guidelines, see Troubleshoot Mirantis Container Cloud Exporter alerts.

ClusterUpdateAutoPaused
ClusterUpdateInProgress
ClusterUpdateStepAutoPaused
ClusterUpdateStepInProgress
ClusterUpdateStepStuck
ClusterUpdateStuck
MCCClusterLCMUnhealthy
MCCClusterUpdating
MCCExporterTargetDown
MCCLicenseExpirationHigh
MCCLicenseExpirationMedium
MCCUpdateBlocked
MCCUpdateScheduled

ClusterUpdateAutoPaused¶

Available since MCC 2.29.0 (17.4.0 and 16.4.0) TechPreview

Severity	Warning
Summary	The Container Cloud cluster update is auto-paused
Description	The `{{ $labels.cluster_namespace }}`/`{{ $labels.cluster_name }}` (`{{ $labels.cluster_uid }}`) cluster update to `{{ $labels.target }}` is auto-paused.

ClusterUpdateInProgress¶

:bdg-success::Available since MCC 2.28.0 (17.3.0 and 16.3.0) TechPreview

Note

Before Container Cloud 2.29.2 (Cluster releases 17.3.7, 16.4.2, and 16.3.7), this alert was named ClusterUpdateInProggress, which contained a typo.

Severity	Informational
Summary	The Container Cloud cluster is updating
Description	The `{{ $labels.cluster_namespace }}/{{ $labels.cluster_name }}` (`{{ $labels.cluster_uid }}`) cluster update to `{{ $labels.target }}` is in progress.

ClusterUpdateStepAutoPaused¶

Available since MCC 2.29.0 (17.4.0 and 16.4.0) TechPreview

Severity	Warning
Summary	Step `{{ $labels.step_id }}` of the Container Cloud cluster update is auto-paused
Description	Step `{{ $labels.step_id }}` of the `{{ $labels.cluster_namespace }}`/`{{ $labels.cluster_name }}` (`{{ $labels.cluster_uid }}`) cluster update to `{{ $labels.target }}` is auto-paused.

ClusterUpdateStepInProgress¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0) TechPreview

Note

Before Container Cloud 2.29.2 (Cluster releases 17.3.7, 16.4.2, and 16.3.7), this alert was named ClusterUpdateStepInProggress, which contained a typo.

Severity	Informational
Summary	Step `{{ $labels.step_id }}` of the Container Cloud cluster update is in progress
Description	Step `{{ $labels.step_id }}` of the `{{ $labels.cluster_namespace }}/{{ $labels.cluster_name }}` (`{{ $labels.cluster_uid }}`) cluster update to `{{ $labels.target }}` is in progress.

ClusterUpdateStepStuck¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0) TechPreview

Severity	Critical
Summary	Step `{{ $labels.step_id }}` of the Container Cloud cluster update is stuck
Description	Step `{{ $labels.step_id }}` of the `{{ $labels.cluster_namespace }}/{{ $labels.cluster_name }}` (`{{ $labels.cluster_uid }}`) cluster update to `{{ $labels.target }}` is stuck.

ClusterUpdateStuck¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0) TechPreview

Severity	Critical
Summary	The Container Cloud cluster update is stuck
Description	The `{{ $labels.cluster_namespace }}/{{ $labels.cluster_name }}` (`{{ $labels.cluster_uid }}`) cluster update to `{{ $labels.target }}` is stuck.

MCCClusterLCMUnhealthy¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0)

Severity	Major
Summary	LCM of the Container Cloud cluster is unhealthy.
Description	Some LCM operations have issues on the `{{ $labels.namespace }}/{{ $labels.name }}` cluster.

MCCClusterUpdating¶

Severity	Informational
Summary	The Container Cloud cluster is updating.
Description	The `{{ $labels.namespace }}/{{ $labels.name }}` cluster is in the updating state.

MCCExporterTargetDown¶

Severity	Critical
Summary	MCC Exporter Prometheus target is down.
Description	Prometheus fails to scrape metrics from the MCC Exporter service.

MCCLicenseExpirationHigh¶

Severity	Critical
Summary	Mirantis Container Cloud license expires on `{{ $value \| humanizeTimestamp }}`.
Description	The Mirantis Container Cloud license expires on `{{ $value \| humanizeTimestamp }}`, less than 10 days are left.

MCCLicenseExpirationMedium¶

Severity	Warning
Summary	Mirantis Container Cloud license expires on `{{ $value \| humanizeTimestamp }}`.
Description	The Mirantis Container Cloud license expires on `{{ $value \| humanizeTimestamp }}`, less than 30 days are left.

MCCUpdateBlocked¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0)

Severity	Warning
Summary	Container Cloud update is blocked
Description	The Container Cloud update from `{{ $labels.active_kaasrelease_version }}` to `{{ $labels.pending_kaasrelease_version }}` is available but blocked. For details, see Troubleshoot Mirantis Container Cloud Exporter alerts.

MCCUpdateScheduled¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0)

Severity	Informational
Summary	Container Cloud update is scheduled
Description	The Container Cloud update from `{{ $labels.active_kaasrelease_version }}` to `{{ $labels.pending_kaasrelease_version }}` is available and scheduled for `{{ $value \| humanizeTimestamp }}`. For details, see Schedule Mirantis Container Cloud updates.

Container Cloud cache¶

This section describes the alerts for the mcc-cache service.

MCCCacheTargetDown

MCCCacheTargetDown¶

Severity	Major
Summary	`mcc-cache` target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod on the `{{ $labels.node }}` node.

Container Cloud controllers¶

Available since MCC 2.23.0 (Cluster release 11.7.0)

This section describes the alerts for the mcc-controllers service.

MCCControllerTargetDown

MCCControllerTargetDown¶

Severity	Critical
Summary	`{{ $labels.component_name }}` target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.component_name }}` on the `{{ $labels.node }}` node.

Container Cloud providers¶

Available since MCC 2.23.0 (Cluster release 11.7.0)

This section describes the alerts for the mcc-providers service.

MCCProviderTargetDown

MCCProviderTargetDown¶

Severity	Critical
Summary	`{{ $labels.component_name }}` target is down.
Description	Prometheus fails to scrape metrics from the `{{ $labels.pod }}` Pod of the `{{ $labels.component_name }}` on the `{{ $labels.node }}` node.

Container Cloud SSL¶

This section describes the alerts available for SSL services and certificates in Mirantis Container Cloud.

MCCSSLCertExpirationHigh
MCCSSLCertExpirationMedium
MCCSSLProbesEndpointTargetsOutage
MCCSSLProbesFailing
MCCSSLProbesServiceTargetOutage

MCCSSLCertExpirationHigh¶

Severity	Critical
Summary	SSL certificate for a Mirantis Container Cloud service expires on `{{ $value \| humanizeTimestamp }}`.
Description	The SSL certificate for the Mirantis Container Cloud `{{ $labels.namespace }}/{{ $labels.service_name }}` service endpoints expires on `{{ $value \| humanizeTimestamp }}`, less than 10 days are left.

MCCSSLCertExpirationMedium¶

Severity	Major
Summary	SSL certificate for a Mirantis Container Cloud service expires in less than 30 days.
Description	The SSL certificate for the Mirantis Container Cloud `{{ $labels.namespace }}/{{ $labels.service_name }}` service endpoints expires on `{{ $value \| humanizeTimestamp }}`, less than 30 days are left.

MCCSSLProbesEndpointTargetsOutage¶

Severity	Major
Summary	Mirantis Container Cloud `{{ $labels.service_name }}` SSL endpoint targets outage.
Description	Prometheus fails to probe 2/3 of the Mirantis Container Cloud `{{ $labels.namespace }}`/`{{ $labels.service_name }}` SSL endpoint targets.

MCCSSLProbesFailing¶

Severity	Critical
Summary	SSL certificate probes for a Mirantis Container Cloud service are failing.
Description	SSL certificate probes for the Mirantis Container Cloud `{{ $labels.namespace }}`/`{{ $labels.service_name }}` service endpoints are failing.

MCCSSLProbesServiceTargetOutage¶

Severity	Critical
Summary	Mirantis Container Cloud `{{ $labels.service_name }}` SSL service target outage.
Description	Prometheus fails to probe the Mirantis Container Cloud `{{ $labels.namespace }}`/`{{ $labels.service_name }}` SSL service target.

Release Controller¶

This section describes the alerts for the Mirantis Container Cloud Release Controller service.

For troubleshooting guidelines, see Troubleshoot Release Controller alerts.

MCCReleaseControllerDeploymentStateCritical

MCCReleaseControllerDeploymentStateCritical¶

Severity	Critical
Summary	Release Controller deployment is missing or has 0 replicas.
Description	The Release Controller deployment is not present or scaled down to 0 replicas.

Mirantis Kubernetes Engine¶

This section describes the alerts for the Mirantis Kubernetes Engine (MKE) cluster, including the Docker Swarm service.

For troubleshooting guidelines, see Troubleshoot Mirantis Kubernetes Engine alerts.

DockerSwarmNetworkUnhealthy
DockerSwarmNodeFlapping
DockerSwarmServiceReplicasDown
DockerSwarmServiceReplicasFlapping
DockerSwarmServiceReplicasOutage
MKEAPICertExpirationHigh
MKEAPICertExpirationMedium
MKEAPIDown
MKEAPIOutage
MKEContainersUnhealthy
MKEManagerAPITargetsOutage
MKEMetricsControllerTargetsOutage
MKEMetricsEngineTargetDown
MKEMetricsEngineTargetsOutage
MKENodeDiskFullCritical
MKENodeDiskFullWarning
MKENodeFlapping
MKENodeDown

DockerSwarmNetworkUnhealthy¶

Severity	Warning
Summary	Docker Swarm network unhealthy.
Description	The `qLen` size and `NetMsg` showed unexpected output for the last 10 minutes. Verify the `NetworkDb Stats` output for the `qLen` size and `NetMsg` using journalctl -d docker. Note For the `DockerNetworkUnhealthy` alert, StackLight collects metrics from logs. Therefore, this alert is available only if logging is enabled.

DockerSwarmNodeFlapping¶

Replaced with MKENodeFlapping in MCC 2.29.0 (Cluster releases 17.4.0 and 16.4.0)

Severity	Major
Summary	Docker Swarm node is flapping.
Description	The `{{ $labels.node_name }}` Docker Swarm node (ID: `{{ $labels.node_id }}`) state flapped more than 3 times for the last 10 minutes.

DockerSwarmServiceReplicasDown¶

Severity	Major
Summary	Docker Swarm replica is down.
Description	The `{{ $labels.service_name }}` Docker Swarm `{{ $labels.service_mode }}` service replica is down for 5 minutes.

DockerSwarmServiceReplicasFlapping¶

Severity	Major
Summary	Docker Swarm service is flapping.
Description	The `{{ $labels.service_name }}` Docker Swarm `{{ $labels.service_mode }}` service replica is flapping for longer than 30 minutes.

DockerSwarmServiceReplicasOutage¶

Severity	Critical
Summary	Docker Swarm service outage.
Description	All `{{ $labels.service_name }}` Docker Swarm `{{ $labels.service_mode }}` service replicas are down for 2 minutes.

MKEAPICertExpirationHigh¶

Severity	Critical
Summary	MKE API certificate expires on `{{ $value \| humanizeTimestamp }}`
Description	The SSL certificate for MKE API expires on `{{ $value \| humanizeTimestamp }}`, less than 10 days are left.

MKEAPICertExpirationMedium¶

Severity	Major
Summary	MKE API certificate expires on `{{ $value \| humanizeTimestamp }}`
Description	The SSL certificate for MKE API expires on `{{ $value \| humanizeTimestamp }}`, less than 30 days are left.

MKEAPIDown¶

Severity	Critical
Summary	MKE API endpoint is down.
Description	The MKE API endpoint on the `{{ $labels.node }}` node is not accessible for the last 3 minutes.

MKEAPIOutage¶

Severity	Critical
Summary	MKE API is down.
Description	The MKE API (port 443) is not accessible for the last 1 minute.

MKEContainersUnhealthy¶

Severity	Major
Summary	MKE containers are `Unhealthy`.
Description	`{{ $value }}` MKE `{{ $labels.name }}` containers are `Unhealthy`.

MKEManagerAPITargetsOutage¶

Severity	Critical
Summary	MKE manager API cluster Prometheus targets outage.
Description	Prometheus fails to scrape metrics from 2/3 of MKE manager API nodes.

MKEMetricsControllerTargetsOutage¶

Severity	Critical
Summary	MKE metrics controller Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all MKE metrics controller endpoints.

MKEMetricsEngineTargetDown¶

Severity	Major
Summary	MKE metrics engine Prometheus target is down.
Description	Prometheus fails to scrape metrics from the MKE metrics engine on the `{{ $labels.node }}` node.

MKEMetricsEngineTargetsOutage¶

Severity	Critical
Summary	MKE metrics engine Prometheus targets outage.
Description	Prometheus fails to scrape metrics from the MKE metrics engine on all nodes.

MKENodeDiskFullCritical¶

Severity	Critical
Summary	MKE node disk is 95% full.
Description	The `{{ $labels.node }}` MKE node disk is 95% full.

MKENodeDiskFullWarning¶

Severity	Warning
Summary	MKE node disk is 85% full.
Description	The `{{ $labels.node }}` MKE node disk is 85% full.

MKENodeFlapping¶

Available since MCC 2.29.0 (17.4.0 and 16.4.0)

Severity	Major
Summary	MKE node is flapping.
Description	The `{{ $labels.node }}` MKE node state flapped more than 3 times for the last 10 minutes.

MKENodeDown¶

Severity	Critical
Summary	MKE node is down.
Description	The `{{ $labels.node }}` MKE node is down.

Node¶

This section describes the alerts for node state in terms of operating system and resource consumption.

General¶

This section lists the general alerts for Kubernetes nodes.

FileDescriptorUsageMajor
FileDescriptorUsageWarning
NodeDown
NodeExporterTargetDown
NodeExporterTargetsOutage
SystemCpuFullWarning
SystemLoadTooHighWarning
SystemLoadTooHighCritical
SystemDiskFullWarning
SystemDiskFullMajor
SystemMemoryFullWarning
SystemMemoryFullMajor
SystemDiskInodesFullWarning
SystemDiskInodesFullMajor

FileDescriptorUsageMajor¶

Severity	Major
Summary	Node uses 90% of file descriptors.
Description	The `{{ $labels.node }}` node uses 90% of file descriptors.

FileDescriptorUsageWarning¶

Severity	Warning
Summary	Node uses 80% of file descriptors.
Description	The `{{ $labels.node }}` node uses 80% of file descriptors.

NodeDown¶

Severity	Critical
Summary	`{{ $labels.node }}` node is down.
Description	The `{{ $labels.node }}` node is down. During the last 2 minutes Kubernetes treated the node as `Not Ready` or `Unknown` and kubelet was not accessible from Prometheus.

NodeExporterTargetDown¶

Severity	Critical
Summary	Node Exporter Prometheus target is down.
Description	Prometheus fails to scrape metrics from the Node Exporter endpoint on the `{{ $labels.node }}` node.

NodeExporterTargetsOutage¶

Severity	Critical
Summary	Node Exporter Prometheus targets outage.
Description	Prometheus fails to scrape metrics from all Node Exporter endpoints.

SystemCpuFullWarning¶

Severity	Warning
Summary	High CPU consumption.
Description	The average CPU consumption on the `{{ $labels.node }}` node is `{{ $value }}%` for 2 minutes.

SystemLoadTooHighWarning¶

Severity	Warning
Summary	System load is more than 1 per CPU.
Description	The system load per CPU on the `{{ $labels.node }}` node is `{{ $value }}` for 5 minutes.

SystemLoadTooHighCritical¶

Severity	Critical
Summary	System load is more than 2 per CPU.
Description	The system load per CPU on the `{{ $labels.node }}` node is `{{ $value }}` for 5 minutes.

SystemDiskFullWarning¶

Severity	Warning
Summary	Disk partition `{{ $labels.device }}` is 85% full.
Description	The `{{ $labels.device }}` disk partition on the `{{ $labels.node }}` node is `{{ printf "%.1f" $value }}` % full for 2 minutes.

SystemDiskFullMajor¶

Severity	Major
Summary	Disk partition `{{ $labels.device }}` is 95% full.
Description	The `{{ $labels.device }}` disk partition on the `{{ $labels.node }}` node is `{{ printf "%.1f" $value }}` % full for 2 minutes.

SystemMemoryFullWarning¶

Severity	Warning
Summary	`{{ $labels.node }}` memory warning usage.
Description	The `{{ $labels.node }}` node uses `{{ $value }}%` of memory for 10 minutes. More than 90% of memory is used and less than 8 GB of memory is available.

SystemMemoryFullMajor¶

Severity	Major
Summary	`{{ $labels.node }}` memory major usage.
Description	The `{{ $labels.node }}` node uses `{{ $value }}%` of memory for 10 minutes. More than 95% of memory is used and less than 4 GB of memory is available.

SystemDiskInodesFullWarning¶

Severity	Warning
Summary	85% of inodes for `{{ $labels.device }}` are used.
Description	The `{{ $labels.device }}` disk on the `{{ $labels.node }}` node uses `{{ printf "%.1f" $value }}` % of disk inodes for 2 minutes.

SystemDiskInodesFullMajor¶

Severity	Major
Summary	95% of inodes for `{{ $labels.device }}` are used.
Description	The `{{ $labels.device }}` disk on the `{{ $labels.node }}` node uses `{{ printf "%.1f" $value }}` % of disk inodes for 2 minutes.

Node network¶

This section lists the alerts for a Kubernetes node network.

SystemRxPacketsErrorTooHigh
SystemTxPacketsErrorTooHigh
SystemRxPacketsDroppedTooHigh
SystemTxPacketsDroppedTooHigh
NodeNetworkInterfaceFlapping

SystemRxPacketsErrorTooHigh¶

Severity	Critical
Summary	`{{ $labels.node }}` has package receive errors.
Description	The `{{ $labels.device }}` Network interface is showing receive errors on the `{{ $labels.namespace }}`/`{{ $labels.pod }}` Node Exporter Pod.

SystemTxPacketsErrorTooHigh¶

Severity	Critical
Summary	`{{ $labels.node }}` has package transmit errors.
Description	The `{{ $labels.device }}` Network interface is showing transmit errors on the `{{ $labels.namespace }}`/`{{ $labels.pod }}` Node Exporter Pod.

SystemRxPacketsDroppedTooHigh¶

Severity	Major
Summary	60 received packets were dropped.
Description	`{{ $value \| printf "%.2f" }}` packets received by the `{{ $labels.device }}` interface on the `{{ $labels.node }}` node were dropped during the last minute.

SystemTxPacketsDroppedTooHigh¶

Severity	Major
Summary	100 transmitted packets were dropped.
Description	`{{ $value \| printf "%.2f" }}` packets transmitted by the `{{ $labels.device }}` interface on the `{{ $labels.node }}` node were dropped during the last minute.

NodeNetworkInterfaceFlapping¶

Severity	Warning
Summary	`{{ $labels.node }}` has flapping interface.
Description	The `{{ $labels.device }}` Network interface is often changing its `UP` status on the `{{ $labels.namespace }}`/`{{ $labels.pod }}` Node Exporter Pod.

Node time¶

This section lists the alerts for a Kubernetes node time.

ClockSkewDetected

ClockSkewDetected¶

Severity	Warning
Summary	NTP offset reached the limit of 0.03 seconds.
Description	Clock skew was detected on the `{{ $labels.node }}` node. Verify that NTP is configured correctly on this host.

See also

Alert dependencies

Alert dependencies¶

Using alert inhibition rules, Alertmanager decreases alert noise by suppressing dependent alerts notifications to provide a clearer view on the cloud status and simplify troubleshooting. Alert inhibition rules are enabled by default.

The following tables describe the dependencies between the OpenStack-related and MOSK cluster alerts.

Once an alert from the Alert column raises, the alert from the Inhibits and rules column will be suppressed with the Inhibited status in the Alertmanager web UI.

The Inhibits and rules column lists the labels and conditions, if any, for the inhibition to apply.

Alert inhibition rules for OpenStack clusters¶

Alert	Inhibits and rules
`CassandraTombstonesTooManyCritical`	`CassandraTombstonesTooManyMajor` with the same `cassandra_cluster`, `namespace`, and `pod` labels
`CassandraTombstonesTooManyMajor`	`CassandraTombstonesTooManyWarning` with the same `cassandra_cluster`, `namespace`, and `pod` labels
`CinderServiceOutage`	Since MOSK 25.1: `CinderServiceDown` with the same `binary` and `zone` labels Before MOSK 25.1: `CinderServiceDown` with the same `binary` label
`KubeDaemonSetOutage`	`LibvirtExporterTargetsOutage` `TungstenFabricControllerOutage` `TungstenFabricControllerTargetsOutage` `TungstenFabricVrouterOutage` `TungstenFabricVrouterTargetsOutage` And other alerts described in Alert inhibition rules for MOSK clusters.
`KubeDeploymentOutage`	`RabbitMQExporterTargetDown` for the particular OpenStack service `RabbitMQOperatorTargetDown` `TelegrafOpenstackTargetDown` And other alerts described in Alert inhibition rules for MOSK clusters.
`KubeStatefulSetOutage`	`CassandraClusterTargetDown` ^{Since 23.3} `CassandraClusterTargetsOutage` ^{Before 23.3} `KafkaClusterTargetDown` ^{Since 23.3} `KafkaClusterTargetsOutage` ^{Before 23.3} `MariadbClusterDown` `MariadbExporterTargetDown` ^{Since 23.3} `MariadbExporterClusterTargetsOutage` ^{Before 23.3} `MemcachedClusterDown` `MemcachedExporterTargetDown` ^{Since 23.3} `MemcachedExporterClusterTargetsOutage` ^{Before 23.3} `OpenstackPowerDNSTargetDown` ^{Since 24.3} `OpenstackPowerDNSProbeFailure` ^{Since 24.3} `RabbitMQTargetDown` ^{Since 25.1} `RabbitMQDown` for the particular OpenStack service `ZooKeeperClusterTargetDown` ^{Since 23.3} `ZooKeeperClusterTargetsOutage` ^{Before 23.3} And other alerts described in Alert inhibition rules for MOSK clusters.
`LibvirtExporterTargetsOutage`	`LibvirtExporterTargetDown`
`MemcachedConnectionsNoneMajor`	`MemcachedConnectionsNoneWarning` with the same `namespace` label
`NeutronAgentOutage`	Since MOSK 25.1: `NeutronAgentDown` with the same `binary` and `zone` labels Before MOSK 25.1: `NeutronAgentDown` with the same `binary` label
`NodeDown`	Alerts with the same `node` label: `LibvirtExporterTargetDown` `OpenstackPowerDNSTargetDown` ^{Since 24.3} `RabbitMQTargetDown` ^{Since 25.1} Since 23.3: `CassandraClusterTargetDown` `KafkaClusterTargetDown` `MariadbExporterTargetDown` `MemcachedExporterTargetDown` `OpenstackCloudproberTargetDown` `RabbitMQOperatorTargetDown` `RabbitMQExporterTargetDown` `RedisClusterTargetDown` `ZooKeeperClusterTargetDown` And other alerts described in Alert inhibition rules for MOSK clusters.
`NovaServiceOutage`	Since MOSK 25.1: `NovaServiceDown` with the same `binary` and `zone` labels Before MOSK 25.1: `NovaServiceDown` with the same `binary` label
`OpenstackPowerDNSProbeFailure` ^{Since 24.3}	`OpenstackPowerDNSQueryDurationHigh` with the same `target_name`, `target_type`, and `protocol`
`OpenstackSSLCertExpirationHigh`	`OpenstackSSLCertExpirationMedium` with the same `namespace` and `service_name` labels
`OsDplSSLCertExpirationHigh`	`OsDplSSLCertExpirationMedium` with the same `identifier` label
`TungstenFabricControllerOutage`	`TungstenFabricControllerDown`
`TungstenFabricVrouterOutage`	`TungstenFabricVrouterDown`
`TungstenFabricVrouterTargetsOutage`	`TungstenFabricVrouterTargetDown`

Alert inhibition rules for MOSK clusters¶

Alert	Inhibits and rules
`cAdvisorTargetsOutage`	`cAdvisorTargetDown`
`CalicoTargetsOutage`	`CalicoTargetDown`
`CephClusterFullCritical`	`CephClusterFullWarning`
`CephClusterHealthCritical`	`CephClusterHealthWarning`
`CephOSDDiskNotResponding`	`CephOSDDown` with the same `rook_cluster` label ^{Before MCC 2.25.0 (17.0.0 and 16.0.0)}
`CephOSDDiskUnavailable`	`CephOSDDown` with the same `rook_cluster` label ^{Before MCC 2.25.0 (17.0.0 and 16.0.0)}
`CephOSDNodeDown` ^{Since MCC 2.25.0 (17.0.0 and 16.0.0)}	With the same `node` label: `CephOSDDiskNotResponding` `CephOSDDiskUnavailable`
`CephOSDPgNumTooHighCritical`	`CephOSDPgNumTooHighWarning`
`DockerSwarmServiceReplicasFlapping`	`DockerSwarmServiceReplicasDown` with the same `service_id`, `service_mode`, and `service_name` labels
`DockerSwarmServiceReplicasOutage`	`DockerSwarmServiceReplicasDown` with the same `service_id`, `service_mode`, and `service_name` labels
`etcdDbSizeCritical`	`etcdDbSizeMajor` with the same `job` and `instance` labels
`etcdHighNumberOfFailedGRPCRequestsCritical`	`etcdHighNumberOfFailedGRPCRequestsWarning` with the same `grpc_method`, `grpc_service`, `job`, and `instance` labels
`ExternalEndpointDown`	`ExternalEndpointTCPFailure` with the same `instance` and `job` labels
`FileDescriptorUsageMajor`	`FileDescriptorUsageWarning` with the same `node` label
`FluentdTargetsOutage`	`FluentdTargetDown`
`KubeAPICertExpirationHigh`	`KubeAPICertExpirationMedium`
`KubeAPIErrorsHighMajor`	`KubeAPIErrorsHighWarning` with the same `instance` label
`KubeAPIOutage`	`KubeAPIDown`
`KubeAPIResourceErrorsHighMajor`	`KubeAPIResourceErrorsHighWarning` with the same `instance`, `resource`, and `subresource` labels
`KubeClientCertificateExpirationInOneDay` ^{Removed in MCC 2.28.0 (17.3.0 and 16.3.0)}	`KubeClientCertificateExpirationInSevenDays` with the same `instance` label
`KubeDaemonSetOutage`	`CalicoTargetsOutage` `KubeDaemonSetRolloutStuck` with the same `daemonset` and `namespace` labels `FluentdTargetsOutage` `NodeExporterTargetsOutage` `TelegrafSMARTTargetsOutage`
`KubeDeploymentOutage`	`KubeDeploymentReplicasMismatch` with the same `deployment` and `namespace` labels `GrafanaTargetDown` `KubeDNSTargetsOutage` ^{Removed in MCC 2.25.0 (17.0.0 and 16.0.0)} `KubernetesMasterAPITargetsOutage` `KubeStateMetricsTargetDown` `PrometheusEsExporterTargetDown` `PrometheusMsTeamsTargetDown` `PrometheusRelayTargetDown` `ServiceNowWebhookReceiverTargetDown` `SfNotifierTargetDown` `TelegrafDockerSwarmTargetDown` `TelegrafOpenstackTargetDown`
`KubeJobFailed`	`KubePodsNotReady` for `created_by_kind=Job` and with the same `created_by_name` label (removed in Container Cloud 2.25.0, Cluster releases 17.0.0 and 16.0.0)
`KubeletTargetsOutage`	`KubeletTargetDown`
`KubePersistentVolumeUsageCritical`	With the same `namespace` and `persistentvolumeclaim` labels: `KubePersistentVolumeFullInFourDays` `OpenSearchStorageUsageCritical` ^{Since MCC 2.26.0 (17.1.0 and 16.1.0)} `OpenSearchStorageUsageMajor` ^{Since MCC 2.26.0 (17.1.0 and 16.1.0)}
`KubePodsCrashLooping`	`KubePodsRegularLongTermRestarts` with the same `created_by_name`, `created_by_kind`, and `namespace` labels
`KubeStatefulSetOutage`	Alerts with the same `namespace` and `statefulset` labels: `KubeStatefulSetUpdateNotRolledOut` `KubeStatefulSetReplicasMismatch` `AlertmanagerTargetDown` ^{Since MCC 2.25.0 (17.0.0 and 16.0.0)} `AlertmanagerClusterTargetDown` ^{Before MCC 2.25.0 (17.0.0 and 16.0.0)} `ElasticsearchExporterTargetDown` `FluentdTargetsOutage` `OpenSearchClusterStatusCritical` `PostgresqlReplicaDown` `PostgresqlTargetDown` ^{Since MCC 2.25.0 (17.0.0 and 16.0.0)} `PostgresqlTargetsOutage` ^{Before MCC 2.25.0 (17.0.0 and 16.0.0)} `PrometheusEsExporterTargetDown` `PrometheusServerTargetDown` ^{Since MCC 2.25.0 (17.0.0 and 16.0.0)} `PrometheusServerTargetsOutage` ^{Before MCC 2.25.0 (17.0.0 and 16.0.0)}
`MCCLicenseExpirationHigh`	`MCCLicenseExpirationMedium`
`MCCSSLCertExpirationHigh`	`MCCSSLCertExpirationMedium` with the same `namespace` and `service_name` labels
`MCCSSLProbesServiceTargetOutage`	`MCCSSLProbesEndpointTargetOutage` with the same `namespace` and `service_name` labels
`MKEAPICertExpirationHigh`	`MKEAPICertExpirationMedium`
`MKEAPIOutage`	`MKEAPIDown`
`MKEMetricsEngineTargetsOutage`	`MKEMetricsEngineTargetDown`
`MKENodeDiskFullCritical`	`MKENodeDiskFullWarning` with the same `node` label
`NodeDown`	`KubeDaemonSetMisScheduled` for the following DaemonSets (removed in Container Cloud 2.27.0, Cluster releases 17.2.0 and 16.2.0): `cadvisor` `csi-cephfsplugin` `csi-cinder-nodeplugin` `csi-rbdplugin` `fluentd-logs` `local-volume-provisioner` `metallb-speaker` `openstack-ccm` `prometheus-libvirt-exporter` `prometheus-node-exporter` `rook-discover` `telegraf-ds-smart` `ucp-metrics` `KubeDaemonSetRolloutStuck` for the `calico-node`, `ucp-node-feature-discovery` (since Container Cloud 2.29.0, Cluster releases 17.4.0 and 16.4.0), and `ucp-nvidia-device-plugin` DaemonSets For `resource=nodes`: `KubeAPIResourceErrorsHighMajor` `KubeAPIResourceErrorsHighWarning` Alerts with the same `node` label: `cAdvisorTargetDown` `CalicoTargetDown` `FluentdTargetDown` `KubeletDown` `KubeletTargetDown` `KubeNodeNotReady` `LibvirtExporterTargetDown` `MKEMetricsEngineTargetDown` `MKENodeDown` `NodeExporterTargetDown` `TelegrafSMARTTargetDown` Since MCC 2.25.0 (Cluster releases 17.0.0 and 16.0.0)`: `AlertmanagerTargetDown` `CephClusterTargetDown` `etcdTargetDown` `GrafanaTargetDown` `HelmControllerTargetDown` `KubeAPIDown` `MCCCacheTargetDown` `MCCControllerTargetDown` `MCCProviderTargetDown` `MKEAPIDown` `PostgresqlTargetDown` `PrometheusMsTeamsTargetDown` `PrometheusRelayTargetDown` `PrometheusServerTargetDown` `ServiceNowWebhookReceiverTargetDown` `SfNotifierTargetDown` `TelegrafDockerSwarmTargetDown` `TelemeterClientTargetDown` `TelemeterServerFederationTargetDown` `TelemeterServerTargetDown`
`NodeExporterTargetsOutage`	`NodeExporterTargetDown`
`OpenSearchClusterStatusCritical`	`OpenSearchClusterStatusWarning` and `OpenSearchNumberOfUnassignedShards` (removed in Container Cloud 2.27.0, Cluster releases 17.2.0 and 16.2.0) with the same `cluster` label For `created_by_name=~"elasticsearch-curator-."`: `KubeJobFailed` `KubePodsNotReady` (removed in Container Cloud 2.27.0, Cluster releases 17.0.0 and 16.0.0)
`OpenSearchClusterStatusWarning` ^{Since MCC 2.26.0 (17.1.0 and 16.1.0)}	`OpenSearchNumberOfUnassignedShards` with the same `cluster` label (removed in Container Cloud 2.27.0, Cluster releases 17.2.0 and 16.2.0)
`OpenSearchHeapUsageCritical`	`OpenSearchHeapUsageWarning` with the same `cluster` and `name` labels
`OpenSearchStorageUsageCritical` ^{Since MCC 2.26.0 (17.1.0 and 16.1.0)}	`KubePersistentVolumeFullInFourDays` and `OpenSearchStorageUsageMajor` with the same `namespace` and `persistentvolumeclaim` labels
`OpenSearchStorageUsageMajor` ^{Since MCC 2.26.0 (17.1.0 and 16.1.0)}	`KubePersistentVolumeFullInFourDays` with the same `namespace` and `persistentvolumeclaim` labels
`PostgresqlPatroniClusterUnlocked`	With the same `cluster` and `namespace` labels: `PostgresqlReplicationNonStreamingReplicas` `PostgresqlReplicationPaused`
`PostgresqlReplicaDown`	Alerts with the same `cluster` and `namespace` labels: `PostgresqlReplicationNonStreamingReplicas` `PostgresqlReplicationPaused` `PostgresqlReplicationSlowWalApplication` `PostgresqlReplicationSlowWalDownload` `PostgresqlReplicationWalArchiveWriteFailing`
`PrometheusErrorSendingAlertsMajor`	`PrometheusErrorSendingAlertsWarning` with the same `alertmanager` and `pod` labels
`SystemDiskFullMajor`	`SystemDiskFullWarning` with the same `device`, `mountpoint`, and `node` labels
`SystemDiskInodesFullMajor`	`SystemDiskInodesFullWarning` with the same `device`, `mountpoint`, and `node` labels
`SystemLoadTooHighCritical`	`SystemLoadTooHighWarning` with the same `node` label
`SystemMemoryFullMajor`	`SystemMemoryFullWarning` with the same `node` label
`SSLCertExpirationHigh`	`SSLCertExpirationMedium` with the same `instance` label
`TelegrafSMARTTargetsOutage`	`TelegrafSMARTTargetDown`
`TelemeterServerTargetDown`	`TelemeterServerFederationTargetDown`

Silence alerts¶

Due to the Alertmanager issue, silences with regexp matchers do not mute all notifications for all alerts matched by the specified regular expression.

If you need to mute multiple alerts, for example, for maintenance or before cluster update, Mirantis recommends using a set of fixed-matcher silences instead. As an example, this section describes how to silence all alerts for a specified period through the Alertmanager web UI or CLI without using the regexp matchers. You can also manually force silence expiration before the specified period ends.

To silence all alerts:

Silence alerts through the Alertmanager web UI:
1. Log in to the Alertmanager web UI as described in Getting access.
2. Click New Silence.
3. Create four Prometheus Alertmanager silences. In Matchers, set Name to severity and Value to warning, minor, major, and critical, one for each silence.
  
  Note
  
  To silence the Watchdog alert, create an additional silence with severity set in Name and informational set in Value.

Silence alerts through CLI:

Run the following command setting the required duration:

kubectl exec -it -n stacklight prometheus-alertmanager-1 prometheus-alertmanager -- sh -c 'rm -f /tmp/all_silences; \
 touch /tmp/all_silences; \
 for severity in warning minor major critical; do \
   echo $severity; \
   amtool silence add severity=${severity} \
     --alertmanager.url=<http://prometheus-alertmanager> \
     --comment="silence them all" \
     --duration="2h" | tee /tmp/all_silences; \
 done'

Note

To silence the Watchdog alert, add informational to the list of severities.

To expire alert silences:

To expire alert silences through the Alertmanager web UI, click Expire next to each silence.

To expire alert silences through CLI, run the following command:

kubectl exec -it -n stacklight prometheus-alertmanager-1 prometheus-alertmanager -- sh -c 'for silence in $(cat /tmp/all_silences); do \
    echo $severity; \
    amtool silence expire $silence \
      --alertmanager.url=<http://prometheus-alertmanager;> \
  done'

StackLight rules for Kubernetes network policies¶

Available since Cluster releases 17.0.1 and 16.0.1

The Kubernetes NetworkPolicy resource allows controlling network connections to and from Pods within a cluster. This enhances security by restricting communication from compromised Pod applications and provides transparency into how applications communicate with each other.

Network Policies are enabled by default in StackLight using the networkPolicies parameter. For configuration details, see Kubernetes network policies.

The following table contains general network policy rules applied to StackLight components:

Network policy rules for StackLight¶
Network policy rule	Component
Deny all ingress for Pods not expecting incoming traffic (including Prometheus scrape)	Elasticsearch curator Fluentd notifications Metric collector Metricbeat `sf-reporter`
Deny all egress for Pods not expecting outgoing traffic	cAdvisor Prometheus libvirt Exporter `telegraf-ds-smart`
Allow all ingress for Pods that can be exposed through load balancers	Alerta Grafana OpenSearch dashboards Prometheus Alertmanager (because of web UI) Prometheus Server (because of web UI)
Allow all egress for Pods connecting to outside world or external APIs (Kubernetes, Docker, Keycloak, OpenStack)	`alertmanager-webhook-servicenow` (ServiceNow webhook) Fluentd logs Fluentd notifications Grafana Helm Controller IAM proxy Metric Collector OpenSearch Patroni Prometheus Alertmanager Prometheus `kube-state-metrics` Prometheus MS Teams Prometheus Server `sf-notifier` `sf-reporter` Telegraf Docker Swarm Telegraf OpenStack Telemeter Client Telemeter Server
Allow DNS traffic from all Pods specifying communication endpoints of other StackLight workloads.	Alerta Elasticsearch Curator Elasticsearch Exporter Opensearch Dashboards Prometheus-es-exporter Prometheus Relay

The following exceptions apply to the StackLight network policy rules:

Because Prometheus Node Exporter uses the host network, the allow-all rule applies to both ingress and egress that is the no-op placeholder.
Due to dynamically created scrape configurations, the allow-all rule applies to egress for Prometheus Server.

Calculate the storage retention time¶

Obsolete since MCC 2.26.0 (17.1.0, 16.1.0) for OpenSearch

Caution

In Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), the storage-based log retention together with the updated proportion of available disk space replaces the estimated storage retention management in OpenSearch. For details, see Storage-based log retention strategy.

The logging.retentionTime parameter is removed from the StackLight configuration. While the Estimated Retention panel of the OpenSearch dashboard in Grafana can provide some information, it does not provide any guarantees. The panel is removed in Container Cloud 2.26.1 (Cluster releases 17.1.1 and 16.1.1). Therefore, consider this section as obsolete for OpenSearch.

Using the following panels in the OpenSearch and Prometheus dashboards, you can view details about the storage usage on managed clusters. These details allow you to calculate the possible retention time based on provisioned storage and its average usage:

OpenSearch dashboard:
- Shards > Estimated Retention
- Resources > Disk
- Resources > File System Used Space by Percentage
- Resources > Stored Indices Disk Usage
- Resources > Age of Logs
Prometheus dashboard:
- General > Estimated Retention
- Resources > Storage
- Resources > Storage by Percentage

To calculate the storage retention time:

Log in to the Grafana web UI. For details, see Getting access.
Assess the OpenSearch and Prometheus dashboards. For details on Grafana dashboards, see View Grafana dashboards.
On each dashboard, select the required period for calculation.

Tip

Mirantis recommends analyzing at least one day of data collected in the respective component to benefit from results presented on the Estimated Retention panels.
Assess the Cluster > Estimated Retention panel of each dashboard.

The panel displays maximum possible retention days while other panels provide details on utilized and available storage.
If persistent volumes of some StackLight components share storage, partition the storage logically to separate components before estimating the retention threshold. This is required since the Estimated Retention panel uses the entire provisioned storage as the calculation base.

For example, if StackLight is deployed in the default HA mode, then it uses Local Volume Provisioner that provides shared storage unless two separate partitions are configured for each cluster node for exclusive use of Prometheus and OpenSearch.

Two main storage provisioners are OpenSearch and Prometheus. The level of storage usage by other StackLight components is relatively low. For example, you can share storage logically as follows:
- 35% for Prometheus
- 35% for OpenSearch
- 30% for other components
In this case, take 35% of the calculated maximum retention value and set it as threshold.
In the Prometheus dashboard, navigate to Resources (Row) > Storage (Panel) > total provisioned disk per pod (Metric) to verify the retention size for the Prometheus storage.

If both retention time and size are set, Prometheus applies retention to the first reached threshold.

Caution

Mirantis does not recommend setting the retention size to 0 and replying on the retention time only.

You can change the retention settings through either the web UI or API:

Using the Container Cloud web UI, navigate to the Configure cluster menu and use the StackLight tab
Using the Container Cloud API:
- For OpenSearch, use the logging.retentionTime parameter
- For Prometheus, use the prometheusServer.retentionTime and prometheusServer.retentionSize parameters

For details, see Change a cluster configuration and Configure StackLight.

Tune StackLight for long-term log retention¶

Available since MCC 2.24.0 (Cluster release 14.0.0)

If you plan to switch to a long log retention period (months), tune StackLight by increasing the cluster.max_shards_per_node limit. This configuration enables OpenSearch to successfully accept new logs and prevents the maximum open shards error.

To tune StackLight for long-term log retention:

Increase the cluster.max_shards_per_node limit:

logging:
  extraConfig:
    cluster.max_shards_per_node: 10000

If you increase the limit to more than double the default value, increase the memory and CPU limit for opensearch to prevent MaxHeapUsage warnings.

For example, if you set cluster.max_shards_per_node: 20000, configure the resources:opensearch:limits section as follows:
```
resources:
  opensearch:
    limits:
      cpu: "8"
      memory: "45Gi"
```

Enable log forwarding to external destinations¶

Available since MCC 2.23.0 (Cluster release 11.7.0)

By default, StackLight sends logs to OpenSearch. However, you can configure StackLight to add external Elasticsearch, OpenSearch, and syslog destinations as the fluentd-logs output. In this case, StackLight will send logs both to an external server(s) and OpenSearch.

Since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0), you can also enable sending of Container Cloud service logs to Splunk using the syslog external output configuration. The feature is available in the Technology Preview scope.

Warning

Sending logs to Splunk implies that the target Splunk instance is available from the MOSK cluster. If proxy is enabled, the feature is not supported.

Prior to enabling the functionality, complete the following prerequisites:

Enable StackLight logging
Deploy an external server outside MOSK
Make sure that Container Cloud proxy is not enabled since it only supports the HTTP(S) traffic
For Splunk, configure the server to accept logs:
- Create an index and set its type to Event
- Configure data input:
  - Open the required port
  - Configure the required protocol (TCP/UDP)
  - Configure connection to the created index

To enable log forwarding to external destinations:

Perform the steps 1-2 described in StackLight configuration procedure.

In the stacklight.values section of the opened manifest, configure the logging.externalOutputs parameters using the following table.

Key	Description	Example values
`disabled` (bool)	Optional. Disables the output destination using `disabled: true`. If not set, defaults to `disabled: false`.	`true` or `false`
`type` (string)	Required. Specifies the type of log destination. The following values are accepted: `elasticsearch`, `opensearch`, `remote_syslog`, and `opensearch_data_stream` (since Container Cloud 2.26.0, Cluster releases 17.1.0 and 16.1.0).	`remote_syslog`
`level` (string) ^{Removed in MCC 2.26.0 (17.1.0, 16.1.0)}	Optional. Sets the least important level of log messages to send. For example, values that are defined using the `severity_label` field, see the `logging.level` description in Logging.	`warning`
`plugin_log_level` (string)	Optional. Defaults to `info`. Sets the value of `@log_level` of the output plugin for a particular backend. For other available values, refer to the `logging.level` description in Logging.	`notice`
`tag_exclude` (string)	Optional. Overrides `tag_include`. Sets logs by tags to exclude from the destination output. For example, to exclude all logs with the `test` tag, set `tag_exclude: '/.test./'`. How to obtain tags for logs Select from the following options: In the main OpenSearch output, use the `logger` field that equals the tag. Use logs of a particular Pod or container by following the below order, with the first match winning: The value of the `app` Pod label. For example, for `app=opensearch-master`, use `opensearch-master` as the log tag. The value of the `k8s-app` Pod label. The value of the `app.kubernetes.io/name` Pod label. If a `release_group` Pod label exists and the component Pod label starts with `app`, use the value of the component label as the tag. Otherwise, the tag is the application label joined to the component label with a `-`. The name of the container from which the log is taken. The values for `tag_exclude` and `tag_include` are placed into `<match>` directives of Fluentd and only accept regex types that are supported by the `<match>` directive of Fluentd. For details, refer to the Fluentd official documentation.	`'{fluentd-logs,systemd}'`
`tag_include` (string)	Optional. Is overridden by `tag_exclude`. Sets logs by tags to include to the destination output. For example, to include all logs with the `auth` tag, set `tag_include: '/.auth./'`.	`'/.auth./'`
`<pluginConfigOptions>` (map)	Configures plugin settings. Has a hierarchical structure. The first-level configuration parameters are dynamic except `type`, `id`, and `log_level` that are reserved by StackLight. For available options, refer to the required plugin documentation. Mirantis does not set any default values for plugin configuration settings except the reserved ones. The second-level configuration options are predefined and limited to `buffer` (for any type of log destination) and `format` (for `remote_syslog` only). Inside the second-level configuration, the parameters are dynamic. For available configuration options, refer to the following documentation: fluent-plugin-elasticsearch 5.1.5 fluent-plugin-opensearch 1.0.10 fluent-plugin-remote_syslog 1.0.0	First-level configuration options: elasticsearch: ... tag_exclude: '{fluentd-logs,systemd}' host: elasticsearch-host port: 9200 logstash_date_format: '%Y.%m.%d' logstash_format: true logstash_prefix: logstash ... Second-level configuration options: syslog: format: "@type": single_value message_key: message
`buffer` (map)	Configures buffering of events using the second-level configuration options. Applies to any type of log destinations. Parameters are dynamic except the following mandatory ones that should not be modified: `type: file` that sets the default buffer type `path: <pathToBufferFile>` that sets the path to the buffer destination file `overflow_action: block` that prevents Fluentd from crashing if the output destination is down For details about other mandatory and optional `buffer` parameters, see the Fluentd: Output Plugins documentation. Note To disable `buffer` without deleting it, use `buffer.disabled: true`.	buffer: # disabled: false chunk_limit_size: 16m flush_interval: 15s flush_mode: interval overflow_action: block
`output_kind` (string) ^{Since MCC 2.26.0 (17.1.0, 16.1.0)}	Configures the type of logs to forward. If set to `audit`, only audit logs are forwarded. If unset, only system logs are forwarded.	opensearch: output_kind: audit

Note

Mirantis recommends that you tune the packet_size parameter value to allow sending full log lines.

This parameter defines the packet size in bytes for the syslog logging output. It is useful for syslog setups allowing packet size larger than 1 kB.

Optional. Mount authentication secrets for the required external destination to Fluentd using logging.externalOutputSecretMounts. For the parameter options, see Logging to external outputs: secrets.

Example command to create a secret:
```
kubectl -n stacklight create secret generic elasticsearch-certs \
  --from-file=./ca.pem \
  --from-file=./client.pem \
  --from-file=./client.key
```
Recommended. Increase the CPU limit for the fluentd-logs DaemonSet by 50% of the original value per each external output.

The following table describes default and recommended limits for the fluentd-logs DaemonSet per external destination on clusters of different sizes:

CPU limits for fluentd-logs per external output¶

Cluster size

Default CPU limit

Recommended CPU limit

Small

1000m

1500m

Medium

1500m

2250m

Large

2000m

3000m

To increase the CPU limit for fluentd-logs, configure the resources StackLight parameter. For details, see StackLight configuration procedure and Resource limits.
Verify remote logging to syslog as described in Verify StackLight after configuration.

Note

If Fluentd cannot flush logs and the buffer of the external output: starts to fill depending on resources and configuration of the external Elasticsearch or OpenSearch server, the Data too large, circuit_breaking_exception error may occur even after you resolve the external output issues.

This error indicates that the output destination cannot accept logs data sent in bulk because of their size. To mitigate the issue, select from the following options:

Set bulk_message_request_threshold to 10MB or lower. It is unlimited by default. For details, see the Fluentd plugin documentation for Elasticsearch.
Adjust output destinations to accept a large amount of data at once. For details, refer to the official documentation of the required external system.

Enable remote logging to syslog¶

Deprecated since MCC 2.23.0 (Cluster release 11.7.0)

Caution

Since Container Cloud 2.23.0 (Cluster release 11.7.0), this procedure and the logging.syslog parameter are deprecated. For a new configuration of remote logging to syslog, follow the Enable log forwarding to external destinations procedure instead.

By default, StackLight sends logs to OpenSearch. However, you can configure StackLight to forward all logs to an external syslog server. In this case, StackLight will send logs both to the syslog server and to OpenSearch. Prior to enabling the functionality, consider the following requirements:

StackLight logging must be enabled
A remote syslog server must be deployed outside MOSK
Container Cloud proxy must not be enabled since it only supports the HTTP(S) traffic

To enable sending of logs to syslog:

Perform the steps 1-2 described in StackLight configuration procedure.
In the stacklight.values section of the opened manifest, configure the logging.syslog parameters as described in StackLight configuration parameters.

For example:
```
logging:
  enabled: true
  syslog:
    enabled: true
    host: remote-syslog.svc
    port: 514
    packetSize: 1024
    protocol: tcp
    tls:
      enabled: true
      certificate:
        secret: ""
        hostPath: "/etc/ssl/certs/ca-bundle.pem"
      verify_mode: 1
```
Note
- Mirantis recommends that you tune the packetSize parameter value to allow sending full log lines.
- The hostname field in the remote syslog database will be set based on clusterId specified in the StackLight chart values. For example, if clusterId is ns/cluster/example-uid, the hostname will transform to ns_cluster_example-uid. For details, see clusterId in StackLight configuration parameters.
Verify remote logging to syslog as described in Verify StackLight after configuration.

Create logs-based metrics¶

StackLight provides a vast variety of metrics for MOSK components. However, you may need to create a custom log-based metric to use it for alert notifications, for example, in the following cases:

If a component producing logs does not expose scraping targets. In this case, component-specific metrics may be missing.
If a scraping target lacks information that can be collected by aggregating the log messages.
If alerting reasons are more explicitly presented in log messages.

For example, you want to receive alert notifications when more than 10 cases are created in Salesforce within an hour. The sf-notifier scraping endpoint does not expose such information. However, sf-notifier logs are stored in OpenSearch and using prometheus-es-exporter you can perform the following:

Configure a query using Query DSL (Domain Specific Language) and test it in Dev Tools in in OpenSearch Dashboards.
Configure Prometheus Elasticsearch Exporter to expose the result as a Prometheus metric showing the total amount of Salesforce cases created daily, for example, salesforce_cases_daily_total_value.
Configure StackLight to send a notification once the value of this metric increases by 10 or more within an hour.

Caution

StackLight logging must be enabled and functional.
Prometheus-es-exporter uses OpenSearch Search API. Therefore, configured queries must be tuned for this specific API and must include:
- The query part to filter documents
- The aggregation part to combine filtered documents into a metric-oriented result
For details, see Supported Aggregations.

The following procedure is based on the salesforce_cases_daily_total_value metric described in the example above.

To create a custom logs-based metric:

Perform steps 1-2 as described in Configure StackLight.
In the manifest that opens, verify that StackLight logging is enabled:
```
logging:
  enabled: true
```

Create a query using Query DSL:

Select one of the following options:
Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)
In the OpenSearch Dashboards web UI, select an index to query. StackLight stores logs in hourly OpenSearch indices.

Note

Optimize the query time by limiting the number of results. For example, we will use the OpenSearch event.provider field set to sf-notifier to limit the number of logs to search.

For example:
GET system/_search { "query": { "bool": { "filter": [ { "term": { "event.provider": { "value": "sf-notifier" } } }, { "range": { "@timestamp": { "gte": "now/d" } } } ] } } }
Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)
In the OpenSearch Dashboards web UI, select an index to query. StackLight stores logs in hourly OpenSearch indices. To select all indices for a day, use the <logstash-{now/d}*> index pattern, which stands for %3Clogstash-%7Bnow%2Fd%7D*%3E when URL-encoded.

Note

Optimize the query time by limiting the number of results. For example, we will use the OpenSearch logger field set to sf-notifier to limit the number of logs to search.

For example:
GET /%3Clogstash-%7Bnow%2Fd%7D*%3E/_search { "query": { "bool": { "must": { "term": { "logger": { "value": "sf-notifier" } } } } } }
Test the query in Dev Tools in OpenSearch Dashboards.
Select the log lines that include information about Salesforce cases creation. For the info logging level, to indicate case creation, sf-notifier produces log messages similar to the following one:
```
[2021-07-02 12:35:28,596] INFO in client: Created case: OrderedDict([('id', '5007h000007iqmKAAQ'), ('success', True), ('errors', [])]).
```
Such log messages include the Created case phrase. Use it in the query to filter log messages for created cases:
```
"filter": {
  "match_phrase_prefix" : {
    "message" : "Created case"
  }
}
```

Combine the query result to a single value that prometheus-es-exporter will expose as a metric. Use the value_count aggregation:

Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)

GET system/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "event.provider": {
              "value": "sf-notifier"
            }
          }
        },
        {
          "range": {
            "@timestamp": {
              "gte": "now/d"
            }
          }
        },
        {
          "match_phrase_prefix" : {
            "message" : "Created case"
          }
        }
      ]
    }
  },
  "aggs" : {
    "daily_total": {
      "value_count": {
        "field" : "event.provider"
      }
    }
  }
}

Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)

GET /%3Clogstash-%7Bnow%2Fd%7D*%3E/_search
{
  "query": {
    "bool": {
      "must": {
        "term": {
          "logger": {
            "value": "sf-notifier"
          }
        }
      },
      "filter": {
        "match_phrase_prefix" : {
          "message" : "Created case"
        }
      }
    }
  },
  "aggs" : {
    "daily_total": {
      "value_count": {
        "field" : "logger"
      }
    }
  }
}

The aggregation result in Dev Tools should look as follows:

"aggregations" : {
  "daily_total" : {
    "value" : 19
  }
}

Note

The metric name is suffixed with the aggregation name and the result field name: salesforce_cases_daily_total_value.

Configure Prometheus Elasticsearch Exporter:

In StackLight values of the cluster resource, specify the new metric using the logging.metricQueries parameter and configure the query parameters as described in StackLight configuration parameters: logging.metricQueries.

In the example below, salesforce_cases is the query name. The final metric name can be generalized using the <query_name>_<aggregation_name>_<aggregation_result_field_name> template.

Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)

logging:
  metricQueries:
    salesforce_cases:
      indices: system
      interval: 600
      timeout: 60
      onError: preserve
      onMissing: zero
      body: "{\"query\":{\"bool\":{\"filter\":[{\"term\":{\"event.provider\":{\"value\":\"sf-notifier\"}}},{\"range\":{\"@timestamp\":{\"gte\":\"now/d\"}}},{\"match_phrase_prefix\":{\"message\":\"Created case\"}}]}},\"aggs\":{\"daily_total\":{\"value_count\":{\"field\":\"event.provider\"}}}}"

Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)

logging:
  metricQueries:
    salesforce_cases:
      indices: '<logstash-{now/d}*>'
      interval: 600
      timeout: 60
      onError: preserve
      onMissing: zero
      body: "{\"query\":{\"bool\":{\"must\":{\"term\":{\"logger\":{\"value\":\"sf-notifier\"}}},\"filter\":{\"match_bool_prefix\":{\"message\":\"Created case\"}}}},\"aggs\":{\"daily_total\":{\"value_count\":{\"field\":\"logger\"}}}}"

Verify that the prometheus-es-exporter ConfigMap has been updated:

kubectl describe cm -n stacklight prometheus-es-exporter

Example of system response:

Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)

QueryOnError = preserve
QueryOnMissing = zero
QueryJson = "{\"aggs\":{\"component\":{\"terms\":{\"field\":\"event.provider\"}}},\"query\":{\"match_all\":{}},\"size\":0}"
[query_salesforce_cases]
QueryIntervalSecs = 600
QueryTimeoutSecs = 60
QueryIndices = system
QueryOnError = preserve
QueryOnMissing = zero
QueryJson = "{\"query\":{\"bool\":{\"filter\":[{\"term\":{\"event.provider\":{\"value\":\"sf-notifier\"}}},{\"range\":{\"@timestamp\":{\"gte\":\"now/d\"}}},{\"match_phrase_prefix\":{\"message\":\"Created case\"}}]}},\"aggs\":{\"daily_total\":{\"value_count\":{\"field\":\"event.provider\"}}}}"

Events:  <none>

Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)

QueryOnError = preserve
QueryOnMissing = zero
QueryJson = "{\"aggs\":{\"component\":{\"terms\":{\"field\":\"logger\"}}},\"query\":{\"match_all\":{}},\"size\":0}"
[query_salesforce_cases]
QueryIntervalSecs = 600
QueryTimeoutSecs = 60
QueryIndices = <logstash-{now/d}*>
QueryOnError = preserve
QueryOnMissing = zero
QueryJson = "{\"query\":{\"bool\":{\"must\":{\"term\":{\"logger\":{\"value\":\"sf-notifier\"}}},\"filter\":{\"match_phrase_prefix\":{\"message\":\"Created case\"}}}},\"aggs\":{\"daily_total\":{\"value_count\":{\"field\":\"logger\"}}}}"

Events:  <none>

ConfigMap update triggers the prometheus-es-exporter pod restart.

Verify that the newly configured query has been executed.

kubectl logs -f -n stacklight <prometheus-es-exporter-pod-id>

Example of system response:

[...]
[2021-08-04 12:08:51,989] opensearch.info MainThread POST http://opensearch-master:9200/%3Cnotification-%7Bnow%2Fd%7D%3E/_search [status:200 request:0.040s]
[2021-08-04 12:08:52,089] opensearch.info MainThread POST http://opensearch-master:9200/%3Cnotification-%7Bnow%2Fd%7D%3E/_search [status:200 request:0.100s]
[2021-08-04 12:08:54,469] opensearch.info MainThread POST http://opensearch-master:9200/%3Csystem-%7Bnow%2Fd%7D*%3E/_search [status:200 request:2.278s]

Once done, prometheus-es-exporter will expose metrics from Prometheus in its scraping endpoint. You can view the new metric in the Prometheus web UI.

Optional. Configure StackLight notifications:

Add a new alert as described in Alert configuration. For example:

prometheusServer:
  customAlerts:
  - alert: SalesforceCasesDailyWarning
    annotations:
      description: The number of cases created today in Salesforce increased by 10 within the last hour.
      summary: Too many cases in Salesforce
    expr: increase(salesforce_cases_daily_total_value[1h]) >= 10
    labels:
      severity: warning
      service: custom

Configure receivers as described in StackLight configuration parameters. For example, to send alert notifications to Slack only:

alertmanagerSimpleConfig:
  slack:
    enabled: true
    api_url: https://hooks.slack.com/services/i45f3k3/w3bh00kU9L/06vi0u5ly
    channel: Slackbot
    route:
      match:
        alertname: SalesforceCasesDailyWarning
  salesForce:
    enabled: true
    route:
      routes:
        - receiver: HTTP-slack
          match:
          - alertname: SalesforceCasesDailyWarning

Enable generic metric scraping¶

StackLight can scrape metrics from any service that exposes Prometheus metrics and is running on the Kubernetes cluster. Such metrics appear in Prometheus under the {job="stacklight-generic",service="<service_name>",namespace="<service_namespace>"} set of labels. If the Kubernetes service is backed by Kubernetes pods, the set of labels also includes {pod="<pod_name>"}.

To enable the functionality, define at least one of the following annotations in the service metadata:

"generic.stacklight.mirantis.com/scrape-port" - the HTTP endpoint port. By default, the port number found through Kubernetes service discovery, usually __meta_kubernetes_pod_container_port_number. If none discovered, use the default port for the chosen scheme.
"generic.stacklight.mirantis.com/scrape-path" - the HTTP endpoint path, related to the Prometheus scrape_config.metrics_path option. By default, /metrics.
"generic.stacklight.mirantis.com/scrape-scheme" - the HTTP endpoint scheme between HTTP and HTTPS, related to the Prometheus scrape_config.scheme option. By default, http.

For example:

metadata:
  annotations:
    "generic.stacklight.mirantis.com/scrape-path": "/metrics"

metadata:
  annotations:
    "generic.stacklight.mirantis.com/scrape-port": "8080"

Manage metrics filtering¶

Available since MCC 2.24.0 (Cluster release 14.0.0)

By default, StackLight drops unused metrics to increase Prometheus performance providing better resource utilization and faster query response. The following list contains white-listed scrape jobs grouped by the job name. Prometheus collects metrics from this list by default.

White list of Prometheus scrape jobs

_group-blackbox-metrics:
  - probe_dns_lookup_time_seconds
  - probe_duration_seconds
  - probe_http_content_length
  - probe_http_duration_seconds
  - probe_http_ssl
  - probe_http_uncompressed_body_length
  - probe_ssl_earliest_cert_expiry
  - probe_success
_group-controller-runtime-metrics:
  - workqueue_adds_total
  - workqueue_depth
  - workqueue_queue_duration_seconds_count
  - workqueue_queue_duration_seconds_sum
  - workqueue_retries_total
  - workqueue_work_duration_seconds_count
  - workqueue_work_duration_seconds_sum
_group-etcd-metrics:
  - etcd_cluster_version
  - etcd_debugging_snap_save_total_duration_seconds_sum
  - etcd_disk_backend_commit_duration_seconds_bucket
  - etcd_disk_backend_commit_duration_seconds_count
  - etcd_disk_backend_commit_duration_seconds_sum
  - etcd_disk_backend_snapshot_duration_seconds_count
  - etcd_disk_backend_snapshot_duration_seconds_sum
  - etcd_disk_wal_fsync_duration_seconds_bucket
  - etcd_disk_wal_fsync_duration_seconds_count
  - etcd_disk_wal_fsync_duration_seconds_sum
  - etcd_mvcc_db_total_size_in_bytes
  - etcd_network_client_grpc_received_bytes_total
  - etcd_network_client_grpc_sent_bytes_total
  - etcd_network_peer_received_bytes_total
  - etcd_network_peer_sent_bytes_total
  - etcd_server_go_version
  - etcd_server_has_leader
  - etcd_server_leader_changes_seen_total
  - etcd_server_proposals_applied_total
  - etcd_server_proposals_committed_total
  - etcd_server_proposals_failed_total
  - etcd_server_proposals_pending
  - etcd_server_quota_backend_bytes
  - etcd_server_version
  - grpc_server_handled_total
  - grpc_server_started_total
_group-go-collector-metrics:
  - go_gc_duration_seconds
  - go_gc_duration_seconds_count
  - go_gc_duration_seconds_sum
  - go_goroutines
  - go_info
  - go_memstats_alloc_bytes
  - go_memstats_alloc_bytes_total
  - go_memstats_buck_hash_sys_bytes
  - go_memstats_frees_total
  - go_memstats_gc_sys_bytes
  - go_memstats_heap_alloc_bytes
  - go_memstats_heap_idle_bytes
  - go_memstats_heap_inuse_bytes
  - go_memstats_heap_released_bytes
  - go_memstats_heap_sys_bytes
  - go_memstats_lookups_total
  - go_memstats_mallocs_total
  - go_memstats_mcache_inuse_bytes
  - go_memstats_mcache_sys_bytes
  - go_memstats_mspan_inuse_bytes
  - go_memstats_mspan_sys_bytes
  - go_memstats_next_gc_bytes
  - go_memstats_other_sys_bytes
  - go_memstats_stack_inuse_bytes
  - go_memstats_stack_sys_bytes
  - go_memstats_sys_bytes
  - go_threads
_group-process-collector-metrics:
  - process_cpu_seconds_total
  - process_max_fds
  - process_open_fds
  - process_resident_memory_bytes
  - process_start_time_seconds
  - process_virtual_memory_bytes
_group-rest-client-metrics:
  - rest_client_request_latency_seconds_count
  - rest_client_request_latency_seconds_sum
_group-service-handler-metrics:
  - service_handler_count
  - service_handler_sum
_group-service-http-metrics:
  - service_http_count
  - service_http_sum
_group-service-reconciler-metrics:
  - service_reconciler_count
  - service_reconciler_sum
alertmanager-webhook-servicenow:
  - servicenow_auth_ok
blackbox: []
blackbox-external-endpoint: []
cadvisor:
  - cadvisor_version_info
  - container_cpu_cfs_periods_total
  - container_cpu_cfs_throttled_periods_total
  - container_cpu_usage_seconds_total
  - container_fs_reads_bytes_total
  - container_fs_reads_total
  - container_fs_writes_bytes_total
  - container_fs_writes_total
  - container_memory_usage_bytes
  - container_memory_working_set_bytes
  - container_network_receive_bytes_total
  - container_network_transmit_bytes_total
  - container_scrape_error
  - machine_cpu_cores
calico:
  - felix_active_local_endpoints
  - felix_active_local_policies
  - felix_active_local_selectors
  - felix_active_local_tags
  - felix_cluster_num_host_endpoints
  - felix_cluster_num_hosts
  - felix_cluster_num_workload_endpoints
  - felix_host
  - felix_int_dataplane_addr_msg_batch_size_count
  - felix_int_dataplane_addr_msg_batch_size_sum
  - felix_int_dataplane_failures
  - felix_int_dataplane_iface_msg_batch_size_count
  - felix_int_dataplane_iface_msg_batch_size_sum
  - felix_ipset_errors
  - felix_ipsets_calico
  - felix_iptables_chains
  - felix_iptables_restore_errors
  - felix_iptables_save_errors
  - felix_resyncs_started
etcd-server: []
fluentd:
  - apache_http_request_duration_seconds_bucket
  - apache_http_request_duration_seconds_count
  - docker_networkdb_stats_netmsg
  - docker_networkdb_stats_qlen
  - kernel_io_errors_total # Since MCC 2.27.0 (17.2.0 and 16.2.0)
helm-controller:
  - helmbundle_reconcile_up
  - helmbundle_release_ready
  - helmbundle_release_status
  - helmbundle_release_success
  - rest_client_requests_total
host-os-modules-controller:
  - hostos_module_deprecation_info # Since MCC 2.28.0 (17.3.0 and 16.3.0)
  - hostos_module_usage # Since MCC 2.28.0 (17.3.0 and 16.3.0)
ironic:
  - ironic_driver_metadata
  - ironic_drivers_total
  - ironic_nodes
  - ironic_up
kaas-exporter:
  - kaas_cluster_info
  - kaas_cluster_lcm_healthy # Since MCC 2.28.0 (17.3.0 and 16.3.0)
  - kaas_cluster_ready # Since MCC 2.28.0 (17.3.0 and 16.3.0)
  - kaas_cluster_updating
  - kaas_clusters
  - kaas_info
  - kaas_license_expiry
  - kaas_machine_ready
  - kaas_machines_ready
  - kaas_machines_requested
  - mcc_cluster_pending_update_schedule_time # Since MCC 2.28.0 (17.3.0 and 16.3.0)
  - mcc_cluster_pending_update_status # Since MCC 2.28.0 (17.3.0 and 16.3.0)
  - mcc_cluster_update_plan_status # Since MCC 2.28.0 (17.3.0 and 16.3.0) as TechPreview
  - mcc_cluster_update_plan_step_status # Since MCC 2.28.0 (17.3.0 and 16.3.0) as TechPreview
  - rest_client_requests_total
kubelet:
  - kubelet_running_containers
  - kubelet_running_pods
  - kubelet_volume_stats_available_bytes
  - kubelet_volume_stats_capacity_bytes
  - kubelet_volume_stats_used_bytes # Since MCC 2.26.0 (17.1.0 and 16.1.0)
  - kubernetes_build_info
  - rest_client_requests_total
kubernetes-apiservers:
  - apiserver_client_certificate_expiration_seconds_bucket
  - apiserver_client_certificate_expiration_seconds_count
  - apiserver_request_total
  - kubernetes_build_info
  - rest_client_requests_total
kubernetes-master-api: []
mcc-blackbox: []
mcc-cache: []
mcc-controllers:
  - rest_client_requests_total
mcc-providers:
  - rest_client_requests_total
mke-manager-api: []
mke-metrics-controller:
  - ucp_controller_services
  - ucp_engine_node_health
mke-metrics-engine:
  - ucp_engine_container_cpu_percent
  - ucp_engine_container_cpu_total_time_nanoseconds
  - ucp_engine_container_health
  - ucp_engine_container_memory_usage_bytes
  - ucp_engine_container_network_rx_bytes_total
  - ucp_engine_container_network_tx_bytes_total
  - ucp_engine_container_unhealth
  - ucp_engine_containers
  - ucp_engine_disk_free_bytes
  - ucp_engine_disk_total_bytes
  - ucp_engine_images
  - ucp_engine_memory_total_bytes
  - ucp_engine_num_cpu_cores
msr-api: []
openstack-blackbox-ext: []
openstack-cloudprober: # Since MOSK 24.2
  - cloudprober_success
  - cloudprober_total
openstack-dns-probes: # Since MOSK 24.3
  - probe_dns_duration_seconds
openstack-ingress-controller:
  - nginx_ingress_controller_build_info
  - nginx_ingress_controller_config_hash
  - nginx_ingress_controller_config_last_reload_successful
  - nginx_ingress_controller_nginx_process_connections
  - nginx_ingress_controller_nginx_process_cpu_seconds_total
  - nginx_ingress_controller_nginx_process_resident_memory_bytes
  - nginx_ingress_controller_request_duration_seconds_bucket
  - nginx_ingress_controller_request_size_sum
  - nginx_ingress_controller_requests
  - nginx_ingress_controller_response_size_sum
  - nginx_ingress_controller_ssl_expire_time_seconds
  - nginx_ingress_controller_success
openstack-portprober:
  - portprober_arping_target_success
  - portprober_arping_target_total
openstack-powerdns: # Since MOSK 24.3
  - pdns_auth_backend_latency
  - pdns_auth_backend_queries
  - pdns_auth_cache_latency
  - pdns_auth_corrupt_packets
  - pdns_auth_cpu_iowait
  - pdns_auth_cpu_steal
  - pdns_auth_dnsupdate_answers
  - pdns_auth_dnsupdate_queries
  - pdns_auth_dnsupdate_refused
  - pdns_auth_fd_usage
  - pdns_auth_incoming_notifications
  - pdns_auth_open_tcp_connections
  - pdns_auth_qsize_q
  - pdns_auth_receive_latency
  - pdns_auth_ring_logmessages_capacity
  - pdns_auth_ring_logmessages_size
  - pdns_auth_ring_noerror_queries_capacity
  - pdns_auth_ring_noerror_queries_size
  - pdns_auth_ring_nxdomain_queries_capacity
  - pdns_auth_ring_nxdomain_queries_size
  - pdns_auth_ring_queries_capacity
  - pdns_auth_ring_queries_size
  - pdns_auth_ring_remotes_capacity
  - pdns_auth_ring_remotes_size
  - pdns_auth_ring_remotes_unauth_capacity
  - pdns_auth_ring_remotes_unauth_size
  - pdns_auth_ring_servfail_queries_capacity
  - pdns_auth_ring_servfail_queries_size
  - pdns_auth_ring_unauth_queries_capacity
  - pdns_auth_ring_unauth_queries_size
  - pdns_auth_signatures
  - pdns_auth_sys_msec
  - pdns_auth_tcp4_answers
  - pdns_auth_tcp4_answers_bytes
  - pdns_auth_tcp4_queries
  - pdns_auth_tcp6_answers
  - pdns_auth_tcp6_answers_bytes
  - pdns_auth_tcp6_queries
  - pdns_auth_timedout_packets
  - pdns_auth_udp4_answers
  - pdns_auth_udp4_answers_bytes
  - pdns_auth_udp4_queries
  - pdns_auth_udp6_answers
  - pdns_auth_udp6_answers_bytes
  - pdns_auth_udp6_queries
  - pdns_auth_udp_in_csum_errors
  - pdns_auth_udp_in_errors
  - pdns_auth_udp_noport_errors
  - pdns_auth_udp_recvbuf_errors
  - pdns_auth_udp_sndbuf_errors
  - pdns_auth_uptime
  - pdns_auth_user_msec
osdpl-exporter: # Removed in MOSK 24.1
  - osdpl_aodh_alarms
  - osdpl_certificate_expiry
  - osdpl_cinder_zone_volumes
  - osdpl_neutron_availability_zone_info
  - osdpl_neutron_zone_routers
  - osdpl_nova_aggregate_hosts
  - osdpl_nova_availability_zone_info
  - osdpl_nova_availability_zone_instances
  - osdpl_nova_availability_zone_hosts
  - osdpl_version_info
patroni:
  - patroni_patroni_cluster_unlocked
  - patroni_patroni_info
  - patroni_postgresql_info
  - patroni_replication_info
  - patroni_xlog_location
  - patroni_xlog_paused
  - patroni_xlog_received_location
  - patroni_xlog_replayed_location
  - python_info
postgresql:
  - pg_database_size
  - pg_locks_count
  - pg_stat_activity_count
  - pg_stat_activity_max_tx_duration
  - pg_stat_archiver_failed_count
  - pg_stat_bgwriter_buffers_alloc_total
  - pg_stat_bgwriter_buffers_backend_fsync_total
  - pg_stat_bgwriter_buffers_backend_total
  - pg_stat_bgwriter_buffers_checkpoint_total
  - pg_stat_bgwriter_buffers_clean_total
  - pg_stat_bgwriter_checkpoint_sync_time_total
  - pg_stat_bgwriter_checkpoint_write_time_total
  - pg_stat_database_blks_hit
  - pg_stat_database_blks_read
  - pg_stat_database_checksum_failures
  - pg_stat_database_conflicts
  - pg_stat_database_conflicts_confl_bufferpin
  - pg_stat_database_conflicts_confl_deadlock
  - pg_stat_database_conflicts_confl_lock
  - pg_stat_database_conflicts_confl_snapshot
  - pg_stat_database_conflicts_confl_tablespace
  - pg_stat_database_deadlocks
  - pg_stat_database_temp_bytes
  - pg_stat_database_tup_deleted
  - pg_stat_database_tup_fetched
  - pg_stat_database_tup_inserted
  - pg_stat_database_tup_returned
  - pg_stat_database_tup_updated
  - pg_stat_database_xact_commit
  - pg_stat_database_xact_rollback
  - postgres_exporter_build_info
prometheus-alertmanager:
  - alertmanager_active_alerts
  - alertmanager_active_silences
  - alertmanager_alerts
  - alertmanager_alerts_invalid_total
  - alertmanager_alerts_received_total
  - alertmanager_build_info
  - alertmanager_cluster_failed_peers
  - alertmanager_cluster_health_score
  - alertmanager_cluster_members
  - alertmanager_cluster_messages_pruned_total
  - alertmanager_cluster_messages_queued
  - alertmanager_cluster_messages_received_size_total
  - alertmanager_cluster_messages_received_total
  - alertmanager_cluster_messages_sent_size_total
  - alertmanager_cluster_messages_sent_total
  - alertmanager_cluster_peer_info
  - alertmanager_cluster_peers_joined_total
  - alertmanager_cluster_peers_left_total
  - alertmanager_cluster_reconnections_failed_total
  - alertmanager_cluster_reconnections_total
  - alertmanager_config_last_reload_success_timestamp_seconds
  - alertmanager_config_last_reload_successful
  - alertmanager_nflog_gc_duration_seconds_count
  - alertmanager_nflog_gc_duration_seconds_sum
  - alertmanager_nflog_gossip_messages_propagated_total
  - alertmanager_nflog_queries_total
  - alertmanager_nflog_query_duration_seconds_bucket
  - alertmanager_nflog_query_errors_total
  - alertmanager_nflog_snapshot_duration_seconds_count
  - alertmanager_nflog_snapshot_duration_seconds_sum
  - alertmanager_nflog_snapshot_size_bytes
  - alertmanager_notification_latency_seconds_bucket
  - alertmanager_notifications_failed_total
  - alertmanager_notifications_total
  - alertmanager_oversize_gossip_message_duration_seconds_bucket
  - alertmanager_oversized_gossip_message_dropped_total
  - alertmanager_oversized_gossip_message_failure_total
  - alertmanager_oversized_gossip_message_sent_total
  - alertmanager_partial_state_merges_failed_total
  - alertmanager_partial_state_merges_total
  - alertmanager_silences
  - alertmanager_silences_gc_duration_seconds_count
  - alertmanager_silences_gc_duration_seconds_sum
  - alertmanager_silences_gossip_messages_propagated_total
  - alertmanager_silences_queries_total
  - alertmanager_silences_query_duration_seconds_bucket
  - alertmanager_silences_query_errors_total
  - alertmanager_silences_snapshot_duration_seconds_count
  - alertmanager_silences_snapshot_duration_seconds_sum
  - alertmanager_silences_snapshot_size_bytes
  - alertmanager_state_replication_failed_total
  - alertmanager_state_replication_total
prometheus-elasticsearch-exporter:
  - elasticsearch_breakers_estimated_size_bytes
  - elasticsearch_breakers_limit_size_bytes
  - elasticsearch_breakers_tripped
  - elasticsearch_cluster_health_active_primary_shards
  - elasticsearch_cluster_health_active_shards
  - elasticsearch_cluster_health_delayed_unassigned_shards
  - elasticsearch_cluster_health_initializing_shards
  - elasticsearch_cluster_health_number_of_data_nodes
  - elasticsearch_cluster_health_number_of_nodes
  - elasticsearch_cluster_health_number_of_pending_tasks
  - elasticsearch_cluster_health_relocating_shards
  - elasticsearch_cluster_health_status
  - elasticsearch_cluster_health_unassigned_shards
  - elasticsearch_exporter_build_info
  - elasticsearch_indices_docs
  - elasticsearch_indices_docs_deleted
  - elasticsearch_indices_docs_primary
  - elasticsearch_indices_fielddata_evictions
  - elasticsearch_indices_fielddata_memory_size_bytes
  - elasticsearch_indices_filter_cache_evictions
  - elasticsearch_indices_flush_time_seconds
  - elasticsearch_indices_flush_total
  - elasticsearch_indices_get_exists_time_seconds
  - elasticsearch_indices_get_exists_total
  - elasticsearch_indices_get_missing_time_seconds
  - elasticsearch_indices_get_missing_total
  - elasticsearch_indices_get_time_seconds
  - elasticsearch_indices_get_total
  - elasticsearch_indices_indexing_delete_time_seconds_total
  - elasticsearch_indices_indexing_delete_total
  - elasticsearch_indices_indexing_index_time_seconds_total
  - elasticsearch_indices_indexing_index_total
  - elasticsearch_indices_merges_docs_total
  - elasticsearch_indices_merges_total
  - elasticsearch_indices_merges_total_size_bytes_total
  - elasticsearch_indices_merges_total_time_seconds_total
  - elasticsearch_indices_query_cache_evictions
  - elasticsearch_indices_query_cache_memory_size_bytes
  - elasticsearch_indices_refresh_time_seconds_total
  - elasticsearch_indices_refresh_total
  - elasticsearch_indices_search_fetch_time_seconds
  - elasticsearch_indices_search_fetch_total
  - elasticsearch_indices_search_query_time_seconds
  - elasticsearch_indices_search_query_total
  - elasticsearch_indices_segment_count_primary
  - elasticsearch_indices_segment_count_total
  - elasticsearch_indices_segment_memory_bytes_primary
  - elasticsearch_indices_segment_memory_bytes_total
  - elasticsearch_indices_segments_count
  - elasticsearch_indices_segments_memory_bytes
  - elasticsearch_indices_store_size_bytes
  - elasticsearch_indices_store_size_bytes_primary
  - elasticsearch_indices_store_size_bytes_total
  - elasticsearch_indices_store_throttle_time_seconds_total
  - elasticsearch_indices_translog_operations
  - elasticsearch_indices_translog_size_in_bytes
  - elasticsearch_jvm_gc_collection_seconds_count
  - elasticsearch_jvm_gc_collection_seconds_sum
  - elasticsearch_jvm_memory_committed_bytes
  - elasticsearch_jvm_memory_max_bytes
  - elasticsearch_jvm_memory_pool_peak_used_bytes
  - elasticsearch_jvm_memory_used_bytes
  - elasticsearch_os_load1
  - elasticsearch_os_load15
  - elasticsearch_os_load5
  - elasticsearch_process_cpu_percent
  - elasticsearch_process_cpu_seconds_total
  - elasticsearch_process_cpu_time_seconds_sum
  - elasticsearch_process_open_files_count
  - elasticsearch_thread_pool_active_count
  - elasticsearch_thread_pool_completed_count
  - elasticsearch_thread_pool_queue_count
  - elasticsearch_thread_pool_rejected_count
  - elasticsearch_transport_rx_size_bytes_total
  - elasticsearch_transport_tx_size_bytes_total
prometheus-grafana:
  - grafana_api_dashboard_get_milliseconds
  - grafana_api_dashboard_get_milliseconds_count
  - grafana_api_dashboard_get_milliseconds_sum
  - grafana_api_dashboard_save_milliseconds
  - grafana_api_dashboard_save_milliseconds_count
  - grafana_api_dashboard_save_milliseconds_sum
  - grafana_api_dashboard_search_milliseconds
  - grafana_api_dashboard_search_milliseconds_count
  - grafana_api_dashboard_search_milliseconds_sum
  - grafana_api_dataproxy_request_all_milliseconds
  - grafana_api_dataproxy_request_all_milliseconds_count
  - grafana_api_dataproxy_request_all_milliseconds_sum
  - grafana_api_login_oauth_total
  - grafana_api_login_post_total
  - grafana_api_response_status_total
  - grafana_build_info
  - grafana_feature_toggles_info
  - grafana_http_request_duration_seconds_count
  - grafana_page_response_status_total
  - grafana_plugin_build_info
  - grafana_proxy_response_status_total
  - grafana_stat_total_orgs
  - grafana_stat_total_users
  - grafana_stat_totals_dashboard
prometheus-kube-state-metrics:
  - kube_cronjob_next_schedule_time
  - kube_daemonset_created
  - kube_daemonset_status_current_number_scheduled
  - kube_daemonset_status_desired_number_scheduled
  - kube_daemonset_status_number_available
  - kube_daemonset_status_number_misscheduled
  - kube_daemonset_status_number_ready
  - kube_daemonset_status_number_unavailable
  - kube_daemonset_status_observed_generation
  - kube_daemonset_status_updated_number_scheduled
  - kube_deployment_created
  - kube_deployment_metadata_generation
  - kube_deployment_spec_replicas
  - kube_deployment_status_observed_generation
  - kube_deployment_status_replicas
  - kube_deployment_status_replicas_available
  - kube_deployment_status_replicas_unavailable
  - kube_deployment_status_replicas_updated
  - kube_endpoint_address # Since MOSK 25.1
  - kube_endpoint_address_available # Deprecated since MOSK 25.1
  - kube_job_status_active
  - kube_job_status_failed
  - kube_job_status_succeeded
  - kube_namespace_created
  - kube_namespace_status_phase
  - kube_node_info
  - kube_node_labels
  - kube_node_role
  - kube_node_spec_taint
  - kube_node_spec_unschedulable
  - kube_node_status_allocatable
  - kube_node_status_capacity
  - kube_node_status_condition
  - kube_persistentvolume_capacity_bytes
  - kube_persistentvolume_status_phase
  - kube_persistentvolumeclaim_resource_requests_storage_bytes
  - kube_pod_container_info
  - kube_pod_container_resource_limits
  - kube_pod_container_resource_requests
  - kube_pod_container_status_restarts_total
  - kube_pod_container_status_running
  - kube_pod_container_status_terminated
  - kube_pod_container_status_waiting
  - kube_pod_info
  - kube_pod_init_container_status_running
  - kube_pod_status_phase
  - kube_service_status_load_balancer_ingress
  - kube_statefulset_created
  - kube_statefulset_metadata_generation
  - kube_statefulset_replicas
  - kube_statefulset_status_current_revision
  - kube_statefulset_status_observed_generation
  - kube_statefulset_status_replicas
  - kube_statefulset_status_replicas_available
  - kube_statefulset_status_replicas_current
  - kube_statefulset_status_replicas_ready
  - kube_statefulset_status_replicas_updated
  - kube_statefulset_status_update_revision
prometheus-libvirt-exporter:
  - libvirt_domain_block_stats_allocation
  - libvirt_domain_block_stats_capacity
  - libvirt_domain_block_stats_physical
  - libvirt_domain_block_stats_read_bytes_total
  - libvirt_domain_block_stats_read_requests_total
  - libvirt_domain_block_stats_write_bytes_total
  - libvirt_domain_block_stats_write_requests_total
  - libvirt_domain_info_cpu_time_seconds_total
  - libvirt_domain_info_maximum_memory_bytes
  - libvirt_domain_info_memory_usage_bytes
  - libvirt_domain_info_state
  - libvirt_domain_info_virtual_cpus
  - libvirt_domain_interface_stats_receive_bytes_total
  - libvirt_domain_interface_stats_receive_drops_total
  - libvirt_domain_interface_stats_receive_errors_total
  - libvirt_domain_interface_stats_receive_packets_total
  - libvirt_domain_interface_stats_transmit_bytes_total
  - libvirt_domain_interface_stats_transmit_drops_total
  - libvirt_domain_interface_stats_transmit_errors_total
  - libvirt_domain_interface_stats_transmit_packets_total
  - libvirt_domain_memory_actual_balloon_bytes
  - libvirt_domain_memory_available_bytes
  - libvirt_domain_memory_rss_bytes
  - libvirt_domain_memory_unused_bytes
  - libvirt_domain_memory_usable_bytes
  - libvirt_up
prometheus-memcached-exporter:
  - memcached_commands_total
  - memcached_current_bytes
  - memcached_current_connections
  - memcached_current_items
  - memcached_exporter_build_info
  - memcached_items_evicted_total
  - memcached_items_reclaimed_total
  - memcached_limit_bytes
  - memcached_read_bytes_total
  - memcached_up
  - memcached_version
  - memcached_written_bytes_total
prometheus-msteams: []
prometheus-mysql-exporter:
  - mysql_global_status_aborted_clients
  - mysql_global_status_aborted_connects
  - mysql_global_status_buffer_pool_pages
  - mysql_global_status_bytes_received
  - mysql_global_status_bytes_sent
  - mysql_global_status_commands_total
  - mysql_global_status_created_tmp_disk_tables
  - mysql_global_status_created_tmp_files
  - mysql_global_status_created_tmp_tables
  - mysql_global_status_handlers_total
  - mysql_global_status_innodb_log_waits
  - mysql_global_status_innodb_num_open_files
  - mysql_global_status_innodb_page_size
  - mysql_global_status_max_used_connections
  - mysql_global_status_open_files
  - mysql_global_status_open_table_definitions
  - mysql_global_status_open_tables
  - mysql_global_status_opened_files
  - mysql_global_status_opened_table_definitions
  - mysql_global_status_opened_tables
  - mysql_global_status_qcache_free_memory
  - mysql_global_status_qcache_hits
  - mysql_global_status_qcache_inserts
  - mysql_global_status_qcache_lowmem_prunes
  - mysql_global_status_qcache_not_cached
  - mysql_global_status_qcache_queries_in_cache
  - mysql_global_status_queries
  - mysql_global_status_questions
  - mysql_global_status_select_full_join
  - mysql_global_status_select_full_range_join
  - mysql_global_status_select_range
  - mysql_global_status_select_range_check
  - mysql_global_status_select_scan
  - mysql_global_status_slow_queries
  - mysql_global_status_sort_merge_passes
  - mysql_global_status_sort_range
  - mysql_global_status_sort_rows
  - mysql_global_status_sort_scan
  - mysql_global_status_table_locks_immediate
  - mysql_global_status_table_locks_waited
  - mysql_global_status_threads_cached
  - mysql_global_status_threads_connected
  - mysql_global_status_threads_created
  - mysql_global_status_threads_running
  - mysql_global_status_wsrep_flow_control_paused
  - mysql_global_status_wsrep_local_recv_queue
  - mysql_global_status_wsrep_local_state
  - mysql_global_status_wsrep_ready
  - mysql_global_variables_innodb_buffer_pool_size
  - mysql_global_variables_innodb_log_buffer_size
  - mysql_global_variables_key_buffer_size
  - mysql_global_variables_max_connections
  - mysql_global_variables_open_files_limit
  - mysql_global_variables_query_cache_size
  - mysql_global_variables_table_definition_cache
  - mysql_global_variables_table_open_cache
  - mysql_global_variables_thread_cache_size
  - mysql_global_variables_wsrep_desync
  - mysql_up
  - mysql_version_info
  - mysqld_exporter_build_info
prometheus-node-exporter:
  - node_arp_entries
  - node_bonding_active
  - node_bonding_slaves
  - node_boot_time_seconds
  - node_context_switches_total
  - node_cpu_seconds_total
  - node_disk_io_now
  - node_disk_io_time_seconds_total
  - node_disk_io_time_weighted_seconds_total
  - node_disk_read_bytes_total
  - node_disk_read_time_seconds_total
  - node_disk_reads_completed_total
  - node_disk_reads_merged_total
  - node_disk_write_time_seconds_total
  - node_disk_writes_completed_total
  - node_disk_writes_merged_total
  - node_disk_written_bytes_total
  - node_entropy_available_bits
  - node_exporter_build_info
  - node_filefd_allocated
  - node_filefd_maximum
  - node_filesystem_avail_bytes
  - node_filesystem_files
  - node_filesystem_files_free
  - node_filesystem_free_bytes
  - node_filesystem_readonly
  - node_filesystem_size_bytes
  - node_forks_total
  - node_hwmon_temp_celsius
  - node_hwmon_temp_crit_alarm_celsius
  - node_hwmon_temp_crit_celsius
  - node_hwmon_temp_crit_hyst_celsius
  - node_hwmon_temp_max_celsius
  - node_intr_total
  - node_load1
  - node_load15
  - node_load5
  - node_memory_Active_anon_bytes
  - node_memory_Active_bytes
  - node_memory_Active_file_bytes
  - node_memory_AnonHugePages_bytes
  - node_memory_AnonPages_bytes
  - node_memory_Bounce_bytes
  - node_memory_Buffers_bytes
  - node_memory_Cached_bytes
  - node_memory_CommitLimit_bytes
  - node_memory_Committed_AS_bytes
  - node_memory_DirectMap1G
  - node_memory_DirectMap2M_bytes
  - node_memory_DirectMap4k_bytes
  - node_memory_Dirty_bytes
  - node_memory_HardwareCorrupted_bytes
  - node_memory_HugePages_Free
  - node_memory_HugePages_Rsvd
  - node_memory_HugePages_Surp
  - node_memory_HugePages_Total
  - node_memory_Hugepagesize_bytes
  - node_memory_Inactive_anon_bytes
  - node_memory_Inactive_bytes
  - node_memory_Inactive_file_bytes
  - node_memory_KernelStack_bytes
  - node_memory_Mapped_bytes
  - node_memory_MemAvailable_bytes
  - node_memory_MemFree_bytes
  - node_memory_MemTotal_bytes
  - node_memory_Mlocked_bytes
  - node_memory_NFS_Unstable_bytes
  - node_memory_PageTables_bytes
  - node_memory_SReclaimable_bytes
  - node_memory_SUnreclaim_bytes
  - node_memory_Shmem_bytes
  - node_memory_Slab_bytes
  - node_memory_SwapCached_bytes
  - node_memory_SwapFree_bytes
  - node_memory_SwapTotal_bytes
  - node_memory_Unevictable_bytes
  - node_memory_VmallocChunk_bytes
  - node_memory_VmallocTotal_bytes
  - node_memory_VmallocUsed_bytes
  - node_memory_WritebackTmp_bytes
  - node_memory_Writeback_bytes
  - node_netstat_TcpExt_TCPSynRetrans
  - node_netstat_Tcp_ActiveOpens
  - node_netstat_Tcp_AttemptFails
  - node_netstat_Tcp_CurrEstab
  - node_netstat_Tcp_EstabResets
  - node_netstat_Tcp_InCsumErrors
  - node_netstat_Tcp_InErrs
  - node_netstat_Tcp_InSegs
  - node_netstat_Tcp_MaxConn
  - node_netstat_Tcp_OutRsts
  - node_netstat_Tcp_OutSegs
  - node_netstat_Tcp_PassiveOpens
  - node_netstat_Tcp_RetransSegs
  - node_netstat_Udp_InCsumErrors
  - node_netstat_Udp_InDatagrams
  - node_netstat_Udp_InErrors
  - node_netstat_Udp_NoPorts
  - node_netstat_Udp_OutDatagrams
  - node_netstat_Udp_RcvbufErrors
  - node_netstat_Udp_SndbufErrors
  - node_network_mtu_bytes
  - node_network_receive_bytes_total
  - node_network_receive_compressed_total
  - node_network_receive_drop_total
  - node_network_receive_errs_total
  - node_network_receive_fifo_total
  - node_network_receive_frame_total
  - node_network_receive_multicast_total
  - node_network_receive_packets_total
  - node_network_transmit_bytes_total
  - node_network_transmit_carrier_total
  - node_network_transmit_colls_total
  - node_network_transmit_compressed_total
  - node_network_transmit_drop_total
  - node_network_transmit_errs_total
  - node_network_transmit_fifo_total
  - node_network_transmit_packets_total
  - node_network_up
  - node_nf_conntrack_entries
  - node_nf_conntrack_entries_limit
  - node_procs_blocked
  - node_procs_running
  - node_scrape_collector_duration_seconds
  - node_scrape_collector_success
  - node_sockstat_FRAG_inuse
  - node_sockstat_FRAG_memory
  - node_sockstat_RAW_inuse
  - node_sockstat_TCP_alloc
  - node_sockstat_TCP_inuse
  - node_sockstat_TCP_mem
  - node_sockstat_TCP_mem_bytes
  - node_sockstat_TCP_orphan
  - node_sockstat_TCP_tw
  - node_sockstat_UDPLITE_inuse
  - node_sockstat_UDP_inuse
  - node_sockstat_UDP_mem
  - node_sockstat_UDP_mem_bytes
  - node_sockstat_sockets_used
  - node_textfile_scrape_error
  - node_time_seconds
  - node_timex_estimated_error_seconds
  - node_timex_frequency_adjustment_ratio
  - node_timex_maxerror_seconds
  - node_timex_offset_seconds
  - node_timex_sync_status
  - node_uname_info
prometheus-rabbitmq-exporter: # Deprecated since MOSK 25.1, use rabbitmq-prometheus-plugin instead
  - rabbitmq_channels
  - rabbitmq_connections
  - rabbitmq_consumers
  - rabbitmq_exchanges
  - rabbitmq_exporter_build_info
  - rabbitmq_fd_available
  - rabbitmq_fd_used
  - rabbitmq_node_disk_free
  - rabbitmq_node_disk_free_alarm
  - rabbitmq_node_mem_alarm
  - rabbitmq_node_mem_used
  - rabbitmq_partitions
  - rabbitmq_queue_messages_global
  - rabbitmq_queue_messages_ready_global
  - rabbitmq_queue_messages_unacknowledged_global
  - rabbitmq_queues
  - rabbitmq_sockets_available
  - rabbitmq_sockets_used
  - rabbitmq_up
  - rabbitmq_uptime
  - rabbitmq_version_info
prometheus-relay: []
prometheus-server:
  - prometheus_build_info
  - prometheus_config_last_reload_success_timestamp_seconds
  - prometheus_config_last_reload_successful
  - prometheus_engine_query_duration_seconds
  - prometheus_engine_query_duration_seconds_sum
  - prometheus_http_request_duration_seconds_count
  - prometheus_notifications_alertmanagers_discovered
  - prometheus_notifications_errors_total
  - prometheus_notifications_queue_capacity
  - prometheus_notifications_queue_length
  - prometheus_notifications_sent_total
  - prometheus_rule_evaluation_failures_total
  - prometheus_target_interval_length_seconds
  - prometheus_target_interval_length_seconds_count
  - prometheus_target_scrapes_sample_duplicate_timestamp_total
  - prometheus_tsdb_blocks_loaded
  - prometheus_tsdb_compaction_chunk_range_seconds_count
  - prometheus_tsdb_compaction_chunk_range_seconds_sum
  - prometheus_tsdb_compaction_chunk_samples_count
  - prometheus_tsdb_compaction_chunk_samples_sum
  - prometheus_tsdb_compaction_chunk_size_bytes_sum
  - prometheus_tsdb_compaction_duration_seconds_bucket
  - prometheus_tsdb_compaction_duration_seconds_count
  - prometheus_tsdb_compaction_duration_seconds_sum
  - prometheus_tsdb_compactions_failed_total
  - prometheus_tsdb_compactions_total
  - prometheus_tsdb_compactions_triggered_total
  - prometheus_tsdb_head_active_appenders
  - prometheus_tsdb_head_chunks
  - prometheus_tsdb_head_chunks_created_total
  - prometheus_tsdb_head_chunks_removed_total
  - prometheus_tsdb_head_gc_duration_seconds_sum
  - prometheus_tsdb_head_samples_appended_total
  - prometheus_tsdb_head_series
  - prometheus_tsdb_head_series_created_total
  - prometheus_tsdb_head_series_removed_total
  - prometheus_tsdb_reloads_failures_total
  - prometheus_tsdb_reloads_total
  - prometheus_tsdb_storage_blocks_bytes
  - prometheus_tsdb_wal_corruptions_total
  - prometheus_tsdb_wal_fsync_duration_seconds_count
  - prometheus_tsdb_wal_fsync_duration_seconds_sum
  - prometheus_tsdb_wal_truncations_failed_total
  - prometheus_tsdb_wal_truncations_total
rabbitmq-operator-metrics:
  - rest_client_requests_total
rabbitmq-prometheus-plugin: # Since MOSK 25.1 to replace prometheus-rabbitmq-exporter
  - erlang_vm_allocators
  - erlang_vm_dist_node_queue_size_bytes
  - erlang_vm_dist_node_state
  - erlang_vm_dist_recv_bytes
  - erlang_vm_dist_recv_cnt
  - erlang_vm_dist_send_bytes
  - erlang_vm_dist_send_cnt
  - erlang_vm_ets_limit
  - erlang_vm_memory_bytes_total
  - erlang_vm_memory_dets_tables
  - erlang_vm_memory_ets_tables
  - erlang_vm_memory_system_bytes_total
  - erlang_vm_port_count
  - erlang_vm_port_limit
  - erlang_vm_process_count
  - erlang_vm_process_limit
  - erlang_vm_statistics_bytes_output_total
  - erlang_vm_statistics_bytes_received_total
  - erlang_vm_statistics_context_switches
  - erlang_vm_statistics_dirty_cpu_run_queue_length
  - erlang_vm_statistics_dirty_io_run_queue_length
  - erlang_vm_statistics_garbage_collection_bytes_reclaimed
  - erlang_vm_statistics_garbage_collection_number_of_gcs
  - erlang_vm_statistics_reductions_total
  - erlang_vm_statistics_run_queues_length
  - erlang_vm_statistics_run_queues_length_total
  - erlang_vm_statistics_runtime_milliseconds
  - rabbitmq_alarms_free_disk_space_watermark
  - rabbitmq_alarms_memory_used_watermark
  - rabbitmq_build_info
  - rabbitmq_channels
  - rabbitmq_channels_closed_total
  - rabbitmq_channels_opened_total
  - rabbitmq_connections
  - rabbitmq_connections_closed_total
  - rabbitmq_connections_opened_total
  - rabbitmq_consumers
  - rabbitmq_disk_space_available_bytes
  - rabbitmq_erlang_uptime_seconds
  - rabbitmq_global_messages_acknowledged_total
  - rabbitmq_global_messages_confirmed_total
  - rabbitmq_global_messages_delivered_consume_auto_ack_total
  - rabbitmq_global_messages_delivered_consume_manual_ack_total
  - rabbitmq_global_messages_delivered_get_auto_ack_total
  - rabbitmq_global_messages_delivered_get_manual_ack_total
  - rabbitmq_global_messages_get_empty_total
  - rabbitmq_global_messages_received_confirm_total
  - rabbitmq_global_messages_received_total
  - rabbitmq_global_messages_redelivered_total
  - rabbitmq_global_messages_routed_total
  - rabbitmq_global_messages_unroutable_dropped_total
  - rabbitmq_global_messages_unroutable_returned_total
  - rabbitmq_global_publishers
  - rabbitmq_identity_info
  - rabbitmq_process_max_fds
  - rabbitmq_process_max_tcp_sockets
  - rabbitmq_process_open_fds
  - rabbitmq_process_open_tcp_sockets
  - rabbitmq_process_resident_memory_bytes
  - rabbitmq_queue_messages
  - rabbitmq_queue_messages_ready
  - rabbitmq_queue_messages_unacked
  - rabbitmq_queues
  - rabbitmq_queues_created_total
  - rabbitmq_queues_declared_total
  - rabbitmq_queues_deleted_total
  - rabbitmq_resident_memory_limit_bytes
  - rabbitmq_unreachable_cluster_peers_count
sf-notifier:
  - sf_auth_ok
  - sf_error_count_created
  - sf_error_count_total
  - sf_request_count_created
  - sf_request_count_total
telegraf-docker-swarm:
  - docker_n_containers
  - docker_n_containers_paused
  - docker_n_containers_running
  - docker_n_containers_stopped
  - docker_swarm_node_ready # Removed in MOSK 25.1
  - docker_swarm_tasks_desired
  - docker_swarm_tasks_running
  - internal_agent_gather_errors
telemeter-client:
  - federate_errors
  - federate_filtered_samples
  - federate_samples
telemeter-server:
  - telemeter_cleanups_total
  - telemeter_partitions
  - telemeter_samples_total
tf-cassandra-jmx-exporter:
  - cassandra_cache_entries
  - cassandra_cache_estimated_size_bytes
  - cassandra_cache_hits_total
  - cassandra_cache_requests_total
  - cassandra_client_authentication_failures_total
  - cassandra_client_native_connections
  - cassandra_client_request_failures_total
  - cassandra_client_request_latency_seconds_count
  - cassandra_client_request_latency_seconds_sum
  - cassandra_client_request_timeouts_total
  - cassandra_client_request_unavailable_exceptions_total
  - cassandra_client_request_view_write_latency_seconds
  - cassandra_commit_log_pending_tasks
  - cassandra_compaction_bytes_compacted_total
  - cassandra_compaction_completed_total
  - cassandra_dropped_messages_total
  - cassandra_endpoint_connection_timeouts_total
  - cassandra_storage_exceptions_total
  - cassandra_storage_hints_total
  - cassandra_storage_load_bytes
  - cassandra_table_estimated_pending_compactions
  - cassandra_table_repaired_ratio
  - cassandra_table_sstables_per_read_count
  - cassandra_table_tombstones_scanned
  - cassandra_thread_pool_active_tasks
  - cassandra_thread_pool_blocked_tasks
tf-control:
  - tf_controller_sessions
  - tf_controller_up
tf-kafka-jmx:
  - jmx_exporter_build_info
  - kafka_controller_controllerstats_count
  - kafka_controller_controllerstats_oneminuterate
  - kafka_controller_kafkacontroller_value
  - kafka_log_log_value
  - kafka_network_processor_value
  - kafka_network_requestmetrics_99thpercentile
  - kafka_network_requestmetrics_mean
  - kafka_network_requestmetrics_oneminuterate
  - kafka_network_socketserver_value
  - kafka_server_brokertopicmetrics_count
  - kafka_server_brokertopicmetrics_oneminuterate
  - kafka_server_delayedoperationpurgatory_value
  - kafka_server_kafkarequesthandlerpool_oneminuterate
  - kafka_server_replicamanager_oneminuterate
  - kafka_server_replicamanager_value
tf-operator:
  - tf_operator_info # Since MOSK 23.3
tf-redis:
  - redis_commands_duration_seconds_total
  - redis_commands_processed_total
  - redis_commands_total
  - redis_connected_clients
  - redis_connected_slaves
  - redis_db_keys
  - redis_db_keys_expiring
  - redis_evicted_keys_total
  - redis_expired_keys_total
  - redis_exporter_build_info
  - redis_instance_info
  - redis_keyspace_hits_total
  - redis_keyspace_misses_total
  - redis_memory_max_bytes
  - redis_memory_used_bytes
  - redis_net_input_bytes_total
  - redis_net_output_bytes_total
  - redis_rejected_connections_total
  - redis_slave_info
  - redis_up
  - redis_uptime_in_seconds
tf-vrouter:
  - tf_vrouter_ds_discard
  - tf_vrouter_ds_flow_action_drop
  - tf_vrouter_ds_flow_queue_limit_exceeded
  - tf_vrouter_ds_flow_table_full
  - tf_vrouter_ds_frag_err
  - tf_vrouter_ds_invalid_if
  - tf_vrouter_ds_invalid_label
  - tf_vrouter_ds_invalid_nh
  - tf_vrouter_flow_active
  - tf_vrouter_flow_aged
  - tf_vrouter_flow_created
  - tf_vrouter_lls_session_info
  - tf_vrouter_up
  - tf_vrouter_xmpp_connection_state
tf-zookeeper:
  - approximate_data_size
  - bytes_received_count
  - commit_count
  - connection_drop_count
  - connection_rejected
  - connection_request_count
  - dead_watchers_cleaner_latency_sum
  - dead_watchers_cleared
  - dead_watchers_queued
  - digest_mismatches_count
  - election_time_sum
  - ephemerals_count
  - follower_sync_time_count
  - follower_sync_time_sum
  - fsynctime_sum
  - global_sessions
  - jvm_classes_loaded
  - jvm_gc_collection_seconds_sum
  - jvm_info
  - jvm_memory_pool_bytes_used
  - jvm_threads_current
  - jvm_threads_deadlocked
  - jvm_threads_state
  - leader_uptime
  - learner_commit_received_count
  - learner_proposal_received_count
  - learners
  - local_sessions
  - max_file_descriptor_count
  - node_changed_watch_count_sum
  - node_children_watch_count_sum
  - node_created_watch_count_sum
  - node_deleted_watch_count_sum
  - num_alive_connections
  - om_commit_process_time_ms_sum
  - om_proposal_process_time_ms_sum
  - open_file_descriptor_count
  - outstanding_requests
  - packets_received
  - packets_sent
  - pending_syncs
  - proposal_count
  - quorum_size
  - response_packet_cache_hits
  - response_packet_cache_misses
  - response_packet_get_children_cache_hits
  - response_packet_get_children_cache_misses
  - revalidate_count
  - snapshottime_sum
  - stale_sessions_expired
  - synced_followers
  - synced_non_voting_followers
  - synced_observers
  - unrecoverable_error_count
  - uptime
  - watch_count
  - znode_count
ucp-kv: []

Note

The following Prometheus metrics are removed from the list of white-listed scrape jobs in Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0):

The prometheus-kube-state-metrics group:
- kube_deployment_spec_paused
- kube_deployment_spec_strategy_rollingupdate_max_unavailable
- kube_deployment_status_condition
- kube_deployment_status_replicas_ready
The prometheus-coredns job from the go-collector-metrics and process-collector-metrics groups

You can add necessary metrics that are dropped to this white list as described below. It is also possible to disable the filtering feature. However, Mirantis does not recommend disabling the feature to prevent direct impact on the Prometheus index size, which affects query speed. For clusters with extended retention period, performance degradation will be the most noticeable.

Add dropped metrics to the white list¶

You can expand the default white list of Prometheus metrics using the prometheusServer.metricsFiltering.extraMetricsInclude parameter to enable metrics that are dropped by default. For the parameter description, see Prometheus metrics filtering. For configuration steps, see Configure StackLight.

Example configuration:

prometheusServer:
  metricsFiltering:
    enabled: true
    extraMetricsInclude:
      cadvisor:
        - container_memory_failcnt
        - container_network_transmit_errors_total
      calico:
        - felix_route_table_per_iface_sync_seconds_sum
        - felix_bpf_dataplane_endpoints
      _group-go-collector-metrics:
        - go_gc_heap_goal_bytes
        - go_gc_heap_objects_objects

Disable metrics filtering¶

Mirantis does not recommend disabling metrics filtering to prevent direct impact on the Prometheus index size, which affects query speed. In clusters with an extended retention period, performance degradation will be the most noticeable. Therefore, the best option is to keep the feature enabled and add the required dropped metrics to the white list as described in Add dropped metrics to the white list.

If disabling of metrics filtering is absolutely necessary, set the prometheusServer.metricsFiltering.enabled parameter to false:

prometheusServer:
  metricsFiltering:
    enabled: false

For configuration steps, see Configure StackLight.

Use S.M.A.R.T. metrics for creating alert rules¶

Available since MCC 2.27.0 (Cluster releases 17.2.0 and 16.2.0)

The StackLight telegraf-ds-smart exporter uses the S.M.A.R.T. plugin to obtain detailed disk information and export it as metrics on a MOSK cluster. S.M.A.R.T. is a commonly used system across vendors with performance data provided as attributes, whereas attribute names can be different across vendors. Each attribute contains the following different values:

Raw value
Actual value of the attribute for the time being. Units may not be the same across vendors.
Current value
Health valuation where values can range from 1 to 253 (1 represents the worst case and 253 represents the best one). Depending on the manufacturer, a value of 100 or 200 will often be selected as the normal value.
Worst value
The worst value ever observed as a current one for a particular device.
Threshold value
Lower threshold for the current value. If the current value drops below the lower threshold, it requires attention.

The following table provides examples for alert rules based on S.M.A.R.T. metrics. These examples may not work for all clusters depending on vendor or disk types.

Caution

Before creating alert rules, manually test these expressions to verify whether they are valid for the cluster. You can also implement any other alerts based on S.M.A.R.T. metrics.

To create custom alert rules in StackLight, use the customAlerts parameter described in Alert configuration.

Expression	Description
`expr: smart_device_exit_status > 0`	Alerts when a device `exit` status signals potential issues.
`expr: smart_device_health_ok == 0`	Indicates disk health failure.
`expr: smart_attribute_threshold >= smart_attribute`	Targets any S.M.A.R.T. attribute reaching its predefined threshold, indicating a potential risk or imminent failure of the disk. Utilizing this alert might eliminate the need for more specific attribute alerts by relying on the vendor’s established thresholds, streamlining the monitoring process. Implementing inhibition rules may be necessary to manage overlaps with other alerts effectively.
`expr: smart_device_temp_c > 60`	Is triggered when disk temperature exceeds 60°C, indicating potential overheating issues.
`expr: increase(smart_device_udma_crc_errors[2m]) > 0`	Identifies an increase in UDMA CRC errors, indicating data transmission issues between the disk and controller.
`expr: increase(smart_device_read_error_rate[2m]) > 0`	Is triggered during a noticeable increase in the rate of read errors on the disk. This is a strong indicator of issues with the disk surface or read/write heads that can affect data integrity and accessibility.
`expr: increase(smart_device_spin_retry_count[2m]) > 0`	Is triggered when the disk experiences an increase in attempts to spin up to its operational speed, indicating potential issues with the disk motor, bearings, or power supply, which can lead to drive failure.
`expr: increase(smart_device_uncorrectable_sector_count[2m]) > 0`	Is triggered during an increase in the number of disk sectors that cannot be corrected by the error correction algorithms of the drive, pointing towards serious disk surface or read/write head issues.
`expr: increase(smart_device_pending_sector_count[2m]) > 0`	Is triggered on a rise in sectors that are marked as pending for remapping due to read errors. Persistent increases can indicate deteriorating disk health and impending failure.
`expr: increase(smart_device_end_to_end_error[2m]) > 0`	Detects an upsurge in errors during the process of data transmission from the host to the disk and vice versa, highlighting potential issues in data integrity during transfer operations.
`expr: increase(smart_device_reallocated_sectors_count[2m]) > 0`	Is triggered during an increase in sectors that have been reallocated due to being deemed defective. A rising count is a critical sign of ongoing wear and tear, or damage to the disk surface.

The following table describes S.M.A.R.T. metrics provided by Stacklight that you can use for creating alert rules depending on your cluster requirements:

Metric	Description
`smart_attribute`	Reports current S.M.A.R.T. attribute values with labels for detailed context.
`smart_attribute_exit_status`	Indicates the fetching status of individual attributes. A non-zero code indicates monitoring issues.
`smart_attribute_raw_value`	Reports raw S.M.A.R.T. attribute values with labels for detailed context.
`smart_attribute_threshold`	Reports S.M.A.R.T. attribute threshold values with labels for detailed context.
`smart_attribute_worst`	Reports the worst recorded values of S.M.A.R.T. attributes with labels for detailed context.
`smart_device_command_timeout`	Counts timeouts when a drive fails to respond to a command, indicating responsiveness issues.
`smart_device_exit_status`	Reflects the overall device status post-checks, where values other than `0` indicate issues.
`smart_device_health_ok`	Indicates overall device health, where values other than `1` indicate issues. Relates to the `--health` attribute of the smartctl tool.

The following table describes metrics from various S.M.A.R.T. attributes that are part of the above smart_attribute* metrics. But their value representation can be different, such as unified units or counter information. Also, vendors may have different attribute namings. The following metrics are standardized across different vendors. Depending on the disk or vendor type, a cluster may miss some of the following metrics or have extra ones.

Metric	Description
`smart_device_end_to_end_error`	Monitors data transmission errors, where an increase suggests potential transfer issues.
`smart_device_pending_sector_count`	Counts sectors awaiting remapping due to unrecoverable errors, with decreases over time indicating successful remapping.
`smart_device_read_error_rate`	Tracks errors occurring during disk data reads.
`smart_device_reallocated_sectors_count`	Counts defective sectors that have been remapped, with increases indicating drive degradation.
`smart_device_seek_error_rate`	Measures the error frequency of the drive positioning mechanism, with high values indicating mechanical issues.
`smart_device_spin_retry_count`	Tracks the drive attempts to spin up to operational speed, with increases indicating mechanical issues.
`smart_device_temp_c`	Reports the drive temperature in Celsius.
`smart_device_udma_crc_errors`	Counts errors in data communication between the drive and host.
`smart_device_uncorrectable_errors`	Records total uncorrectable read/write errors.
`smart_device_uncorrectable_sector_count`	Counts sectors that cannot be corrected indicating potentially damaged sectors.

Deschedule StackLight Pods from a worker machine¶

On an existing managed cluster, addition of a worker machine that replaces the one containing the StackLight node label requires the label migration to the new machine and a manual removal of StackLight Pods from the old machine, which you remove the label from.

Caution

In this procedure, replace <machine-name> with the name of the machine from which you remove the StackLight node label.

To deschedule StackLight Pods from a worker machine:

Remove the stacklight=enabled node label from the spec section of the target Machine object.
Connect to the required cluster using its kubeconfig.

Verify that the stacklight=enabled label was removed successfully:

kubectl get node -l "kaas.mirantis.com/machine-name=<machine name>" --show-labels | grep "stacklight=enabled"

A positive system response must be empty.

Verify the list of StackLight Pods to be deleted that run on the target machine:

kubectl get pods -n stacklight -o wide --field-selector spec.nodeName=$(kubectl get node -l "kaas.mirantis.com/machine-name=<machine name>" -o jsonpath='{.items[0].metadata.name}')

Example of system response extract:

NAME                                           READY STATUS    AGE   IP             NODE
alerta-fc45c8f6-6qlfx                          1/1   Running   63m   10.233.76.3    node-3a0de232-c1b4-43b0-8f21-44cd1
grafana-9bc56cdff-sl5w6                        3/3   Running   63m   10.233.76.4    node-3a0de232-c1b4-43b0-8f21-44cd1
iam-proxy-alerta-57585798d7-kqwd7              1/1   Running   58m   10.233.76.17   node-3a0de232-c1b4-43b0-8f21-44cd1
iam-proxy-alertmanager-6b4c4c8867-pdwcs        1/1   Running   56m   10.233.76.18   node-3a0de232-c1b4-43b0-8f21-44cd1
iam-proxy-grafana-87b984c45-2qwvb              1/1   Running   55m   10.233.76.19   node-3a0de232-c1b4-43b0-8f21-44cd1
iam-proxy-prometheus-545789585-9mll8           1/1   Running   54m   10.233.76.21   node-3a0de232-c1b4-43b0-8f21-44cd1
patroni-13-0                                   3/3   Running   61m   10.233.76.11   node-3a0de232-c1b4-43b0-8f21-44cd1
prometheus-alertmanager-0                      1/1   Running   55m   10.233.76.20   node-3a0de232-c1b4-43b0-8f21-44cd1
prometheus-blackbox-exporter-9f6bdfd75-8zn4w   2/2   Running   61m   10.233.76.8    node-3a0de232-c1b4-43b0-8f21-44cd1
prometheus-kube-state-metrics-67ff88649f-tslxc 1/1   Running   61m   10.233.76.7    node-3a0de232-c1b4-43b0-8f21-44cd1
prometheus-node-exporter-zl8pj                 1/1   Running   61m   10.10.10.143   node-3a0de232-c1b4-43b0-8f21-44cd1
telegraf-docker-swarm-69567fcf7f-jvbgn         1/1   Running   61m   10.233.76.10   node-3a0de232-c1b4-43b0-8f21-44cd1
telemeter-client-55d465dcc5-9thds              1/1   Running   61m   10.233.76.9    node-3a0de232-c1b4-43b0-8f21-44cd1

Delete all StackLight Pods from the target machine:

kubectl -n stacklight delete $(kubectl get pods -n stacklight -o wide --field-selector spec.nodeName=$(kubectl get node -l "kaas.mirantis.com/machine-name=<machine name>" -o jsonpath='{.items[0].metadata.name}') -o name)

Example of system response:

pod "alerta-fc45c8f6-6qlfx" deleted
pod "grafana-9bc56cdff-sl5w6" deleted
pod "iam-proxy-alerta-57585798d7-kqwd7" deleted
pod "iam-proxy-alertmanager-6b4c4c8867-pdwcs" deleted
pod "iam-proxy-grafana-87b984c45-2qwvb" deleted
pod "iam-proxy-prometheus-545789585-9mll8" deleted
pod "patroni-13-0" deleted
pod "prometheus-alertmanager-0" deleted
pod "prometheus-blackbox-exporter-9f6bdfd75-8zn4w" deleted
pod "prometheus-kube-state-metrics-67ff88649f-tslxc" deleted
pod "prometheus-node-exporter-zl8pj" deleted
pod "telegraf-docker-swarm-69567fcf7f-jvbgn" deleted
pod "telemeter-client-55d465dcc5-9thds" deleted

Wait about three minutes for Pods to be rescheduled.

Verify that you do not have Pending Pods in the stacklight namespace:

kubectl -n stacklight get pods --field-selector status.phase=Pending

If the system response is No resources found in stacklight namespace, all Pods are rescheduled successfully.

If the system response still contains some Pods, remove local persistent volumes (LVP) bound to the target machine.

Tungsten Fabric operations¶

The section covers the management aspects of a Tungsten Fabric cluster deployed on Kubernetes.

Caution

Before you proceed with the Tungsten Fabric management, read through Tungsten Fabric known limitations.

Update Tungsten Fabric¶

The Tungsten Fabric cluster update is performed during the MOSK cluster release update.

The control plane update is performed automatically. To complete the data plane update, you will need to manually remove the vRouter pods. See Cluster update for details.

Convert v1alpha1 TFOperator custom resource to v2¶

Available since MOSK 24.2

In 24.1, MOSK introduces the API v2 for Tungsten Fabric. Since 24.2, Tungsten Fabric API v2 becomes default for new deployments and includes the ability to convert the existing v1alpha1 TFOperator to v2.

During the update to the 24.3 series, the old Tungsten Fabric cluster configuration API v1alpha1 is automatically converted and replaced with the v2 version. And since MOSK 25.1, Tungsten Fabric API v1alpha1 is not present in the product.

To learn more about the new TFOperator structure, refer to Tungsten Fabric API v2 Reference of the cluster version in question at Tungsten Fabric Operator resources and Key differences between TFOperator API v1alpha1 and v2.

Convert v1alpha1 TFOperator to v2¶

MOSK 24.3

During cluster update to MOSK 24.3, the automatic conversion of the TFOperator v1alpha1 to the v2 version takes place. Therefore, there is no need to perform any manual conversion.

Warning

Since MOSK 24.3, start using the v2: TFOperator custom resource for any updates.

The v1alpha1 TFOperator custom resource remains in the cluster but is no longer reconciled and will be automatically removed in MOSK 25.1.

MOSK 24.2

Caution

Conversion of TFOperator causes recreation of the Tungsten Fabric service pods. Therefore, Mirantis recommends performing the conversion during a maintenance window.

Update the tungstenfabric-operator Helm release values in the corresponding ClusterRelease resource:

spec:
  helm:
    releases:
      - name: tungstenfabric-operator
        values:
          operator:
            convertToV2: true

When the chart changes apply, the tungstenfabric-operator-convert-to-v2 job performs the following:

Saves the existing v1alpha1 TFOperator specification to the tfoperator-v1alpha1-copy ConfigMap
Creates the v2 TFOperator custom resource
Removes the redundant v1alpha1 TFOperator custom resource

While the conversion is being performed, monitor the recreation of the Tungsten Fabric service pods. Verify that TFOperator v2 has been created successfully:

kubectl -n tf describe tf.mirantis.com openstack-tf

Reverse the conversion of v1alpha1 TFOperator to v2¶

MOSK 24.3

Caution

During the reverse conversion, the Tungsten Fabric service pods will get updated. Therefore, Mirantis recommends performing the procedure during the maintenance window.

Caution

Reverse conversion is not possible since MOSK 25.1 because v1alpha1 TFOperator is removed from the product.

Update Helm release values in the corresponding ClusterRelease resource:
```
values:
  operator:
    convertToV2: false
```
When the controller starts, it should use the v1alpha1 TFOperator custom resources for reconcilation.

MOSK 24.2

Caution

During the conversion reverse, the Tungsten Fabric service pods will get recreated. Therefore, Mirantis recommends performing the conversion during the maintenance window.

Update the TFOperator HelmBundle:

values:
  operator:
    convertToV2: false

Manually, delete the v2 TFOperator custom resource:

kubectl -n tf delete tf.mirantis.com openstack-tf

Manually, create the v1alpha1 TFOperator custom resource using data from the tfoperator-v1alpha1-copy ConfigMap.

Replace a failed TF controller node¶

If one of the Tungsten Fabric (TF) controller nodes has failed, follow this procedure to replace it with a new node.

To replace a TF controller node:

Note

Pods that belong to the failed node can stay in the Terminating state.

If a failed node has tfconfigdb=enabled or tfanalyticsdb=enabled, or both labels assigned to it, get and note down the IP addresses of the Cassandra pods that run on the node to be replaced:
```
kubectl -n tf get pods -owide | grep 'tf-cassandra.*<FAILED-NODE-NAME>'
```
Delete the failed TF controller node from the Kubernetes cluster using the Mirantis Container Cloud web UI or CLI. For the procedure, refer to Delete a cluster machine.

Note

Once the failed node has been removed from the cluster, all pods that hanged in the Terminating state should be removed.
Remove the control (with the BGP router), config, and config-db nodes from the TF configuration database:
Tungsten Fabric web UI
To remove control (with the BGP router) nodes:
1. Log in to the TF web UI.
2. Navigate to Configure > BGP Routers.
3. Delete all terminated control nodes.
To remove other type nodes:
1. Log in to the TF web UI.
2. Navigate to Configure > Nodes.
3. Delete the terminated nodes.
Tungsten Fabric API using curl
1. Log in to the Keystone client container:
 kubectl -n openstack exec -it deployment/keystone-client -- bash
2. Get a link for the failed TF config node:
 curl -s -H "X-Auth-Token: $(openstack token issue | awk '/ id / {print $4}')" http://tf-config-api.tf.svc:8082/config-nodes | jq
3. Remove the failed config node:
 curl -s -H "X-Auth-Token: $(openstack token issue | awk '/ id / {print $4}')" -X "DELETE" <LINK_FROM_HREF_WITH_NODE_UUID>
4. Obtain the list of config-database and control nodes:
 curl -s -H "X-Auth-Token: $(openstack token issue | awk '/ id / {print $4}')" http://tf-config-api.tf.svc:8082/config-database-nodes | jq curl -s -H "X-Auth-Token: $(openstack token issue | awk '/ id / {print $4}')" http://tf-config-api.tf.svc:8082/control-nodes | jq
5. Identify the config-database and control nodes to be deleted using the href field from the system output from the previous step. Delete the nodes as required:
 curl -s -H "X-Auth-Token: $(openstack token issue | awk '/ id / {print $4}')" -X "DELETE" <LINK_FROM_HREF_WITH_NODE_UUID>
Tungsten Fabric API using tf-api-cli
1. Enable the tf-api-cli Deployment as described in Enable tf-api-cli.
2. Log in to the tf-api-cli container:
 kubectl -n tf exec -it deployment/tf-tool-cli -- bash
3. Obtain the list of the Tungsten Fabric config nodes and identify the one to be removed using the cat argument and checking the name field:
 tf-api-cli ls config-node tf-api-cli cat config-node/<UUID>
4. Remove the config node, confirm the removal command:
 tf-api-cli rm config-node/<UUID>
5. Obtain the list of the Tungsten Fabric config-database nodes and identify the one to be removed using the cat argument and checking the name field:
 tf-api-cli ls config-database-node tf-api-cli cat config-database-node/<UUID>
6. Remove the config-database node, confirm the removal command:
 tf-api-cli rm config-database-node/<UUID>
7. Obtain the list of the BGP routers and identify the one to be removed using the cat argument and checking the name field:
 tf-api-cli ls bgp-router tf-api-cli cat bgp-router/<UUID>
8. Remove the Tugsten Fabric control node with the BGP router, confirm the removal command:
 tf-api-cli rm bgp-router/<UUID>

Assign the TF labels for the new control plane node as per the table below using the following command:

kubectl label node <NODE-NAME> <LABEL-KEY=LABEL-VALUE> ...

Tungsten Fabric (TF) node roles¶
Node role	Description	Kubernetes labels	Minimal count
TF control plane	Hosts the TF control plane services such as `database`, `messaging`, `api`, `svc`, `config`.	`tfconfig=enabled` `tfcontrol=enabled` `tfwebui=enabled` `tfconfigdb=enabled`	3
TF analytics Unsupported since {{ product_name_abbr }} 24.2	Hosts the TF analytics services.	`tfanalytics=enabled` `tfanalyticsdb=enabled`	3
TF vRouter	Hosts the TF vRouter module and vRouter Agent.	`tfvrouter=enabled`	Varies

Note

TF supports only Kubernetes OpenStack workloads. Therefore, you should label OpenStack compute nodes with the tfvrouter=enabled label.

Note

Do not specify the openvswitch=enabled label for the OpenStack deployments with TF as a networking backend.

Once you label the new Kubernetes node, new pods start scheduling on the node. Though, pods that use Persistent Volume Claims are stuck in the Pending state as their volume claims stay bounded to the local volumes from the deleted node. To resolve the issue:
1. Delete the PersistentVolumeClaim (PVC) bounded to the local volume from the failed node:
```
kubectl -n tf delete pvc <PVC-BOUNDED-TO-NON-EXISTING-VOLUME>
```
 Note
 
 Clustered services that use PVC, such as Cassandra, Kafka, and ZooKeeper, start the replication process when new pods move to the Ready state.
2. Check the PersistenceVolumes (PVs) claimed by the deleted PVCs. If a PV is stuck in the Released state, delete it manually:
```
kubectl -n tf delete pv <PV>
```
3. Delete the pod that is using the removed PVC:
```
kubectl -n tf delete pod <POD-NAME>
```
Verify that the pods have successfully started on the replaced controller node and stay in the Ready state.
If the failed controller node had tfconfigdb=enabled or tfanalyticsdb=enabled, or both labels assigned to it, remove old Cassandra hosts from the config and analytics cluster configuration:
1. Get the host ID of the removed Cassandra host using the pod IP addresses saved during Step 1:
```
kubectl -n tf exec tf-cassandra-<config/analytics>-dc1-rack1-1 -c cassandra -- nodetool status
```
2. Verify that the removed Cassandra node has the DN status that indicates that this node is currently offline.
3. Remove the failed Cassandra host:
```
kubectl -n tf exec tf-cassandra-<config/analytics>-dc1-rack1-1 -c cassandra -- nodetool removenode <HOST-ID>
```

Delete a vRouter node¶

Tungsten Fabric vRouter collocates with the OpenStack compute node. Therefore, to delete the vRouter node from your cluster, follow the node deletion procedure in Delete a compute node.

Additionally, you need to remove the vhost0 OpenStack port and the Node object of the deleted node from the Tungsten Fabric database:

Log in to the keystone-client pod in the openstack namespace through the command line.
Obtain the OpenStack token required to authenticate with the Tungsten Fabric API service:
```
TOKEN=$(openstack token issue | awk '/ id / {print $4}')
```

Obtain the list of vRouter nodes to retrieve the link for the deleted node:

curl -s -H "X-Auth-Token: ${TOKEN}" http://tf-config-api.tf.svc:8082/virtual-routers | jq

Obtain the UUID of the vhost0 port to be removed:

curl -s -H "X-Auth-Token: ${TOKEN}" http://tf-config-api.tf.svc:8082/virtual-router/<VROUTER_UUID> | jq '.["virtual-router"]["virtual_machine_interfaces"][0]["uuid"]'

Delete the vhost0 OpenStack port through the Openstack Horizon or CLI:
```
openstack port delete <VHOST0_PORT_UUID>
```

Delete the vRouter Node object from the Tungsten Fabric configuration database:

curl -s -X DELETE -H "X-Auth-Token: ${TOKEN}" http://tf-config-api.tf.svc:8082/virtual-router/<VROUTER_NODE_UUID>

Verify the Tungsten Fabric deployment¶

This section explains how to use the tungsten-pytest test set to verify your Tungsten Fabric (TF) deployment. The tungsten-pytest test set is part of the TF operator and allows for prompt verification of the Kubernetes objects related to TF and basic verification of the TF services.

To verify the TF deployment using tungsten-pytest:

Enable the tf-test controller in the TF Operator resource for the Operator to start the pod with the test set:
```
spec:
  controllers:
    tf-test:
      tungsten-pytest:
        enabled: true
```
Wait until the tungsten-pytest pod is ready. To keep track of the tests execution and view the results, run:
```
kubectl -n tf logs -l app=tf-test
```

Optional. The test results are stored in the tf-test-tungsten-pytest PVC. To obtain the results:

Deploy the pod with the mounted volume:

# Run pod and mount pvc to it
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: tf-test-results-pod
  namespace: tf
spec:
  volumes:
    - name: tf-test-data-volume
      persistentVolumeClaim:
        claimName: tf-test-tungsten-pytest
  containers:
    - name: tempest-pvc-container
      image: ubuntu
      command: ['sh', '-c', 'sleep infinity']
      volumeMounts:
        - mountPath: "/tungsten-pytest/data"
          name: tf-test-data-volume
EOF

Copy the results locally:

kubectl -n tf cp tf-test-results-pod:tungsten-pytest/data/results.xml results.xml

Remove the Tempest test results pod:

kubectl -n tf delete pod tf-test-results-pod

Disable the tf-test controller:

kubectl -n tf patch tfoperator openstack-tf --type='json' -p='[{"op": "replace", "path": "/spec/controllers/tf-test/tungsten-pytest/enabled", "value": false}]'

Manually remove the pod with tests:

kubectl -n tf delete pod -l app=tf-test

Configure load balancing¶

This section describes a simple load balancing configuration. As an example, we use a topology for balancing the traffic between two HTTP servers listening on port 80. The example topology includes the following parameters:

Backend servers 10.10.0.4 and 10.10.0.3 in the private-subnet subnet run an HTTP application that listens on the TCP port 80.
The public-subnet subnet is a shared external subnet created by the cloud operator and accessible from the Internet.
The created load balancer is accessible through an IP address from the public subnet that will distribute web requests between the backend servers.

To configure load balancing:

Log in to a keystone-client pod.
Create a load balancer:
```
openstack loadbalancer create --vip-subnet-id=private-subnet --name test-lb
```
Note

By default, MOSK uses the Octavia Tungsten Fabric load balancing. Since 23.1, you can explicitly specify amphorav2 as a provider when creating a load balancer using the provider argument:
```
openstack loadbalancer create --provider amphorav2
```
Octavia Amphora load balancing is available as a Technology Preview feature. For details, refer to Octavia Amphora load balancing.

Create an HTTP listener:

openstack loadbalancer listener create --name test-listener \
--protocol HTTP --protocol-port 80 test-lb

Create a LBaaS pool that will be used by default for test-listener:

openstack loadbalancer pool create  --protocol HTTP \
--lb-algorithm ROUND_ROBIN --name test-pool --listener test-listener

Create a health monitor that ensures health of the pool members:

openstack loadbalancer healthmonitor create --delay 5 --name test-hm \
--timeout 3 --max-retries 3 --type HTTP test-pool

Add backend servers to the pool. The following example adds the 10.10.0.3 and 10.10.0.4 backend servers:

openstack loadbalancer member create --address 10.10.0.3 --protocol-port 80 test-pool
openstack loadbalancer member create --address 10.10.0.4 --protocol-port 80 test-pool

Create a floating IP address in a public network and associate it with a port of the load balancer VIP:

vip_port_id=$(openstack loadbalancer show test-lb -c vip_port_id \
-f value)
fip_id=$openstack floating ip create public -c floating_ip_address \
-f value)
openstack floating ip set --port $vip_port_id $fip_id

Optionally, enable security groups using the Tungsten Fabric web UI:
1. Navigate to Configure > Networking > Ports.
2. Find the load balancer ports.
3. In the Device column, find the VIP port.
4. Using the gear icon menu of the VIP port, enable Security Groups.
Access the VIP floating IP address and verify that requests are distributed between the two servers. For example:
```
curl http://10.11.12.103:80
Welcome to addr:10.10.10.4

curl http://10.11.12.103:80
Welcome to addr:10.10.10.3
```
In the example above, an HTTP application that runs on the backend servers returns an IP address of the host on which it runs.

Enable Tungsten Fabric Cassandra repairs¶

Available since MOSK 25.1

MOSK enables you to activate automatic Tungsten Fabric database repairs using the tf-dbrepair-job CronJob. Running this repair job is essential for maintaining the health and consistency of a Cassandra cluster.

Below are scenarios where running the repair job is recommended:

Node recovery
If a node has been down for an extended period or replaced, run a repair after bringing it back online to ensure it has all the latest data
Major changes
After significant changes, such as cluster update or schema modifications, run a repair to ensure data consistency
Cluster modifications
When nodes are added or removed, data may not immediately be consistent across replicas. Run a repair to reconcile the data across the cluster

To enable the repair job:

Edit the TFOperator custom resource to enable the database repair job:

spec:
  features:
    dbRepair:
      enabled: true

Optional. Specify the job schedule. By default, the job will run weekly. For example, to schedule the job to run daily:
```
spec:
  features:
    dbRepair:
      enabled: true
      schedule: '@daily'
```
You can also use default cron expressions with five fields. For example, to schedule the job to run at 8:30 on Mondays:
```
spec:
  eatures:
    dbRepair:
      enabled: true
      schedule: '30 8 * * 1'
```

See also

Enable contrail-tools¶

The contrail-tools container provides a centralized location for all available Tungsten Fabric tools and CLI commands. The container includes such utilities as vif, flow, nh, and other tools to debug network issues. MOSK deploys contrail-tools using the Tungsten Fabric Operator through the TFOperator custom resource.

To enable the Tungsten Fabric contrail-tools Deployment:

Enable the tools Deployment in the TFOperator resource for the operator to start the Pods with utilities to debug Tungsten Fabric on nodes with the tfvrouter:enabled label:
API v2 Available since MOSK 24.1
spec: features: tfTools: tfToolsEnabled: true labels: tfvrouter: enabled
API v1alpha1 Removed in MOSK 25.1
spec: controllers: tf-tool: tools: enabled: true labels: tfvrouter: enabled
Note

Use the labels section to specify target nodes for the contrail-tools Deployment. If the labels section is not specified, the tf-tool-ctools-<xxxxx> Pods will be scheduled to all available nodes in current Deployment.
Wait until the tf-tool-ctools-<xxxxx> Pods are ready in the tf namespace.

Note

The <xxxxx> string in a Pod name consists of random alpha-numeric symbols generated by Kubernetes to differentiate the tf-tool-ctools Pods.
Use interactive shell in the tf-tool-ctools-<xxxxx> Pod to debug current Deployment or run commands through kubectl, for example:
```
kubectl -n tf exec tf-tool-ctools-<xxxxx> -- vif --list
```

Disable the tools Deployment:

API v2 Available since MOSK 24.1

kubectl -n tf patch tfoperator.tf.mirantis.com <TFOperator CR name> --type='json' -p='[{"op": "replace", "path": "/spec/features/tfTools/tfToolsEnabled", "value": false}]'

API v1alpha1 Removed in MOSK 25.1

kubectl -n tf patch tfoperator.operator.tf.mirantis.com <TFOperator CR name> --type='json' -p='[{"op": "replace", "path": "/spec/controllers/tf-tool/tools/enabled", "value": false}]'

Enable tf-api-cli¶

The tf-api-cli container provides access to the Tungsten Fabric API through the command-line interface (CLI). See the contrail-api-cli documentation for details.

Note

The tf-api-cli tool was initially called contrail-api-cli.

To enable the Tungsten Fabric API CLI Deployment:

Enable the tf-cli Deployment in the TFOperator custom resource to start the Pod with utilities to access the Tungsten Fabric API CLI:
API v2 Available since MOSK 24.1
spec: features: tfTools: tfCliEnabled: true
API v1alpha1 Removed in MOSK 25.1
spec: controllers: tf-tool: tf-cli: enabled: true
Wait for the tf-tool-cli Pod to start running in the tf namespace.

Once the tf-tool-cli Pod is running, use the interactive shell to access the Tungsten Fabric API CLI:

kubectl -n tf exec tf-tool-cli -it  -- bash

The following example illustrates the use of the tf-api-cli command inside a container:

tf-api-cli ls virtual-network

To disable the Tungsten Fabric API CLI Deployment:

Update the TFOperator custom resource and disable the tf-cli Deployment:

API v2 Available since MOSK 24.1

kubectl -n tf patch tfoperator.tf.mirantis.com <TFOperator CR name> --type='json' -p='[{"op": "replace", "path": "/spec/features/tfTools/tfCliEnabled", "value": false}]'

API v1alpha1 Removed in MOSK 25.1

kubectl -n tf patch tfoperator.operator.tf.mirantis.com <TFOperator CR name> --type='json' -p='[{"op": "replace", "path": "/spec/controllers/tf-tool/tf-cli/enabled", "value": false}]'

Bare metal operations¶

This section covers the management aspects of the bare metal operations of a MOSK cluster.

Upgrade an operating system distribution¶

Available since MCC 2.24.0 (14.0.0)

Caution

Distribution upgrade of an operating system (OS) is implemented for management and MOSK clusters.

For management clusters, an OS distribution upgrade occurs automatically since Container Cloud 2.24.0 (Cluster release 14.0.0) as part of cluster update and requires machines reboot. The upgrade workflow is as follows:

The distribution ID value is taken from the id field of the distribution from the allowedDistributions list in the spec of the ClusterRelease object.
The distribution that has the default: true value is used during update. This distribution ID is set in the spec:providerSpec:value:distribution field of the Machine object during cluster update.

For MOSK clusters, an in-place OS distribution upgrade should be performed between cluster updates. This scenario implies a machine cordoning, draining, and reboot.

The table below illustrates the correlation between the cluster updates and upgrade to Ubuntu 22.04 to help you effectively plan and perform the upgrade.

Correlation between cluster updates and upgrade to Ubuntu 22.04¶
Management cluster version	MOSK cluster version	Default Ubuntu version	Key impact	Action required
2.28.5	24.3.0 24.3.1 24.3.2	22.04	No impact	Management cluster nodes are automatically upgraded to Ubuntu 22.04 during cluster upgrade to Container Cloud 2.27.0 (Cluster release 16.2.0) Strongly recommended to upgrade Ubuntu on all MOSK cluster nodes. 0
2.29	25.1	22.04	MOSK cluster requires Ubuntu 22.04. Upgrade to 25.1 is blocked unless all cluster nodes are running on Ubuntu 22.04.	Upgrade all MOSK cluster nodes to Ubuntu 22.04 to unblock the MOSK cluster update to 25.1. 0
2.29.1	24.3.3	22.04	Management cluster update to 2.29.1 is blocked unless all nodes present in your deployment are running on Ubuntu 22.04.	Upgrade all nodes to Ubuntu 22.04 to unblock your management cluster update. 0

0(1,2,3): Upgrading all nodes at once is not mandatory. You can upgrade them individually or in small batches, depending on time constraints in the maintenance window.

Caution

After the major cluster update, make sure to change the postponeDistributionUpdate parameter back to false unless you want to postpone new OS distribution upgrades.

Note

If you want to migrate container runtime on cluster machines from Docker to containerd and have not upgraded the OS distribution to Jammy yet, Mirantis recommends combining both procedures to minimize the maintenance window. In this case, ensure that all cluster machines are updated during one maintenance window to prevent machines from running different container runtimes.

Container runtime migration to containerd is available since Container Cloud 2.28.4 (Cluster releases 17.3.4 and 16.3.4). For details, see Migrate container runtime from Docker to containerd.

To upgrade an OS distribution on MOSK between releases:

Caution

A machine reboot occurs automatically during distribution update.

Open the required Machine object for editing.
In spec:providerSpec:value:distribution, set the required ID of the new OS version. For example, ubuntu/jammy.

For description of the Machine object fields, see Machine resource.

The machine reboot occurs automatically after completion of deployment phases.
Once the distribution upgrade completes, verify that currentDistribution matches the distribution value previously set in the object spec. For description of the status fields, see Machine status.
Repeat the procedure with the remaining machines.
Optional. Available since Container Cloud 2.28.4 (Cluster releases 17.3.4 and 16.3.4). Upgrade container runtime from Docker to containerd together with distribution upgrade as described in Migrate container runtime from Docker to containerd to minimize the size of maintenance window.

Note

Container runtime migration becomes mandatory in the scope of Container Cloud 2.29.x. Otherwise, the management cluster update to Container Cloud 2.30.0 will be blocked.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

See also

Cluster update

Automatic upgrade of a host operating system¶

To keep operating system on a bare metal host up to date with the latest security updates, the operating system requires periodic software packages upgrade that may or may not require the host reboot.

MOSK uses life cycle management tools to update the operating system packages on the bare metal hosts.

In a management cluster, software package upgrade and host restart are applied automatically when a new Container Cloud version with available kernel or software packages upgrade is released.

In a MOSK cluster, package upgrade and host restart are applied as part of usual cluster update, when applicable. To start planning the maintenance window and proceed with the MOSK cluster update, see Cluster update.

Operating system upgrade and host restart are applied to cluster nodes one by one. If Ceph is installed in the cluster, the MOSK orchestration securely pauses the Ceph OSDs on the node before restart. This allows avoiding degradation of the storage service.

Remove old Ubuntu kernel packages¶

Available since MCC 2.25.0 (17.0.0 and 16.0.0)

During a management or managed cluster update with Ubuntu package updates, MOSK automatically removes unnecessary kernel and system packages.

During cleanup, MOSK keeps a number of kernel versions following the default behavior of the Ubuntu apt autoremove command:

Booted kernel
The currently booted kernel is always kept.
Latest kernel
If there are any kernel packages with versions higher than the booted kernel version, then the kernel package with the highest version is also kept.
Previous kernel
If there are any installed kernel packages with versions lower than the one of the latest kernel that equals the booted kernel, then the kernel with version previous to the latest kernel is kept.

Note

Previous kernel does not equal the previously booted kernel. Previous kernel is an N-1 kernel in a sorted list of all kernels installed in the system where N is the kernel with the highest version.

Caution

If a kernel package is a dependency of another package, it will not be automatically removed. The rules above do not apply to such a case.

The number of kernel packages may be more than two if the apt autoremove command has never been used or if the cluster is affected by the known issue 46808.

Mirantis recommends keeping previous kernel version for fallback in case the current kernel becomes unstable. However, if you absolutely require leaving only the booted version of kernel packages, you can use the script described below after considering all possible risks.

To remove all kernel packages of the previous version:

Verify that the cluster is successfully updated and is in the Ready state.
Log in as root to the required node using SSH.

Run the following script that calls an Ansible module targeted at local host. The module outputs a list of packages to remove, if any, without actually removing them.

cleanup-kernel-packages

The script workflow includes the following tasks:

Task order	Task name	Description
1	`Get kernels to cleanup`	Collect installed kernel packages and detect the candidates for removal.
2	`Get kernels to cleanup (LOG)`	Print the log from the first task.
3	`Kernel packages to remove`	Print the list of packages collected by the first task.
4	`Remove kernel packages`	Remove packages that are detected as candidates for removal if the following conditions are met: The script detects at least one candidate for removal You add the `--cleanup` flag to the cleanup-kernel-packages command

If the system outputs any packages to remove, carefully assess the list from the output of the Kernel packages to remove task.

Caution

The script removes all detected packages. There is no possibility to modify the list of candidates for removal.

Example of system response with several packages to remove

TASK [Get kernels to cleanup]
ok: [localhost]

TASK [Get kernels to cleanup (LOG)]
ok: [localhost] => {
    "cleanup_kernels.log": [
        "2023-09-28 10:08:42,849 [INFO] Logging enabled",
        "2023-09-28 10:08:42,865 [DEBUG] Found kernel package linux-headers-5.15.0-79-generic, version 5.15.0.post79-generic",
        "2023-09-28 10:08:42,865 [DEBUG] Found kernel package linux-headers-5.15.0-83-generic, version 5.15.0.post83-generic",
        "2023-09-28 10:08:42,865 [DEBUG] Found kernel package linux-hwe-5.15-headers-5.15.0-79, version 5.15.0.post79",
        "2023-09-28 10:08:42,865 [DEBUG] Found kernel package linux-hwe-5.15-headers-5.15.0-83, version 5.15.0.post83",
        "2023-09-28 10:08:42,866 [DEBUG] Found kernel package linux-image-5.15.0-79-generic, version 5.15.0.post79-generic",
        "2023-09-28 10:08:42,866 [DEBUG] Found kernel package linux-image-5.15.0-83-generic, version 5.15.0.post83-generic",
        "2023-09-28 10:08:42,866 [DEBUG] Found kernel package linux-modules-5.15.0-79-generic, version 5.15.0.post79-generic",
        "2023-09-28 10:08:42,866 [DEBUG] Found kernel package linux-modules-5.15.0-83-generic, version 5.15.0.post83-generic",
        "2023-09-28 10:08:42,866 [DEBUG] Found kernel package linux-modules-extra-5.15.0-79-generic, version 5.15.0.post79-generic",
        "2023-09-28 10:08:42,866 [DEBUG] Found kernel package linux-modules-extra-5.15.0-83-generic, version 5.15.0.post83-generic",
        "2023-09-28 10:08:42,871 [DEBUG] Current kernel is 5.15.0.post83-generic",
        "2023-09-28 10:08:42,871 [INFO] Kernel package version prior '5.15.0.post83': 5.15.0.post79",
        "2023-09-28 10:08:42,872 [INFO] No kernel packages after version '5.15.0.post83' found.",
        "2023-09-28 10:08:42,872 [INFO] Kernel package versions to remove: 5.15.0.post79",
        "2023-09-28 10:08:42,872 [DEBUG] The following packages are candidates for autoremoval: linux-headers-5.15.0-79-generic, linux-hwe-5.15-headers-5.15.0-79,linux-image-5.15.0-79-generic, linux-modules-5.15.0-79-generic, linux-modules-extra-5.15.0-79-generic",
        "2023-09-28 10:08:45,338 [DEBUG] The following packages are resolved reverse dependencies for autoremove candidates: linux-modules-5.15.0-79-generic, linux-modules-extra-5.15.0-79-generic, linux-hwe-5.15-headers-5.15.0-79, linux-headers-5.15.0-79-generic, linux-image-5.15.0-79-generic",
        "2023-09-28 10:08:45,338 [INFO] No protected packages found",
        "2023-09-28 10:08:45,339 [INFO] Exiting successfully"
    ]
}

TASK [Kernel packages to remove]
ok: [localhost] => {
    "cleanup_kernels.packages": [
        "linux-headers-5.15.0-79-generic",
        "linux-hwe-5.15-headers-5.15.0-79",
        "linux-image-5.15.0-79-generic",
        "linux-modules-5.15.0-79-generic",
        "linux-modules-extra-5.15.0-79-generic"
    ]
}

TASK [Remove kernel packages] ****************
skipping: [localhost]

If you decide to proceed with removal of package candidates, rerun the script with the --cleanup flag:
```
cleanup-kernel-packages --cleanup
```

Host operating system configuration¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0) TechPreview

Important

The cloud operator takes all risks and responsibility for module execution on cluster machines. For any questions, contact Mirantis support.

Caution

Due to the known issue 49678 addressed in MOSK 25.1, the HostOSConfiguration object may not work as expected after migration to containerd. For details, see the issue description.

The host operating system (OS) configuration API extends configuration management of baremetal-based clusters and machines after initial deployment. The feature allows managing the operating system of a bare metal host granularly using modules without rebuilding the node from scratch. Such approach prevents workload evacuation and significantly reduces configuration time.

The host OS configuration API does not limit the cloud operator’s ability to configure machines in any way, making the operator responsible for day-2 adjustments.

This section provides guidelines for MOSK or custom modules that are used by the HostOSConfiguration and HostOSConfigurationModules custom resources designed for MOSK clusters.

Host OS configuration management workflow¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0) TechPreview

The workflow of the host operating system (OS) configuration API that use modules provided by MOSK or custom modules is as follows:

If you are a module consumer

Select one of the following options:
- If you do not intend to use custom modules, skip to step 2.
- If you intend to use custom modules:
  1. Contact the module creator to obtain the link to the module and its SHA256.
  2. Add the module to an existing HostOSConfigurationModules (hocm) object or create a new hocm object. For details, see HostOSConfigurationModules and Add a custom module to a MOSK deployment.
Add the configuration of the Container Cloud or custom module to an existing HostOSConfiguration (hoc) object or create a new hoc object with the following details:
1. Add the required configuration details of the module.
2. Set the selector for machines to apply the configuration.
For details, see HostOSConfiguration along with HostOSConfiguration and HostOSConfigurationModules concepts.
Optional. Retrigger the same successfully applied module configuration. For details, see Retrigger a module configuration.

If you are a custom module creator

Create a custom configuration module as required. For reference, see Format and structure of a module package and Modules provided by MOSK.
Publish the module in a repository from which the cloud operator can fetch the module.
Share the module details with the cloud operator.

The following diagram illustrates the high-level overview of the host OS configuration operations API:

Global recommendations for implementation of custom modules¶

The following global recommendations are intended to help creators of modules and cloud operators to work with the host operating system (OS) configuration management API for module implementation and execution, in order to keep the cluster and machines healthy and ensure safe and reliable cluster operability.

Functionality limitations¶

Module functionality is limited only by the Ansible itself along with playbook rules for a particular Ansible version. But Mirantis highly recommends paying a special attention to critical components of Container Cloud, some of which are mentioned below, and not managing them by the means of host OS configuration modules.

Important

The cloud operator takes all risks and responsibility for module execution on cluster machines. For any questions, contact Mirantis support.

Do not restart Docker, containerd, and Kubernetes-related services.
Do not configure Docker and Kubernetes node labels.
Do not reconfigure or upgrade MKE.
Do not change the MKE bundle.
Do not reboot nodes using a host OS configuration module.
Do not change network configuration, especially on critical LCM and external networks, so that they remain consistent with kaas-ipam objects.
Do not change iptables, especially for Docker, Kubernetes, and Calico rules.
Do not change partitions on the fly, especially the / and /var/lib/docker ones.

Ansible version¶

Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), the following Ansible versions are supported for Ubuntu 20.04 and 22.04: Ansible 2.12.10 and Ansible 5.10.0-collection. Therefore, your custom modules must be compatible with the corresponding Ansible versions provided for a specific Cluster release, on which your cluster is based.

To verify the Ansible version in a specific Cluster release, refer to Container Cloud Release notes: Cluster releases. Use the Artifacts > System and MCR artifacts section of the corresponding Cluster release. For example, for 17.3.0.

Ansible configuration¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0)

By default, any Ansible execution in Container Cloud uses /etc/ansible/mcc.cfg as Ansible configuration. A custom module may require a specific Ansible configuration that you can add using ansible.cfg in the root folder of the module and package it in the module archive. Such configuration has higher priority than the default one.

Module implementation principles¶

Treat a host OS configuration module as an Ansible module to control a limited set of system resources related to one component, for example, a service or driver, so that a module contains a very limited amount of tasks to set up that component.

For example, if you need to configure a service on a host, the module must manage only package installation, related configuration files, and service enablement. Do not implement the module in a way so that it manages all tasks required for the day-2 configuration of a host. Split such functionality on tasks (modules) responsible for management of a single component. This helps to re-apply (re-run) every module separately in case of any changes.

Mirantis highly recommends using the following key principles during module implementation:

Idempotency: Any module re-run with the same configuration values must lead to the same result.
Granularity: The module must manage only one specific component on a host.
Reset action: The module must be able to revert changes introduced by the module, or at least the module must be able to disable the component controller. The Container Cloud LCM does not provide a way to revert a day-2 change due to unpredictability of potential functionality of any module. Therefore, the reset action must be implemented on the module level. For example, the package or file state can be present or absent, a service can be enabled or disabled. And these states must be controlled by the configuration values.

Modules testing¶

Mirantis highly recommends verifying any Container Cloud or custom module on one machine before applying it to all target machines. For the testing procedure, see Test a custom or MOSK module after creation.

Reboot required¶

A custom module may require node reboot after execution. Implement a custom module using the following options, so that it can notify lcm-agent and Container Cloud controllers about the required reboot:

If a module installs a package that requires a host reboot, then the /run/reboot-required and /var/run/reboot-required.pkgs files are created automatically by the package manager. LCM Agent detects these files and places information about the reboot reason in the LCMMachine status.
A module can create the /run/reboot-required file on the node. You can add the reason for reboot in one of the following files as plain text:
- /run/day2/reboot-required ^{Since Container Cloud 2.28.0 (Cluster
  releases 17.3.0 and 16.3.0)}
- /run/lcm/reboot-required ^{Deprecated since Container Cloud 2.28.0}
This text is passed to the reboot reason in the LCMMachine status.

Once done, you can handle a machine reboot using GracefulRebootRequest. For details, see GracefulRebootRequest resource.

Module deprecation¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0)

You can deprecate a module. Deprecation is soft, meaning that no actual restrictions are applied to the usage of a deprecated module.

To deprecate a module, add the deprecates field to the metadata.yaml file of the custom module. For example:

name: module-sample
version: 1.0.0
description: 'Module for sample purposes'
playbook: main.yaml
deprecates:
  - name: another-module-to-deprecate
    version: 0.1.0
  - version: 0.1.0

If the name field is absent, then the deprecation logic is applied to the module with the same name, meaning that the example above effectively equals to the following one:

name: module-sample
version: 1.0.0
description: 'Module for sample purposes'
playbook: main.yaml
deprecates:
  - name: another-module-to-deprecate
    version: 0.1.0
  - name: module-sample
    version: 0.1.0

Deprecating a version automatically deprecates all lower SemVer versions of the specified module.

Format and structure of a module package¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0) TechPreview

A module package for host operating system configuration management is an archive that contains Ansible playbooks, metadata, and optionally a JSON-validation schema.

Requirements¶

Archive the file with the module package in the GZIP format.
Implement all playbooks for Ansible version used by a specific Cluster release of your Container Cloud cluster. For example, in Cluster releases 16.2.0 and 17.2.0, Ansible collection 5.10.0 and Ansible core 2.12.10 are used.

To verify the Ansible version in a specific Cluster release, refer to Container Cloud Release notes: Cluster releases. Use the Artifacts > System and MCR artifacts section of the corresponding Cluster release. For example, for 17.3.0.

Note

Mirantis recommends implementing each module in modular approach avoiding a single module for everything. This ensures maintainability and readability, as well as improves testing and debugging. For details, refer to Global recommendations for implementation of custom modules.

Archive format¶

The common structure within a module archive is as follows:

main.yaml
File name of the primary playbook that defines tasks to be executed.
metadata.yaml
Metadata of the module such as name, version, and relevant documentation URLs.
schema.json
Optional. JSON schema for validating module-specific configurations that are restricted values.

Metadata file format¶

The common structure of metadata.yaml is as follows:

name
Required. Name of the module.
version
Required. Version of the module.
docURL
Optional. URL to the module documentation.
description
Optional. Brief summary of the module, useful if the complete documentation is too detailed.
playbook
Required. Path to the module playbook. Path must be related to the archive root that is directory/playbook.yaml if directory is a directory in the root of the archive.
valuesJsonSchema
Optional. Path to the JSON-validation schema of the module. Path must be related to the archive root that is directory/schema.json if directory is a directory in the root of the archive.
deprecates
Optional. Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). List of modules that are deprecated by the module. For details, see Module deprecation.
supportedDistributions
Optional. Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). List of operating system distributions that are supported by the current module. An empty list means support of any distribution by the current module.

Example of metadata.yaml:

name: module-sample
version: 1.0.0
docURL: https://docs.mirantis.com
description: 'Module for sample purposes'
playbook: main.yaml
valuesJsonSchema: schema.json
deprecates:
  - name: another-module-to-deprecate
    version: 0.1.0

JSON-validation schema format¶

For description of JSON schema and its format, refer to JSON Schema official documentation.

Example of schema.json:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "kernel.panic": {"type": "string", "const": "1"}
  }
}

Playbook format¶

A playbook for a module must follow the rules of a particular Ansible version as mentioned in Requirements.

The only specific requirement for playbook format is to use the values variable that consists of values described in the inventory file.

Note

As hosts are selected in a HostOSConfiguration object, Mirantis recommends using hosts: all in module playbooks.

For example:

- name: <variable-name>
  hosts: all
  become: true
  tasks:
    - name: <value-name>
      module:
        name: "{{ item.key }}"
        value: "{{ item.value }}"
        state: present
        reload: yes
      with_dict: "{{ values }}"

Inventory format¶

An archive of a module does not require an inventory because the inventory is generated by lcm-controller while processing configurations. The format of the generated inventory is as follows:

all:
  hosts:
    localhost:
      ansible_connection: local
  vars:
    values:
{{- range $key, $value := .Values }}
      {{ $key }}: {{ $value }}
{{- end }}

The .Values parameter contains the values from the provided module configuration of the HostOSConfiguration object. For details, see HostOSConfiguration.

Modules provided by MOSK¶

Available since MCC 2.27.0 (17.2.0 and 16.2.0) TechPreview

MOSK provides several configuration modules that use the designated hocm object named mcc-modules. All other hocm objects contain custom modules. For configuration modules provided by Mirantis, refer to mosk-host-os-modules documentation.

Warning

Do not modify the mcc-modules object that contains only Mirantis-provided modules. Any changes to this object will be overwritten with data from an external source.

HostOSConfiguration and HostOSConfigurationModules concepts¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0) TechPreview

This section outlines fundamental concepts of the HostOSConfiguration, aka hoc, and HostOSConfigurationModules, aka hocm, custom resources as well as provides usage guidelines for these resources. For detailed descriptions of these resources, see HostOSConfiguration and HostOSConfigurationModules.

MOSK and custom-made modules¶

MOSK provides modules, which are described in mosk-host-os-modules documentation , using the designated hocm object named mcc-modules. All other hocm objects contain custom modules.

Warning

Do not modify the mcc-modules object that contains only Mirantis-provided modules. Any changes to this object will be overwritten with data from an external source.

Machine selector¶

Selector value¶

When the value of the machineSelector field in a hoc object is empty (by default), no machines are selected. Therefore, no actions are triggered until you provide a non-empty machineSelector.

This approach differs from the default behavior of Kubernetes selectors to ensure that none of configurations are applied to all machines in a cluster accidentally.

Namespace of a Machine object¶

It is crucial to ensure that the namespace of a hoc object is the same as the namespace of the associated Machine objects defined in the machineSelector field.

For example, the following machines are located in two separate namespaces, default and other-ns, and the hoc object is located in other-ns:

NAMESPACE    NAME                              LABELS
default      machine.cluster.k8s.io/master-0   example-label="1"
default      machine.cluster.k8s.io/master-1   example-label="1"
default      machine.cluster.k8s.io/master-2   example-label="1"
other-ns     machine.cluster.k8s.io/worker-0   example-label="1"
other-ns     machine.cluster.k8s.io/worker-1   example-label="1"
other-ns     machine.cluster.k8s.io/worker-2   example-label="1"

NAMESPACE    NAME                                             LABELS
other-ns     hostosconfigurations.kaas.mirantis.com/example   <none>

And although machineSelector in the hoc object contains example-label="1", which is set for machines in both namespaces, but only worker-0, worker-1, worker-2 will be selected because the hoc object is located in the other-ns namespace.

machineSelector:
  matchLabels:
    example-label: "1"

Configuration values of a module¶

You may use arbitrary types for primitive (non-nested) values. But for optimal compatibility and clarity, Mirantis recommends using string values for primitives in the values section of a hoc object. This practice helps maintain consistency and simplifies the interpretation of configurations.

Under the hood, all primitive values are converted to strings.

For example:

values:
  # instead of
  # primitive-float-value: 1.05
  primitive-float-value: "1.05"
  # instead of
  # primitive-boolean-value: true
  primitive-boolean-value: "true"
  object-value:
    object-key: "string-data"

You can pass the values of any host operating system configuration module to the HostOSConfiguration object using both the values and secretValues fields simultaneously. But if a key is present in both fields, the value from secretValues is applied.

The values field supports the YAML format for values with any nesting level. The HostOSConfiguration controller and provider use the YAML parser underneath to manage the values. The following examples illustrate simple and nested configuration formats:

Simple key-value map:

apiVersion: kaas.mirantis.com/v1alpha1
kind: HostOSConfiguration
...
spec:
  configs:
    - module: somemodule
      moduleVersion: 1.0.0
      values:
        key1: value1
        key2: value2

Nested YAML:

apiVersion: kaas.mirantis.com/v1alpha1
kind: HostOSConfiguration
...
spec:
  configs:
    - module: somemodule
      moduleVersion: 1.0.0
      values:
        nestedkey1:
          nestedkey2:
            - value1
            - value2
        key2: value3

The secretValues field is a reference (namespace and name) to the Secret object.

Warning

The referenced Secret object must contain only primitive non-nested values. Otherwise, the values will not be applied correctly. Therefore, implement your custom modules in a way that secret parameters are on the top level and not used within nested module parameters.

You can create a Secret object in the YAML format. For example:

apiVersion: v1
data:
  key1: <base64-encoded-string-value1>
  key2: <base64-encoded-string-value2>
kind: Secret
metadata:
  name: top-secret
  namespace: default
type: Opaque

Caution

Manually encode secret values using the base64 format and ensure that the value does not contain trailing whitespaces or line translation such as the \n symbol. For example:

echo -n "secret" | base64

You can also create the Secret object using the kubectl command. This way, the secret values are automatically base64-encoded:

kubectl create secret generic top-secret --from-literal=key1=value1 --from-literal=key2=value2

The following example illustrates the use of a secret in HostOSConfiguration:

apiVersion: kaas.mirantis.com/v1alpha1
kind: HostOSConfiguration
...
spec:
  configs:
    - module: somemodule
      moduleVersion: 1.0.0
      secretValues:
        name: top-secret
        namespace: default
      values:
        key3: value3
        key4: value4

Execution order of configurations¶

For details about execution order of configurations, see API Reference: HostOSConfiguration - spec.configs.order.

Identify deprecated modules in use¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0)

To identify whether a cluster has hoc objects with deprecated modules:

kubectl get hoc -A -o go-template="{{range .items}}HOC {{.metadata.namespace}}/{{.metadata.name}} has deprecated modules: {{if .status.containsDeprecatedModules}}yes{{else}}no{{end}}{{\"\n\"}}{{end}}"

Example output:

HOC default/hoc-example1 has deprecated modules: yes
HOC default/hoc-example2 has deprecated modules: yes
HOC default/hoc-example3 has deprecated modules: no
HOC default/hoc-example4 has deprecated modules: yes

To identify specific deprecated modules:

kubectl get hoc -A -o go-template="{{range .items}}HOC {{.metadata.namespace}}/{{.metadata.name}} deprecated modules:{{\"\n\"}}{{range .status.configs}}{{if .moduleDeprecatedBy}}{{.moduleName}}-{{.moduleVersion}}{{\"\n\"}}{{end}}{{end}}{{end}}"

Example output:

HOC default/hoc-example1 deprecated modules:
foo-1.1.0
HOC default/hoc-example2 deprecated modules:
bar-1.1.0
HOC default/hoc-example3 deprecated modules:
HOC default/hoc-example4 deprecated modules:
baz-1.1.0
qux-1.1.0

Internal API for host OS configuration management¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0) TechPreview

This section describes integrations between the HostOSConfiguration custom resouce, aka hoc, HostOSConfigurationModules custom resouce, aka hocm, LCMCluster, and LCMMachine.

Basic integration in LCM¶

The implementation of the internal API used by host operating system configuration management utilizes the current approach of StateItems, including the way how they are processed and passed to lcm-agent.

The workflow of the internal API implementation is as follows:

Create a set of StateItem entries in LCMCluster taking into account all hoc objects in the namespace of LCMCluster.
Fill out StateItems for each LCMMachine that was selected by the machineSelector field value of a hoc object.
Pass StateItems to lcm-agent that is responsible for their execution on nodes.

The machineSelector field selects Machine objects, but they map to LCMMachine objects in 1-1 relation. This way, each selected Machine exactly maps to a relevant LCMMachine object.

LCMCluster and LCM StateItem¶

LCMCluster utilizes empty StateItem to establish a baseline connection between the hoc, LCMMachine objects and lcm-agent on nodes. These empty items have no parameters and serve as placeholders, providing a template for further processing.

To identify items added from hoc objects, these StateItems along with other state items of an LCMCluster object are located in the .spec.machinesTypes.control and .spec.machinesTypes.worker blocks with the following fields in an LCMCluster object:

params is absent
phase is reconfigure as the only supported value
version is v1 as the only supported value
runner can be either downloader or ansible:
- downloader downloads the package of a module of the provided version into machine.
- ansible executes the module on the machine with provided values.
name has the following patterns:
- host-os-<hocObjectName>-<moduleName>-<moduleVersion>-<modulePhase> if the runner field has the ansible value set
- host-os-download-<hocObjectName>-<moduleName>-<moduleVersion> -<modulePhase> if the runner field has the downloader value set.

The following example of an LCMCluster object illustrates empty StateItems for the following configuration:

Machine type - worker
hoc object name - test with a single entry in the configs field
Module name - sample-module
Module version - 1.0.0

spec:
  machineTypes:
    worker:
    - name: host-os-download-test-sample-module-1.0.0-reconfigure
      runner: downloader
      version: "v1"
      phase: reconfigure
    - name: host-os-test-sample-module-1.0.0-reconfigure
      runner: ansible
      version: "v1"
      phase: reconfigure

LCMMachine overwrites¶

To properly execute the StateItem list according to given configurations from a hoc object, the implementation utilizes the .spec.stateItemsOverwrites field in an LCMMachine object.

For each state item that corresponds to a hoc object selected for current machine, each entry of the stateItemsOverwrites field dictionary is filled in with key-value pairs:

Key is a StateItem name
Value is a set of parameters from the module configuration values that will be passed as parameters to StateItem.

After the stateItemsOverwrites field is updated, the corresponding StateItem entries are filled out with values from the stateItemsOverwrites.

Once the StateItem list is updated, it is passed to lcm-agent to be finally applied on nodes.

The following example of an LCMMachine object illustrates the stateItemsOverwrites field having a hoc object with a single entry in the configs field, configuring a module named sample-module with version 1.0.0:

spec:
  stateItemsOverwrites:
    host-os-download-test-sample-module-1.0.0-reconfigure:
      playbook: directory/playbook-name.yaml
      ansible: /usr/bin/ansible-playbook
    host-os-test-sample-module-1.0.0-reconfigure:
      path: "/root/host-os-modules/sample-module-1.0.0"
      sha256: <sha256sum>
      url: https://example.mirantis.com/path/to/sample-module.tgz

HostOSConfiguration processing by baremetal-provider¶

While processing the hoc object, baremetal-provider verifies the hoc resource for both controlled LCMCluster and LCMMachine resources.

Each change to a hoc object immediately triggers its resources if host-os-modules-controller has successfully validated changes. This behavior enables updates to existing LCMCluster and LCMMachine objects described in the sections above. Thus, all empty StateItems, overwrites, and filled out StateItems appear almost instantly.

This behavior also applies when removing a hoc object, thereby cleaning everything related to the object. The object deletion is suspended until the corresponding StateItems of a particular LCMMachine object is cleaned up from the object status field.

Warning

A configuration that is already applied using the deleted hoc object will not be reverted from nodes, because the feature does not provide rollback mechanism. For module implementation details, refer to Global recommendations for implementation of custom modules.

Add a custom module to a MOSK deployment¶

TechPreview since MCC 2.26.0 (17.1.0 and 16.1.0)

Important

The cloud operator takes all risks and responsibility for module execution on cluster machines. For any questions, contact Mirantis support.

You can create a new hocm object or add a new entry with a custom-made module to the .spec.modules list in the existing hocm object. For the object specification, see HostOSConfigurationModules.

Warning

Do not modify the mcc-modules object that contains only Mirantis-provided modules. Any changes to this object will be overwritten with data from an external source.

To add a custom module to a MOSK deployment:

If you use a proxy on the management and/or managed cluster, ensure that the custom module can be downloaded through that proxy, or domain address of the module URL is added to the NO_PROXY value of the related Proxy objects.

This way, the HostOSConfiguration Controller can download and verify the module and its input parameters on the management cluster. After that, the LCM Agent can download the module to any cluster machines for execution.

Caution

A management and managed cluster can use different Proxy objects. In this case, both proxies must satisfy the requirement above. For the Proxy object details, see Proxy support and cache of artifacts.
In the hocm object, set the name and version fields with the same values from the corresponding fields in metadata.yaml of the module archive. For details, see Metadata file format.
Set the url field with the URL to the archive file of the module. For details, see Format and structure of a module package.

Set the sha256sum field with the calculated SHA256 hash sum of the archive file.

To obtain the SHA256 hash sum, you can use the following example command:

curl -sSL https://fully.qualified.domain.name/to/module/archive/name-1.0.0.tgz | shasum -a 256 | tr -d ' -'
bc5fafd15666cb73379d2e63571a0de96fff96ac28e5bce603498cc1f34de299

After applying the changes, monitor the hocm object status to ensure that the new module has been successfully validated and is ready to use. For the hocm status description, see HostOSConfigurationModules status.

Fetching and validating a module archive¶

After you add a custom module to a Container Cloud deployment, the process of fetching a module archive involves the following automatic steps:

Retrieve the .tgz archive of the module and unpack it into a temporary directory.
Retrieve the metadata.yaml file and validate its contents. Once done, the status of the module in the hocm object reflects whether the archive fetching and validating succeeded or failed.

The validation process includes the following verifications:

Validate that the SHA256 hash sum of the archive equals the value defined in the sha256sum field.
Validate that the playbook key is present.
Validate that the file defined in the playbook key value exists in the archive and has a non-zero length.
Validate that the name and version values from metadata.yaml equal the corresponding fields in the hocm object.
If the valuesJsonSchema key is defined, validate that the file from the key value exists in the archive and has a non-zero length.

Test a custom or MOSK module after creation¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0) TechPreview

Important

The cloud operator takes all risks and responsibility for module execution on cluster machines. For any questions, contact Mirantis support.

After you create a custom or configure a MOSK module, verify it on one machine before applying it to all target machines. This approach ensures safe and reliable cluster operability.

To test a module:

Add a custom label to one Machine object:

kubectl edit machine master-0

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  ...
  labels:
    ...
    day2-module-testing-example: "true"
  name: master-0
  namespace: default
...

Create the HostOSConfiguration object with machineSelector for that custom label. For example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: HostOSConfiguration
metadata:
  name: day2-test
  namespace: default
spec:
  ...
  machineSelector:
    matchLabels:
      day2-module-testing-example: "true"

Verify that the status field of modules execution is healthy, validate logs, and verify that the machine is in the ready state.

If the execution result meets your expectations, continue applying HostOSConfiguration on other machines using one of the following options:
- Use the same HostOSConfiguration object:
  - Change the matchLabels value in the machineSelector field to match all target machines.
  - Assign the labels from the matchLabels value to other target machines.
- Create a new HostOSConfiguration object.

Note

Mirantis highly recommends using specific custom labels on machines and in the HostOSConfiguration selector, so that HostOSConfiguration is applied only to the machines with the specific custom label.

Retrigger a module configuration¶

Important

The cloud operator takes all risks and responsibility for module execution on cluster machines. For any questions, contact Mirantis support.

There is no API to reexecute the same successfully applied module configuration upon user request. Once executed, the same configuration will never be executed prior to either of the following actions is taken on the hoc object:

Change the module-related values of the configs field list
Change the data of the Secret object referenced by the module-related secretValues of the configs field list

To retrigger exactly the same configuration for a module, select one of the following options:

Reapply machineSelector:
1. Save the current selector value.
2. Update the selector to match no machines (empty value) or those machines where configuration should not be reapplied.
3. Update the selector to the previously saved value.
Re-create the hoc object:
1. Dump the whole hoc object.
2. Remove the hoc object.
3. Reapply the hoc object from the dump.

Caution

The above steps retrigger all configuration from the configs field of the hoc object. To avoid such behavior, Mirantis recommends the following procedure:

Copy a particular module configuration to a new hoc object and remove the previous machineSelector field.
Remove this configuration from the original hoc object.
Add the required values to the machineSelector field in the new object.

Troubleshooting¶

This section describes possible issues you may encounter while managing host operating system (OS) configuration as well as approaches on how to address these issues.

Troubleshoot the HostOSConfigurationModules object¶

In .status.modules, verify whether all modules have been loaded and verified successfully. Each module must have the available value in the state field. If not, the error field contains the reason of the issue.

Example of different erroneous states in a hocm object:

status:
  modules:
  # error state: hashes mismatched
  - error: 'hashes are not the same: got ''d78352e51792bbe64e573b841d12f54af089923c73bc185bac2dc5d0e6be84cd''
      want ''c726ab9dfbfae1d1ed651bdedd0f8b99af589e35cb6c07167ce0ac6c970129ac'''
    name: sysctl
    sha256sum: d78352e51792bbe64e573b841d12f54af089923c73bc185bac2dc5d0e6be84cd
    state: error
    url: <url-to-package>
    version: 1.0.0
  # error state: an archive is not available because of misconfigured proxy
  - error: 'failed to perform request to fetch the module archive: Get "<url-to-package>": Forbidden'
    name: custom-module
    state: error
    url: <url-to-package>
    version: 0.0.1
  # successfully loaded and verified module
  - description: Module for package installation
    docURL: https://docs.mirantis.com
    name: package
    playbookName: main.yaml
    sha256sum: 2c7c91206ce7a81a90e0068cd4ce7ca05eab36c4da1893555824b5ab82c7cc0e
    state: available
    url: <url-to-package>
    valuesValidationSchema: <gzip+base64 encoded data>
    version: 1.0.0

If a module is in the error state, it might affect the corresponding hoc object that contains the module configuration.

Example of erroneous status in a hoc object:

status:
  configs:
  - moduleName: sysctl
    moduleVersion: 1.0.0
    modulesReference: mcc-modules
    error: module is not found or not verified in any HostOSConfigurationModules object

To resolve an issue described in the error field:

Address the root cause. For example, ensure that a package has the correct hash sum, or adjust the proxy configuration to fetch the package, and so on.
Recreate the hocm object with correct settings.

Setting syncPeriod for debug sessions

During test or debug sessions where errors are inevitable, you can set a reasonable sync period for host-os-modules-controller to avoid manual recreation of hocm objects.

To enable the option, set the syncPeriod parameter in the spec:providerSpec:value:kaas:regional:helmReleases: section of the management Cluster object:

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: baremetal
          helmReleases:
          - name: host-os-modules-controller
            values:
              syncPeriod: 2m

Normally, syncPeriod is not required in the cluster settings. Therefore, you can remove this option after completing a debug session.

Troubleshoot the HostOSConfiguration object¶

After creation of a hoc object with various configurations, perform the following steps with reference to HostOSConfiguration status:

Verify that the .status.isValid field has the true value.
Verify that the .status.configs[*].error fields are absent.
Verify that all .status.machinesStates.<machineName>.configStateItemsStatuses have no Failed status.

For details on the HostOSConfiguration status, refer to HostOSConfiguration status.

Also, verify the LCM-related objects:

Verify that the corresponding LCMCluster object has all related StateItems.
Verify that all selected LCMMachines have the .spec.stateItemsOverwrites field, in which all StateItems from the previous step are present.
Verify that all StateItems from the previous step have been successfully processed by lcm-agent. Otherwise, a manual intervention is required.

To address an issue with a specific StateItem for which the lcm-agent is reporting an error, log in to the corresponding node and inspect Ansible execution logs:

ssh -i <path-to-ssh-key> mcc-user@<ip-addr-of-the-node>
sudo -i
cd /var/log/lcm/runners/
# from 2 directories, select the one
# with subdirectories having 'host-os-' prefix
cd <selected-dir>/<name-of-the-erroneous-state-item>
less <logs-file>

After the inspection, either resolve the issue manually or escalate the issue to Mirantis support.

Enable log debugging¶

The host OS configuration management API allows enabling logs of debug level, which is integrated into the baremetal-provider controller and host-os-modules-controller. Both may be helpful during debug sessions.

To enable log debugging in host-os-modules-controller, add the following snippet to the Cluster object:

providerSpec:
# ...
  value:
  # ...
    kaas:
      regional:
      - helmReleases:
        - name: host-os-modules-controller
          values:
            logLevel: 2

To enable log debugging in baremetal-provider, add the following snippet to the Cluster object:

providerSpec:
# ...
  value:
  # ...
    kaas:
      regional:
      - helmReleases:
        - name: baremetal-provider
          values:
            cluster_api_provider_baremetal:
              log:
                verbosity: 3

To obtain the logs related to host OS configuration management in baremetal-provider, filter them by the .host-os. key:

kubectl logs -n kaas <baremetal-provider-pod> | grep ".host-os."

See also

API Reference:

Migrate container runtime from Docker to containerd¶

Available since 2.28.4 (Cluster releases 17.3.4 and 16.3.4)

Caution

Due to the known issue 49678, the HostOSConfiguration object may not work as expected after migration to containerd. For details, see the issue description.

Migration of container runtime from Docker to containerd is implemented for existing management and managed clusters. The use of containerd allows for better Kubernetes performance and component update without pod restart when applying fixes for CVEs.

Note

On greenfield deployments, containerd is the default container runtime since Container Cloud 2.29.0 and MOSK 25.1. Before that, Docker remains the default option.

Precautions¶

Before the container runtime mirgation, consider the following precautions:

The migration involves machine cordoning and draining.
Cluster update is not allowed during migration to prevent machines from running different container runtimes. However, you can still scale clusters and replace nodes as required.
The migration is mandatory during the scope of Container Cloud 2.29.x. Otherwise, the management cluster update to Container Cloud 2.30.0 will be blocked.

Note

If you have not upgraded the operating system distribution on your machines to Jammy yet, Mirantis recommends migrating machines from Docker to containerd on managed clusters together with distribution upgrade to minimize the maintenance window.

In this case, ensure that all cluster machines are updated at once during the same maintenance window to prevent machines from running different container runtimes.

Upgrade container runtime between releases¶

The following procedure applies to both management and managed clusters.

Verify that the managed cluster is updated to the Cluster release 17.3.4 or later as described in Cluster update.

Note

Management clusters must be updated to the Cluster release 16.3.4 or later.
Verify that the managed cluster distribution is upgraded to Ubuntu Jammy as described in Upgrade an operating system distribution.
Open the required Machine object for editing.

You can schedule more than one machine for migration at the same time. In this case, the process is automatically orchestrated without service interruption.
In the metadata.annotations section, add the following annotation to trigger migration to containerd runtime:
```
apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  name: example-machine
  namespace: example-ns
  annotations:
    kaas.mirantis.com/preferred-container-runtime: containerd
```
The machine will be cordoned and drained, and container runtime will be migrated.

For description of the Machine object fields, see Machine resource.
Once migration is completed, verify that ContainerRuntimeContainerd condition is true in status.providerStatus.

For description of the the Machine object status fields, see Machine status.
Repeat the procedure with the remaining machines.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Caution

If an emergency related to containerd occurs on workloads before migration is complete on all machines, you can temporarily roll back containerd to Docker. Use the procedure above by changing the kaas.mirantis.com/preferred-container-runtime annotation from containerd to docker.

Change a user name and password for a bare metal host¶

This section describes how to change a user name and password of a bare metal host using an existing BareMetalHostCredential object.

To change a user name and password for a bare metal host:

Open the BareMetalHostCredential object of the required bare metal host for editing.
In the spec section:
- Update the username field
- Replace password.name: <secretName> with password.value: <hostPasswordInPlainText>
For example:
```
spec:
 username: admin
 password:
 value: superpassword
```
This action triggers creation of a new Secret object with updated credentials. After that, sensitive password data is replaced with the new Secret object name. For a detailed workflow description, see BareMetalHostCredential resource.

Caution

Adding a password value is mandatory for a user name change. You can either create a new password value or copy the existing one from the related Secret object.

Caution

Changing a user name in the related Secret object does not automatically update the BareMetalHostCredential object. Therefore, Mirantis recommends updating credentials only using the the BareMetalHostCredential object.

Warning

The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying the object.

Therefore, do not use kubectl apply on this object. Use kubectl create, kubectl patch, or kubectl edit instead.

If you used kubectl apply on this object, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the object using kubectl edit.

Restart the inspection of a bare metal host¶

You can use the Container Cloud API to restart an inspection of a bare metal host in MOSK clusters. For example, this procedure is useful when hardware was changed. This works for bare metal hosts that were not provisioned yet or were successfully deprovisioned.

The workflow of the reinspection procedure desribed above is as follows:

Ensure that the BareMetalHostInventory object is not bound to any Machine object and it is in the available state.
Edit the BareMetalHostInventory object to initiate an inspection of the bare metal server that hosts the node.

Note

Caution

To restart the inspection of a bare metal host:

Using kubeconfig of the management cluster, access the Container Cloud API and inspect the BareMetalHost object:

kubectl -n <project-name> get baremetalhost <bare-metal-host-name>

Example of the system response:

NAME                             STATE       CONSUMER                         ONLINE   ERROR   AGE
...
managed-worker-a-storage-worker  preparing   managed-worker-a-storage-worker  false            5d3h
managed-worker-b-storage-worker  preparing   managed-worker-b-storage-worker  false            5d3h
managed-worker-c-storage-worker  available                                    false            5d3h

In the system response above, the managed-worker-c-storage-worker bare metal host is in the available state and has no consumer (not bound to any Machine). Therefore, you can reinspect it.

Open the required bare metal host object for editing:
Since the management cluster update to 16.4.0 (MCC 2.29.0)
kubectl -n <project-name> edit baremetalhostinventory <host-name>
Before the management cluster update to 16.4.0 (MCC 2.29.0)
kubectl -n <project-name> edit baremetalhost <host-name>
Annotate the object with the inspect.metal3.io annotation and ensure that online equals true in the object spec:
```
metadata:
  annotations:
    ...
    inspect.metal3.io: ""
```
```
spec:
  ...
  online: true
```
Once you save changes, approximately in one minute, the state of the bare metal host changes to inspecting.

Verify the state of the bare metal host:

kubectl -n <project-name> get baremetalhost <host-name>

Example of a system response:

NAME                             STATE        CONSUMER    ONLINE   ERROR   AGE
managed-worker-c-storage-worker  inspecting               true             5d3h

Restart a bare metal host¶

You can use the Container Cloud API to restart a bare metal host in Mirantis OpenStack for Kubernetes clusters. The workflow of the host restart is as follows:

Set the maintenance mode on the cluster that contains the target node.
Set the maintenance mode on the target node for OpenStack and Container Cloud to drain it from workloads. No new workloads will be provisioned to a host in the maintenance mode.
Use the bare metal host object to initiate a hard reboot of the bare metal server that hosts the node.

To restart a bare metal host:

Using kubeconfig of the Container Cloud management cluster, access the Container Cloud API and open the Cluster object for editing:
```
kubectl -n <project-name> edit cluster <cluster-name>
```
Add the following field to the spec section to set the maintenance mode on the cluster:
```
spec:
  providerSpec:
    value:
      maintenance: true
```

Verify that the Cluster object status for Maintenance is ready: true:

kubectl -n <project-name> get cluster <cluster-name>

Example of a negative system response:

...
status:
  providerStatus:
    conditions:
    ...
    - message: 'Maintenance state of the cluster is false, expected:
      # true. Waiting for the cluster to enter the maintenance state'
      ready: false
      type: Maintenance
    ...
    maintenance: true

Open the required Machine object for editing:

kubectl -n <project-name> edit machine <machine-name>

In the spec:providerSpec section, set the maintenance mode on the node:

spec:
  providerSpec:
    value:
      maintenance: true

In the annotations section of the Machine definition, capture the bare metal host name connected to the machine:
```
metadata:
 annotations:
 metal3.io/BareMetalHost: <project-name>/<host-name>
```
Verify that the Machine maintenance status is true:
```
status:
  maintenance: true
```
Important

Proceed with the node maintenance only after the machine switches to the maintenance mode.
Open the required bare metal host object for editing using the previously captured <host-name>:
Since the management cluster update to 16.4.0 (MCC 2.29.0)
kubectl -n <project-name> edit baremetalhostinventory <host-name>
m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
Before the management cluster update to 16.4.0 (MCC 2.29.0)
kubectl -n <project-name> edit baremetalhost <host-name>
In the spec section, set the online field to false:
```
spec:
  online: false
```
Wait for the host to shut down.
In the spec section, set the online field to true:
```
spec:
  online: true
```
Restore the machine from the maintenance mode by deleting the previously set maintenance: true line from the Machine object:
```
kubectl -n <project-name> edit machine <machine-name>
```
Restore the cluster from the maintenance mode by deleting the previously set maintenance: true line from the Cluster object.
```
kubectl -n <project-name> edit cluster <cluster-name>
```

Replace a failed manager node¶

This section describes how to replace a failed manager node in your MOSK deployment. The procedure applies to the manager nodes that are, for example, permanently failed due to a hardware failure and remain in the NotReady state.

Note

If your cluster is deployed with a compact control plane, follow the Replace a failed controller node procedure.

To replace a failed manager node:

Verify that the affected manager node is in the NotReady state:

kubectl get nodes <CONTAINER-CLOUD-NODE-NAME>

Example of system response:

NAME                          STATUS     ROLES    AGE   VERSION
<CONTAINER-CLOUD-NODE-NAME>   NotReady   <none>   10d   v1.18.8-mirantis-1

Delete the affected manager node as described in Delete a cluster machine.
Add a manager node as described in Add a machine.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Modify network configuration on an existing machine¶

TechPreview

Caution

Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause data loss.
Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.

Warning

Netplan does not handle arbitrary configuration changes. For details, see Netplan documentation.

To modify network configuration of an existing machine, you need to create a new L2 template and change the assignment of the template for that particular machine.

Warning

When a new network configuration is being applied on nodes, sequential draining of corresponding nodes and re-running of LCM on them occurs the same way as it is done during cluster update.

Therefore, before proceeding with modifying the network configuration, verify that the management cluster is up-to-date as described in Verify the management cluster status before MOSK update.

To modify network configuration on an existing machine:

Select from the following options:
- Create a new L2 template using the Create L2 templates procedure.
- Duplicate the existing L2Template object associated with the machine to be configured, ensuring that the duplicated L2Template:
 - Contains a unique name in the metadata.name field
 - Does not contain the ipam/DefaultForCluster label
 - Since MOSK 23.3, refers to the cluster using the cluster.sigs.k8s.io/cluster-name label
 - Before MOSK 23.3, refers to the cluster using Spec.clusterRef: <cluster-name>

Create the new L2Template object:

kubectl create -f <new-l2template-object-name>.yaml

Assign the new L2Template object to an existing machine:
Since MCC 2.29.0 (17.4.0 and 16.4.0)
In the related Machine object, add the following fields to the spec section:
kubectl edit machine <machine-name>
spec: providerSpec: value: l2TemplateSelector: name: <new-l2-template-name>
Before MCC 2.29.0 (17.3.x, 16.3.x, or earlier)
In the IpamHost object associated with the required machine, add the following fields to the spec section:

Note

The IpamHost object is automatically created with the same name as the related Machine object and is located in the same namespace.
kubectl edit ipamhost <ipamhost-name>
spec: l2TemplateSelector: name: <new-l2-template-name>
Caution

The name field must match the name of the created L2Template object. To verify the list of available L2Template objects:
```
kubectl get l2Template -n <MOSK namespace>
```
Verify the statuses of the IpamHost objects that use the objects updated in the previous step:
```
kubectl get IpamHost <ipamHostName> -o=jsonpath-as-json='{.status.netconfigCandidate}{"\n"}{.status.netconfigCandidateState}{"\n"}{.status.netconfigFilesStates}{"\n"}{.status.messages}'
```
Caution

The following fields of the ipamHost status are renamed since MOSK 23.1 in the scope of the L2Template and IpamHost objects refactoring:
- netconfigV2 to netconfigCandidate
- netconfigV2state to netconfigCandidateState
- netconfigFilesState to netconfigFilesStates (per file)
No user actions are required after renaming.

The format of netconfigFilesState changed after renaming. The netconfigFilesStates field contains a dictionary of statuses of network configuration files stored in netconfigFiles. The dictionary contains the keys that are file paths and values that have the same meaning for each file that netconfigFilesState had:
- For a successfully rendered configuration file: OK: <timestamp> <sha256-hash-of-rendered-file>, where a timestamp is in the RFC 3339 format.
- For a failed rendering: ERR: <error-message>.
- If the configuration is valid:
 - The netconfigCandidate field contains the Netplan configuration file candidate rendered using the modified objects
 - The netconfigCandidateState and netconfigFilesStates fields have the OK status
 - The netconfigFilesStates field contains the old date and checksum meaning that the effective Netplan configuration is still based on the previous versions of the modified objects
 - The messages field may contain some warnings but no errors
- If the L2 template rendering fails, the candidate for Netplan configuration is empty and its netconfigCandidateState status contains an error message. A broken candidate for Netplan configuration cannot be approved and become the effective Netplan configuration.
Warning

Do not proceed to the next step until you make sure that the netconfigCandidate field contains the valid configuration and this configuration meets your expectations.
Approve the new network configuration for the related IpamHost objects:
```
kubectl patch IpamHost <ipamHostName> --type='merge' -p "{\"spec\":{\"netconfigUpdateAllow\":true}}"
```
Once applied, the new configuration is copied to the netconfigFiles field of the effective Netplan configuration, then copied to the corresponding LCMMachine objects.
Verify the statuses of the updated IpamHost objects:
```
kubectl get IpamHost <ipamHostName> -o=jsonpath-as-json='{.status.netconfigCandidate}{"\n"}{.status.netconfigCandidateState}{"\n"}{.status.netconfigFilesStates}{"\n"}{.status.messages}'
```
Caution

The following fields of the ipamHost status are renamed since MOSK 23.1 in the scope of the L2Template and IpamHost objects refactoring:
- netconfigV2 to netconfigCandidate
- netconfigV2state to netconfigCandidateState
- netconfigFilesState to netconfigFilesStates (per file)
No user actions are required after renaming.

The format of netconfigFilesState changed after renaming. The netconfigFilesStates field contains a dictionary of statuses of network configuration files stored in netconfigFiles. The dictionary contains the keys that are file paths and values that have the same meaning for each file that netconfigFilesState had:
- For a successfully rendered configuration file: OK: <timestamp> <sha256-hash-of-rendered-file>, where a timestamp is in the RFC 3339 format.
- For a failed rendering: ERR: <error-message>.
The new configuration is copied to the effective Netplan configuration and both configurations are valid when:
- The netconfigCandidateState and netconfigFilesStates fields have the OK status and the same checksum
- The messages list does not contain any errors
Verify the updated LCMMachine objects:
```
kubectl get LCMMachine <LCMMachineName> -o=jsonpath-as-json='{.spec.stateItemsOverwrites}'
```
In the output of the above command, hash sums contained in the bm_ipam_netconfig_files values must match those in the IpamHost.status.netconfigFilesStates output. If so, the new configuration is copied to LCMMachine objects.
Monitor the update operations that start on nodes. For details, see Verify machine status.

Add more racks to an existing MOSK cluster¶

This section describes exemplary L2 templates to demonstrate how to add more racks to an existing MOSK cluster.

The following exemplary L2 template belongs to a single-rack MOSK cluster. This template has the following characteristics:

Describes all networks that are used for cluster nodes communication
Can be transformed into several L2 templates depending on nodes roles
Uses the IP gateway in the external network as default route on the nodes in the MOSK cluster

Example of an L2 template for a single-rack cluster

l3Layout:
- subnetName: kaas-mgmt
  scope: global
  labelSelector:
    kaas.mirantis.com/provider: baremetal
    kaas-mgmt-subnet: ""
- subnetName: k8s-lcm
  scope: namespace
- subnetName: k8s-ext-ipam
  scope: namespace
- subnetName: tenant
  scope: namespace
- subnetName: k8s-pods
  scope: namespace
- subnetName: ceph-front
  scope: namespace
- subnetName: ceph-back
  scope: namespace
npTemplate: |-
  version: 2
  ethernets:
    {{nic 0}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 0}}
      set-name: {{nic 0}}
      mtu: 1500
    {{nic 1}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 1}}
      set-name: {{nic 1}}
      mtu: 1500
    {{nic 2}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 2}}
      set-name: {{nic 2}}
      mtu: 9050
    {{nic 3}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 3}}
      set-name: {{nic 3}}
      mtu: 9050
  bonds:
    bond0:
      interfaces:
        - {{nic 0}}
        - {{nic 1}}
      parameters:
        mode: 802.3ad
        transmit-hash-policy: layer3+4
      mtu: 1500
    bond1:
      interfaces:
        - {{nic 2}}
        - {{nic 3}}
      parameters:
        mode: 802.3ad
        transmit-hash-policy: layer3+4
      mtu: 9050
  vlans:
    k8s-lcm-v:
      id: 738
      link: bond0
    k8s-pod-v:
      id: 731
      link: bond1
      mtu: 9000
    k8s-ext-v:
      id: 736
      link: bond1
      mtu: 9000
    tenant-vlan:
      id: 732
      link: bond1
      addresses:
        - {{ip "tenant-vlan:tenant"}}
      mtu: 9050
    ceph-front-v:
      id: 733
      link: bond1
      addresses:
        - {{ip "ceph-front-v:ceph-front"}}
      mtu: 9000
    ceph-back-v:
      id: 734
      link: bond1
      addresses:
        - {{ip "ceph-back-v:ceph-back"}}
      mtu: 9000
  bridges:
    k8s-lcm:
      interfaces: [k8s-lcm-v]
      addresses:
        - {{ip "k8s-lcm:k8s-lcm"}}
      routes:
        # to management network of MCC cluster
        - to: {{cidr_from_subnet "kaas-mgmt"}}
          via: {{gateway_from_subnet "k8s-lcm"}}
          table: 101
        # fips network
        - to: 10.159.156.0/22
          via: {{gateway_from_subnet "k8s-lcm"}}
          table: 101
      routing-policy:
        - from: {{cidr_from_subnet "k8s-lcm"}}
          table: 101
    k8s-pods:
      interfaces: [k8s-pod-v]
      addresses:
        - {{ip "k8s-pods:k8s-pods"}}
      mtu: 9000
    k8s-ext:
      interfaces: [k8s-ext-v]
      addresses:
        - {{ip "k8s-ext:k8s-ext-ipam"}}
      gateway4: {{gateway_from_subnet "k8s-ext-ipam"}}
      nameservers:
        addresses: {{nameservers_from_subnet "k8s-ext-ipam"}}
      mtu: 9000
    ## FIP Bridge
    br-fip:
      interfaces: [bond1]
      mtu: 9050

To add nodes to the new rack of the same cluster:

Create Subnet objects for the following networks: LCM, workload, tenant, and Ceph (where applicable).
Create a new L2 template that nodes in a new rack will use.
In this template, configure the external network to be either stretched between racks or connected to the first rack only.

Caution

API/LCM network is the first rack LCM network in our example, since a single-rack MOSK cluster was deployed first. Therefore, only the first rack can contain Kubernetes master nodes that provide access to Kubernetes API.
In the L2 template for the first rack, add IP routes pointing to the networks in the new rack.

The following examples contain:

The modified L2 template for the first rack. Routes added to the second rack are highlighted.
The new L2 template for the second rack with external network that is stretched between racks. The IP gateway in the external network is used as the default route on the nodes of the second rack.

Example of a modified L2 template for the first rack with routes to the second rack

l3Layout:
- subnetName: kaas-mgmt
  scope: global
  labelSelector:
    kaas.mirantis.com/provider: baremetal
    kaas-mgmt-subnet: ""
- subnetName: k8s-lcm
  scope: namespace
- subnetName: k8s-ext-ipam
  scope: namespace
- subnetName: tenant
  scope: namespace
- subnetName: k8s-pods
  scope: namespace
- subnetName: ceph-front
  scope: namespace
- subnetName: ceph-back
  scope: namespace
- subnetName: k8s-lcm-rack2
  scope: namespace
- subnetName: tenant-rack2
  scope: namespace
- subnetName: k8s-pods-rack2
  scope: namespace
- subnetName: ceph-front-rack2
  scope: namespace
- subnetName: ceph-back-rack2
  scope: namespace
npTemplate: |-
  version: 2
  ethernets:
    {{nic 0}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 0}}
      set-name: {{nic 0}}
      mtu: 1500
    {{nic 1}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 1}}
      set-name: {{nic 1}}
      mtu: 1500
    {{nic 2}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 2}}
      set-name: {{nic 2}}
      mtu: 9050
    {{nic 3}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 3}}
      set-name: {{nic 3}}
      mtu: 9050
  bonds:
    bond0:
      interfaces:
        - {{nic 0}}
        - {{nic 1}}
      parameters:
        mode: 802.3ad
        transmit-hash-policy: layer3+4
      mtu: 1500
    bond1:
      interfaces:
        - {{nic 2}}
        - {{nic 3}}
      parameters:
        mode: 802.3ad
        transmit-hash-policy: layer3+4
      mtu: 9050
  vlans:
    k8s-lcm-v:
      id: 738
      link: bond0
    k8s-pod-v:
      id: 731
      link: bond1
      mtu: 9000
    k8s-ext-v:
      id: 736
      link: bond1
      mtu: 9000
    tenant-vlan:
      id: 732
      link: bond1
      addresses:
        - {{ip "tenant-vlan:tenant"}}
      routes:
        # to 2nd rack of MOSK cluster
        - to: {{cidr_from_subnet "tenant-rack2"}}
          via: {{gateway_from_subnet "tenant"}}
      mtu: 9050
    ceph-front-v:
      id: 733
      link: bond1
      addresses:
        - {{ip "ceph-front-v:ceph-front"}}
      routes:
        # to 2nd rack of MOSK cluster
        - to: {{cidr_from_subnet "ceph-front-rack2"}}
          via: {{gateway_from_subnet "ceph-front"}}
      mtu: 9000
    ceph-back-v:
      id: 734
      link: bond1
      addresses:
        - {{ip "ceph-back-v:ceph-back"}}
      routes:
        # to 2nd rack of MOSK cluster
        - to: {{cidr_from_subnet "ceph-back-rack2"}}
          via: {{gateway_from_subnet "ceph-back"}}
      mtu: 9000
  bridges:
    k8s-lcm:
      interfaces: [k8s-lcm-v]
      addresses:
        - {{ip "k8s-lcm:k8s-lcm"}}
      nameservers:
        addresses: {{nameservers_from_subnet "k8s-lcm"}}
      routes:
        # to management network of Container Cloud cluster
        - to: {{cidr_from_subnet "kaas-mgmt"}}
          via: {{gateway_from_subnet "k8s-lcm"}}
          table: 101
        # fips network
        - to: 10.159.156.0/22
          via: {{gateway_from_subnet "k8s-lcm"}}
          table: 101
        # to 2nd rack of MOSK cluster
        - to: {{cidr_from_subnet "k8s-lcm-rack2"}}
          via: {{gateway_from_subnet "k8s-lcm"}}
          table: 101
      routing-policy:
        - from: {{cidr_from_subnet "k8s-lcm"}}
          table: 101
    k8s-pods:
      interfaces: [k8s-pod-v]
      addresses:
        - {{ip "k8s-pods:k8s-pods"}}
      routes:
        # to 2nd rack of MOSK cluster
        - to: {{cidr_from_subnet "k8s-pods-rack2"}}
          via: {{gateway_from_subnet "k8s-pods"}}
      mtu: 9000
    k8s-ext:
      interfaces: [k8s-ext-v]
      addresses:
        - {{ip "k8s-ext:k8s-ext-ipam"}}
      gateway4: {{gateway_from_subnet "k8s-ext-ipam"}}
      nameservers:
        addresses: {{nameservers_from_subnet "k8s-ext-ipam"}}
      mtu: 9000
    ## FIP Bridge
    br-fip:
      interfaces: [bond1]
      mtu: 9050

Example of a new L2 template for the second rack with external network

l3Layout:
- subnetName: kaas-mgmt
  scope: global
  labelSelector:
    kaas.mirantis.com/provider: baremetal
    kaas-mgmt-subnet: ""
- subnetName: k8s-lcm
  scope: namespace
- subnetName: k8s-ext-ipam
  scope: namespace
- subnetName: tenant
  scope: namespace
- subnetName: k8s-pods
  scope: namespace
- subnetName: ceph-front
  scope: namespace
- subnetName: ceph-back
  scope: namespace
- subnetName: k8s-lcm-rack2
  scope: namespace
- subnetName: tenant-rack2
  scope: namespace
- subnetName: k8s-pods-rack2
  scope: namespace
- subnetName: ceph-front-rack2
  scope: namespace
- subnetName: ceph-back-rack2
  scope: namespace
npTemplate: |-
  version: 2
  ethernets:
    {{nic 0}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 0}}
      set-name: {{nic 0}}
      mtu: 1500
    {{nic 1}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 1}}
      set-name: {{nic 1}}
      mtu: 1500
    {{nic 2}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 2}}
      set-name: {{nic 2}}
      mtu: 9050
    {{nic 3}}:
      dhcp4: false
      dhcp6: false
      match:
        macaddress: {{mac 3}}
      set-name: {{nic 3}}
      mtu: 9050
  bonds:
    bond0:
      interfaces:
        - {{nic 0}}
        - {{nic 1}}
      parameters:
        mode: 802.3ad
        transmit-hash-policy: layer3+4
      mtu: 1500
    bond1:
      interfaces:
        - {{nic 2}}
        - {{nic 3}}
      parameters:
        mode: 802.3ad
        transmit-hash-policy: layer3+4
      mtu: 9050
  vlans:
    k8s-lcm-v:
      id: 738
      link: bond0
    k8s-pod-v:
      id: 731
      link: bond1
      mtu: 9000
    k8s-ext-v:
      id: 736
      link: bond1
      mtu: 9000
    tenant-vlan:
      id: 732
      link: bond1
      addresses:
        - {{ip "tenant-vlan:tenant-rack2"}}
      routes:
        # to 2nd rack of MOSK cluster
        - to: {{cidr_from_subnet "tenant"}}
          via: {{gateway_from_subnet "tenant-rack2"}}
      mtu: 9050
    ceph-front-v:
      id: 733
      link: bond1
      addresses:
        - {{ip "ceph-front-v:ceph-front-rack2"}}
      routes:
        # to 1st rack of MOSK cluster
        - to: {{cidr_from_subnet "ceph-front"}}
          via: {{gateway_from_subnet "ceph-front-rack2"}}
      mtu: 9000
    ceph-back-v:
      id: 734
      link: bond1
      addresses:
        - {{ip "ceph-back-v:ceph-back-rack2"}}
      routes:
        # to 2nd rack of MOSK cluster
        - to: {{cidr_from_subnet "ceph-back"}}
          via: {{gateway_from_subnet "ceph-back-rack2"}}
      mtu: 9000
  bridges:
    k8s-lcm:
      interfaces: [k8s-lcm-v]
      addresses:
        - {{ip "k8s-lcm:k8s-lcm-rack2"}}
      nameservers:
        addresses: {{nameservers_from_subnet "k8s-lcm-rack2"}}
      routes:
        # to management network of Container Cloud cluster
        - to: {{cidr_from_subnet "kaas-mgmt"}}
          via: {{gateway_from_subnet "k8s-lcm-rack2"}}
          table: 101
        # fips network
        - to: 10.159.156.0/22
          via: {{gateway_from_subnet "k8s-lcm-rack2"}}
          table: 101
        # to API/LCM network of MOSK cluster
        - to: {{cidr_from_subnet "k8s-lcm"}}
          via: {{gateway_from_subnet "k8s-lcm-rack2"}}
          table: 101
      routing-policy:
        - from: {{cidr_from_subnet "k8s-lcm-rack2"}}
          table: 101
    k8s-pods:
      interfaces: [k8s-pod-v]
      addresses:
        - {{ip "k8s-pods:k8s-pods-rack2"}}
      routes:
        # to 2nd rack of MOSK cluster
        - to: {{cidr_from_subnet "k8s-pods"}}
          via: {{gateway_from_subnet "k8s-pods-rack2"}}
      mtu: 9000
    k8s-ext:
      interfaces: [k8s-ext-v]
      addresses:
        - {{ip "k8s-ext:k8s-ext-ipam"}}
      gateway4: {{gateway_from_subnet "k8s-ext-ipam"}}
      nameservers:
        addresses: {{nameservers_from_subnet "k8s-ext-ipam"}}
      mtu: 9000
    ## FIP Bridge
    br-fip:
      interfaces: [bond1]
      mtu: 9050

Expand IP addresses capacity in an existing cluster¶

If the subnet capacity on your existing cluster is not enough to add new machines, use the l2TemplateSelector feature to expand the IP addresses capacity:

Create new Subnet object(s) to define additional address ranges for new machines.
Set up routing between the existing and new subnets.
Create new L2 template(s) with the new subnet(s) being used in l3Layout.
Set up l2TemplateSelector in the Machine objects for new machines.

To expand IP addresses capacity for an existing cluster:

Verify the capacity of the subnet(s) currently associated with the L2 template(s) used for cluster deployment:

If labelSelector is not used for the given subnet, use the namespace value of the L2 template and the subnetName value from the l3Layout section:
```
kubectl get subnet -n <namespace> <subnetName>
```

If labelSelector is used for the given subnet, use the namespace value of the L2 template and comma-separated key-value pairs from the labelSelector section for the given subnet in the l3Layout section:

kubectl get subnet -n <namespace> -l <key1=value1>[<,key2=value2>...]

Example command:

kubectl get subnet -n test-ns -l cluster.sigs.k8s.io/cluster-name=managed123,user-defined/purpose=lcm-base

Example of system response:

NAME             AGE  CIDR            GATEWAY      CAPACITY  ALLOCATABLE  STATUS
old-lcm-network  8d   192.168.1.0/24  192.168.1.1  253       0            OK

Create new objects:

Subnet with the user-defined/purpose: lcm-additional label.
L2Template with the alternative-template: “1” label. The L2 template should reference the new Subnet object using the user-defined/purpose: lcm-additional label in the labelSelector field.

Note

The label name user-defined/purpose is used for illustration purposes. Use any custom label name that differs from system names. Use of a unique prefix such as user-defined/ is recommended.

You can also reference the new Subnet object by using its name in the l3Layout section of the alternative-template L2 template.

Set up IP routing between the existing and new subnets using the tools of your cloud network infrastructure.
In the providerSpec section of the new Machine object, define the alternative-template label for l2TemplateSelector:
Snippet example of the new Machine object
apiVersion: cluster.k8s.io/v1alpha1 kind: Machine metadata: labels: cluster.sigs.k8s.io/cluster-name: managed123 kaas.mirantis.com/provider: baremetal kaas.mirantis.com/region: region-one name: additional-machine namespace: test-ns spec: ... providerSpec: value: ... l2TemplateSelector: label: alternative-template
Note

The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
After creation, the new machine will use the alternative L2 template that uses the new-lcm-network subnet linked by L3Layout.

Optional. Configure an additional IP address pool for MetalLB:

Since MOSK 24.2

Configure the additional extension IP address pool for the metallb load balancer service.

Open the MetalLBConfig object of the management cluster for editing:
```
kubectl edit metallbconfig <MetalLBConfigName>
```

In the ipAddressPools section, add:

...
spec:
  ipAddressPools:
  - name: extension
    spec:
      addresses:
      - <pool_start_ip>-<pool_end_ip>
      autoAssign: false
      avoidBuggyIPs: false
...

In the snippet above, replace the following parameters:

<pool_start_ip> - first IP address in the required range
<pool_end_ip> - last IP address in the range

Add the extension IP address pool name to the L2Advertisements definition. You can add it to the same L2 advertisement as the default IP address pool, or create a new L2 advertisement if required.

...
spec:
  l2Advertisements:
  - name: default
    spec:
      interfaces:
      - k8s-lcm
      ipAddressPools:
      - default
      - extension
...

Save and exit the object to apply changes.

Before MOSK 24.2 ^Deprecated

Define additional address ranges for MetalLB. For details, see the optional step for the MetalLB service in Create subnets.

You can create one or several Subnet objects to extend the MetalLB address pool with additional ranges. When the MetalLB traffic is routed through the default gateway, you can add the MetalLB address ranges that belong to different CIDR subnet addresses.

For example:

Note

Verify the created objects for MetalLB. For reference, use the following files in Example of a complete template configuration for cluster creation:
- Since MOSK 24.2: managed-ns_MetalLBConfig-lb-managed.yaml
- Before MOSK 24.2: managed-ns_Subnet_metallb-public-for-managed.yaml

Run cluster self-diagnostics¶

Available since MOSK 24.3 TechPreview

MOSK provides cloud operators with a unified tool to perform automatic self-diagnostic checks on both management and managed clusters. This capability allows for easier troubleshooting and preventing potential issues. For instance, self-diagnostic checks can notify you of deprecated features that, if left unresolved, may block upgrades to subsequent versions.

Examples of self-diagnostic checks include:

An SSL/TLS certificate is not set explicitly as plain text
Deprecated OpenStackDeploymentSecret does not exist

Running self-diagnostics is essential to ensure the overall health and optimal performance of your cluster. Mirantis recommends running self-diagnostics before cluster update, node replacement, or any other significant changes in the cluster to optimize maintenance window.

The Diagnostic Controller is a tool with a set of diagnostic checks to automatically perform self-diagnostics of any cluster and help the operator to easily understand, troubleshoot, and resolve potential issues against the following major subsystems: core, bare metal, Ceph, StackLight, Tungsten Fabric, and OpenStack. For illustration of diagnostic checks, refer to the subsection describing the bare metal provider checks.

The Diagnostic Controller analyzes the configuration of the cluster subsystems and reports results of checks that contain useful information about cluster health. These reports may include references to documentation on known issues related to results of checks, along with ticket numbers for tracking the resolution progress of related issues.

The Diagnostic Controller watches for the Diagnostic objects and runs a set of diagnostic checks depending on the cluster version and type, which are identified by the cluster name defined in the spec.cluster section of the Diagnostic object.

Trigger self-diagnostics for a management or managed cluster¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0)

To run self-diagnostics for a cluster, the operator must create a Diagnostic object. The creation of this object triggers diagnostic-controller to start all available checks for the target cluster defined in the spec.cluster section of the object.

After a successful completion of the required set of diagnostic checks, diagnostics is never retriggered. To retrigger diagnostics for the same cluster, the operator must create a new Diagnostic object.

The objects of the Diagnostic kind are not removed automatically so that you can assess the result of each diagnostics later.

To trigger self-diagnostics for a cluster:

Create the Diagnostic object in the namespace where the target cluster is located. For example:

apiVersion: diagnostic.mirantis.com/v1alpha1
kind: Diagnostic
metadata:
  name: test-diagnostic
  namespace: test-namespace
spec:
  cluster: test-cluster

Wait until diagnostics is finished. To monitor the progress:

while [ -z "$(kubectl -n <diagnosticObjectNamespace> get diagnostic <diagnosticObjectName> -o go-template='{{.status.finishedAt}}')" ]; do sleep 1; done;

Verify the status section of the Diagnostic object:
- If diagnostics is finished successfully, its result is displayed in the result map containing key-value pairs describing results of the corresponding diagnostic checks.
- If diagnostics is finished unsuccessfully, or the Diagnostic Controller version is outdated, diagnostic-controller saves the issue description to the status.error field.
  
  If the Diagnostic Controller version is outdated, ensure that release-controller is running and a new DiagnosticRelease has been created. Also, verify logs of the bare metal provider and release-controller for issues.
- If the status section is empty, diagnostic-controller has not run any diagnostics yet.
For details about the object status, see Diagnostic resource.

Self-upgrades of the Diagnostic Controller¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0)

The Diagnostic Controller is upgraded outside the Container Cloud release cycle. Once the new version of the Diagnostic Controller is released, it is automatically installed on the management cluster.

The Diagnostic Controller does not run any diagnostics until it is upgraded to the latest version. If diagnostics is triggered before the Diagnostic Controller is fully upgraded, the status field of the Diagnostic object contains the corresponding error. For example:

apiVersion: diagnostic.mirantis.com/v1alpha1
kind: Diagnostic
metadata:
  name: test-diagnostic
  namespace: test-namespace
spec:
  cluster: test-cluster
status:
  error: The controller has outdated version v1.40.1 (the latest version is
    v1.40.2). Wait until the controller is updated to the latest version. Ensure
    that the release controller is running and the new DiagnosticRelease has
    been created. Check the release controller and the provider logs for issues.
  controllerVersion: v1.40.1

Diagnostic checks for the bare metal provider¶

Available since MCC 2.28.0 (17.3.0 and 16.3.0) TechPreview

This section describes specifics of the diagnostic checks designed for the bare metal provider.

Address capacity¶

The bm_address_capacity check verifies that the available capacities of IP addresses in the Subnet and MetalLBConfig objects are sufficient.

This check verifies the Subnet objects only with the following labels:

ipam/SVC-k8s-lcm
ipam/SVC-pxe-nics

For the MetalLBConfig objects, the check uses only the IP addresses defined in .spec.ipAddressPools and verifies only the MetalLBConfig objects with the following configuration for the IP address pool:

.spec.autoAssign is set to true
.spec.serviceAllocation.serviceSelectors is no set

The minimum thresholds for IP address capacity are as follows:

Subnet — 5
MetalLBConfig — 10

Capacity below these thresholds is reported as insufficient.

If thresholds are met, then the output status is INFO. Otherwise, the status is WARNING.

The check reports the number of available IP addresses for each matching Subnet object and for each matching IP address pool of a matching MetalLBConfig object.

Artifact overrides¶

Applicable to a management cluster only

The bm_artifacts_overrides check verifies that no undesirable overrides are present in the baremetal-operator release, including but not limited to the values.init_bootstrap.provisioning_files.artifacts path.

Object status¶

The bm_objects_statuses check verifies that no errors or undesired states are present in the status of the following objects: IPAMHost, MetalLBConfig, and LCMMachine.

Object name	Checks applied by bm_objects_statuses
`IPAMHost`	`.status.state` is set to `OK` `.status.netconfigCandidate` equals the configuration set in `/etc/netplan/60-kaas-lcm-netplan.yaml` on a corresponding machine
`LCMMachine`	`.status.hostInfo.hardware` is present and contains values `.status.stateItemStatuses` has no errors for each `StateItem`
`MetalLBConfig`	`.status.objects` equals `.spec` `.status.updateResult.success` and `.status.propagateResult.success` are set to `true`

See also

Diagnostic resource

IAM operations¶

Note

The Container Cloud web UI communicates with Keycloak to authenticate users. Keycloak is exposed using HTTPS with self-signed TLS certificates that are not trusted by web browsers.

To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for cluster applications.

Manage user roles through Container Cloud API¶

You can manage IAM user role bindings through Container Cloud API. For the API reference of the IAM custom resources, see iam-api. You can also manage user roles using the Container Cloud web UI.

Note

User management for the Mirantis OpenStack for Kubernetes m:os roles is not yet available through API or web UI. Therefore, continue managing these roles using Keycloak.

You can use the following objects depending on the way you want the role to be assigned to the user:

IAMGlobalRoleBinding for global role bindings
Any IAM role can be used in IAMGlobalRoleBinding and will be applied globally, not limited to a specific project or cluster. For example, the global-admin role.
IAMRoleBinding for project role bindings
Any role except the global-admin one apply. For example, using the operator and user IAM roles in IAMRoleBinding of the example project corresponds to assigning of m:kaas:example@operator/user in Keycloak. You can also use these IAM roles in IAMGlobalRoleBinding. In this case, the roles corresponding to every project will be assigned to a user in Keycloak.
IAMClusterRoleBinding for cluster role bindings
Only the cluster-admin and stacklight-admin roles apply to IAMClusterRoleBinding. Creation of such objects corresponds to the assignment of m:k8s:namespace:cluster@cluster-admin/stacklight-admin in Keycloak. You can also bind these roles to either IAMGlobalRoleBinding or IAMRoleBinding. In this case, the roles corresponding to all clusters and in all projects or one particular project will be assigned to a user.

This section describes available IAM roles with use cases and the Container Cloud API IAM*RoleBinding mapping with Keycloak.

Available IAM roles and use cases¶

This section describes IAM roles and access rights they provide with possible use cases.

IAM roles¶

The following table illustrates the IAM roles available in MOSK and read/write or read-only permissions for specific project and cluster operations:

Roles	global-admin	management-admin	bm-pool-operator	operator	user	member	cluster-admin	stacklight-admin
Scope	Global	Global	Namespace	Namespace	Namespace	Namespace	Cluster	Cluster
User Role Management API	r/w	r/w	-	r/w	r/o	-	-	-
Create BM hosts	-	r/w	r/w	-	-	-	-	-
Create BareMetalHostInventories 0	-	r/w	r/w 1	r/w 1	r/o 1	r/w 1	-	-
Ceph objects	-	r/w	-	r/w	-	r/w	-	-
Projects (Kubernetes namespaces)	r/w	r/w	r/o	r/o	r/o	r/o	-	-
Container Cloud API	-	r/w	-	r/w	r/o	r/w	-	-
Kubernetes API (managed cluster)	-	-	-	r/w	-	r/w	r/w	-
StackLight UI/API (managed cluster)	-	-	-	r/w	-	r/w	r/w	r/w

0: Available since Container Cloud 2.29.0 (Cluster release 16.4.0)
1(1,2,3,4): Available since Container Cloud 2.29.1 (Cluster release 16.4.1)

Role use cases¶

The following table illustrates possible role use cases for a better understanding on which roles should be assigned to users who perform particular operations in a MOSK cluster:

Role	Use case
kind: IAMGlobalRoleBinding metadata: name: mybinding-ga role: name: global-admin user: name: myuser-1943c384	Infrastructure operator with the `global-admin` role who performs the following operations: Can manage all types of role bindings for all users Performs CRUD operations on namespaces to effectively manage Container Cloud projects (Kubernetes namespaces) Creates a new project when onboarding a new team to MOSK Assigns the `operator` role to users who are going to create Kubernetes clusters in a project Can assign the `user` or `operator` role for themselves to monitor cluster state in a specific namespace or manage Container Cloud API objects in that namespace respectively.
kind: IAMGlobalRoleBinding metadata: name: mybinding-ma role: name: management-admin user: name: myuser-1943c384	Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Infrastructure operator with the `management-admin` role who has full access to the management cluster, for example, to debug MOSK issues.
kind: IAMRoleBinding metadata: name: mybinding-bm namespace: mynamespace role: name: bm-pool-operator user: name: myuser-1943c384	Infrastructure operator with the `bm-pool-operator` role who only manages bare metal hosts of a particular namespace.
kind: IAMRoleBinding metadata: name: mybinding-op namespace: mynamespace role: name: operator user: name: myuser-1943c384	Infrastructure operator with the `operator` role who performs the following operations: Can manage Container Cloud API and Ceph-related objects in a particular namespace, create clusters and machines, have full access to Kubernetes clusters and StackLight APIs deployed by anyone in this namespace Can manage role bindings in the current namespace for users who require the `bm-pool-operator`, `operator`, or `user` role, or who should manage a particular Kubernetes cluster in this namespace Is responsible for upgrading Kubernetes clusters in the defined project when an update is available
kind: IAMRoleBinding metadata: name: mybinding-us namespace: mynamespace role: name: user user: name: myuser-1943c384	Infrastructure support operator with the `user` role who performs the following operations: Is responsible for the infrastructure of a particular project Has access to live statuses of the project cluster machines to identify unhealthy ones and perform maintenance on the infrastructure level with the possibility to adjust operating system if required Has access to IAM objects such as `IAMUser`, `IAMRole`
kind: IAMRoleBinding metadata: name: mybinding-me namespace: mynamespace role: name: member user: name: myuser-1943c384	Infrastructure support operator with the `member` role who has read and write access to Container Cloud API and does not have access to IAM objects.
kind: IAMClusterRoleBinding metadata: name: mybinding-ca namespace: mynamespace role: name: cluster-admin user: name: myuser-1943c384 cluster: name: mycluster	User with the `cluster-admin` role who performs the following operations: Has admin access to a Kubernetes cluster deployed in a particular namespace Has admin access to the StackLight components of the cluster to monitor it
kind: IAMClusterRoleBinding metadata: name: mybinding-sa namespace: mynamespace role: name: stacklight-admin user: name: myuser-1943c384 cluster: name: mycluster	User with the `stacklight-admin` role who performs the following operations: Has the admin-level access to the StackLight components of a particular Kubernetes cluster deployed in a particular namespace to monitor the cluster health.

Mapping of Keycloak roles to IAM*RoleBinding objects¶

Starting from Container Cloud 2.14.0 (Cluster releases 7.4.0, 6.20.0, and 5.21.0), MOSK role naming has changed. The old role names logic has been reworked and new role names are introduced.

Old-style role mappings are reflected in the Container Cloud API with the new roles and the legacy: true and legacyRole: “<oldRoleName>” fields set. If you remove the legacy flag, user-controller automatically performs the following update in Keycloak:

Grants the new-style role
Removes the old-style role mapping

Note

You can assign the old-style roles using Keycloak only. These roles will be synced into the Container Cloud API as the corresponding IAM*RoleBinding objects with the external: true, legacy: true, and legacyRole: “<oldRoleName>” fields set.
If you assign new-style roles using Keycloak, they will be synced into the Container Cloud API with the external: true field set.

Mapping of new-style Keycloak roles to IAM*RoleBinding objects¶

The following table describes how the IAM*RoleBinding objects in the Container Cloud API map to roles in Keycloak.

Container Cloud new role names	global-admin	bm-pool-operator	operator	user	cluster-admin	stacklight-admin
m:kaas@global-admin	1
m:kaas@management-admin ^{Since 2.25.0 (17.0.0 and 16.0.0)}	1
m:kaas:{ns}@bm-pool-operator		2
m:kaas:{ns}@operator			2
m:kaas:{ns}@user				2
m:k8s:{ns}:{cluster}@cluster-admin					3
m:sl:{ns}:{cluster}@stacklight-admin						3

1(1,2): IAMGlobalRoleBinding
2(1,2,3): IAMRoleBinding
3(1,2): IAMClusterRoleBinding

Mapping of old-style Keycloak roles to IAM*RoleBinding objects¶

The following table describes how the role names available before the Container Cloud 2.14.0 (Cluster releases 7.4.0, 6.20.0, and 5.21.0) map with the current IAM*RoleBinding objects in the Container Cloud API map:

Container Cloud new role names	global-admin	bm-pool-operator	operator	user	cluster-admin	stacklight-admin
m:kaas@writer	4		4
m:kaas@reader				4
m:kaas@operator		4
m:kaas:{ns}@writer			5
m:kaas:{ns}@reader				5
m:k8s:{ns}:{cluster}@cluster-admin					6
m:sl:{ns}:{cluster}@admin						6

4(1,2,3,4): IAMGlobalRoleBinding
5(1,2): IAMRoleBinding
6(1,2): IAMClusterRoleBinding

Examples of mapping between Keycloak roles and IAM*RoleBinding objects¶

The following tables contain several examples of role assignment either through Keycloak or the Container Cloud IAM objects with the corresponding role mappings for each use case.

Examples of roles assigned through IAM objects¶
Use case	Namespace operator role binding
IAM*RoleBinding example	apiVersion: iam.mirantis.com/v1alpha1 kind: IAMRoleBinding metadata: namespace: ns1 name: user1-operator role: name: operator user: name: user1-f150d839
Mapped role in Keycloak	The role `m:kaas:ns1@operator` assigned to `user1`.
Use case	Cluster-admin role assigned globally
IAM*RoleBinding example	apiVersion: iam.mirantis.com/v1alpha1 kind: IAMGlobalRoleBinding metadata: name: user1-global-cluster-admin role: name: cluster-admin user: name: user1-f150d839
Mapped role in Keycloak	For example, if you have two namespaces (`ns1`, `ns2`) and two clusters in each namespace, the following roles are created in Keycloak: `m:k8s:ns1:cluster1@cluster-admin` `m:k8s:ns1:cluster2@cluster-admin` `m:k8s:ns2:cluster3@cluster-admin` `m:k8s:ns2:cluster4@cluster-admin` If you create a new `cluster5` in `ns2`, the user is automatically assigned a new role in Keycloak: `m:k8s:ns2:cluster5@cluster-admin`.

The following table provides the new-style and old-style examples on how a role assigned to a user through Keycloak will be translated into IAM objects.

Examples of roles assigned through Keycloak¶
Role type	New-style role
Role example in Keycloak	The role `m:kaas:ns1@operator` is assigned to `user1`. The `external: true` flag defines the role that was assigned through Keycloak and only after that synced with the Container Cloud API object.
Mapped IAM*RoleBinding example	apiVersion: iam.mirantis.com/v1alpha1 kind: IAMRoleBinding metadata: namespace: ns1 name: user1-f150d839-operator external: true role: name: operator user: name: user1-f150d839
Role type	Old-style role
Role example in Keycloak	The role `m:kaas@writer` assigned to `user1`. Creation of this role through Keycloak triggers creation of two `IAMGlobalRoleBindings`: `global-admin` and `operator`. To migrate the old-style `m:kaas@writer` role to the new-style roles, remove the `legacy: true` flag in two API objects. For example, if you have two namespaces (`ns1` and `ns2`) and remove the `legacy: true` flag from both `IAMGlobalRoleBindings` mentioned above, the old-style `m:kaas@writer` role will be substituted by the following roles in Keycloak: `m:kaas@global-admin` `m:kaas:ns1@operator` `m:kaas:ns2@operator` If you create a new `ns3`, `user1` is automatically assigned a new role `m:kaas:ns3@operator`. If you do not remove the `legacy` flag from `IAMGlobalRoleBindings`, only one role remains in Keycloak - `m:kaas@writer`.
Mapped IAM*RoleBinding example	apiVersion: iam.mirantis.com/v1alpha1 kind: IAMGlobalRoleBinding metadata: name: user1-f150d839-global-admin external: true legacy: true legacyRole: m:kaas@writer role: name: global-admin user: name: user1-f150d839 apiVersion: iam.mirantis.com/v1alpha1 kind: IAMGlobalRoleBinding metadata: name: user1-f150d839-operator external: true legacy: true legacyRole: m:kaas@writer role: name: operator user: name: user1-f150d839

Manage user roles through the Container Cloud web UI¶

If you are assigned the global-admin role, you can manage the IAM*RoleBinding objects through the Container Cloud web UI. The possibility to manage project role bindings using the operator role will become available in one of the following Container Cloud releases.

To add or remove a role binding using the Container Cloud web UI:

Log in to the Container Cloud web UI as global-admin.
In the left-side navigation panel, click Users to open the active users list and view the number and types of bindings for each user. Click on a user name to open the details page with the user Role Bindings.

Select from the following options:

To add a new binding:

Click Create Role Binding.

In the window that opens, configure the following fields:

Parameter

Description

Role

global-admin
Manage all types of role bindings for all users
management-admin ^{Since MCC 2.25.0 (17.0.0 and 16.0.0)}
Have full access to the management cluster
bm-pool-operator
Manage bare metal hosts of a particular namespace
operator
- Manage Container Cloud API and Ceph-related objects in a particular project, create clusters and machines, have full access to Kubernetes clusters and StackLight APIs deployed by anyone in this project
- Manage role bindings in the current namespace for users who require the bm-pool-operator, operator, or user role
user
Manage infrastructure of a particular project with access to live statuses of the project cluster machines to monitor cluster health
cluster-admin
Have admin access to Kubernetes clusters and StackLight components of a particular cluster and project
stacklight-admin
Have admin access to the StackLight components of a particular Kubernetes cluster deployed in a particular project to monitor the cluster health.

Binding type

Global
Bind a role globally, not limited to a specific project or cluster. By default, global-admin has the global binding type.

You can bind any role globally. For example, you can change the default project binding of the operator role to apply this role globally, to all existing and new projects.
Project
Bind a role to a specific project. If selected, also define the Project name that the binding is assigned to.

By default, the following IAM roles have the project binding type: bm-pool-operator, operator, and user. You can bind any role to a project except the global-admin one.
Cluster
Bind a role to a specific cluster. If selected, also define the Project and Cluster name that the binding is assigned to. You can bind only the cluster-admin and stacklight-admin roles to a cluster.

To remove a binding, click the Delete action icon located in the last column of the required role binding.

Bindings that have the external flag set to true will be synced back from Keycloak during the next user-controller reconciliation. Therefore, manage such bindings through Keycloak.

Manage user roles through Keycloak¶

Note

Since Container Cloud 2.14.0 (Cluster releases 7.4.0, 6.20.0, and 5.21.0):

User roles management is available through the Container Cloud API and web UI.
User management for the m:os roles is not yet available through API or web UI. Therefore, continue managing these roles using Keycloak.
Role names have been updated. For details, see Mapping of Keycloak roles to IAM*RoleBinding objects.

Mirantis Container Cloud creates the IAM roles in scopes. For each application type, such as kaas, k8s, or sl, Container Cloud creates a set of roles such as @admin, @cluster-admin, @reader, @writer, @operator.

Depending on the role, you can perform specific operations in a cluster. For example:

With the m:kaas@writer role, you can create a project using the Container Cloud web UI. The corresponding project-specific roles will be automatically created in Keycloak by iam-controller.
With the m:kaas* roles, you can download the kubeconfig of the management cluster.

The semantic structure of role naming in MOSK is as follows:

m:<appType>:<namespaceName>:<clusterName>@<roleName>

Role naming semantic structure¶
Element	Description
`m`	Prefix for all IAM roles in MOSK
`<appType>`	Application type: `kaas` for a management cluster and Container Cloud API `k8s` for a managed cluster `sl` for StackLight
`<namespaceName>`	Namespace name that is optional depending on the application type
`<clusterName>`	Managed cluster name that is optional depending on the application type
`@`	Delimiter between a scope and role
`<roleName>`	Short name of a role within a scope

This section outlines the IAM roles and scopes structure in MOSK and role assignment to users using the Keycloak Admin Console.

MOSK roles and scopes¶

MOSK roles can have three types of scopes:

Types of MOSK scopes¶
Scope	Application type	Components	Example
Global	`kaas`	`m` `<appType>`	`m:kaas@writer` This scope applies to all managed clusters and namespaces.
Namespace	`kaas`	`m` `<appType>` `<namespaceName>`	`m:kaas:my_namespace@writer`
Cluster	`k8s` `sl`	`m` `<appType>` `<namespaceName>` `<clusterName>`	`m:k8s:my_namespace:my_cluster@cluster-admin`

New-style roles¶

Recommended

Since Container Cloud 2.14.0 (Cluster releases 7.4.0, 6.20.0, 5.21.0), new-style roles were introduced. They can be assigned to users through Keycloak directly as well as by using IAM API objects. Mirantis recommends using IAM API for roles assignment.

Users with the m:kaas@global-admin role can create MOSK projects, which are Kubernetes namespaces in a management cluster, and all IAM API objects that manage users access to MOSK.

Users with the m:kaas@management-admin role have full access to the management cluster. This role is available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0).

After project creation, iam-controller creates the following roles in Keycloak:

m:kaas:<namespaceName>@operator
Provides the same permissions as m:kaas:<namespaceName>@writer
m:kaas:<namespaceName>@bm-pool-operator
Provides the same permissions as m:kaas@operator but restricted to a single namespace
m:kaas:<namespaceName>@user
Provides the same permissions as m:kaas:<namespaceName>@reader
m:kaas:<namespaceName>@member
Provides the same permissions as m:kaas:<namespaceName>@operator except for IAM API access

The old-style m:k8s:<namespaceName>:<clusterName>@cluster-admin role is unchanged in the new-style format and is recommended for usage.

When a managed cluster is created, a new role m:sl:<namespaceName>:<clusterName>@stacklight-admin for the sl application is created. This role provides the same access to the StackLight resources in the managed cluster as m:sl:<namespaceName>:<clusterName>@admin and is included into the corresponding m:k8s:<namespaceName>:<clusterName>@cluster-admin role.

Old-style roles¶

Not recommended

Users with the m:kaas@writer role are considered global MOSK administrators. They can create MOSK projects that are Kubernetes namespaces in the management cluster. After a project is created, the m:kaas:<namespaceName>@writer and m:kaas:<namespaceName>@reader roles are created in Keycloak by iam-controller. These roles are automatically included into the corresponding global roles, such as m:kaas@writer, so that users with the global-scoped role also obtain the rights provided by the namespace-scoped roles. The global role m:kaas@operator provides full access to bare metal objects.

When a managed cluster is created, roles for the sl and k8s applications are created:

m:k8s:<namespaceName>:<clusterName>@cluster-admin (also applies to new-style roles, recommended)
m:sl:<namespaceName>:<clusterName>@admin

These roles provide access to the corresponding resources in a managed cluster and are included into the corresponding m:kaas:<namespaceName>@writer role.

Detailed role descriptions¶

The following tables include MOSK scopes and descriptions of their roles by three application types:

Container Cloud
Kubernetes
StackLight

Container Cloud¶
Scope identifier	Short role name	Full role name	Role description
`m:kaas`	`reader`	`m:kaas@reader` 0	List the API resources within the Container Cloud scope.
	`writer`	`m:kaas@writer` 0	Create, update, or delete the API resources within the Container Cloud scope. Create projects.
	`operator`	`m:kaas@operator` 0	Add or delete a bare metal host and, since Container Cloud 2.29.1 (Cluster release 16.4.1), bare metal inventory within the Container Cloud scope.
	`global-admin`	`m:kaas@global-admin` 0	Create, update, or delete the IAM API resources within the Container Cloud scope. Create projects.
	`management-admin`	`m:kaas@management-admin` 0	Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Have full access to the management cluster.
`m:kaas:<namespaceName>`	`reader`	`m:kaas:<namespaceName>@reader`	List the API resources within the specified Container Cloud project.
	`writer`	`m:kaas:<namespaceName>@writer`	Create, update, or delete the API resources within the specified Container Cloud project.
	`user`	`m:kaas:<namespaceName>@user`	List the API resources within the specified Container Cloud project.
	`operator`	`m:kaas:<namespaceName>@operator`	Create, update, or delete the API resources within the specified Container Cloud project.
	`bm-pool-operator`	`m:kaas:<namespaceName>@bm-pool-operator`	Add or delete a bare metal host and, since Container Cloud 2.29.1 (Cluster release 16.4.1), bare metal inventory within the specified Container Cloud project.
	`member`	`m:kaas:<namespaceName>@member`	Create, update, or delete the API resources within the specified Container Cloud project, except IAM API.

0(1,2,3,4,5): Role is available by default. Other roles will be added during a managed cluster deployment or project creation.

Kubernetes¶
Scope identifier	Short role name	Full role name	Role description
`m:k8s:<namespaceName>:<clusterName>`	`cluster-admin`	`m:k8s:<namespaceName>:<clusterName>@cluster-admin`	Allow the superuser to perform any action on any resource in the specified cluster.

StackLight¶
Scope identifier	Short role name	Full role name	Role description
`m:sl:<namespaceName>:<clusterName>`	`admin`	`m:sl:$<namespaceName>:<clusterName>@admin`	Access the following web UIs within the scope: Alerta Alertmanager Grafana OpenSearch Dashboards Prometheus
	`stacklight-admin`	`m:sl:$<namespaceName>:<clusterName>@stacklight-admin`	Access the following web UIs within the scope: Alerta Alertmanager Grafana OpenSearch Dashboards Prometheus

Use cases¶

This section illustrates possible use cases for a better understanding on which roles should be assigned to users who perform particular operations in a MOSK cluster:

Role	Use case
`m:kaas@operator`	Member of a dedicated infrastructure team who only manages bare metal hosts and, since Container Cloud 2.29.1 (Cluster release 16.4.1), bare metal inventories in MOSK
`m:kaas@writer`	Infrastructure Operator who performs the following operations: Performs CRUD operations on namespaces to effectively manage MOSK projects (Kubernetes namespaces) Creates a new project when a new team is being onboarded to MOSK Manages API objects in all namespaces, creates clusters and machines Using `kubeconfig` downloaded through the Container Cloud web UI, has full access to the Kubernetes clusters and StackLight APIs deployed by anyone in MOSK except the management cluster Has the Container Cloud API access in the management cluster using the management cluster `kubeconfig` downloaded through the Container Cloud web UI Note To have full access to the management cluster, a `kubeconfig` generated during the management cluster bootstrap is required.
`m:kaas@reader`	Member of a dedicated infrastructure support team responsible for the MOSK infrastructure who performs the following operations: Monitors the cluster and machine live statuses to control the underlying cluster infrastructure health status Performs maintenance on the infrastructure level Performs adjustments on the operating system level
`m:kaas:<namespaceName>@writer`	User who administers a particular project: Has full access to Kubernetes clusters and StackLight APIs deployed by anyone in this project Has full access to Container Cloud API in this project Upgrades Kubernetes clusters in the project when an update is available
`m:kaas:<namespaceName>@reader`	Member of a dedicated infrastructure support team in a particular project. For use cases, see the `m:kaas@reader` role described above.
`m:k8s:<namespaceName>:<clusterName>@cluster-admin`	User who has admin access to a Kubernetes cluster deployed in a particular project.
`m:sl:<namespaceName>:<clusterName>@admin`	User who has full access to the StackLight components of a particular Kubernetes cluster deployed in a particular project to monitor the cluster health status.

See also

Access the Keycloak Admin Console

Change passwords for IAM users¶

This section describes how to change passwords for IAM users on publicly accessible MOSK deployments using the Keycloak web UI.

To change the IAM passwords:

Obtain the Keycloak admin password:

kubectl get secret -n kaas iam-api-secrets -o jsonpath='{.data.keycloak_password}' | base64 -d ; echo

Obtain the Keycloak load balancer IP:

kubectl get svc -n kaas iam-keycloak-http

Log in to the Keycloak web UI using the following link form with the default keycloak admin user and the Keycloak credentials obtained in the previous steps:

https://<Keycloak-LB-IP>/auth/admin/master/console/#/iam/users
Navigate to Users > User list that contains all users in the IAM realm.
Click the required user name. The page with user settings opens.
Open Credentials tab.
Using the Reset password form, update the password as required.

Note

To change the password permanently, toggle the Temporary switch to the OFF position. Otherwise, the user will be prompted to change the password after the next login.

Obtain MariaDB credentials for IAM¶

Available since MCC 2.22.0 (11.6.0)

To obtain the MariaDB credentials for IAM, use the Container Cloud binary:

./container-cloud get iam-creds --mgmt-kubeconfig <pathToManagementClusterKubeconfig>

Example of system response:

IAM DB credentials:
MYSQL_DBADMIN_PASSWORD: foobar
MYSQL_DBSST_PASSWORD: barbaz

Caution

Credentials provided in the system response allow operating MariaDB with the root user inside a container. Therefore, use them with caution.

Manage Keycloak truststore using the Container Cloud web UI¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0)

While communicating with external services, Keycloak must validate the certificate of the remote server to ensure secured connection.

By default, the standard Java Truststore configuration is used for validating outgoing requests. In order to properly validate client self-signed certificates, the truststore configuration must be added. The truststore is used to ensure secured connection to identity brokers, LDAP identity providers, and so on.

If a custom truststore is set, only certificates from that truststore are used. If trusted public CA certificates are also required, they must be included in the custom truststore.

To add a custom truststore for Keycloak using the Container Cloud web UI:

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the default project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the More action icon in the last column of the management cluster and select Configure cluster.
In the window that opens, click Keycloak and select Configure trusted certificates.

Note

The Configure trusted certificates check box is available since Container Cloud 2.26.4 (Cluster releases 17.1.4 and 16.1.4).

In the Truststore section that opens, fill out and save the form with the following parameters:

Parameter	Description
Data	Content of the truststore file. Click Upload to select the required file.
Password	Password of the truststore. Mandatory.
Type	Supported truststore types: jks, pkcs12, or bcfks.
Hostname verification policy	Optional verification of the host name of the server certificate: The default WILDCARD value allows wildcards in subdomain names. The STRICT value requires the Common Name (CN) to match the host name.

Click Update.

Once a custom truststore for Keycloak is applied, the following configuration is added to the Cluster object:

spec:
  providerSpec:
    value:
      kaas:
        management:
          keycloak:
            truststore:
              data:
                value: # base64 encoded truststore file content
              password:
                value: # string
              type: # string
              hostnameVerificationPolicy: # string

Note

Use the same web UI menu to customize an existing truststore or reset it to default settings, which is available since Container Cloud 2.26.4 (Cluster releases 17.1.4 and 16.1.4).

See also

Management cluster operations¶

Note

The Container Cloud web UI communicates with Keycloak to authenticate users. Keycloak is exposed using HTTPS with self-signed TLS certificates that are not trusted by web browsers.

To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for cluster applications.

Caution

Regional clusters are unsupported since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Mirantis does not perform functional integration testing of the feature and the related code is removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). If you still require this feature, contact Mirantis support for further information.

This section covers the management aspects of the bare metal management cluster that MOSK is based on.

The Mirantis Container Cloud web UI enables you to perform the following operations with management clusters:

View the cluster details (such as cluster ID, creation date, node count, and so on) as well as obtain a list of cluster endpoints including the StackLight components, depending on your deployment configuration.

To view generic cluster details, in the Clusters tab, click the More action icon in the last column of the required cluster and select Cluster info.
Note
- Adding more than 3 nodes to a management cluster is not supported.
- Removing a management cluster using the Container Cloud web UI is not supported. Use the dedicated cleanup script instead. For details, see Remove a management cluster.
Verify the current release version of the cluster including the list of installed components with their versions and the cluster release change log.

To view details of a Cluster release version, in the Clusters tab, click the version in the Release column next to the name of the required cluster.

This section outlines the operations that you can perform with a management cluster.

Container Cloud CLI¶

The Container Cloud APIs are implemented using the Kubernetes CustomResourceDefinitions (CRDs) that enable you to expand the Kubernetes API. For details, see Container Cloud documentation: API Reference.

You can operate a cluster using the kubectl command-line tool that is based on the Kubernetes API. For the kubectl reference, see the official Kubernetes documentation.

Workflow and configuration of management cluster upgrade¶

This section describes specifics of automatic upgrade workflow of a management cluster as well as provides configuration procedures that you may apply before and after automatic upgrade.

Automatic upgrade workflow¶

A management cluster upgrade to a newer version is performed automatically once a new Container Cloud version is released. For more details about the Container Cloud release upgrade mechanism, see Release Controller.

The Operator can delay the automatic update procedure for a limited amount of time or schedule update to run at desired hours or weekdays. For details, see Schedule Mirantis Container Cloud updates.

Since Container Cloud 2.23.2 (Cluster releases 12.7.1 and 11.7.1), the release update train includes patch release updates being delivered between major releases. Patch release updates also involve automatic upgrade of a management cluster. For details on the currently available patch releases, see Container Cloud Release Notes: Latest supported patch releases.

See also

Post-upgrade steps¶

Once the management cluster is automatically upgraded to the latest version, proceed with the following steps:

Update the original bootstrap tarball for successful cluster management, such as collecting logs and so on:

Select from the following options:
- For clusters deployed using Container Cloud 2.11.0 (Cluster releases 7.1.0 and 6.18.0) or later:
```
./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
--target-dir <pathToBootstrapDirectory>
```
- For clusters deployed using the Container Cloud release earlier than 2.11.0 (7.0.0, 6.16.0, or earlier), or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:
```
wget https://binary.mirantis.com/releases/get_container_cloud.sh

chmod 0755 get_container_cloud.sh

./get_container_cloud.sh
```
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Schedule Mirantis Container Cloud updates¶

By default, Container Cloud automatically updates to the latest version, once available. An Operator can delay or reschedule Container Cloud automatic update process using CLI or web UI. The scheduling feature allows:

Limiting hours and weekdays when Container Cloud update can run. For example, if a release becomes available on Monday, you can delay it until Sunday by setting Sunday as the only permitted day for auto-updates.
Available since Container Cloud 2.28.0 (Cluster release 16.3.0):
- Blocking Container Cloud auto-update immediately on the release date. The delay period is minimum 20 days for each newly discovered release. The exact number of delay days is set in the release metadata and cannot be changed by the user. It depends on the specifics of each release cycle and on optional configuration of week days and hours selected for update.
  
  You can verify the exact date of a scheduled auto-update either in the Status section of the Management Cluster Updates page in the web UI or in the status section of the MCCUpgrade object.
- Monitoring availability of new releases with corresponding alerts from StackLight. For details, see StackLight alerts: Container Cloud.
Deprecated since Container Cloud 2.28.0 (Cluster release 16.3.0) in the CLI and removed in the web UI. Blocking Container Cloud update process for up to 7 days from the current date and up to 30 days from the latest Container Cloud release

Caution

Since Container Cloud 2.23.2 (Cluster release 11.7.1), the release update train includes patch release updates being delivered between major releases. The new approach increases the frequency of the release updates. Therefore, schedule a longer maintenance window for the Container Cloud update as there can be more than one scheduled update in the queue.

For details on the currently available patch releases, see Container Cloud Release Notes: Latest supported patch releases.

Schedule update of a management cluster using the web UI¶

Since Container Cloud 2.28.0 (Cluster release 16.3.0)

Log in to the Container Cloud web UI as m:kaas@global-admin or m:kaas@writer.
In the left-side navigation panel, click Admin > Updates.
On the Management Cluster Updates page, verify the status of the next release in the Status section.
- If the management cluster update is delayed, the section contains the following information about the new release: version, publish date, link to release notes, scheduled date and time of update.
- If the management cluster contains managed clusters running unsupported Cluster versions, a tooltip with a notification about blocked update is displayed.
- If the cluster is updated to the latest version, the corresponding message is displayed.
On the left side of the page, click Settings.
On the Configure updates schedule page, select Auto-delay cluster updates to delay every new consecutive release for minimum 20 days from the release publish date.

Note

Changing the number of delay days in unsupported. The exact number of delay days depends on specifics of each release cycle and on optional configuration of week days and hours selected for update.
Optional. Select Apply updates only within specific hours:
- From the Time Zone list, select the required time zone or type in the required location.
- In Allowed Time for update, set the time intervals and week days allowed for update. To set additional update hours, use the + button on the right side of the window.
Note

You can use this option with or without the auto-delay option. When both options are enabled, the next available update starts after the 20-day interval at the earliest allowed hour and week day that are allowed by the defined time window.

Before Container Cloud 2.28.0 (Cluster release 16.2.0 or eralier)

Log in to the Container Cloud web UI as m:kaas@global-admin or m:kaas@writer.
In the left-side navigation panel, click Upgrade Schedule in the Admin section.
Click Configure Schedule.
Select the time zone from the Time Zone list. You can also type the necessary location to find it in the list.
Optional. In Delay Upgrade, configure the update delay. You can set no delay or select the exact day, hour, and minute. You can delay the update up to 7 days, but not more than 30 days from the latest release date. For example, the current time is 10:00 March 28, and the latest release was on March 1. In this case, the maximum delay you can set is 10:00 March 31. Regardless of your time zone, configure time in accordance with the previously selected time zone.
Optional. In Allowed Time for Upgrade, set the time intervals when to allow update. Select the update hours in the From and To time input fields. Select days of the week in the corresponding check boxes. Click + to set additional update hours.

Schedule update of a management cluster using CLI¶

You can delay or reschedule Container Cloud automatic update by editing the MCCUpgrade object named mcc-upgrade in Kubernetes API.

Caution

Only the management cluster admin and users with the operator (or writer in old-style Keycloak roles) permissions can edit the MCCUpgrade object. For object editing, use kubeconfig generated during the management cluster bootstrap or kubeconfig generated with the operator (or writer) permissions.

To edit the current configuration, run the following command in the command line:

kubectl edit mccupgrade mcc-upgrade

In the system response, the editor displays the current state of the MCCUpgrade object in the YAML format. The spec section contains the current update schedule configuration, for example:

spec:
  autoDelay: true
  timeZone: CET
  schedule:
  - hours:
      from: 10
      to: 17
    weekdays:
      monday: true
      tuesday: true
  - hours:
      from: 7
      to: 10
    weekdays:
      monday: true
      friday: true

In this example, all schedule calculations are done in the CET timezone and upgrades are allowed only:

From 7:00 to 17:00 on Mondays
From 10:00 to 17:00 on Tuesdays
From 7:00 to 10:00 on Fridays

For details about the MCCUpgrade object, see MCCUpgrade resource.

On every update step, the Release Controller verifies if the current time is allowed by the schedule and does not start or proceed with the update if it is not.

Renew the Container Cloud and MKE licenses¶

When your Mirantis Container Cloud expires, contact you account manager to request a new license by submitting a ticket through the Mirantis CloudCare Portal. If your trial license has expired, contact Mirantis support for further information. Once you obtain a new mirantis.lic file, update Container Cloud along with MKE clusters using the instructions below.

Important

Once your Container Cloud license expires, all API operations with new and existing clusters are blocked until license renewal. Existing workloads are not affected.

Additionally, since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0), you cannot perform the following operations on your cluster with an expired license:

Create new clusters and machines
Automatically upgrade the management cluster
Update managed clusters

To update the Container Cloud and MKE licenses:

Log in to the Container Cloud web UI with the m:kaas@global-admin role.
Navigate to Admin > License.
Click Update License and upload your new license.
Click Update.

Caution

Machines are not cordoned and drained, user workloads are not interrupted, and the MKE license is updated automatically for all clusters.

See also

License resource

Configure NTP server¶

If you did not add the NTP server parameters during the management cluster bootstrap, configure them on the existing management cluster as required. These parameters are applied to all machines of managed clusters deployed within the configured management cluster.

Caution

The procedure below applies only if ntpEnabled=true (default) was set during a management cluster bootstrap. Enabling or disabling NTP after bootstrap is not supported.

Warning

The procedure below triggers an upgrade of all clusters in a specific management cluster, which may lead to workload disruption during nodes cordoning and draining.

To configure an NTP server for managed clusters:

Download your management cluster kubeconfig:
1. Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
2. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
3. Expand the menu of the tab with your user name.
4. Click Download kubeconfig to download kubeconfig of your management cluster.
5. Log in to any local machine with kubectl installed.
6. Copy the downloaded kubeconfig to this machine.
Use the downloaded kubeconfig to edit the management cluster:
```
kubectl --kubeconfig <kubeconfigPath> edit -n <projectName> cluster <managementClusterName>
```
In the command above and the step below, replace the parameters enclosed in angle brackets with the corresponding values of your cluster.

In the regional section, add the ntp:servers section with the list of required server names:

spec:
  ...
  providerSpec:
    value:
      kaas:
      ...
      ntpEnabled: true
      ...
        regional:
          - helmReleases:
            - name: baremetal-provider
              values:
                config:
                  lcm:
                    ...
                    ntp:
                      servers:
                      - 0.pool.ntp.org
                      ...

Automatically propagate Salesforce configuration to all clusters¶

You can enable automatic propagation of the Salesforce configuration of your management cluster to the related managed clusters using the autoSyncSalesForceConfig=true flag added to the Cluster object of the management cluster. This option allows for automatic update and sync of the Salesforce settings on all your clusters after you update your management cluster configuration.

You can also set custom settings for managed clusters that always override automatically propagated Salesforce values.

Enable propagation of Salesforce configuration using web UI¶

Log in to the Container Cloud web UI as m:kaas@global-admin or m:kaas@writer.
In the Clusters tab, click the More action icon in the last column of the required management cluster and select Configure.
In the Configure cluster window, navigate to StackLight > Salesforce and select Salesforce Configuration Propagation To Managed Clusters.
Click Update.

Once the automatic propagation applies, the Events section of the corresponding managed cluster displays the following message: Propagated Cluster Salesforce Config From Management <clusterName> Cluster uses SalesForce configuration from management cluster.

Note

To set custom Salesforce settings for your managed clusters that will override the related management cluster settings, refer to the optional step in Enable propagation of Salesforce configuration using CLI.

Enable propagation of Salesforce configuration using CLI¶

Download your management cluster kubeconfig:
1. Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
2. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
3. Expand the menu of the tab with your user name.
4. Click Download kubeconfig to download kubeconfig of your management cluster.
5. Log in to any local machine with kubectl installed.
6. Copy the downloaded kubeconfig to this machine.

In the Cluster objects of the required managed cluster, remove all Salesforce settings that you want to automatically sync with the same settings of the management cluster:

kubectl --kubeconfig <mgmtClusterKubeconfigPath> edit -n <managedClusterProjectName> cluster <managedClusterName>

From the StackLight values section, remove the following Salesforce parameters:

spec:
  ...
  providerSpec:
    value:
      helmReleases:
      - name: stacklight
        values:
          ...

alertmanagerSimpleConfig.salesForce.enabled
alertmanagerSimpleConfig.salesForce.auth
sfReporter.salesForceAuth
sfReporter.enabled
sfReporter.cronjob

For details about these parameters, refer to StackLight configuration parameters for Salesforce.

In the management section of the management cluster Cluster object, add the autoSyncSalesForceConfig: true flag:

kubectl --kubeconfig <kubeconfigPath> edit -n <projectName> cluster <managementClusterName>

spec:
  ...
  providerSpec:
    value:
      kaas:
      ...
        management:
          ...
          autoSyncSalesForceConfig: true

Note

If the autoSyncSalesForceConfig is not set to any value, automatic propagation is disabled.

Once enabled, the following Salesforce parameters are copied to all managed clusters where these settings were not configured yet:

alertmanagerSimpleConfig.salesForce.enabled
alertmanagerSimpleConfig.salesForce.auth
sfReporter.salesForceAuth
sfReporter.enabled
sfReporter.cronjob

The existing Salesforce settings of managed clusters will not be overridden after you enable automatic propagation.

To verify the automatic propagation status:

kubectl edit helmbundles <managedClusterName> -n <managedClusterProjectName>

Optional. Set custom Salesforce settings for your managed cluster to override the related management cluster settings. Add the required custom settings to the StackLight values section of the corresponding Cluster object of your managed cluster:
```
spec:
  ...
  providerSpec:
    value:
      helmReleases:
      - name: stacklight
        values:
          ...
```
For details, refer to Configure StackLight and StackLight configuration parameters for Salesforce.

Note

Custom settings are not overridden if you update the management cluster settings for Salesforce.

Update the Keycloak IP address on bare metal clusters¶

The following instruction describes how to update the IP address of the Keycloak service on management clusters.

Note

The commands below contain the default kaas-mgmt name of the management cluster. If you changed the default name, replace it accordingly. To verify the cluster name, run kubectl get clusters.

To update the Keycloak IP address on a management cluster:

Log in to a node that contains kubeconfig of the required management cluster.

Make sure that the configuration file is in your .kube directory. Otherwise, set the KUBECONFIG environment variable with a full path to the configuration file.

Configure the additional external IP address pool for the metallb load balancer service.

The Keycloak service requires one IP address. Therefore, the external IP address pool must contain at least one IP address.

Since Container Cloud 2.27.0 (Cluster release 16.2.0)

Open the MetalLBConfig object of the management cluster for editing:
```
kubectl edit metallbconfig <MetalLBConfigName>
```

In the ipAddressPools section, add:

...
spec:
  ipAddressPools:
  - name: external
    spec:
      addresses:
      - <pool_start_ip>-<pool_end_ip>
      autoAssign: false
      avoidBuggyIPs: false
...

In the snippet above, replace the following parameters:

<pool_start_ip> - first IP address in the required range
<pool_end_ip> - last IP address in the range

Add the external IP address pool name to the L2Advertisements definition. You can add it to the same L2 advertisement as the default IP address pool, or create a new L2 advertisement if required.

...
spec:
  l2Advertisements:
  - name: default
    spec:
      interfaces:
      - k8s-lcm
      ipAddressPools:
      - default
      - external
...

Save and exit the object to apply changes.

Since Container Cloud 2.24.0 (Cluster release 14.0.1)

Create the Subnet object template with the following content:
```
apiVersion: ipam.mirantis.com/v1alpha1
kind: Subnet
metadata:
 labels:
 cluster.sigs.k8s.io/cluster-name: kaas-mgmt
 kaas.mirantis.com/provider: baremetal
 kaas.mirantis.com/region: region-one
 metallb/address-pool-auto-assign: "false"
 metallb/address-pool-name: external
 metallb/address-pool-protocol: layer2
 name: master-lb-external
 namespace: default
spec:
 cidr: <pool_cidr>
 includeRanges:
 - <pool_start_ip>-<pool_end_ip>
```
Note

The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.

In the template above, replace the following parameters:
- <pool_start_ip> - first IP address in the desired range.
- <pool_end_ip> - last IP address in the range.
- <pool_cidr> - corresponding CIDR address. The only requirement for this CIDR address is that the address range mentioned above must fit into this CIDR. The CIDR address is not used by MetalLB, it is just formally required for Subnet objects.
Note

If required, use a different IP address pool name.
Apply the Subnet template created in the previous step:
```
kubectl create -f <subnetTemplateName>
```
Open the MetalLBConfigTemplate object of the management cluster for editing:
```
kubectl edit <MetalLBConfigTemplateName>
```

...
spec:
  templates:
    l2Advertisements: |
      - name: management-lcm
        spec:
          ipAddressPools:
            - default
            - external
          interfaces:
            - k8s-lcm
      - name: provision-pxe
        spec:
          ipAddressPools:
            - services-pxe
          interfaces:
            - k8s-pxe
...

Save and exit the object to apply changes.

Before Container Cloud 2.24.0 (Cluster release 11.7.0 or earlier)

Open the Cluster object for editing:
```
kubectl edit cluster <clusterName>
```

Add the following highlighted lines by replacing <pool_start_ip> with the first IP address in the desired range and <pool_end_ip> with the last IP address in the range:

spec:
  providerSpec:
    value:
      helmReleases:
      - name: metallb
        values:
          configInline:
            address-pools:
            - name: default
              protocol: layer2
              addresses:
              - 10.0.0.100-10.0.0.120 // example values
            - name: external
              protocol: layer2
              auto-assign: false
              addresses:
              - <pool_start_ip>-<pool_end_ip>

Note

If required, use a different IP address pool name.

Save and exit the object to apply changes.

Obtain the current Keycloak IP address for reference:

kubectl -n kaas get service iam-keycloak-http -o jsonpath='{.status.loadBalancer.ingress[0].ip}{"\n"}'

Configure the iam-keycloak-http service to listen on one of the IP addresses from the external pool:
```
kubectl -n kaas edit service iam-keycloak-http
```
Add the following annotation to the service:
```
kind: Service
metadata:
  annotations:
    metallb.universe.tf/address-pool: external
```
Save and exit to apply changes.

Verify that the Keycloak service IP address has changed:

kubectl -n kaas get service iam-keycloak-http -o jsonpath='{.status.loadBalancer.ingress[0].ip}{"\n"}'

Monitor the cluster status to verify that the changes are applied:

kubectl get cluster kaas-mgmt -o yaml

In the output, monitor the url parameter value in the keycloak field:

...
status:
  providerStatus:
    helm:
      ready: true
      ...
      releases:
      ...
        iam:
          keycloak:
            url: https://<pool_start_ip>

The value of the parameter is typically the first address of the external pool rage.

Once the parameter has updated, delete the old certificate for the former address:
```
kubectl delete secret keycloak-tls-certs -n kaas
```
Note

The new certificate secret with the same name keycloak-tls-certs will be generated automatically.

Verify the new certificate, once available:

kubectl get secret keycloak-tls-certs -n kaas -o yaml

Restart the iam-keycloak-http pod to ensure that the new certificate is used:
1. Change the number of the iam-keycloak StatefulSet replicas to 0:
```
kubectl -n kaas scale statefulsets iam-keycloak --replicas=0
```
2. Wait until the READY column has 0/0 pods:
```
kubectl -n kaas get statefulsets iam-keycloak
```
3. Change the number of the iam-keycloak StatefulSet replicas back to 3:
```
kubectl -n kaas scale statefulsets iam-keycloak --replicas=3
```
4. Wait until the READY column has at least 1/3 pods:
```
kubectl -n kaas get statefulsets iam-keycloak
```

Verify that the IP address in the status.providerStatus.oidc.issuerUrl field of the Cluster object has changed:

kubectl get cluster kaas-mgmt -o jsonpath='{.status.providerStatus.oidc.issuerUrl}{"\n"}'

If it still contains the old IP address, update it manually:

kubectl edit cluster kaas-mgmt

Under spec.providerSpec.value.kaas.management.helmReleases, update the values.api.keycloak.url field inside the iam Helm object definition:

spec:
  providerSpec:
    value:
      kaas:
        management:
          helmReleases:
            - name: iam
              values:
                api:
                  keycloak:
                    url: https://<newKeycloakServiceIpAddress>

Save and exit to apply changes.

Wait a few minutes until issuerUrl is changed and OIDC is ready.

To verify issuerUrl:

kubectl get cluster kaas-mgmt -o jsonpath='{.status.providerStatus.oidc.issuerUrl}{"\n"}'

To verify OIDC readiness:

kubectl get cluster kaas-mgmt -o jsonpath='{.status.providerStatus.oidc.ready}{"\n"}'

Verify that the Container Cloud and MKE web UIs are accessible with the new Keycloak IP address and certificate.

Configure host names for cluster machines¶

Available since MCC 2.24.0 (Cluster release 14.0.1) TechPreview

You can enable custom host names for cluster machines so that any machine host name in a particular management cluster and its managed clusters matches the related Machine object name. For example, instead of the default kaas-node-<UID>, a machine host name will be master-0. The custom naming format is more convenient and easier to operate with.

Note

After you enable custom host names on an existing management cluster, names of all newly deployed machines in this cluster and its managed clusters will match machine host names. Existing host names will remain the same.

If you are going to clean up a management cluster with this feature enabled after cluster deployment, make sure to manually delete machines with existing non-custom host names before cluster cleanup to prevent cleanup failure. For details, see Remove a management cluster.

You can enable custom host names during management cluster bootstrap during initial cluster configuration. For details, see Deploy a management cluster. To enable the feature on an existing cluster, see the procedure below.

To enable custom host names on an existing management cluster:

Open the Cluster object of the management cluster for editing:
```
kubectl edit cluster <mgmtClusterName>
```
In the spec.providerSpec.value.kaas.regional section of the required region, find the required provider name under helmReleases and add customHostnamesEnabled: true under values.config. For example:
```
regional:
 - helmReleases:
   - name: baremetal-provider
     values:
       config:
         allInOneAllowed: false
         customHostnamesEnabled: true
         internalLoadBalancers: false
```
The configuration applies in several minutes after the bare metal provider Pods restart automatically.

Verify that customHostnames is present in the provider ConfigMap:

Since MCC 2.26.0 (Cluster release 16.1.0)

kubectl -n kaas get configmap provider-config-<providerName> -o=yaml | grep customHostnames

Before MCC 2.26.0 (Cluster release 16.0.0 or eralier)

kubectl -n kaas get configmap provider-config-<providerName>-<regionName> -o=yaml | grep customHostnames

Back up MariaDB on a management cluster¶

Available since MCC 2.27.0 (Cluster release 16.2.0)

MOSK uses a MariaDB database to store data generated by the Container Cloud components. Mirantis recommends backing up your databases to ensure the integrity of your data. Also, you should create an instant backup before upgrading your database to restore it if required.

The Kubernetes cron job responsible for the MariaDB backup is enabled by default to create daily backups. You can modify the default configuration before or after the management cluster deployment.

Warning

A local volume of only one node of a management cluster is selected when the backup is created for the first time. This volume is used for all subsequent backups.

If the node containing backup data must be redeployed, first move the MySQL backup to another node and update the PVC binding along with the MariaDB backup job to use another node as described in Change the storage node for MariaDB.

Important

Mirantis highly recommends that you also back up MKE on management and MOSK clusters regularly, especially after any major action. For example, after the following operations:

Deployment of a management and its MOSK cluster
Updating a cluster to a major or patch version
Adding or redeploying nodes, including host operating system operations using HostOSConfiguration objects
Changing the operating system distribution
Reconfiguration of a cluster, for example:
- Switching to a new container runtime
- Configuring proxy
- Updating MKE or Keycloak certificates
Minor cluster changes, for example:
- Adding or removing node labels
- Creating or removing UpdateGroup objects

For the backup procedure, see Mirantis Kubernetes Engine documentation: Back up MKE.

Configure periodic backups of MariaDB¶

After the management cluster deployment, the cluster configuration includes the MariaDB backup functionality. The Kubernetes cron job responsible for the MariaDB backup is enabled by default. For the MariaDB backup workflow, see Workflows of the OpenStack database backup and restoration.

Warning

A local volume of only one node of a management cluster is selected when the backup is created for the first time. This volume is used for all subsequent backups.

To manually create a MariaDB database backup:

kubectl -n kaas create job --from=cronjob/mariadb-phy-backup mariadb-phy-backup-manual-001

To modify the default backup configuration for MariaDB:

Select from the following options:
- If the management cluster is not bootstrapped yet, proceed to the next step.
- If the management cluster is already deployed, verify that the mariadb-phy-backup CronJob object is present:
```
kubectl -n kaas get cronjob mariadb-phy-backup
```
  Example of a positive system response:
```
NAME                 SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
mariadb-phy-backup   0 0 * * *   False     0        6h3m            10d
```
  If the object is missing, make sure that your management cluster is successfully upgraded to the latest version.
Select from the following options:
- If the management cluster is not bootstrapped yet, modify cluster.yaml.template using the steps below.
- If the management cluster is already deployed, modify the configuration kubectl edit <mgmtClusterName> using the steps below. By default, the management cluster name is kaas-mgmt.

Enable the MariaDB backup in the Cluster object:

spec:
  providerSpec:
    value:
      kaas:
        management:
          helmReleases:
          ...
          - name: iam
            values:
              keycloak:
                mariadb:
                  conf:
                    phy_backup:
                      enabled: true

Modify the configuration as required. By default, the backup is set up as follows:
- Runs on a daily basis at 00:00 AM
- Creates full backups daily
- Keeps 5 latest full backups
- Saves backups to the mariadb-phy-backup-data PVC
- The backup timeout is 21600 seconds
- The backup type is full
The mariadb-phy-backup cron job launches backups of the MariaDB Galera cluster. The job accepts settings through parameters and environment variables.

Modify the following backup parameters that you can pass to the cron job and override from the Cluster object:

MariaDB backup: Configuration parameters¶
Parameter	Default	Description
`--backup-type` (string)	`full`	Backup type. The list of possible values include: `incremental` If the newest full backup is older than the value of the `full_backup_cycle` parameter, the system performs a full backup. Otherwise, the system performs an incremental backup of the newest full backup. `full` Always performs only a full backup. Usage example: spec: providerSpec: value: kaas: management: helmReleases: - name: iam values: keycloak: mariadb: conf: phy_backup: backup_type: incremental
`--backup-timeout` (integer)	`21600`	Timeout in seconds for the system to wait for the backup operation to succeed. Usage example: spec: providerSpec: value: kaas: management: helmReleases: - name: iam values: keycloak: mariadb: conf: phy_backup: backup_timeout: 30000
`--allow-unsafe-backup` (boolean)	`false`	If set to `true`, enables the MariaDB cluster backup on a not fully operational cluster where: The current number of ready pods is not equal to `MARIADB_REPLICAS` Some replicas do not have healthy wsrep statuses Usage example: spec: providerSpec: value: kaas: management: helmReleases: - name: iam values: keycloak: mariadb: conf: phy_backup: allow_unsafe_backup: true

Modify the following environment variables that you can pass to the cron job and override from the Cluster object:

MariaDB backup: Environment variables¶
Variable	Default	Description
`MARIADB_BACKUPS_TO_KEEP` (integer)	`10`	Number of full backups to keep. Usage example: spec: providerSpec: value: kaas: management: helmReleases: - name: iam values: keycloak: mariadb: conf: phy_backup: backups_to_keep: 3
`MARIADB_BACKUP_PVC_NAME` (string)	`mariadb-phy-backup-data`	Persistent volume claim used to store backups. Usage example: spec: providerSpec: value: kaas: management: helmReleases: - name: iam values: keycloak: mariadb: conf: phy_backup: backup_pvc_name: mariadb-phy-backup-data
`MARIADB_FULL_BACKUP_CYCLE` (integer)	`604800`	Number of seconds that defines a period between 2 full backups. During this period, incremental backups are performed. The parameter is taken into account only if `backup_type` is set to `incremental`. Otherwise, it is ignored. For example, with `full_backup_cycle` set to `604800` seconds, a full backup is performed weekly and, if cron is set to `0 0 * * `, an incremental backup is performed daily. Usage example:* spec: providerSpec: value: kaas: management: helmReleases: - name: iam values: keycloak: mariadb: conf: phy_backup: full_backup_cycle: 70000
`MARIADB_BACKUP_REQUIRED_SPACE_RATIO` (floating)	`1.2`	Multiplier for the database size to predict the space required to create a backup, either full or incremental, and perform a restoration keeping the uncompressed backup files on the same file system as the compressed ones. To estimate the size of `MARIADB_BACKUP_REQUIRED_SPACE_RATIO`, use the following formula: size of (1 uncompressed full backup + all related incremental uncompressed backups + 1 full compressed backup) in KB =< (`DB_SIZE` * `MARIADB_BACKUP_REQUIRED_SPACE_RATIO`) in KB. The `DB_SIZE` is the disk space allocated in the MySQL data directory, which is `/var/lib/mysql`, for databases data excluding `galera.cache` and `ib_logfile` files. This parameter prevents the backup PVC from being full in the middle of the restoration and backup procedures. If the current available space is lower than `DB_SIZE` `MARIADB_BACKUP_REQUIRED_SPACE_RATIO`, the backup script fails before the system starts the actual backup and the overall status of the backup job is failed. Usage example: spec: providerSpec: value: kaas: management: helmReleases: - name: iam values: keycloak: mariadb: conf: phy_backup: backup_required_space_ratio: 1.4

Configuration example:

To perform full backups monthly and incremental backups daily at 02:30 AM and keep the backups for the last six months, configure the database backup in your Cluster object as follows:

spec:
  providerSpec:
    value:
      kaas:
        management:
          helmReleases:
          - name: iam
            values:
              keycloak:
                mariadb:
                  conf:
                    phy_backup:
                      enabled: true
                      backups_to_keep: 6
                      schedule_time: '30 2 * * *'
                      full_backup_cycle: 2628000

See also

Change the storage node for MariaDB

Verify operability of the MariaDB backup jobs¶

After you configure the MariaDB periodic jobs, verify that backup jobs are operational by creating a helper pod to view the backup volume content.

To verify operability of the MariaDB backup jobs:

Verify pods in the kaas project. After the backup jobs have succeeded, the pods remain in the Completed state:

kubectl -n kaas get pods -l application=mariadb-phy-backup

Example of a positive system response:

NAME                                  READY   STATUS      RESTARTS   AGE
mariadb-phy-backup-1599613200-n7jqv   0/1     Completed   0          43h
mariadb-phy-backup-1599699600-d79nc   0/1     Completed   0          30h
mariadb-phy-backup-1599786000-d5kc7   0/1     Completed   0          6h17m

Note

By default, the system keeps five latest successful and one latest failed pods.

Obtain an image of the MariaDB container:

kubectl -n kaas get pods mariadb-server-0 -o jsonpath='{.spec.containers[0].image}'

Create the check_pod.yaml file to create the helper pod required to view the backup volume content.

Configuration example:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: check-backup-helper
  namespace: kaas
---
apiVersion: v1
kind: Pod
metadata:
  name: check-backup-helper
  namespace: kaas
  labels:
    application: check-backup-helper
spec:
  containers:
    - name: helper
      securityContext:
        allowPrivilegeEscalation: false
        runAsUser: 0
        readOnlyRootFilesystem: true
      command:
        - sleep
        - infinity
      # using image from mariadb sts
      image: <<insert_image_of_mariadb_container_here>>
      imagePullPolicy: IfNotPresent
      volumeMounts:
        - name: pod-tmp
          mountPath: /tmp
        - mountPath: /var/backup
          name: mysql-backup
  restartPolicy: Never
  serviceAccount: check-backup-helper
  serviceAccountName: check-backup-helper
  volumes:
    - name: pod-tmp
      emptyDir: {}
    - name: mariadb-secrets
      secret:
        secretName: mariadb-secrets
        defaultMode: 0444
    - name: mariadb-bin
      configMap:
        name: mariadb-bin
        defaultMode: 0555
    - name: mysql-backup
      persistentVolumeClaim:
        claimName: mariadb-phy-backup-data

Apply the helper service account and pod resources:

kubectl -n kaas apply -f check_pod.yaml
kubectl -n kaas get pods -l application=check-backup-helper

Example of a positive system response:

NAME                  READY   STATUS    RESTARTS   AGE
check-backup-helper   1/1     Running   0          27s

Verify the directories structure within the /var/backup directory of the spawned pod:

kubectl -n kaas exec -t check-backup-helper -- tree /var/backup

Example of a system response:

/var/backup
|-- base
|   `-- 2021-09-09_11-35-48
|       |-- backup.stream.gz
|       |-- backup.successful
|       |-- grastate.dat
|       |-- xtrabackup_checkpoints
|       `-- xtrabackup_info
|-- incr
|   `-- 2021-09-09_11-35-48
|       |-- 2021-09-10_01-02-36
|       |   |-- backup.stream.gz
|       |   |-- backup.successful
|       |   |-- grastate.dat
|       |   |-- xtrabackup_checkpoints
|       |   `-- xtrabackup_info
|       `-- 2021-09-11_01-02-02
|           |-- backup.stream.gz
|           |-- backup.successful
|           |-- grastate.dat
|           |-- xtrabackup_checkpoints
|           `-- xtrabackup_info

Delete the helper pod:
```
kubectl delete -f check_pod.yaml
```

Restore MariaDB databases¶

During the restore procedure, the MariaDB service will be unavailable because the MariaDB StatefulSet scales down to 0 replicas. Therefore, plan the maintenance window according to the database size. The restore speed depends on the following:

Network throughput
Storage performance where backups are kept
Local disks performance of nodes with MariaDB local volumes

To restore MariaDB databases:

Obtain an image of the MariaDB container:

kubectl -n kaas get pods mariadb-server-0 -o jsonpath='{.spec.containers[0].image}'

Create the check_pod.yaml file to create the helper pod required to view the backup volume content. For example:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: check-backup-helper
  namespace: kaas
---
apiVersion: v1
kind: Pod
metadata:
  name: check-backup-helper
  namespace: kaas
  labels:
    application: check-backup-helper
spec:
  containers:
    - name: helper
      securityContext:
        allowPrivilegeEscalation: false
        runAsUser: 0
        readOnlyRootFilesystem: true
      command:
        - sleep
        - infinity
      image: <<insert_image_of_mariadb_container_here>>
      imagePullPolicy: IfNotPresent
      volumeMounts:
        - name: pod-tmp
          mountPath: /tmp
        - mountPath: /var/backup
          name: mysql-backup
  restartPolicy: Never
  serviceAccount: check-backup-helper
  serviceAccountName: check-backup-helper
  volumes:
    - name: pod-tmp
      emptyDir: {}
    - name: mariadb-secrets
      secret:
        secretName: mariadb-secrets
        defaultMode: 0444
    - name: mariadb-bin
      configMap:
        name: mariadb-bin
        defaultMode: 0555
    - name: mysql-backup
      persistentVolumeClaim:
        claimName: mariadb-phy-backup-data

Create the helper pod:

kubectl -n kaas apply -f check_pod.yaml

Obtain the name of the backup to restore:
```
kubectl -n kaas exec -t check-backup-helper -- tree /var/backup
```
Example of system response:
```
/var/backup
|-- base
|   `-- 2021-09-09_11-35-48
|       |-- backup.stream.gz
|       |-- backup.successful
|       |-- grastate.dat
|       |-- xtrabackup_checkpoints
|       `-- xtrabackup_info
|-- incr
|   `-- 2021-09-09_11-35-48
|       |-- 2021-09-10_01-02-36
|       |-- 2021-09-11_01-02-02
|       |-- 2021-09-12_01-01-54
|       |-- 2021-09-13_01-01-55
|       `-- 2021-09-14_01-01-55
`-- lost+found

10 directories, 5 files
```
If you want to restore the full backup, the name from the example above is 2021-09-09_11-35-48. To restore a specific incremental backup, the name from the example above is 2021-09-09_11-35-48/2021-09-12_01-01-54.

In the example above, the backups will be restored in the following strict order:
1. 2021-09-09_11-35-48 - full backup, path /var/backup/base/2021-09-09_11-35-48
2. 2021-09-10_01-02-36 - incremental backup, path /var/backup/incr/2021-09-09_11-35-48/2021-09-10_01-02-36
3. 2021-09-11_01-02-02 - incremental backup, path /var/backup/incr/2021-09-09_11-35-48/2021-09-11_01-02-02
4. 2021-09-12_01-01-54 - incremental backup, path /var/backup/incr/2021-09-09_11-35-48/2021-09-12_01-01-54
Delete the helper pod to prevent PVC multi-attach issues:
```
kubectl -n kaas delete -f check_pod.yaml
```

Verify that no other restore job exists on the cluster:

cd kaas-bootstrap

kubectl -n kaas get jobs | grep restore

kubectl -n kaas get po | grep check-backup-helper

Edit the Cluster object by configuring the MariaDB parameters. For example:

spec:
  providerSpec:
    kaas:
      management:
        helmReleases:
        - name: iam
          values:
            keycloak:
              mariadb:
                manifests:
                  job_mariadb_phy_restore: true
                conf:
                  phy_restore:
                    backup_name: "2021-09-09_11-35-48/2021-09-12_01-01-54"
                    replica_restore_timeout: 7200

Parameter	Type	Default	Description
`backup-name`	String	-	Name of a folder with backup in `<baseBackup>` or `<baseBackup>/<incrementalBackup>`.
`replica-restore-timeout`	Integer	`3600`	Timeout in seconds for 1 replica data to be restored to the `mysql` data directory. Also, includes time for spawning a rescue runner pod in Kubernetes and extracting data from a backup archive.

Wait until the mariadb-phy-restore job succeeds:

kubectl -n kaas get jobs mariadb-phy-restore -o jsonpath='{.status}'

The mariadb-phy-restore job is an immutable object. Therefore, remove the job after each execution. To correctly remove the job, clean up all settings from the Cluster object that you have configured during step 7 of this procedure. This will remove all related pods as well.

Note

If you create a new user after creating the MariaDB backup file, such user obviously will not exist in the database after restoring MariaDB. But Keycloak may still contain cache about such user. Therefore, during an attempt of this user to log in, the Container Cloud web UI may start the authentication procedure that fails with the following error: Data loading failed: Failed to log in: Failed to get token. Reason: “User not found”. To clear cache in Keycloak, refer to the official Keycloak documentation.

See also

Change the storage node for MariaDB

Change the storage node for MariaDB¶

The default storage class cannot be used on a management cluster, so a specially created one is used for this purpose. For storage, this class uses local volumes, which are managed by local-volume-provisioner.

Each node of a management cluster contains a local volume, and the volume bound with a PVC is selected when the backup gets created for the first time. This volume is used for all subsequent backups. Therefore, to ensure reliable backup storage, consider creating a regular backup copy of this volume in a separate location.

If the node that contains backup data must be redeployed, first move the MySQL backup data to another node and update the PVC binding along with the MariaDB backup job to use another node as described below.

Identify a node where backup data is stored¶

Download and save the following script on the node where kubectl is installed and configured to use the Kubernetes API:

get_lv_info.sh

Make the script executable and execute it:

vim get_lv_info.sh

chmod +x get_lv_info.sh

./get_lv_info.sh

The script outputs the following information:

Primary local volume: Current active local volume, which is bound to the PVC using the backup_pvc_name field and which is used to store backup data.
Secondary local volume: Unused volumes of two remaining nodes of the management cluster.

Example of system response:

Primary local volume
====================
Volume: local-pv-a1c9425b
Volume path: /mnt/local-volumes/iam/kaas-iam-backup/vol00
Data PVC: mysql-data-mariadb-server-1
Backup PVC: mariadb-phy-backup-data
Node: kaas-node-788dba0a-f931-45ff-a66d-1b583851c3ba
Machine: master-1
Internal IP: 10.100.91.50

Secondary local volume
----------------------
Volume: local-pv-8519d270
Volume path: /mnt/local-volumes/iam/kaas-iam-backup/vol00
Data PVC: mysql-data-mariadb-server-0
Node: kaas-node-2b83025a-d4d1-4ccc-a263-11b07150f302
Machine: master-0
Internal IP: 10.100.91.51

Secondary local volume
----------------------
Volume: local-pv-1bfef721
Volume path: /mnt/local-volumes/iam/kaas-iam-backup/vol00
Data PVC: mysql-data-mariadb-server-2
Node: kaas-node-f4742907-5fb0-41fb-ba6c-3ce467779754
Machine: master-2
Internal IP: 10.100.91.52

Note

The order of nodes that contain Secondary local volume is random.

Capture details of the node containing the primary local volume for further usage. For example, you can use the Internal IP value to SSH to the required node and copy the backup data located under Volume path to a separate location.

Change the default storage node for MariaDB backups¶

Capture details of the local volume and node containing backups as described in Identify a node where backup data is stored. Also, capture details of Secondary local volume that you select to move backup data to.
Using Internal IP of Primary local volume, SSH to the corresponding node and create a backup tarball:

Note

In the command below, replace <newVolumePath> with the value of the Volume path field of the selected Secondary local volume.
```
sudo tar -czPvf ~/mariadb-backup.tar.gz -C <newVolumePath>
```
Using Internal IP of Secondary local volume, SSH to the corresponding node and copy the created backup mariadb-backup.tar.gz using a dedicated utility such as scp, rsync, or other.

Restore mariadb-backup.tar.gz under the selected Volume path:

sudo tar -xzPvf ~/mariadb-backup.tar.gz -C <newVolumePath>

Update the CronJob object to associate it with the new backup node:
1. Download and save the following helper script on a node where kubectl is installed and configured to use Kubernetes API:
 
 fix_cronjob_pvc
2. Make the script executable:
```
vim fix_cronjob_pvc.sh

chmod +x fix_cronjob_pvc.sh
```
3. Using the Data PVC value of the selected Secondary local volume, run the script:
```
./fix_cronjob_pvc.sh <secondaryDataPVCName>
```

Remove a management cluster¶

This section describes how to remove a management cluster. You can also use the following instruction to remove unsupported regional clusters, if any.

To remove a management or regional cluster:

Verify that you have successfully removed all managed clusters that run on top of the management or regional cluster to be removed. For details, see Delete a managed cluster.
If you enabled custom host names on an existing management or regional cluster as described in Configure host names for cluster machines, and the cluster contains hosts with non-custom names, manually delete such hosts to prevent cleanup failure.
Log in to a local machine where your management cluster kubeconfig is located and where kubectl is installed.

Note

The management cluster kubeconfig is created during the last stage of the management cluster bootstrap.

Note

To remove a regional cluster, you also need access to the regional cluster kubeconfig that was created during the last stage of the regional cluster bootstrap.
Verify that the bootstrap directory is updated.

Select from the following options:
- For clusters deployed using Container Cloud 2.11.0 (Cluster releases 7.1.0 and 6.18.0) or later:
```
./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
--target-dir <pathToBootstrapDirectory>
```
- For clusters deployed using the Container Cloud release earlier than 2.11.0 (7.0.0, 6.16.0, or earlier), or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:
```
wget https://binary.mirantis.com/releases/get_container_cloud.sh

chmod 0755 get_container_cloud.sh

./get_container_cloud.sh
```

If you are removing the regional cluster, run the following script. Otherwise, skip this step.

REGIONAL_CLUSTER_NAME=<regionalClusterName> \
REGIONAL_KUBECONFIG=<pathToRegionalClusterKubeconfig> \
KUBECONFIG=<mgmtClusterKubeconfig> \
bootstrap.sh destroy_regional

Remove the management cluster:
```
bootstrap.sh cleanup
```

Note

Removing a management or regional cluster using the Container Cloud web UI is not supported.

Warm up the Container Cloud cache¶

Available since MCC 2.24.0 (14.0.1 and 15.0.1) TechPreview

This section describes how to speed up deployment and update process of managed clusters, which usually do not have access to the Internet and consume artifacts from a management cluster using the mcc-cache service.

By default, after auto-upgrade of a management cluster, before each managed cluster deployment or update, mcc-cache downloads the required list of images, thus slowing down the process.

Using the CacheWarmupRequest resource, you can predownload (warm up) a list of images included in a given set of Cluster releases into the mcc-cache service only once per release for further usage on all managed clusters.

After a successful cache warm-up, the object of the CacheWarmupRequest resource is automatically deleted from the cluster and cache remains for managed clusters deployment or update until next Container Cloud auto-upgrade of the management cluster.

Caution

If the disk space for cache runs out, the cache for the oldest object is evicted. To avoid running out of space in the cache, verify and adjust its size before each cache warm-up.

Requirements¶

Cache warm-up requires a lot of disk storage, it may take up to 100% of disk space. Therefore, make sure to have enough space for storing cached objects on each node of the management cluster before creating the CacheWarmupRequest resource. The following example contains minimal required values for the cache size for the management cluster:

Minimal cache size¶
Cluster release	Minimal value in GiB
MOSK Cluster release with one OpenStack version	50
MOSK Cluster release with an OpenStack version upgrade from `victoria` to `yoga`	120

Increase cache size for mcc-cache¶

After you calculate the disk size for warming up cache depending on your cluster settings and minimal cache warm-up requirements, configure the size of cache in the Cluster object of your cluster.

In the spec:providerSpec:value:kaas:regionalHelmReleases: section of the management Cluster object, add the following snippet to the mcc-cache entry with the required size value in GiB:

nginx:
  cacheSize: 100

kubectl --kubeconfig <pathToManagementClusterKubeconfig> edit cluster <clusterName>

Configuration example:

spec:
  providerSpec:
    value:
      kaas:
        regionalHelmReleases:
        - name: mcc-cache
          values:
            nginx:
              cacheSize: 100

Note

The cacheSize parameter is set in GiB.

Warm up cache using CLI¶

After you increase the size of cache on the cluster as described in Increase cache size for mcc-cache, create the CacheWarmupRequest object in the Kubernetes API.

Caution

Create CacheWarmupRequest objects only on the management cluster.

To warm up cache using CLI:

Identify the latest available Cluster releases to use for deployment of new clusters and update of existing clusters:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> get kaasreleases -l=kaas.mirantis.com/active="true" -o=json | jq -r '.items[].spec.supportedClusterReleases[] | select(.availableUpgrades | length == 0) | .name'

Example of system response:

mke-14-0-1-3-6-5
mosk-15-0-1

On the management cluster, create a .yaml file for the CacheWarmupRequest object using the following example:
```
apiVersion: kaas.mirantis.com/v1alpha1
kind: CacheWarmupRequest
metadata:
  name: example-cluster-name
  namespace: default
spec:
  clusterReleases:
  - mke-14-0-1
  - mosk-15-0-1
  openstackReleases:
  - yoga
  fetchRequestTimeout: 30m
  clientsPerEndpoint: 2
  openstackOnly: false
```
In this example:
- The CacheWarmupRequest object is created for a management cluster named example-cluster-name.
- The CacheWarmupRequest object is created in the only allowed default Container Cloud project.
- Two Cluster releases mosk-15-0-1 and mke-14-0-1 will be predownloaded.
- For mosk-15-0-1, only images related to the OpenStack version Yoga will be predownloaded.
- Maximum time-out for a single request to download a single artifact is 30 minutes.
- Two parallel workers will fetch artifacts per each mcc-cache service endpoint.
- All artifacts will be fetched, not only those related to OpenStack.
For details about the CacheWarmupRequest object, see CacheWarmupRequest resource.
Apply the object to the cluster:
```
kubectl --kubeconfig <pathToManagementKubeconfig> apply -f <pathToFile>
```
Once done, during deployment and update of managed clusters, Container Cloud uses cached artifacts from the mcc-cache service to facilitate and speed up the procedure.

When a new Container Cloud release becomes available and the management cluster auto-upgrades to a new Container Cloud release, repeat the above steps to predownload a new set of artifacts for managed clusters.

See also

General operations¶

The section covers general operations required while managing a MOSK cluster.

Change a cluster configuration¶

After deploying a managed cluster, you can configure a few cluster settings using the Container Cloud web UI as described below.

To change a cluster configuration:

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Select the required project.
On the Clusters page, click the More action icon in the last column of the required cluster and select Configure cluster.

In the Configure cluster window:

In the General Settings tab, you can:

Add or update proxy for a cluster by selecting the name of previously created proxy settings from the Proxy drop-down menu.

Proxy configuration

In the Proxies tab, configure proxy:

Click Add Proxy.

In the Add New Proxy wizard, fill out the form with the following parameters:

Proxy configuration¶
Parameter	Description
Proxy Name	Name of the proxy server to use during MOSK cluster creation.
Region ^{Removed in MCC 2.26.0 (16.1.0 and 17.1.0)}	From the drop-down list, select the required region.
HTTP Proxy	Add the HTTP proxy server domain name in the following format: `http://proxy.example.com:port` - for anonymous access `http://user:password@proxy.example.com:port` - for restricted access
HTTPS Proxy	Add the HTTPS proxy server domain name in the same format as for HTTP Proxy.
No Proxy	Comma-separated list of IP addresses or domain names.

For implementation details, see Proxy support and cache of artifacts.

If your proxy requires a trusted CA certificate, select the CA Certificate check box and paste a CA certificate for a MITM proxy to the corresponding field or upload a certificate using Upload Certificate.

Note

The possibility to use a MITM proxy with a CA certificate is available since MOSK 23.1.

For the list of Mirantis resources and IP addresses to be accessible from MOSK clusters, see Reference Architecture: Requirements.

Using the SSH Keys drop-down menu, select the required previously created SSH key to add it to the running cluster. If required, you can add several keys or remove unused ones, if any.

Note

To delete an SSH key, use the SSH Keys tab of the main menu.
Applies since MCC 2.21.x (Cluster releases 12.5.0 and 11.5.0). Using the Container Registry drop-down menu, select the previously created Docker container registry name to add it to the running cluster.
Applies since MCC 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Using the following options, define the maximum number of worker machines to be upgraded in parallel during cluster update:

Parallel Upgrade Of Worker Machines
The maximum number of the worker nodes to update simultaneously. It serves as an upper limit on the number of machines that are drained at a given moment of time. Defaults to 1.

Parallel Preparation For Upgrade Of Worker Machines
The maximum number of worker nodes being prepared at a given moment of time, which includes downloading of new artifacts. It serves as a limit for the network load that can occur when downloading the files to the nodes. Defaults to 50.

In the Stacklight tab, select or deselect StackLight and configure its parameters if enabled.

You can also update the default log level severity for all StackLight components as well as set a custom log level severity for specific StackLight components. For details about severity levels, see Log verbosity.

Click Update to apply the changes.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Enable cluster and machine maintenance mode¶

Before performing node maintenance operations that are not managed by MOSK, such as operating system configuration or node reboot, enable maintenance mode on the cluster and required machines using the Container Cloud web UI or CLI to prepare workloads for maintenance.

Enable maintenance mode on a cluster and machine using web UI¶

You can use the instructions below to enable maintenance mode on a cluster and machine using the Container Cloud web UI. To enable maintenance mode using the Container Cloud API, refer to Enable maintenance mode on a cluster and machine using CLI.

Caution

To enable maintenance mode on a machine, first enable maintenance mode on the related cluster.
To disable maintenance mode on a cluster, first disable maintenance mode on all machines of the cluster.

Warning

During cluster and machine maintenance:

Cluster upgrades and configuration changes (except of the SSH keys setting) are unavailable. Make sure you disable maintenance mode on the cluster after maintenance is complete.
Data load balancing is disabled while Ceph is in maintenance mode.
Workloads are not affected.

Enable maintenance mode on a cluster and machine¶

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Enable maintenance mode on the cluster:
1. In the Clusters tab, click the More action icon in the last column of the cluster you want to put into maintenance mode and select Enter maintenance. Confirm your selection.
2. Wait until the Status of the cluster switches to Maintenance.
Now, you can switch cluster machines to maintenance mode.
In the Clusters tab, click the required cluster name to open the list of machines running on it.
In the Maintenance column of the machine you want to put into maintenance mode, enable the toggle switch.
Wait until the machine Status switches to Maintenance.

Once done, the node of the selected machine is cordoned, drained, and prepared for maintenance operations.

Important

Proceed with the node maintenance only after the machine Status switches to Maintenance.

Disable maintenance mode on a cluster and machine¶

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
In the Clusters tab, click the required cluster name to open its machines list.
In the Maintenance column of the machine you want to disable maintenance mode, disable the toggle switch.
Wait until the machine Status does not display Maintenance, Pending maintenance, or the progress indicator.
Repeat the above steps for all machines that are in maintenance mode.
Disable maintenance mode on the related cluster:
1. In the Clusters tab, click the More action icon in the last column of the cluster where you want to disable maintenance mode and select Exit maintenance.
2. Wait until the cluster Status does not display Maintenance, Pending maintenance, or the progress indicator.

Enable maintenance mode on a cluster and machine using CLI¶

You can use the instructions below to enable maintenance mode on a cluster and machine using the Container Cloud API. To enable maintenance mode using the Container Cloud web UI, refer to Enable maintenance mode on a cluster and machine using web UI.

Caution

To enable maintenance mode on a machine, first enable maintenance mode on the related cluster.
To disable maintenance mode on a cluster, first disable maintenance mode on all machines of the cluster.

Warning

During cluster and machine maintenance:

Cluster upgrades and configuration changes (except of the SSH keys setting) are unavailable. Make sure you disable maintenance mode on the cluster after maintenance is complete.
Data load balancing is disabled while Ceph is in maintenance mode.
Workloads are not affected.

Enable maintenance mode on a cluster and machine¶

Enable maintenance mode on the cluster:

In the value section of providerSpec of the Cluster object, set maintenance to true:
```
kubectl patch clusters.cluster.k8s.io -n <projectName> <clusterName> --type=merge -p '{"spec":{"providerSpec":{"value":{"maintenance":true}}}}'
```
Replace the parameters enclosed in angle brackets with the corresponding values.

Wait until the maintenance status is true:

kubectl get clusters.cluster.k8s.io -n <projectName> <clusterName> -o jsonpath='{.status.providerStatus.maintenance}'

Enable maintenance mode on the required machine:

In the value section of providerSpec of the Machine object, set maintenance to true:

kubectl patch machines.cluster.k8s.io -n <projectName> <machineName> --type=merge -p '{"spec":{"providerSpec":{"value":{"maintenance":true}}}}'

Wait until the maintenance status is true:

kubectl get machines.cluster.k8s.io -n <projectName> <machineName> -o jsonpath='{.status.providerStatus.maintenance}'

Once done, the node of the selected machine is cordoned, drained, and prepared for maintenance operations.

Disable maintenance mode on a cluster and machine¶

Disable maintenance mode on the machine:

In the value section of providerSpec of the Cluster object, set maintenance to false:

kubectl patch machines.cluster.k8s.io -n <projectName> <machineName> --type=merge -p '{"spec":{"providerSpec":{"value":{"maintenance":false}}}}'

Wait until the machine maintenance mode disables:

kubectl get machines.cluster.k8s.io -n <projectName> <machineName> -o jsonpath='{.status.providerStatus.maintenance}'

Verify that the system output is false or empty.

Repeat the above steps for all machines that are in maintenance mode.

Disable maintenance mode on the cluster:

In the value section of providerSpec of the Cluster object, set maintenance to false:

kubectl patch clusters.cluster.k8s.io -n <projectName> <clusterName> --type=merge -p '{"spec":{"providerSpec":{"value":{"maintenance":false}}}}'

Wait until the cluster maintenance mode disables:

kubectl get clusters.cluster.k8s.io -n <projectName> <clusterName> -o jsonpath='{.status.providerStatus.maintenance}'

Verify that the system output is false or empty.

Disable a machine¶

Available since MCC 2.25.0 (17.0.0 and 16.0.0) for workers on managed clusters TechPreview

You can use the machine disabling API to seamlessly remove a worker machine from the LCM control of a managed cluster. This action isolates the affected node without impacting other machines in the cluster, effectively eliminating it from the Kubernetes cluster. This functionality proves invaluable in scenarios where a malfunctioning machine impedes cluster updates.

Note

The Technology Preview support of the machine disabling feature also applies during cluster update to the Cluster release 17.1.0 or 16.1.0.

Precautions for machine disablement¶

Before disabling a cluster machine, carefully read the following essential information for a successful machine disablement:

MOSK supports machine disablement of worker machines only.

If an issue occurs on the control plane, which is updated before worker machines, fix the issue or replace the affected control machine as soon as possible to prevent issues with workloads. For reference, see Troubleshooting Guide and Delete a cluster machine.
Disabling a machine can break high availability (HA) of components such as StackLight. Therefore, Mirantis recommends adding a new machine as soon as possible to provide sufficient node number for components HA.

Note

It is expected that the cluster status contains degraded replicas of some components during or after cluster update with a disabled machine. These replicas become available as soon as you replace the disabled machine.
When a machine is disabled, some services may switch to the NodeReady state and may require additional actions to unblock LCM tasks.
A disabled machine is removed from the overall cluster status and is labeled as Disabled. The requested node number for the cluster remains the same, but an additional disabled field is displayed with the number of disabled nodes.
A disabled machine is not taken into account for any calculations, for example, when the number of StackLight nodes is required for some restriction check.
MOSK removes the node running the disabled machine from the Kubernetes cluster.
Deletion of the disabled machine with the graceful deletion policy is not allowed. Use the unsafe deletion policy instead. For details, see Delete a cluster machine.
For a major cluster update, the Cluster release of a disabled machine must match the Cluster release of other cluster machines.

If a machine is disabled during the major Cluster release update, then the upgrade should be completed if all other requirements are met. However, cluster update to the next available major Cluster release will be blocked until you re-enable or replace the disabled machine.

Patch updates do not have such limitation on different patch Cluster releases. You can update a cluster with a disabled machine to several patch Cluster releases in the scope of one major Cluster release.
After enabling the machine, it will be updated to match the Cluster release of the corresponding cluster, including all related components.
For Ceph machines, you need to perform additional disablement steps.

Disable a machine using the Container Cloud web UI¶

Carefully read the precautions for machine disablement.
Power off the underlying host of a machine to be disabled.

Warning

If the underlying host of a machine is not powered off, the cluster may still contain the disabled machine in the list of available nodes with kubelet attempting to start the corresponding containers on the disabled machine.

Therefore, Mirantis strongly recommends powering off the underlying host to prevent manual removal of the related Kubernetes node from the Docker Swarm cluster using the MKE web UI.
In the Clusters tab, click the required cluster name to open the list of machines running on it.
Click the More action icon in the last column of the required machine and click Disable.
Wait until the machine Status switches to Disabled.
If the disabled machine contains StackLight or Ceph, migrate these services to a healthy machine:
1. Verify that the required disabled and healthy machines are not currently added to GracefulRebootRequest:
 
 Note
 
 Machine configuration changes, such as reassigning Ceph and StackLight labels from a disabled machine to a healthy one, which are described in the following steps, are not allowed during graceful reboot. For details, see Perform a graceful reboot of a cluster.
 1. Verify that the More > Reboot machines option is not disabled. If the option is active, skip the following sub-step and proceed to the next step. If the option is disabled, proceed to the following sub-step.
 2. Using the Container Cloud CLI, verify that the new machine, which you are going to use for StackLight or Ceph services migration, is not included in the list of the GracefulRebootRequest resource. Otherwise, remove GracefulRebootRequest before proceeding. For details, see Disable a machine using the Container Cloud CLI.
 Note
 
 Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), reboot of the disabled machine is automatically skipped in GracefulRebootRequest.
2. If StackLight is deployed on the machine, unblock LCM tasks by moving the stacklight=enabled label to another healthy machine with a sufficient amount of resources and manually remove StackLight Pods along with related local persistent volumes from the disabled machine. For details, see Deschedule StackLight Pods from a worker machine.
3. If Ceph is deployed on the machine:
 Disable a Ceph machine
 1. Select one of the following options to open the Ceph cluster spec:
 
 Web UI
 
 In the CephClusters tab, click the required Ceph cluster name to open its spec.
 
 CLI
 
 Open the KaaSCephCluster object for editing:
 
 kubectl edit kaascephcluster -n <managedClusterProjectName> <KaaSCephClusterName>
 2. In spec.node, find the machine to be disabled.
 3. Back up the machine configuration.
 4. Verify the machine role:
 
 For mgr, rgw, or mds, move such role to another node located in the node section. Such node must meet resource requirements to run the corresponding daemon type and must not have the respective node assigned yet.
 
 For mon, refer to Move a Ceph Monitor daemon to another node for further instructions. Mirantis recommends considering nodes with sufficient resources to run the moved monitor daemon.
 
 For osd, proceed to the next step.
 5. Remove the machine from spec.

Enable machine using the Container Cloud web UI¶

In the Clusters tab, click the required cluster name to open the list of machines running on it.
Click the More action icon in the last column of the required machine and click Enable.
Wait until the machine Status switches to Ready.
If Ceph is deployed on the machine:
Enable a Ceph machine
1. Select one of the following options to open the Ceph cluster spec:
 Web UI
 
 In the CephClusters tab, click the required Ceph cluster name to open its spec.
 
 CLI
 
 Open the KaaSCephCluster object for editing:
 
 kubectl edit kaascephcluster -n <managedClusterProjectName> <KaaSCephClusterName>
2. In spec.node, add a new or backed-up configuration of the machine to be enabled.
 
 If the machine must have any role besides osd, consider the following options to return a role back to the node:
 - For mgr, rgw, or mds, add the role to the enabled node in the node section.
 - For mon, refer to Move a Ceph Monitor daemon to another node for further instructions.

Disable a machine using the Container Cloud CLI¶

Carefully read the precautions for machine disablement.
Power off the underlying host of a machine to be disabled.

Warning

If the underlying host of a machine is not powered off, the cluster may still contain the disabled machine in the list of available nodes with kubelet attempting to start the corresponding containers on the disabled machine.

Therefore, Mirantis strongly recommends powering off the underlying host to prevent manual removal of the related Kubernetes node from the Docker Swarm cluster using the MKE web UI.
Open the required Machine object for editing.

In the providerSpec:value section, set disable to true:

kubectl patch machines.cluster.k8s.io -n <projectName> <machineName> --type=merge -p '{"spec":{"providerSpec":{"value":{"disable":true}}}}'

Wait until the machine status switches to Disabled:

kubectl get machines.cluster.k8s.io -n <projectName> <machineName> -o jsonpath='{.status.providerStatus.status}'

If the disabled machine contains StackLight or Ceph, migrate these services to a healthy machine:
1. Verify that the required disabled and healthy machines are not currently added to GracefulRebootRequest:
 
 Note
 
 Machine configuration changes, such as reassigning Ceph and StackLight labels from a disabled machine to a healthy one, which are described in the following steps, are not allowed during graceful reboot. For details, see Perform a graceful reboot of a cluster.
```
kubectl get gracefulrebootrequest -A

kubectl -n <projectName> get gracefulrebootrequest <gracefulRebootRequestName> -o yaml
```
 If the machine is listed in the object spec section, remove the GracefulRebootRequest object:
```
kubectl -n <projectName> delete gracefulrebootrequest <gracefulRebootRequestName>
```
 Note
 
 Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), reboot of the disabled machine is automatically skipped in GracefulRebootRequest.
2. If StackLight is deployed on the machine, unblock LCM tasks by moving the stacklight=enabled label to another healthy machine with a sufficient amount of resources and manually remove StackLight Pods along with related local persistent volumes from the disabled machine. For details, see Deschedule StackLight Pods from a worker machine.
3. If Ceph is deployed on the machine:
 Disable a Ceph machine
 1. Select one of the following options to open the Ceph cluster spec:
 
 Web UI
 
 In the CephClusters tab, click the required Ceph cluster name to open its spec.
 
 CLI
 
 Open the KaaSCephCluster object for editing:
 
 kubectl edit kaascephcluster -n <managedClusterProjectName> <KaaSCephClusterName>
 2. In spec.node, find the machine to be disabled.
 3. Back up the machine configuration.
 4. Verify the machine role:
 
 For mgr, rgw, or mds, move such role to another node located in the node section. Such node must meet resource requirements to run the corresponding daemon type and must not have the respective node assigned yet.
 
 For mon, refer to Move a Ceph Monitor daemon to another node for further instructions. Mirantis recommends considering nodes with sufficient resources to run the moved monitor daemon.
 
 For osd, proceed to the next step.
 5. Remove the machine from spec.

Enable a machine using the Container Cloud CLI¶

Open the required Machine object for editing.

In the providerSpec:value section, set disable to false:

kubectl patch machines.cluster.k8s.io -n <projectName> <machineName> --type=merge -p '{"spec":{"providerSpec":{"value":{"disable":false}}}}'

Wait until the machine status switches to Ready:

kubectl get machines.cluster.k8s.io -n <projectName> <machineName> -o jsonpath='{.status.providerStatus.status}'

If Ceph is deployed on the machine:
Enable a Ceph machine
1. Select one of the following options to open the Ceph cluster spec:
 Web UI
 
 In the CephClusters tab, click the required Ceph cluster name to open its spec.
 
 CLI
 
 Open the KaaSCephCluster object for editing:
 
 kubectl edit kaascephcluster -n <managedClusterProjectName> <KaaSCephClusterName>
2. In spec.node, add a new or backed-up configuration of the machine to be enabled.
 
 If the machine must have any role besides osd, consider the following options to return a role back to the node:
 - For mgr, rgw, or mds, add the role to the enabled node in the node section.
 - For mon, refer to Move a Ceph Monitor daemon to another node for further instructions.

See also

Delete a cluster machine
Perform a graceful reboot of a cluster
Container Cloud Release Notes: Known issue 40036 (fixed in 2.26.1: Cluster releases 17.1.1 and 16.1.1)

Perform a graceful reboot of a cluster¶

Available since MCC 2.23.x (Cluster releases 12.7.0 and 11.7.0)

You can perform a graceful reboot on a management or managed cluster. Use the below procedure to cordon, drain, and reboot the required cluster machines using a rolling reboot without workloads interruption. The procedure is also useful for a bulk reboot of machines, for example, on large clusters.

The reboot occurs in the order of cluster upgrade policy that you can change for managed clusters as described in Change the upgrade order of a machine.

Caution

The cluster and machines must have the Ready status to perform a graceful reboot.

Perform a rolling reboot of a cluster using web UI¶

Available since MCC 2.24.x (Cluster release 14.0.1 and 15.0.1)

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
On the Clusters page, verify that the status of the required cluster is Ready. Otherwise, the Reboot machines option is disabled.
Click the More action icon in the last column of the required cluster and select Reboot machines. Confirm the selection.

Note

While a graceful reboot is in progress, the Reboot machines option is disabled.

To monitor the cluster readiness, see Verify cluster status.

Caution

Machine configuration changes are forbidden during graceful reboot. Therefore, either wait until reboot is completed or cancel it using CLI, as described in the following section.

Perform a rolling reboot of a cluster using CLI¶

Available since MCC 2.23.x (Cluster releases 12.7.0 and 11.7.0)

Create a GracefulRebootRequest resource with a name that matches the name of the required cluster. For description of the resource fields, see GracefulRebootRequest resource.
In spec:machines, add the machine list or leave it empty to reboot all cluster machines.

Wait until all specified machines are rebooted. You can monitor the reboot status of the cluster and machines using the Conditions:GracefulReboot fields of the corresponding Cluster and Machine objects.

The GracefulRebootRequest object is automatically deleted once the reboot on all target machines completes.

To monitor the live machine status:
```
kubectl get machines <machineName> -o wide
```
Example of system response:
```
NAME READY LCMPHASE NODENAME UPGRADEINDEX REBOOTREQUIRED WARNINGS
demo-0 true Ready kaas-node-c6aa8ad3 1 true
```

Caution

Machine configuration changes are forbidden during graceful reboot.

In emergency cases, for example, to migrate StackLight or Ceph services from a disabled machine that fails during graceful reboot and blocks the process, cancel the reboot by deleting the GracefulRebootRequest object:

kubectl -n <projectName> delete gracefulrebootrequest <gracefulRebootRequestName>

Once you migrate StackLight or Ceph services to another machine and disable it, re-create the GracefulRebootRequest object for the remaining machines that require reboot.

Note

To reboot a single node, for example, for maintenance purposes, refer to Enable cluster and machine maintenance mode.

See also

Disable a machine

Delete a cluster machine¶

This section instructs you on how to scale down an existing management or managed cluster through the web UI or CLI.

Precautions for a cluster machine deletion¶

Before deleting a cluster machine, carefully read the following essential information for a successful machine deletion:

Mirantis recommends deleting cluster machines using the Container Cloud web UI or API instead of using the provider tools directly. Otherwise, the cluster deletion or detachment may hang and additional manual steps will be required to clean up machine resources.
An operational managed cluster must contain a minimum of 3 Kubernetes manager machines to meet the etcd quorum and 2 Kubernetes worker machines.

The deployment of the cluster does not start until the minimum number of machines is created.

A machine with the manager role is automatically deleted during the cluster deletion. Manual deletion of manager machines is allowed only for the purpose of node replacement or recovery.
Consider the following precautions before deleting manager machines:
- Create a new manager machine to replace the deleted one as soon as possible. This is necessary because after machine removal, the cluster has limited capabilities to tolerate faults. Deletion of manager machines is intended only for replacement or recovery of failed nodes.
- You can delete a manager machine only if your cluster has at least two manager machines in the Ready state.
- Do not delete more than one manager machine at once to prevent cluster failure and data loss.
- After deletion of a manager machine, proceed with additional manual steps described in Replace a failed controller node.
- Before replacing a failed manager machine, make sure that all Deployments with replicas configured to 1 are ready.
- Ensure that the machine to delete is not a Ceph Monitor. Otherwise, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.
If StackLight in HA mode is enabled and you are going to delete a machine with the StackLight label:
- Make sure that at least 3 machines with the StackLight label remain after the deletion. Otherwise, add an additional machine with such label before the deletion. After the deletion, perform the additional steps described in the deletion procedure, if required.
- Do not delete more than 1 machine with the StackLight label. Since StackLight in HA mode uses local volumes bound to machines, the data from these volumes on the deleted machine will be purged but its replicas remain on other machines. Removal of more than 1 machine can cause data loss.
If you move the StackLight label to a new worker machine on an existing cluster, manually deschedule all StackLight components from the old worker machine, which you remove the StackLight label from. For details, see Operations Guide: Deschedule StackLight Pods from a worker machine.
If the machine being deleted has a prioritized upgrade index and you want to preserve the same upgrade order, manually set the required index to the new node that replaces the deleted one. Otherwise, the new node is automatically set the greatest upgrade index that is prioritized the last. To set the upgrade index, refer to Change the upgrade order of a machine.

Delete a cluster machine using web UI¶

This section instructs you on how to scale down an existing management or managed cluster through the Mirantis Container Cloud web UI.

To delete a machine from a cluster using web UI:

Carefully read the machine deletion precautions.
Ensure that the machine being deleted is not a Ceph Monitor. If it is, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.
Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Click the More action icon in the last column of the machine you want to delete and select Delete.
Select the machine deletion method:
- Graceful
  Recommended. The machine will be prepared for deletion with all workloads safely evacuated. Using this option, you can cancel the deletion before the corresponding node is removed from Docker Swarm.
- Unsafe
  Not recommended. The machine will be deleted without any preparation.
- Forced
  Not recommended. The machine will be deleted with no guarantee of resources cleanup. Therefore, Mirantis recommends trying Graceful or Unsafe option first.
For deletion workflow of each method, see Overview of machine deletion policies.
Confirm the deletion.
If machine deletion fails, you can reduce the deletion policy restrictions and try another method but in the following order only: Graceful > Unsafe > Forced.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

See also

Delete a cluster machine using CLI

Delete a cluster machine using CLI¶

Available since MOSK 23.3

This section instructs you on how to scale down an existing management or managed cluster through the Container Cloud API. To delete a machine using the Container Cloud web UI, see Delete a cluster machine using web UI.

Using the Container Cloud API, you can delete a cluster machine using the following methods:

Recommended. Enable the delete field in the providerSpec section of the required Machine object. It allows aborting graceful machine deletion before the node is removed from Docker Swarm.
Not recommended. Apply the delete request to the Machine object.

You can control machine deletion steps by following a specific machine deletion policy.

Overview of machine deletion policies¶

The deletion policy of the Machine resource used in the Container Cloud API defines specific steps occurring before a machine deletion.

The Container Cloud API contains the following types of deletion policies: graceful, unsafe, forced. By default, the graceful deletion policy is used.

You can change the deletion policy before the machine deletion. If the deletion process has already started, you can reduce the deletion policy restrictions in the following order only: graceful > unsafe > forced.

Graceful machine deletion¶

Recommended

During a graceful machine deletion, the provider and LCM controllers perform the following steps:

Cordon and drain the node being deleted.
Remove the node from Docker Swarm.
Send the delete request to the corresponding Machine resource.
Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the Machine resource. This step completes the machine deletion from Kubernetes resources.

Caution

You can abort a graceful machine deletion only before the corresponding node is removed from Docker Swarm.

During a graceful machine deletion, the Machine object status displays prepareDeletionPhase with the following possible values:

started
Provider controller prepares a machine for deletion by cordoning, draining the machine, and so on.
completed
LCM Controller starts removing the machine resources since the preparation for deletion is complete.
aborting
Provider controller attempts to uncordon the node. If the attempt fails, the status changes to failed.
failed
Error in the deletion workflow.

Unsafe machine deletion¶

During an unsafe machine deletion, the provider and LCM controllers perform the following steps:

Send the delete request to the corresponding Machine resource.
Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the Machine resource. This step completes the machine deletion from Kubernetes resources.

Forced machine deletion¶

During a forced machine deletion, the provider and LCM controllers perform the following steps:

Send the delete request to the corresponding Machine resource.
Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the Machine resource. This step completes the machine deletion from Kubernetes resources.

This policy type allows deleting a Machine resource even if the provider or LCM controller gets stuck at some step. But this policy may require a manual cleanup of machine resources in case of a controller failure. For details, see Delete a machine from a cluster using CLI.

Caution

Consider the following precautions applied to the forced machine deletion policy:

Use the forced machine deletion only if either graceful or unsafe machine deletion fails.
If the forced machine deletion fails at any step, the LCM Controller removes the finalizer anyway.

Before starting the forced machine deletion, back up the related Machine resource:

kubectl get machine -n <projectName> <machineName> -o json > deleted_machine.json

Delete a machine from a cluster using CLI¶

Carefully read the machine deletion precautions.
Log in to the host where your management cluster kubeconfig is located and where kubectl is installed.
For the bare metal provider, ensure that the machine being deleted is not a Ceph Monitor. If it is, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.
Select from the following options:
- Recommended. In the providerSpec.value section of the Machine object, set delete to true:
```
kubectl patch machines.cluster.k8s.io -n <projectName> <machineName> --type=merge -p '{"spec":{"providerSpec":{"value":{"delete":true}}}}'
```
 Replace the parameters enclosed in angle brackets with the corresponding values.
- Delete the Machine object.
```
kubectl delete machines.cluster.k8s.io -n <projectName> <machineName>
```
After a successful unsafe or graceful machine deletion, the resources allocated to the machine are automatically freed up.
If you applied the forced machine deletion, verify that all machine resources are freed up. Otherwise, manually clean up resources:
1. Delete the Kubernetes Node object related to the deleted Machine object:
 
 Note
 
 Since MOSK 23.1, skip this step as the system performs it automatically.
 1. Log in to the host where your managed cluster kubeconfig is located.
 2. Verify whether the Node object for the deleted Machine object still exists:
 kubectl get node $(jq -r '.status.nodeRef.name' deleted_machine.json)
 If the system response is positive:
 1. Log in to the host where your management cluster kubeconfig is located.
 2. Delete the LcmMachine object with same name and project name as the deleted Machine object.
 kubectl delete lcmmachines.lcm.mirantis.com -n <projectName> <machineName>
2. Clean up the provider resources:
 1. Log in to the host that contains the management cluster kubeconfig and jq installed.
 2. If the deleted machine was located on the managed cluster, delete the Ceph node as described in High-level workflow of Ceph OSD or node removal.
 3. Obtain the BareMetalHost object that relates to the deleted machine:
 BMH=$(jq -r '.metadata.annotations."metal3.io/BareMetalHost"| split("/") | .[1]' deleted_machine.json)
 4. Delete the BareMetalHost credentials:
 kubectl delete secret -n <projectName> <machineName>-user-data
 5. Deprovision the related bare metal host object:
 Since the management cluster update to 16.4.0 (MCC 2.29.0)
 m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
 
 kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"image": null, "userData": null}}' kubectl patch baremetalhostinventories -n <projectName> ${BMH} --type merge --patch '{"spec": {"online":false}}' kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"consumerRef": null}}'
 Before the management cluster update to 16.4.0 (MCC 2.29.0)
 kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"image": null, "userData": null, "online":false}}' kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"consumerRef": null}}'
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

See also

Disable a machine

Delete a managed cluster¶

Due to a development limitation in baremetal-operator, deletion of a managed cluster requires preliminary deletion of the worker machines running on the cluster.

Warning

Mirantis recommends deleting cluster machines using the Container Cloud web UI or API instead of using the provider tools directly. Otherwise, the cluster deletion or detachment may hang and additional manual steps will be required to clean up machine resources.

Using the Container Cloud web UI, first delete worker machines one by one until you hit the minimum of 2 workers for an operational cluster. After that, you can delete the cluster with the remaining workers and managers.

To delete a baremetal-based managed cluster:

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name to open the list of machines running on it.
Click the More action icon in the last column of the worker machine you want to delete and select Delete. Confirm the deletion.
Repeat the step above until you have 2 workers left.
In the Clusters tab, click the More action icon in the last column of the required cluster and select Delete.
Verify the list of machines to be removed. Confirm the deletion.
If the cluster deletion hangs and the Deleting status message does not disappear after a while, refer to Cluster deletion or detachment freezes to fix the issue.
Optional. Available since MOSK 23.2. If you do not plan to reuse the bare metal hosts of the deleted cluster, delete them:
1. In the BM Hosts tab, click the Delete action icon next to the name of the host to be deleted.
2. Confirm the deletion.
Caution

Credentials associated with the deleted bare metal host, if any, are deleted automatically.
Optional. Applies before MOSK 23.2. If you do not need credentials associated with the bare metal hosts of the deleted cluster, delete them manually:
```
kubectl delete baremetalhostcredential <credentialName>
```

Deleting a cluster automatically frees up the resources allocated for this cluster, for example, instances, load balancers, networks, floating IPs, and so on.

Verify cluster status¶

This section instructs you on how to verify MOSK cluster status using the Container Cloud web UI during cluster deployment or day-2 operations such as cluster update, maintenance, and so on.

To monitor the cluster readiness, hover over the status icon of a specific cluster in the Status column of the Clusters page.

Once the orange blinking status icon becomes green and Ready, the cluster deployment or update is complete.

You can monitor live deployment status of the following cluster components:

Component	Description
Helm	Installation or upgrade status of all Helm releases
Kubelet	Readiness of the node in a Kubernetes cluster, as reported by kubelet
Kubernetes	Readiness of all requested Kubernetes objects
Nodes	Equality of the requested nodes number in the cluster to the number of nodes having the `Ready` LCM status
OIDC	Readiness of the cluster OIDC configuration
StackLight	Health of all StackLight-related objects in a Kubernetes cluster
Swarm	Readiness of all nodes in a Docker Swarm cluster
LoadBalancer	Readiness of the Kubernetes API load balancer
ProviderInstance	Readiness of all machines in the underlying infrastructure
Graceful Reboot	Readiness of a cluster during a scheduled graceful reboot, available since Container Cloud 2.24.0 (Cluster releases 15.0.1 and 14.0.0).
Infrastructure Status	Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Readiness of the `MetalLBConfig` object along with MetalLB and DHCP subnets.
LCM Operation	Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Health of all LCM operations on the cluster and its machines.
LCM Agent	Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Health of all LCM agents on cluster machines and the status of LCM agents update to the version from the current Cluster release.

For the history of a cluster deployment or update, refer to Inspect the history of a cluster and machine deployment or update.

Verify machine status¶

This section instructs you on how to verify machine status of a MOSK cluster using the Container Cloud web UI during cluster deployment or day-2 operations such as cluster update, maintenance, and so on.

Machine statuses¶

The machine creation starts with the Provision status. During provisioning, the machine is not expected to be accessible since its infrastructure (VM, network, and so on) is being created.

Other machine statuses are the same as the LCMMachine object states:

Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond to the prepare phase. This phase usually involves downloading the necessary archives and packages.
Deploy - the machine executes StateItems that correspond to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE) node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond to the reconfigure phase. The machine configuration is being updated without affecting workloads running on the machine.

Once the status changes to Ready, the deployment of the cluster components on this machine is complete.

To monitor the deploy or update live status of the machine, use one of the following options:

Quick status
On the Clusters page, in the Managers or Workers column. The green status icon indicates that the machine is Ready, the orange status icon indicates that the machine is Updating.
Detailed status
In the Machines section of a particular cluster page, in the Status column. Hover over a particular machine status icon to verify the deploy or update status of a specific machine component.

Monitored components of a machine¶

You can monitor the status of the following machine components:

Component	Description
Kubelet	Readiness of a node in a Kubernetes cluster.
Swarm	Health and readiness of a node in a Docker Swarm cluster.
LCM	LCM readiness status of a node.
ProviderInstance	Readiness of a node in the underlying bare metal infrastructure.
Graceful Reboot	Readiness of a machine during a scheduled graceful reboot of a cluster, available since Container Cloud 2.24.x (Cluster releases 15.0.1 and 14.0.0).
Infrastructure Status	Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Readiness of the `IPAMHost`, `L2Template`, `BareMetalHost`, and `BareMetalHostProfile` objects associated with the machine.
LCM Operation	Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Health of all LCM operations on the machine.
LCM Agent	Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Health of the LCM Agent on the machine and the status of the LCM Agent update to the version from the current Cluster release.

Monitor live machine status using API¶

You can monitor the live status of a machine deployment of update using the kubectl get machines <machineName> -o wide command.

Example of system response for the above command since Container Cloud 2.23.0 (Cluster release 11.7.0):

NAME    READY  LCMPHASE  NODENAME            UPGRADEINDEX  REBOOTREQUIRED  WARNINGS
demo-0  true   Ready     kaas-node-c6aa8ad3  1             false

For the history of a machine deployment or update, refer to Inspect the history of a cluster and machine deployment or update.

Underlay Kubernetes operations¶

The section covers underlay Kubernetes operations of a Mirantis OpenStack for Kubernetes (MOSK) cluster.

Set the MTU size for Calico¶

Available since MCC 2.24.0 (14.0.0, 14.0.1, 15.0.1) TechPreview

You can set the maximum transmission unit (MTU) size for Calico in the Cluster object using the calico.mtu parameter. By default, the MTU size for Calico is 1450 bytes. You can change it regardless of the host operating system.

For details on how to calculate the MTU size, see Calico documentation: Configure MTU to maximize network performance.

The following configuration example of the Cluster object covers a use case where the interface MTU size of the workload network, which is the smallest value across cluster nodes, is set to 9000 and the use of WireGuard is not expected:

spec:
  ...
  providerSpec:
    value:
      ...
      calico:
        mtu: 8950

Note

Since Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0), WireGuard is deprecated. If you still require the feature, contact Mirantis support for further information.

Caution

If you do not expect to use WireGuard encryption, ensure that the MTU size for Calico is at least 50 bytes smaller than the interface MTU size of the workload network. IPv4 VXLAN uses a 50-byte header.

Warning

Mirantis does not recommend changing this parameter on a running cluster. It leads to sequential draining of nodes and re-installation of packets, as during cluster upgrade.

Add or update a CA certificate for a MITM proxy using API¶

Note

For managed clusters, this feature is available since MOSK 23.1.

When you enable a man-in-the-middle (MITM) proxy access to a managed cluster, your proxy requires a trusted CA certificate. This section describes how to manually add the caCertificate field to the spec section of the Proxy object. You can also use this instruction to update an expired certificate on an existing cluster.

You can also add a CA certificate for a MITM proxy using the Container Cloud web UI through the Proxies tab. For details, refer to the cluster creation procedure as described in Create a managed bare metal cluster.

Warning

Any modification to the Proxy object, for example, changing the proxy URL, NO_PROXY values, or certificate, leads to cordon-drain and Docker restart on the cluster machines.

To add or update a CA certificate for a MITM proxy using API:

Encode your proxy CA certificate. For example:
```
cat ~/.mitmproxy/mitmproxy-ca-cert.cer | base64 -w0
```
Replace ~/.mitmproxy/mitmproxy-ca-cert.cer with the path to your CA certificate file.
Open the existing Proxy object for editing:

Warning

The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying the object.

Therefore, do not use kubectl apply on this object. Use kubectl create, kubectl patch, or kubectl edit instead.

If you used kubectl apply on this object, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the object using kubectl edit.
```
kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit proxy <proxyName>
```
In the system response, find the spec section with the current proxy configuration. For example:
```
spec:
 httpProxy: http://172.19.123.57:8080
 httpsProxy: http://172.19.123.57:8080
```

In the spec section, add or update the spec.caCertificate field with the base64-encoded proxy CA certificate data. For example:

spec:
  caCertificate: <BASE64_ENCODED_CA_CERTIFICATE>
  httpProxy: http://172.19.123.57:8080
  httpsProxy: http://172.19.123.57:8080

Save the Proxy object and proceed with the managed cluster creation.

If you update an expired certificate on an existing managed cluster, wait until the machines switch from the Reconfigure to Ready state to apply changes.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Configure TLS certificates for cluster applications¶

TechPreview

The Container Cloud web UI and StackLight endpoints are available through Transport Layer Security (TLS) with self-signed certificates generated by the Container Cloud provider.

Caution

The Container Cloud endpoints are available only through HTTPS.

Supported applications for TLS certificates configuration¶
Application name	Cluster Type	Comment
Container Cloud web UI	Management
`iam-proxy`	Management and managed	Available since Container Cloud 2.22.0 (Cluster release 11.6.0).
Keycloak	Management
`mcc-cache`	Management

Caution

The organization administrator must ensure that the application host name is resolvable within and outside the cluster.

Caution

Custom TLS certificates for Keycloak are supported for new and existing clusters originally deployed using MOS 21.3 or later.

Workflow of custom MKE certificates configuration¶

Available since 2.24.0 (Cluster releases 14.0.0, 14.0.1) Applies to management clusters only

When you add custom MKE certificates on a management cluster, the following workflow applies:

LCM agents are notified to connect to the management cluster using a different certificate.
After all agents confirm that they are ready to support both current and custom authentication, new MKE certificates apply.
LCM agents switch to the new configuration as soon as it gets valid.
The next cluster reconciliation reconfigures helm-controller for each managed cluster created within the configured management cluster.
If MKE certificates apply to the management cluster, the Container Cloud web UI reconfigures.

Caution

The Container Cloud web UI requires up to 10 minutes to update the MKE certificate configuration for communication with the management cluster. During this time, requests to the management cluster fail with the following example error:

Data loading failed
Failed to get projects list. Server response code: 502

This error is expected and disappears once new certificates apply.

Warning

During certificates application, LCM agents from every node must confirm that they have a new configuration prepared. If managed clusters contain a big number of nodes and some are stuck or orphaned, then the whole process gets stuck. Therefore, before applying new certificates, make sure that all nodes are ready.

Warning

If you apply MKE certificates to the management cluster with proxy enabled, all nodes and pods of this cluster and its managed clusters are triggered for reconfiguration and restart, which may cause the API and workload outage.

Prepare TLS certificates¶

Obtain your DNS server name. For example, container-cloud-auth.example.com.
Buy or generate a certificate from a certification authority (CA) that contains the following items:
- A full CA bundle including the root and all intermediate CA certificates.
- Your server certificate issued for the container-cloud-auth.example.com DNS name.
- Your secret key that was used to sign the certificate signing request. For example, cert.key.
Select the root CA certificate from your CA bundle and add it to root_ca.crt.
Combine all certificates including the root CA, intermediate CA from the CA bundle, and your server certificate into one file. For example, full_chain_cert.crt.

Configure TLS certificates using the Container Cloud web UI¶

Available since MCC 2.24.0 (14.0.0, 14.0.1)

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the More action icon in the last column of the required cluster and select Configure cluster.
In the Security > TLS Certificates section, click Add certificate.

In the wizard that opens, fill out and save the form:

Parameter	Description
Server name	Host name of the application.
Applications	Drop-down list of available applications for TLS certificates configuration.
Server certificate	Certificate to authenticate the identity of the server to a client. You can also add a valid certificate bundle. The server certificate must be on the top of the chain.
Private key	Private key for the server that must correspond to the public key used in the server certificate.
CA Certificate	CA certificate that issued the server certificate. Required when configuring Keycloak or `mcc-cache`. Use the top-most intermediate certificate if the CA certificate is unavailable.

The Security section displays the expiration date and the readiness status for every application with user-defined certificates.

Optional. Edit the certificate using the Edit action icon located to the right of the application status and edit the form filled out in the previous step.

Note

To revoke a certificate, use the Delete action icon located to the right of the application status.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Configure TLS certificates using the Container Cloud API¶

For clusters originally deployed using MOS release earlier than 21.3, download the latest version of the bootstrap script on the management cluster:
```
wget https://binary.mirantis.com/releases/get_container_cloud.sh
chmod 0755 get_container_cloud.sh
./get_container_cloud.sh
```
Change the directory to kaas-boostrap.

If you deleted this directory, restore it using the step 1 of the procedure described in Collect cluster logs.

Select from the following options:

Set a TLS certificate for the Container Cloud web UI:

./container-cloud set certificate \
  --cert-file <fullPathToCertForUI> \
  --key-file <pathToPrivateKeyForUI> \
  --for ui \
  --hostname  <applicationHostName> \
  --kubeconfig <mgmtClusterKubeconfig>

Since Container Cloud 2.22.0 (Cluster release 11.6.0), set a TLS certificate for iam-proxy:

./container-cloud set certificate \
  --cert-file <fullPathToCertForIAMProxyEndpoint> \
  --key-file <pathToPrivateKeyForIAMProxyEndpoint> \
  --for <IAMProxyEndpoint> --hostname <IAMProxyEndpointHostName> \
  --kubeconfig <mgmtClusterKubeconfig> \
  --cluster-name <targetClusterName> \
  --cluster-namespace <targetClusterNamespace>

Possible values for IAMProxyEndpoint are as follows:

iam-proxy-alerta
iam-proxy-alertmanager
iam-proxy-grafana
iam-proxy-kibana
iam-proxy-prometheus

Set a TLS certificate for Keycloak:

./container-cloud set certificate \
  --cacert-file <fullRootpathToCACertForKeycloak> \
  --cert-file <fullPathToCertForKeycloak> \
  --key-file <pathToPrivateKeyForKeycloak> \
  --for keycloak --hostname <applicationHostName> \
  --kubeconfig <mgmtClusterKubeconfig>

Set a TLS certificate for mcc-cache:

./container-cloud set certificate \
  --cacert-file <fullRootpathToCACertForCache> \
  --cert-file <fullPathToCertForCache> \
  --key-file <pathToPrivateKeyForCache> \
  --for cache --hostname <applicationHostName> \
  --kubeconfig <mgmtClusterKubeconfig> \
  --cluster-name <targetClusterName> \
  --cluster-namespace <targetClusterProjectName>

Caution

All managed clusters must be updated to the latest available Cluster release.

Caution

The organization administrator must ensure that the mcc-cache host name is resolvable for all managed clusters.

In the commands above, replace the parameters enclosed in angle brackets with the corresponding values of your cluster.

Flag	Description
`--cacert-file`	Must contain only one PEM-encoded root CA certificate in the certificate chain of trust.
`--cert-file`	Must contain all certificates in the server certificate chain of trust including the PEM-encoded server certificate.
`--key-file`	Private key used to generate the provided certificate.
`--for` `<applicationName>` or `<IAMProxyEndpoint>`	Configures a certificate for a supported application. The list of possible values for application names includes: `cache`, `keycloak`, or `ui`.
`--hostname`	DNS server host name.
`--kubeconfig`	Management cluster `kubeconfig` that is by default located in the `kaas-bootstrap` directory.
`--cluster-name`	Target cluster name.
`--cluster-namespace`	Project name of the target cluster.

Example command:

./container-cloud set certificate \
  --cacert-file root_ca.crt \
  --cert-file full_chain_cert.crt \
  --key-file cert.key \
  --for keycloak \
  --hostname container-cloud-auth.example.com \
  --kubeconfig kubeconfig

Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

The self-signed certificates generated and managed by the Container Cloud provider are stored in *-tls-certs secrets in the kaas and stacklight namespaces.

Renew expired TLS certificates¶

MOSK provides automatic renewal of certificates for internal Container Cloud services. Custom certificates require manual renewal.

If you have permissions to view the default project in the Container Cloud web UI, you may see the Certificate Is Expiring Soon warning for custom certificates. The warning appears on top of the Container Cloud web UI. It displays the certificate with the least number of days before expiration. Click See Details and get more information about other expiring certificates. You can also find the details about the expiring certificates in the Status column’s Certificate Issues tooltip on the Clusters page.

The Certificate Issues status may include the following messages:

Some certificates require manual renewal
A custom certificate is expiring in less than seven days. Renew the certificate manually using the same container-cloud binary as for the certificate configuration. For details, see Configure TLS certificates using the Container Cloud API.
Some certificates were not renewed automatically
An automatic certificate renewal issue. Unexpected error, contact Mirantis support.

Caution

After certificate renewal, back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Define a custom CA certificate for a private Docker registry¶

This section instructs you on how to define a custom CA certificate for Docker registry connections on your management or managed cluster using the Container Cloud web UI or CLI.

Caution

A Docker registry that is being used by a cluster cannot be deleted.

Define a custom CA certificate for a Docker registry using CLI¶

Create a ContainerRegistry resource(s) with the required registry domain and CA certificate. For details, see ContainerRegistry resource.

In the providerSpec section of the Cluster object, set the containerRegistries field with the names list of created ContainerRegistry resource objects:

kubectl patch cluster -n <clusterProjectName> <clusterName> --type merge -p '{"spec":{"providerSpec":{"value":{"containerRegistries":["<containerRegistryName>"]}}}}'

Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

Define a custom CA certificate for a Docker registry using web UI¶

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
In the Container Registries tab, click Add Container Registry.
In the Add new Container Registry window, define the following parameters:
- Container Registry Name
  Name of the Docker registry to select during cluster creation or post-deployment configuration.
- Domain
  Host name and optional port of the registry. For example, demohost:5000.
- CA Certificate
  SSL CA certificate of the registry to upload or insert in plain text.
Click Create.
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.

You can add the created Docker registry configuration to a new or existing managed cluster as well as to an existing management cluster:

For a new managed cluster, in the Create new cluster wizard, select the required registry name from the drop-down menu of the Container Registry option. For details on a new cluster creation, see Create a managed bare metal cluster.
For an existing cluster of any type, in the More menu of the cluster, select the required registry name from the drop-down menu of the Configure cluster > General Settings > Container Registry option. For details on an existing managed cluster configuration, see Change a cluster configuration.

Increase memory limits for cluster components¶

When any MOSK component reaches the limit of memory resources usage, the affected pod may be killed by OOM killer to prevent memory leaks and further destabilization of resource distribution.

A periodic recreation of a pod killed by OOM killer is normal once a day or week. But if the alerts frequency increases or pods cannot start and move to the CrashLoopBack state, adjust the default memory limits to fit your cluster needs and prevent critical workloads interruption.

When any MOSK component reaches the limit of CPU resources usage, StackLight raises the CPUThrottlingHigh alerts. CPU limits for MOSK components (except the StackLight ones) were removed in Container Cloud 2.24.0 (Cluster releases 14.0.0, 14.0.1, and 15.0.1). For earlier versions, use the resources:limits:cpu parameter located in the same section as the resources:limits:memory parameter of the corresponding component.

Note

For StackLight resources limits, refer to StackLight configuration parameters.

To increase memory limits on a MOSK cluster:

In the spec:providerSpec:value: section of cluster.yaml, add the resources:limits parameters with the required values for necessary MOSK components:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit cluster <clusterName>

The limits key location in the Cluster object can differ depending on component. Different cluster types have different sets of components that you can adjust limits for.

The following sections describe components that relate to a specific cluster type with corresponding limits key location provided in configuration examples. Limit values in the examples correspond to default values used since Container Cloud 2.24.0 (Cluster releases 15.0.1, 14.0.1, and 14.0.0).

Note

For StackLight resources limits, refer to Resource limits.

Limits for common components of any cluster type¶

No limits are set for the following components:

storage-discovery

The memory limits for the following components can be increased on the management and managed clusters:

client-certificate-controller
metrics-server
metallb

Note

For helm-controller, limits configuration is not supported.
For metallb, the limits key in cluster.yaml differs from other common components.

Component name

Configuration example

<common-component-name>

spec:
  providerSpec:
    value:
      helmReleases:
      - name: client-certificate-controller
        values:
          resources:
            limits:
              memory: 500Mi

metallb

spec:
  providerSpec:
    value:
      helmReleases:
      - name: metallb
        values:
          controller:
            resources:
              limits:
                memory: 200Mi
                # no CPU limit and 200Mi of memory limit since MCC 2.24.0 (15.0.1, 14.0.0)
                # 200m CPU and 200Mi of memory limit since MCC 2.23.0 (11.7.0)
          speaker:
            resources:
              limits:
                memory: 500Mi
                # no CPU limit and 500Mi of memory limit since MCC 2.24.0 (15.0.1, 14.0.0)
                # 500m CPU and 500Mi of memory limit since MCC 2.23.0 (11.7.0)

Limits for management cluster components¶

No limits are set for the following components:

baremetal-operator
baremetal-provider
cert-manager

The memory limits for the following components can be increased on a management cluster in the spec:providerSpec:value:kaas:management:helmReleases: section:

admission-controller
credentials-controller ^{Since MCC 2.28 (17.3.0 and 16.3.0)}
event-controller
iam
iam-controller
kaas-exporter
kaas-ui

license-controller
proxy-controller 0
release-controller
scope-controller
secret-controller ^{Since MCC 2.27 (17.2.0 and 16.2.0)}
user-controller

0: The proxy-controller component is replaced with secret-controller in MCC 2.27.0 (Cluster releases 17.2.0 and 16.2.0).

The memory limits for the following components can be increased on a management cluster in the following sections:

spec:providerSpec:value:kaas:regional:provider:baremetal:helmReleases:
spec:providerSpec:value:kaas:regionalHelmReleases:

agent-controller
lcm-controller
mcc-cache

rbac-controller
squid-proxy

Limits for management cluster components¶
Component name	Configuration example
`<mgmt-cluster-component-name>`	spec: providerSpec: value: kaas: management: helmReleases: - name: release-controller values: resources: limits: memory: 200Mi
`baremetal-provider`	spec: providerSpec: value: kaas: regional: - provider: baremetal helmReleases: - name: baremetal-provider values: cluster_api_provider_baremetal: resources: requests: cpu: 500m memory: 500Mi
`agent-controller` `lcm-controller` `rbac-controller`	spec: providerSpec: value: kaas: regionalHelmReleases: - name: lcm-controller values: resources: limits: memory: 1Gi
`mcc-cache`	spec: providerSpec: value: kaas: regionalHelmReleases: - name: mcc-cache values: nginx: resources: limits: memory: 500Mi registry: resources: limits: memory: 500Mi kproxy: resources: limits: memory: 300Mi
`squid-proxy`	spec: providerSpec: value: kaas: regional: - provider: baremetal helmReleases: - name: squid-proxy values: resources: limits: memory: 1Gi

Increase storage quota for etcd¶

Available since MCC 2.24.4 (Cluster releases 15.0.3 and 14.0.3)

You may need to increase the default etcd storage quota that is 2 GB if etcd runs out of space and there is no other way to clean up the storage on your management or managed cluster.

To increase storage quota for etcd:

In the spec:providerSpec:value: section of cluster.yaml, edit the etcd:storageQuota value:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit cluster <clusterName>

Configuration example:

apiVersion: cluster.k8s.io/v1alpha1
kind: Cluster
metadata:
  name: mycluster
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
spec:
  providerSpec:
    value:
      apiVersion: baremetal.k8s.io/v1alpha1
      kind: BaremetalClusterProviderSpec
      etcd:
        storageQuota: 4GB

Caution

You cannot decrease the storageQuota once set.

Note

Applies only to the following Cluster releases:
- 15.0.3 (MOSK 23.2.2) or 14.0.3
- 15.0.4 (MOSK 23.2.3) or 14.0.4 if you scheduled a delayed management cluster upgrade
Before upgrading your management cluster to MCC 2.25.0 (Cluster release 16.0.0), configure LCMMachine resources of the cluster controller nodes as described in Container Cloud documentation: Release notes - Update notes.

Configure Kubernetes auditing and profiling¶

Available since MCC 2.24.3 (Cluster releases 15.0.2 and 14.0.2)

This section instructs you on how to enable and configure Kubernetes auditing and profiling options for MKE using the Cluster object of your MOSK managed or management cluster. These options enable auditing and profiling of MKE performance with specialized debugging endpoints.

Note

You can also enable audit_log_configuration using the MKE API with no MOSK overrides. However, if you enable the option using the Cluster object, use the same object to disable the option. Otherwise, if you disable the option using the MKE API, it will be overridden by MOSK and enabled again.

References:

For MOSK overrides, see Reference Architecture: MKE options managed by Container Cloud
For configuration using the MKE API, see MKE documentation: Enable MKE audit logging

To enable Kubernetes auditing and profiling for MKE:

Open the Cluster object of your MOSK cluster for editing.
In spec:providerSpec:value: section:
1. Add or configure the audit configuration. For example:
```
spec:
 ...
 providerSpec:
 value:
 ...
 audit:
 kubernetes:
 level: request
 includeInSupportDump: true
 apiServer:
 enabled: true
 maxAge: <uint>
 maxBackup: <uint>
 maxSize: <uint>
```
 You can configure the following parameters that are also defined in the MKE configuration file:
 
 Note
 
 The names of the corresponding MKE options are marked with [] in the below definitions.
 - level
 Defines the value of [audit_log_configuration]level. Valid values are request and metadata.
 
 Note
 
 For management clusters, the metadata value is set by default since Container Cloud 2.26.0 (Cluster release 16.1.0).
 - includeInSupportDump
 Defines the value of [audit_log_configuration]support_dump_include_audit_logs. Boolean.
 - apiServer:enabled
 Defines the value of [cluster_config]kube_api_server_auditing. Boolean. If set to true but with no level set, the [audit_log_configuration]level MKE option is set to metadata.
 
 Note
 
 For management clusters, this option is enabled by default since the Container Cloud 2.26.0 (Cluster release 16.1.0).
 - maxAge
 Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Defines the value of kube_api_server_audit_log_maxage. Integer. If not set, defaults to 30.
 - maxBackup
 Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Defines the value of kube_api_server_audit_log_maxbackup. Integer. If not set, defaults to 10.
 - maxSize
 Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Defines the value of kube_api_server_audit_log_maxsize. Integer. If not set, defaults to 10.
2. Enable profiling:
```
spec:
 ...
 providerSpec:
 value:
 ...
 profiling:
 enabled: true
```
 Enabling profiling automatically enables the following MKE configuration options:
```
[cluster_config]kube_api_server_profiling_enabled
[cluster_config]kube_controller_manager_profiling_enabled
[cluster_config]kube_scheduler_profiling_enabled
```
Since Container Cloud 2.26.4 (Cluster releases 17.1.4 and 16.1.4), manually enable audit log rotation in the MKE configuration file:

Note

Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), the below parameters are automatically enabled with default values along with the auditing feature. Therefore, skip this step.
```
[cluster_config]
 kube_api_server_audit_log_maxage=30
 kube_api_server_audit_log_maxbackup=10
 kube_api_server_audit_log_maxsize=10
```
For the configuration procedure, see MKE documentation: Configure an existing MKE cluster.

While using this procedure, replace the command to upload the newly edited MKE configuration file with the following one:
```
curl --silent --insecure -X PUT -H "X-UCP-Allow-Restricted-API: i-solemnly-swear-i-am-up-to-no-good" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file 'mke-config.toml' https://$MKE_HOST/api/ucp/config-toml
```
- The value for the MKE_HOST variable has the <loadBalancerHost>:6443 format, where loadBalancerHost is the corresponding field in the cluster status.
- The value for MKE_PASSWORD is taken from the ucp-admin-password-<clusterName> secret in the cluster namespace of the management cluster.
- The value for MKE_USERNAME is always admin.

See also

Kubernetes documentation: Auditing

Limitations¶

The section covers the limitations of Mirantis OpenStack for Kubernetes (MOSK).

[3544] Due to a community issue, Kubernetes pods may occasionally not be rescheduled on the nodes that are in the NotReady state. As a workaround, manually reschedule the pods from the node in the NotReady state using the kubectl drain --ignore-daemonsets --force <node-uuid> command.

Troubleshooting Guide¶

This guide provides solutions to the issues that can occur while deploying and operating your MOSK management and managed clusters.

For the list of known issues that you may encounter in the cluster, refer to the Release Notes for the corresponding MOSK version.

Collect cluster logs¶

While operating your management or managed cluster, you may require collecting and inspecting the cluster logs to analyze cluster events or troubleshoot issues. For bootstrap logs, see Collect the bootstrap logs.

To collect cluster logs:

Verify that the bootstrap directory is updated.

Select from the following options:
- For clusters deployed using Container Cloud 2.11.0 or later:
```
./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
--target-dir <pathToBootstrapDirectory>
```
- For clusters deployed using the Container Cloud release earlier than 2.11.0 or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:
```
wget https://binary.mirantis.com/releases/get_container_cloud.sh

chmod 0755 get_container_cloud.sh

./get_container_cloud.sh
```
Obtain kubeconfig of the required cluster. The management cluster kubeconfig file is created during the last stage of the management cluster bootstrap. To obtain a managed cluster kubeconfig, see Connect to a MOSK cluster.
Obtain the private SSH key of the required cluster:
- For a managed cluster, this is an SSH key added in the Container Cloud web UI before the managed cluster creation.
- For a management cluster, ssh_key is created in the same directory as the bootstrap script during cluster bootstrap.
  
  Note
  
  If the initial version of your management cluster was earlier than 2.6.0, ssh_key is named openstack_tmp and is located at ~/.ssh/.
Depending on the cluster type that you require logs from, run the corresponding command:
- For a management cluster:
```
./container-cloud collect logs --management-kubeconfig <pathToMgmtClusterKubeconfig> \
--key-file <pathToMgmtClusterPrivateSshKey> \
--cluster-name <clusterName> --cluster-namespace <clusterProject>
```
- For a managed cluster:
```
./container-cloud collect logs --management-kubeconfig <pathToMgmtClusterKubeconfig> \
--key-file <pathToManagedClusterSshKey> --kubeconfig <pathToManagedClusterKubeconfig> \
--cluster-name <clusterName> --cluster-namespace <clusterProject>
```
Substitute the parameters enclosed in angle brackets with the corresponding values of your cluster.

Optional flags:
- --output-dir
 Directory path to save logs. The default value is logs/. For example, logs/<clusterName>/events.log.
- --extended
 Output the extended version of logs that contains system and MKE logs, logs from LCM Ansible and LCM Agent along with cluster events and Kubernetes resources description and logs.
 
 Without the --extended flag, the basic version of logs is collected, which is sufficient for most use cases. The basic version of logs contains all events, Kubernetes custom resources, and logs from all cluster components. This version does not require passing --key-file.
For the logs structure, see Collect the bootstrap logs.
If you require logs of a cluster update, inspect the following folders on the control plane nodes:
- /objects/namespaced/<namespaceName>/core/pods/lcm-lcm-controller- <controllerID>/ for the lcm-controller logs.
- /objects/namespaced/<namespaceName>/core/pods/<providerName-ID>/ for logs of the provider controller. For example, baremetal-provider-5b96fb4fd6-bhl7g.
- /system/mke/<controllerMachineName>/ (or /system/<controllerMachineName>/mke/) for the MKE support dump. The dsinfo/dsinfo.txt file contains Docker and system information of the MKE configuration set before and after update.
- events.log for cluster events logs.
Technology Preview. Assess the Ironic pod logs:
- Extract the content of the 'message' fields from every log message:
```
kubectl -n kaas logs <ironicPodName> -c syslog | jq -rRM 'fromjson? | .message'
```
- Extract the content of the 'message' fields from the ironic_conductor source log messages:
```
kubectl -n kaas logs <ironicPodName> -c syslog | jq -rRM 'fromjson? | select(.source == "ironic_conductor") | .message'
```
The syslog container collects logs generated by Ansible during the node deployment and cleanup and outputs them in the JSON format.
Compress the collected log files and send the archive to the Mirantis support team.

Inspect the history of a cluster and machine deployment or update¶

Available since MCC 2.22.0 (11.6.0)

Using the ClusterDeploymentStatus, ClusterUpgradeStatus, MachineDeploymentStatus, and MachineUpgradeStatus objects, you can inspect historical data of cluster and machine deployment or update stages, their time stamps, statuses, and failure messages, if any.

Caution

The order of cluster and machine update stages may not always be sorted by a time stamp but have an approximate logical order due to several components running simultaneously.

View the history using the web UI¶

Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.
Switch to the required project using the Switch Project action icon on top of the main left-side navigation panel.
In the Clusters tab, click the More action icon in the last column of the required cluster area and select History to display details of the ClusterDeploymentStatus or ClusterUpgradeStatus object, if any.
In the window that opens, click the required object to display the object stages, their time stamps, and statuses.

Object names match the initial and/or target Cluster release versions and MKE versions of the cluster at a specific date and time. For example, 11.6.0+3.5.5 (initial version) or 11.5.0+3.5.5 -> 11.6.0+3.5.5.

If any stage fails, hover over the Failure status field to display the failure message.
Optional. Inspect the deployment and update status of the cluster machines:
1. In the Clusters tab, click the required cluster name. The cluster page with Machines list opens.
2. Click More action icon in the last column of the required machine area and select History.

View the history using the CLI¶

Select from the following options:

Inspect the cluster or machine deployment history using the ClusterDeploymentStatus or MachineDeploymentStatus object:

./bin/kubectl --kubeconfig <pathToClusterKubeconfig> get clusterdeploymentstatus

./bin/kubectl --kubeconfig <pathToClusterKubeconfig> get machinedeploymentstatus

Inspect the cluster or machine update history using the ClusterUpgradeStatus and MachineUpgradeStatus objects:

./bin/kubectl --kubeconfig <pathToClusterKubeconfig> get clusterupgradestatus

./bin/kubectl --kubeconfig <pathToClusterKubeconfig> get machineupgradestatus

Object names match the initial and/or target Cluster release versions and MKE versions of the cluster. For example, 11.5.0+3.5.5 (initial version) or 11.5.0+3.5.5 -> 11.6.0+3.5.5. Each object displays the update stages, their time stamps, and statuses. If any stage fails, the success field contains a failure message.

Cluster deletion or detachment freezes¶

If you delete managed cluster nodes not using the Container Cloud web UI or API, the cluster deletion or detachment may hang with the Deleting message remaining in the cluster status.

To apply the issue resolution:

Expand the menu of the tab with your username.
Click Download kubeconfig to download kubeconfig of your management cluster.
Log in to any local machine with kubectl installed.
Copy the downloaded kubeconfig to this machine.

Run the following command:

kubectl --kubeconfig <mgmtClusterKubeconfigPath> edit -n <projectName> cluster <managedClusterName>

Edit the opened kubeconfig by removing the following lines:
```
finalizers:
- cluster.cluster.k8s.io
```
Manually clean up the resources of the nodes that you have previously deleted not using the Container Cloud web UI.

Keycloak admin console becomes inaccessible after changing the theme¶

Due to the upstream Keycloack issue, the Keycloak admin console becomes inaccessible after changing the theme to base using the Themes tab.

To apply the issue resolution:

Obtain the MySQL admin password:

kubectl get secret -n kaas mariadb-dbadmin-password -o yaml | awk '/MYSQL_DBADMIN_PASSWORD/ {print $2}' | base64 -d

Connect to the MariaDB server:

kubectl exec -it -n kaas mariadb-server-0 -- mysql -h localhost -u root -p

Update the Keycloak database for the following themes:
- ADMIN_THEME
- ACCOUNT_THEME
- EMAIL_THEME
- LOGIN_THEME
For example:
```
use keycloak;
update REALM set ADMIN_THEME = REPLACE(ADMIN_THEME, 'base','keycloak');
```

Restart Keycloak:

kubectl scale sts -n kaas --replicas=0 iam-keycloak
kubectl scale sts -n kaas --replicas=3 iam-keycloak

The ‘database space exceeded’ error on large clusters¶

Occasionally, cluster update may get stuck on large clusters running 500+ nodes along with 15k+ pods due to the etcd database overflow. The following error occurs every time when accessing the Kubernetes API server:

etcdserver: mvcc: database space exceeded

Normally, kube-apiserver actively compacts the etcd database. In rare cases, it is required to manually compact the etcd database as described below, for example, during rapid creation of numerous Kubernetes objects. Once done, Mirantis recommends that you identify the root cause of the issue and clean up unnecessary resources to prevent manual etcd compacting and defragmentation in future.

To apply the issue resolution:

Since Container Cloud 2.24.0 (Cluster release 14.0.0)

Open an SSH connection to any controller node.
Execute the following script to compact and defragment the etcd database:
```
sudo -i
compact_etcd.sh
defrag_etcd.sh
```

Before Container Cloud 2.24.0 (Cluster release 14.0.0)

Defragment the etcd database as described in MKE documentation: Apply etcd defragmentation.

The auditd events cause ‘backlog limit exceeded’ messages¶

If auditd generates a lot of events, some of them may be lost with the following numerous messages in dmesg or kernel logs:

auditd: backlog limit exceeded

You may also observe high or growing values of the lost counter in the auditctl output. For example:

auditctl -s
...
lost 1351280
...

To resolve the issue, you may need to update the rules loaded to auditd and adjust the size of the backlog buffer.

Update the rules loaded to auditd¶

If auditd contains a lot of rules, it may generate a lot of events and overrun the buffer. Therefore, verify and update your preset and custom rules. Preset rules are defined as presetRules, custom rules are defined as follows:

customRules
customRulesX32
customRulesX64

To verify and update the rules:

In the Cluster object of the affected cluster, verify that the presetRules string does not start with the ! symbol.
Verify all audit rules:
1. Log in through SSH or directly using the console to the node having the buffer overrun symptoms.
2. Run the following command:
```
auditctl -l
```
 In the system response, identify the rules to exclude.
3. In /etc/audit/rules.d, find the files containing the rules to exclude.
 - If the file is named 60-custom.rules, remove the rules from any of the following parameters located in the Cluster object:
 - customRules
 - customRulesX32
 - customRulesX64
 - If the file is named 50-<NAME>.rules, and you want to exclude all rules from that file, exclude the preset named <NAME> from the list of allowed presets defined under presetRules in the Cluster object.
 - If the file is named 50-<NAME>.rules, and you want to exclude only several rules from that file:
 1. Copy the rules you want to keep to one of the following parameters located in the Cluster object:
 - customRules
 - customRulesX32
 - customRulesX64
 2. Exclude the preset named <NAME> from the list of allowed presets.

Adjust the size of the backlog buffer¶

By default, the backlog buffer size is set to 8192, which is enough for most use cases. To prevent buffer overrun, you can adjust the default value to fit your needs. But keep in mind that increasing this value leads to higher memory requirements because the buffer uses RAM.

To estimate RAM requirements for the buffer, you can use the following calculation:

A buffer of 8192 audit records uses ~70 MiB of RAM
A buffer of 15000 audit records uses ~128 MiB of RAM

To change the backlog buffer size, adjust the backlogLimit value in the Cluster object of the affected cluster.

You may also want to change the size directly on the system and verify the result at once. But to change the size permanently, use the Cluster object.

To adjust the size of the backlog buffer on a node:

Log in to the affected node through SSH or directly through the console.
If enabledAtBoot is enabled, adjust the audit_backlog_limit value in kernel options:
1. List grub configuration files where GRUB_CMDLINE_LINUX is defined:
```
grep -rn 'GRUB_CMDLINE_LINUX' /etc/default/grub /etc/default/grub.d/* \
| cut -d: -f1 | sort -u
```
2. In each file obtained in the previous step, edit the GRUB_CMDLINE_LINUX string by changing the integer value after audit_backlog_limit= to the desired value.
In /etc/audit/rules.d/audit.rules, adjust the buffer size by editing the integer value after -b.
Select from the following options:
- If the auditd configuration is not immutable, restart the auditd service:
```
systemctl restart auditd.service
```
- If the auditd configuration is immutable, reboot the node. The auditd configuration is immutable if any of the following conditions are met:
  - In the auditctl -s output, the enabled parameter is set to 2
  - The -e 2 flag is defined explicitly in parameters of any custom rule
  - The immutable preset is defined explicitly
  - The virtual preset all is enabled and the immutable preset is not excluded explicitly
  Caution
  
  Arrange the time to reboot the node according to your maintenance schedule. For the exact reboot procedure, use your maintenance policies.
If the backlog limit exceeded message disappears, adjust the size permanently using the backlogLimit value in the Cluster object.

Troubleshoot a management cluster bootstrap¶

This section provides solutions to the issues that may occur while deploying a management cluster with Bootstrap v2.

Troubleshoot the bootstrap node configuration¶

This section provides solutions to the issues that may occur while configuring the bootstrap node.

DNS settings¶

If you have issues related to the DNS settings, the following error message may occur:

curl: (6) Could not resolve host

The issue may occur if a VPN is used to connect to the cloud or a local DNS forwarder is set up.

To apply the issue resolution, change the default DNS settings for Docker:

Log in to your local machine.
Identify your internal or corporate DNS server address:
```
systemd-resolve --status
```
Create or edit /etc/docker/daemon.json by specifying your DNS address:
```
{
 "dns": ["<YOUR_DNS_ADDRESS>"]
}
```
Restart the Docker daemon:
```
sudo systemctl restart docker
```

Default network addresses¶

If you have issues related to the default network address configuration, curl either hangs or the following error occurs:

curl: (7) Failed to connect to xxx.xxx.xxx.xxx port xxxx: Host is unreachable

The issue may occur because the default Docker network address 172.17.0.0/16 and/or the kind Docker network, which is used by kind, overlap with your cloud address or other addresses of network configuration.

To apply the issue resolution:

Log in to your local machine.
Verify routing to the IP addresses of the target cloud endpoints:
1. Obtain the IP address of your target cloud. For example:
```
nslookup auth.openstack.example.com
```
  Example of system response:
```
Name:   auth.openstack.example.com
Address: 172.17.246.119
```
2. Verify that this IP address is not routed through docker0 but through any other interface, for example, ens3:
```
ip r get 172.17.246.119
```
  Example of the system response if the routing is configured correctly:
```
172.17.246.119 via 172.18.194.1 dev ens3 src 172.18.1.1 uid 1000
  cache
```
  Example of the system response if the routing is configured incorrectly:
```
172.17.246.119 via 172.18.194.1 dev docker0 src 172.18.1.1 uid 1000
  cache
```
If the routing is incorrect, change the IP address of the default Docker bridge:
1. Create or edit /etc/docker/daemon.json by adding the "bip" option:
```
{
  "bip": "192.168.91.1/24"
}
```
2. Restart the Docker daemon:
```
sudo systemctl restart docker
```
If required, customize addresses for your kind Docker network or any other additional Docker networks:
1. Remove the kind network:
```
docker network rm 'kind'
```
2. Select one of the following options:
  - Configure /etc/docker/daemon.json:
    
    Note
    
    The following steps are applied to to customize addresses for the kind Docker network. Use these steps as an example for any other additional Docker networks.
    1. Add the following section to /etc/docker/daemon.json:
      { "default-address-pools": [ {"base":"192.169.0.0/16","size":24} ] }
    2. Restart the Docker daemon:
      sudo systemctl restart docker
      After Docker restart, the newly created local or global scope networks, including 'kind', will be dynamically assigned a subnet from the defined pool.
  - Recreate the 'kind' Docker network manually with a subnet that is not in use in your network. For example:
    docker network create -o com.docker.network.bridge.enable_ip_masquerade=true -d bridge --subnet 192.168.0.0/24 'kind'
    Caution
    
    Docker pruning removes the user defined networks, including 'kind'. Therefore, every time after running the Docker pruning commands, re-create the 'kind' network again using the command above.

Troubleshoot the bootstrap region creation¶

If the BootstrapRegion object is in the Error state, find the error type in the Status field of the object for the following components to resolve the issue:

Field name

Troubleshooting steps

Helm

If the bootstrap HelmBundle is not ready for a long time, for example, during 15 minutes in case of an average network bandwidth, verify statuses of non-ready releases and resolve the issue depending on the error message of a particular release:

kubectl --kubeconfig <pathToKindKubeconfig> \
get helmbundle bootstrap -o json | \
jq '.status.releaseStatuses[] | select(.ready == false) | {name: .chart, message: .message}'

If fixing the issues with Helm releases does not help, collect the Helm Controller logs and filter them by error to find the root cause:

kubectl --kubeconfig <pathToKindKubeconfig> -n kube-sytem \
logs -lapp=helm-controller | grep "ERROR"

Deployments

If some of deployments are not ready for a long time while the bootstrap HelmBundle is ready, restart the affected deployments:

kubectl --kubeconfig <pathToKindKubeconfig> \
-n kaas rollout restart deploy <notReadyDeploymentName>

If restarting of the affected deployments does not help, collect and assess the logs of non-ready deployments:

kubectl --kubeconfig <pathToKindKubeconfig> \
-n kaas logs -lapp.kubernetes.io/name=<notReadyDeploymentName>

Provider

The status of this field becomes Ready when all provider-related HelmBundle charts are configured and in the Ready status.

See also

Collect the bootstrap logs

Troubleshoot machine creation¶

If a Machine object is stuck in the same status for a long time, identify the status phase of the affected machine and proceed as described below.

To verify the status of the created Machine objects:

kubectl --kubeconfig <pathToKindKubeconfig> \
get machines -o jsonpath='{.items[*].status.phase}'

The deployment statuses of a Machine object are the same as the LCMMachine object states:

Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond to the prepare phase. This phase usually involves downloading the necessary archives and packages.
Deploy - the machine executes StateItems that correspond to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE) node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond to the reconfigure phase. The machine configuration is being updated without affecting workloads running on the machine.

If the system response is empty, approve the BootstrapRegion object using the Container Cloud CLI:

./container-cloud bootstrap approve all

If the system response is not empty and the status remains the same for a while, the issue may relate to machine misconfiguration. Therefore, verify and adjust the parameters of the affected Machine object.

Troubleshoot deployment stages¶

If the cluster deployment is stuck on the same stage for a long time, it may be related to configuration issues in the Machine or other deployment objects.

To troubleshoot cluster deployment:

Identify the current deployment stage that got stuck:
```
kubectl --kubeconfig <pathToKindKubeconfig> \
get cluster <cluster-name> -o jsonpath='{.status.bootstrapStatus}{"\n"}'
```
For the deployment stages description, see Overview of the deployment workflow.

Collect the bootstrap-provider logs and identify a repetitive error that relates to the stuck deployment stage:

kubectl --kubeconfig <pathToKindKubeconfig> \
-n kaas logs -lapp.kubernetes.io/name=bootstrap-provider

Examples of repetitive errors¶
Error name	Solution
`Cluster nodes are not yet ready`	Verify configuration of `Machine` objects .
`Starting pivot`	Contact Mirantis support for further issue assessment.
`Some objects in cluster are not ready` with the same `deployment` names	Verify the related `deployment` configuration.

See also

Collect the bootstrap logs

Collect the bootstrap logs¶

If the bootstrap process is stuck or fails, collect and inspect the bootstrap and management cluster logs.

To collect the bootstrap logs:

Logs structure¶

The Container Cloud logs structure in <output_dir>/<cluster_name>/ is as follows:

/events.log
Human-readable table that contains information about the cluster events.
/system
System logs.
/system/mke (or /system/MachineName/mke)
Mirantis Kuberntes Engine (MKE) logs.
/objects/cluster
Logs of the non-namespaced Kubernetes objects.
/objects/namespaced
Logs of the namespaced Kubernetes objects.
/objects/namespaced/<namespaceName>/core/pods
Logs of the pods from a specific Kubernetes namespace. For example, logs of the pods from the kaas namespace contain logs of Container Cloud controllers, including bootstrap-cluster-controller since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0).
/objects/namespaced/<namespaceName>/core/pods/<containerName>.prev.log
Logs of the pods from a specific Kubernetes namespace that were previously removed or failed.
/objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.log
Technology Preview. Ironic pod logs.

Note

Logs collected by the syslog container during the bootstrap phase are not transferred to the management cluster during pivoting. These logs are located in /volume/log/ironic/ansible_conductor.log inside the Ironic pod.

Each log entry of the management cluster logs contains a request ID that identifies chronology of actions performed on a cluster or machine. The format of the log entry is as follows:

<process ID>.[<subprocess ID>...<subprocess ID N>].req:<requestID>: <logMessage>

For example, bm.machine.req:28 contains information about the task 28 applied to a bare metal machine.

Since Container Cloud 2.22.0 (Cluster release 11.6.0), the logging format has the following extended structure for the admission-controller, storage-discovery, and all baremetal-provider services of a management cluster:

level:<debug,info,warn,error,panic>,
ts:<YYYY-MM-DDTHH:mm:ssZ>,
logger:<processID>.<subProcessID(s)>.req:<requestID>,
caller:<lineOfCode>,
msg:<message>,
error:<errorMessage>,
stacktrace:<codeInfo>

Since Container Cloud 2.23.0 (Cluster release 11.6.0), this structure also applies to the <name>-controller services of a management cluster.

level
Informational level. Possible values: debug, info, warn, error, panic.
ts
Time stamp in the <YYYY-MM-DDTHH:mm:ssZ> format. For example: 2022-11-14T21:37:23Z.
logger
Details on the process ID being logged:
- <processID>
 Primary process identifier. The list of possible values includes bm, iam, license, and bootstrap.
 
 Note
 
 The iam and license values are available since Container Cloud 2.23.0 (Cluster release 11.7.0). The bootstrap value is available since Container Cloud 2.25.0 (Cluster release 16.0.0).
- <subProcessID(s)>
 One or more secondary process identifiers. The list of possible values includes cluster, machine, controller, and cluster-ctrl.
 
 Note
 
 The controller value is available since Container Cloud 2.23.0 (Cluster release 11.7.0).
 
 The cluster-ctrl value is available since Container Cloud 2.25.0 (Cluster release 16.0.0) for the bootstrap process identifier.
- req
 Request ID number that increases when a service performs the following actions:
 
 Receives a request from Kubernetes about creating, updating, or deleting an object
 
 Receives an HTTP request
 
 Runs a background process
 
 The request ID allows combining all operations performed with an object within one request. For example, the result of a Machine object creation, update of its statuses, and so on has the same request ID.
caller
Code line used to apply the corresponding action to an object.
msg
Description of a deployment or update phase. If empty, it contains the "error" key with a message followed by the "stacktrace" key with stack trace details. For example:
"msg"="" "error"="Cluster nodes are not yet ready" "stacktrace": "<stack-trace-info>"
The log format of the following Container Cloud components does not contain the "stacktrace" key for easier log handling: baremetal-provider, bootstrap-provider, and host-os-modules-controller.

Note

Logs may also include a number of informational key-value pairs containing additional cluster details. For example, "name": "object-name", "foobar": "baz".

Depending on the type of issue found in logs, apply the corresponding fixes. For example, if you detect the LoadBalancer ERROR state errors during the bootstrap of an OpenStack-based management cluster, contact your system administrator to fix the issue.

See also

Collect cluster logs

Troubleshoot bare metal¶

This section provides solutions to the issues that may occur while managing your bare metal infrastructure.

Log in to the IPA virtual console for hardware troubleshooting¶

Container Cloud uses kernel and initramfs files with the pre-installed Ironic Python Agent (IPA) for inspection of server hardware. The IPA image initramfs is based on Ubuntu Server.

If you need to troubleshoot hardware during inspection, you can use the IPA virtual console to assess hardware logs and image configuration.

To log in to the IPA virtual console of a bare metal host:

Create the bare metal host object for the required bare metal host as described in Add a bare metal host using CLI and wait for inspection to complete.

Caution

Meantime, do not create the Machine object for the bare metal host being inspected to prevent automatic provisioning.

Using the pwgen utility, recover the dynamically calculated password of the IPA image:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> \
-n kaas get secret ironic-ssh-key \
-o jsonpath="{.data.public}" | base64 -d > /tmp/ironic-ssh-key.pub
pwgen -H /tmp/ironic-ssh-key.pub -1 -s 16
rm /tmp/ironic-ssh-key.pub

Remotely log in to the IPA console of the bare metal host using the devuser user name and the password obtained in the previous step. For example, use IPMItool, Integrated Lights-Out, or the iDRAC web UI.

Note

To assess the IPA logs, use the journalctl -u ironic-python-agent.service command.

Bare metal hosts in ‘provisioned registration error’ state after update¶

After update of a management or managed cluster created using the Container Cloud release earlier than 2.6.0, a bare metal host state is Provisioned in the Container Cloud web UI while having the error state in logs with the following message:

status:
  errorCount: 1
  errorMessage: 'Host adoption failed: Error while attempting to adopt node  7a8d8aa7-e39d-48ec-98c1-ed05eacc354f:
    Validation of image href http://10.10.10.10/images/stub_image.qcow2 failed,
    reason: Got HTTP code 404 instead of 200 in response to HEAD request..'
  errorType: provisioned registration error

The issue is caused by the image URL pointing to an unavailable resource due to the URI IP change during update. To apply the issue resolution, update URLs for the bare metal host status and spec with the correct values that use a stable DNS record as a host.

To apply the issue resolution:

Note

In the commands below, we update master-2 as an example. Replace it with the corresponding value to fit your deployment.

Exit Lens.
In a new terminal, configure access to the affected cluster.
Start kube-proxy:
```
kubectl proxy &
```

Pause the reconcile:

kubectl patch bmh master-2 --type=merge --patch '{"metadata":{"annotations":{"baremetalhost.metal3.io/paused": "true"}}}'

Create the payload data with the following content:

For status_payload.json:

{
   "status": {
      "errorCount": 0,
      "errorMessage": "",
      "provisioning": {
         "image": {
            "checksum": "http://httpd-http/images/stub_image.qcow2.md5sum",
            "url": "http://httpd-http/images/stub_image.qcow2"
         },
         "state": "provisioned"
      }
   }
}

For status_payload.json:

{
   "spec": {
      "image": {
         "checksum": "http://httpd-http/images/stub_image.qcow2.md5sum",
         "url": "http://httpd-http/images/stub_image.qcow2"
      }
   }
}

Verify that the payload data is valid:
```
cat status_payload.json | jq
cat spec_payload.json | jq
```
The system response must contain the data added in the previous step.

Patch the bare metal host status with payload:

curl -k -v -XPATCH -H "Accept: application/json" -H "Content-Type: application/merge-patch+json" --data-binary "@status_payload.json" 127.0.0.1:8001/apis/metal3.io/v1alpha1/namespaces/default/baremetalhosts/master-2/status

Patch the bare metal host spec with payload:

kubectl patch bmh master-2 --type=merge --patch "$(cat spec_payload.json)"

Resume the reconcile:

kubectl patch bmh master-2 --type=merge --patch '{"metadata":{"annotations":{"baremetalhost.metal3.io/paused":null}}}'

Close the terminal to quit kube-proxy and resume Lens.

Inspection error on bare metal hosts after dnsmasq restart¶

If the dnsmasq pod is restarted during the bootstrap of newly added nodes, those nodes may fail to undergo inspection. That can result in inspection error in the corresponding BareMetalHost objects.

The issue can occur when:

The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the dhcpd container of the dnsmasq pod is restarted.

Caution

If changing or adding of DHCP subnets is required to bootstrap new nodes, wait after changing or adding DHCP subnets until the dnsmasq pod becomes ready, then create BareMetalHost objects.

To verify whether the nodes are affected:

Verify whether the BareMetalHost objects contain the inspection error:

kubectl get bmh -n <managed-cluster-namespace-name>

Example of system response:

NAME            STATE         CONSUMER        ONLINE   ERROR              AGE
test-master-1   provisioned   test-master-1   true                        9d
test-master-2   provisioned   test-master-2   true                        9d
test-master-3   provisioned   test-master-3   true                        9d
test-worker-1   provisioned   test-worker-1   true                        9d
test-worker-2   provisioned   test-worker-2   true                        9d
test-worker-3   inspecting                    true     inspection error   19h

Verify whether the dnsmasq pod was in Ready state when the inspection of the affected baremetal hosts (test-worker-3 in the example above) was started:

kubectl -n kaas get pod <dnsmasq-pod-name> -oyaml

Example of system response:

...
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-10-10T15:37:34Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2024-10-11T07:38:54Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2024-10-11T07:38:54Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2024-10-10T15:37:34Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://6dbcf2fc4b36ce4c549c9191ab01f72d0236c51d42947675302675e4bfaf4cdf
    image: docker-dev-kaas-virtual.artifactory-eu.mcp.mirantis.net/bm/baremetal-dnsmasq:base-2-28-alpine-20240812132650
    imageID: docker-dev-kaas-virtual.artifactory-eu.mcp.mirantis.net/bm/baremetal-dnsmasq@sha256:3dad3e278add18e69b2608e462691c4823942641a0f0e25e6811e703e3c23b3b
    lastState:
      terminated:
        containerID: containerd://816fcf079cd544acd74e312065de5b5ed4dbf1dc6159fefffff4f644b5e45987
        exitCode: 0
        finishedAt: "2024-10-11T07:38:35Z"
        reason: Completed
        startedAt: "2024-10-10T15:37:45Z"
    name: dhcpd
    ready: true
    restartCount: 2
    started: true
    state:
      running:
        startedAt: "2024-10-11T07:38:37Z"
  ...

In the system response above, the dhcpd container was not ready between "2024-10-11T07:38:35Z" and "2024-10-11T07:38:54Z".

Verify the affected baremetal host. For example:

kubectl get bmh -n managed-ns test-worker-3 -oyaml

Example of system response:

...
status:
  errorCount: 15
  errorMessage: Introspection timeout
  errorType: inspection error
  ...
  operationHistory:
    deprovision:
      end: null
      start: null
    inspect:
      end: null
      start: "2024-10-11T07:38:19Z"
    provision:
      end: null
      start: null
    register:
      end: "2024-10-11T07:38:19Z"
      start: "2024-10-11T07:37:25Z"

In the system response above, inspection was started at "2024-10-11T07:38:19Z", immediately before the period of the dhcpd container downtime. Therefore, this node is most likely affected by the issue.

To apply the issue resolution:

Reboot the node using the IPMI reset or cycle command.
If the node fails to boot, remove the failed BareMetalHost object and create it again:
1. Remove BareMetalHost object. For example:
```
kubectl delete bmh -n managed-ns test-worker-3
```
2. Verify that the BareMetalHost object is removed:
```
kubectl get bmh -n managed-ns test-worker-3
```
3. Create a BareMetalHost object from the template. For example:
```
kubectl create -f bmhc-test-worker-3.yaml
kubectl create -f bmh-test-worker-3.yaml
```

Troubleshoot an operating system upgrade with host restart¶

Mandatory host restart for the operating system (OS) upgrade is designed to be safe and takes certain precautions to protect the user data and the cluster integrity. However, sometimes it may result in a host-level failure and block the cluster update. Use this section to troubleshoot such issues.

Warning

The OS upgrade cannot be rolled back on a host or cluster level. If the OS upgrade fails, recover or remove the faulty host before you can complete the cluster upgrade.

Caution

Depending on the cluster configuration, applying security updates and host restart can increase the update time for each node to up to 1 hour.
Cluster nodes are updated one by one. Therefore, for large clusters, the update may take several days to complete.

Pre-upgrade workload lock issues¶

If the cluster upgrade does not start, verify whether the ceph-clusterworkloadlock object is present in the Container Cloud Management API:

kubectl get clusterworkloadlocks

Example of system response:

NAME                       AGE
ceph-clusterworkloadlock   7h37m

This object indicates that LCM operations that require hosts restart cannot start on the cluster. The Ceph Controller verifies that Ceph services are prepared for restart. Once the Ceph Controller completes verification, it removes the ceph-clusterworkloadlock object and the cluster upgrade starts.

If this object is still present after the upgrade is initiated, assess the logs of the ceph-controller pod to identify and fix errors:

kubectl -n ceph-lcm-mirantis logs deployments/ceph-controller

If a node upgrade does not start, verify whether the NodeWorkloadLock object is present in the Container Cloud Management API:

kubectl get nodeworkloadlocks

If the object is present, assess the affected node logs to identify and fix errors.

Host restart issues¶

If the host cannot boot after upgrade, verify the following possible issues:

Invalid boot order configuration in the host BIOS settings
Inspect the host settings using the IPMI console. If you see a message about an invalid boot device, verify and correct the boot order in the host BIOS settings. Set the first boot device to a network card and the second device to a local disk (legacy or UEFI).
The host is stuck in the GRUB rescue mode
If you see the following message, you are likely affected by the Ubuntu known issue in the Ubuntu grub-installer:
Entering rescue mode... grub rescue>
In this case, redeploy the host with a correctly defined BareMetalHostProfile. You will have to delete the corresponding Machine resource and create a new Machine with the corresponding BareMetalHostProfile. For details, see Create MOSK host profiles.

Troubleshoot iPXE boot issues¶

Container Cloud relies on iPXE to remotely bootstrap bare metal machines before provisioning them to Kubernetes clusters. The remote bootstrap with iPXE depends on the state of the underlay network. Incorrect or suboptimal configuration of the underlay network can cause the process to fail.

The following error may mean that network configuration is incorrect:

iPXE 1.21.1+ (g74c5) - Open Source Network Boot Firmware - http://ipxe.org
Features: DNS HTTP iSCSI TFTP SRP AoE EFI Menu

net2: 3c:ec:ef:70:39:fe using 14e4-16D8 on 0000:ca:00.0 (Ethernet) [open]
  [Link:up, TX:0 TXE:1 RX:0 RXE:0]
  [TXE: 1 x "Network unreachable (http://ipxe.org/28086090)"]
Configuring (net2 3c:ec:ef:70:39:fe)...... No configuration methods
succeeded (http://ipxe.org/040ee186)
No more network devices

Network switch not forwarding packets for a prolonged period after the server brings up a link to a switch port may be the reason for this error. It may happen because the switch waits for the Spanning Tree Protocol (STP) configuration on the port.

To avoid this issue, configure the ports connecting the servers in STP portfast mode. See details in the vendor documentation for your particular network switch, for example:

Provisioning failure due to device naming issues in a bare metal host profile¶

During a bare metal host provisioning, transition to each stage implies the host reboot. This may cause device name issues if a device is configured using the by_name device identifier.

In Linux, assignment of device names, for example, /dev/sda, to physical disks can change, especially in systems with multiple disks or when hardware configuration changes. For example:

If you add or remove a hard drive or change the boot order, the device names can shift.
If the system uses hardware with additional disk array controllers, such as RaidControllers in the JBOD mode, device names can shift during reboot. This can lead to unintended consequences and potential data loss if your file systems are not mounted correctly.
The /dev/sda partition on the first boot may become /dev/sdb on the second boot. Consequently, your file system may not be provisioned as expected, leading to errors during disk formatting and assembling.

Linux recommends using unique identifiers (UUIDs) or labels for device identification in /etc/fstab. These identifiers are more stable and ensure that the defined devices are mounted regardless of the naming changes.

Therefore, to prevent device naming issues during a bare metal host provisioning, instead of the by_name identifier, Mirantis recommends using the workBy parameter along with device labels or filters such as minSize and maxSize. These device settings ensure a successful bare metal host provisioning with /dev/disk/by-uuid/<UUID> or /dev/disk/by-label/<label> in /etc/fstab. For details on workBy, see BareMetalHostProfile spec.

Overview of the device naming logic in a bare metal host profile¶

To manage physical devices, the bare metal provider uses the following entities:

The BareMetalHostProfile object
Object created by an operator with description of the required file-system schema on a node. For details, see Create a custom bare metal host profile.
The status.hardware.storage fields of the BareMetalHost object
Initial description of physical disks that is discovered only once during a bare metal host inspection.
The status.hostInfo.storage fields of the LCMMachine object
Current state of physical disks during life cycle of Machine and LCMMachine objects.

The default device naming workflow during management of BareMetalHost and BareMetalHostProfile objects is as follows:

An operator creates the BareMetalHostInventory and BareMetalHostCredential objects.

Note

Before update of the management cluster to Container Cloud 2.29.0 (Cluster release 16.4.0), instead of BareMetalHostInventory, use the BareMetalHost object. For details, see BareMetalHost resource.

Caution

While the Cluster release of the management cluster is 16.4.0, BareMetalHostInventory operations are allowed to m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
The baremetal-operator service inspects the objects.
The operator creates or reviews an existing BareMetalHostProfile object using the status.hardware.storage fields of the BareMetalHost object associated with the created BareMetalHostInventory object. For details, see Create a custom bare metal host profile.
The operator creates a Machine object and maps it to the related BareMetalHost and BareMetalHostProfile objects. For details, see Deploy a machine to a specific bare metal host.
The baremeral-provider service starts processing BareMetalHostProfile and searching for suitable hardware disks to build the internal AnsibleExtra object configuration. During the building process:
- The bmh:hardware:storage list is sorted using the hardwaredetails-storage-sort-term rule. For details, see Add a bare metal host using CLI and BareMetalHost metadata.
- The first suitable disk for an item in the bmhp.spec.devices list is selected.
The cleanup and provisioning stage of BareMetalHost starts:
- During provisioning, the selection order described in bmhp.workBy applies. For details, see Create MOSK host profiles.
  
  This logic ensures that an exact by_id name is taken from the discovery stage, as opposed to by_name that can be changed during transition from the inspection to provisioning stage.
- After provisioning finishes, the target system /etc/fstab is generated using UUIDs.

Note

For the /dev/disk/by-id mapping in Ceph, see Addressing storage devices.

Troubleshoot Ceph¶

This section provides solutions to the issues that may occur during Ceph usage.

Ceph disaster recovery¶

This section describes how to recover a failed or accidentally removed Ceph cluster in the following cases:

If Ceph Controller underlying a running Rook Ceph cluster has failed and you want to install a new Ceph Controller Helm release and recover the failed Ceph cluster onto the new Ceph Controller.
To migrate the data of an existing Ceph cluster to a new deployment in case downtime can be tolerated.

Consider the common state of a failed or removed Ceph cluster:

The rook-ceph namespace does not contain pods or they are in the Terminating state.
The rook-ceph or/and ceph-lcm-mirantis namespaces are in the Terminating state.
The ceph-operator is in the FAILED state:
- Management cluster: the state of the ceph-operator Helm release in the management HelmBundle, such as default/kaas-mgmt, has switched from DEPLOYED to FAILED.
- Managed cluster: the state of the osh-system/ceph-operator HelmBundle, or a related namespace, has switched from DEPLOYED to FAILED.
The Rook CephCluster, CephBlockPool, CephObjectStore CRs in the rook-ceph namespace cannot be found or have the deletionTimestamp parameter in the metadata section.

Note

Prior to recovering the Ceph cluster, verify that your deployment meets the following prerequisites:

The Ceph cluster fsid exists.
The Ceph cluster Monitor keyrings exist.
The Ceph cluster devices exist and include the data previously handled by Ceph OSDs.

Ceph cluster recovery workflow¶

Create a backup of the remaining data and resources.
Clean up the failed or removed ceph-operator Helm release.
Deploy a new ceph-operator Helm release with the previously used KaaSCephCluster and one Ceph Monitor.
Replace the ceph-mon data with the old cluster data.
Replace fsid in secrets/rook-ceph-mon with the old one.
Fix the Monitor map in the ceph-mon database.
Fix the Ceph Monitor authentication key and disable authentication.
Start the restored cluster and inspect the recovery.
Fix the admin authentication key and enable authentication.
Restart the cluster.

Recover a failed or removed Ceph cluster¶

Back up the remaining resources. Skip the commands for the resources that have already been removed:

kubectl -n rook-ceph get cephcluster <clusterName> -o yaml > backup/cephcluster.yaml
# perform this for each cephblockpool
kubectl -n rook-ceph get cephblockpool <cephBlockPool-i> -o yaml > backup/<cephBlockPool-i>.yaml
# perform this for each client
kubectl -n rook-ceph get cephclient <cephclient-i> -o yaml > backup/<cephclient-i>.yaml
kubectl -n rook-ceph get cephobjectstore <cephObjectStoreName> -o yaml > backup/<cephObjectStoreName>.yaml
# perform this for each secret
kubectl -n rook-ceph get secret <secret-i> -o yaml > backup/<secret-i>.yaml
# perform this for each configMap
kubectl -n rook-ceph get cm <cm-i> -o yaml > backup/<cm-i>.yaml

SSH to each node where the Ceph Monitors or Ceph OSDs were placed before the failure and back up the valuable data:
```
mv /var/lib/rook /var/lib/rook.backup
mv /etc/ceph /etc/ceph.backup
mv /etc/rook /etc/rook.backup
```
Once done, close the SSH connection.
Clean up the previous installation of ceph-operator. For details, see Rook documentation: Cleaning up a cluster.
1. Delete the ceph-lcm-mirantis/ceph-controller deployment:
```
kubectl -n ceph-lcm-mirantis delete deployment ceph-controller
```
2. Delete all deployments, DaemonSets, and jobs from the rook-ceph namespace, if any:
```
kubectl -n rook-ceph delete deployment --all
kubectl -n rook-ceph delete daemonset --all
kubectl -n rook-ceph delete job --all
```
3. Edit the MiraCeph and MiraCephHealth CRs of the ceph-lcm-mirantis namespace and remove the finalizer parameter from the metadata section:
```
kubectl -n ceph-lcm-mirantis edit miraceph
kubectl -n ceph-lcm-mirantis edit miracephhealth
```
 Note
 
 Before MOSK 25.1, use MiraCephLog instead of MiraCephHealth as the object name and in the command above.
4. Edit the CephCluster, CephBlockPool, CephClient, and CephObjectStore CRs of the rook-ceph namespace and remove the finalizer parameter from the metadata section:
```
kubectl -n rook-ceph edit cephclusters
kubectl -n rook-ceph edit cephblockpools
kubectl -n rook-ceph edit cephclients
kubectl -n rook-ceph edit cephobjectstores
kubectl -n rook-ceph edit cephobjectusers
```
5. Once you clean up every single resource related to the Ceph release, open the Cluster CR for editing:
```
kubectl -n <projectName> edit cluster <clusterName>
```
 Substitute <projectName> with default for the management cluster or with a related project name for the managed cluster.
6. Remove the ceph-controller Helm release item from the spec.providerSpec.value.helmReleases array and save the Cluster CR:
```
- name: ceph-controller
 values: {}
```
7. Verify that ceph-controller has disappeared from the corresponding HelmBundle:
```
kubectl -n <projectName> get helmbundle -o yaml
```
Open the KaaSCephCluster CR of the related management or managed cluster for editing:
```
kubectl -n <projectName> edit kaascephcluster
```
Substitute <projectName> with default for the management cluster or with a related project name for the managed cluster.
Edit the roles of nodes. The entire nodes spec must contain only one mon role. Save KaaSCephCluster after editing.
Open the Cluster CR for editing:
```
kubectl -n <projectName> edit cluster <clusterName>
```
Substitute <projectName> with default for the management cluster or with a related project name for the managed cluster.
Add ceph-controller to spec.providerSpec.value.helmReleases to restore the ceph-controller Helm release. Save Cluster after editing.
```
- name: ceph-controller
  values: {}
```
Verify that the ceph-controller Helm release is deployed:
1. Inspect the Rook Operator logs and wait until the orchestration has settled:
```
kubectl -n rook-ceph logs -l app=rook-ceph-operator
```
2. Verify that the pods in the rook-ceph namespace have rook-ceph-mon-a, rook-ceph-mgr-a, and all the auxiliary pods ar up and running, and no rook-ceph-osd-ID-xxxxxx are running:
```
kubectl -n rook-ceph get pod
```
3. Verify the Ceph state. The output must indicate that one mon and one mgr are running, all Ceph OSDs are down, and all PGs are in the Unknown state.
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
```
  Note
  
  Rook should not start any Ceph OSD daemon because all devices belong to the old cluster that has a different fsid. To verify the Ceph OSD daemons, inspect the osd-prepare pods logs:
  kubectl -n rook-ceph logs -l app=rook-ceph-osd-prepare

Connect to the terminal of the rook-ceph-mon-a pod:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \
-l app=rook-ceph-mon -o jsonpath='{.items[0].metadata.name}') bash

Output the keyring file and save it for further usage:
```
cat /etc/ceph/keyring-store/keyring
exit
```

Obtain and save the nodeName of mon-a for further usage:

kubectl -n rook-ceph get pod $(kubectl -n rook-ceph get pod \
-l app=rook-ceph-mon -o jsonpath='{.items[0].metadata.name}') -o jsonpath='{.spec.nodeName}'

Obtain and save the cephImage used in the Ceph cluster for further usage:

kubectl -n ceph-lcm-mirantis get cm ccsettings -o jsonpath='{.data.cephImage}'

Stop Rook Operator and scale the deployment replicas to 0:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0

Remove the Rook deployments generated with Rook Operator:

kubectl -n rook-ceph delete deploy -l app=rook-ceph-mon
kubectl -n rook-ceph delete deploy -l app=rook-ceph-mgr
kubectl -n rook-ceph delete deploy -l app=rook-ceph-osd
kubectl -n rook-ceph delete deploy -l app=rook-ceph-crashcollector

Using the saved nodeName, SSH to the host where rook-ceph-mon-a in the new Kubernetes cluster is placed and perform the following steps:

Remove /var/lib/rook/mon-a or copy it to another folder:
```
mv /var/lib/rook/mon-a /var/lib/rook/mon-a.new
```
Pick a healthy rook-ceph-mon-ID directory (/var/lib/rook.backup/mon-ID) in the previous backup, copy to /var/lib/rook/mon-a:
```
cp -rp /var/lib/rook.backup/mon-<ID> /var/lib/rook/mon-a
```
Substitute ID with any healthy mon node ID of the old cluster.
Replace /var/lib/rook/mon-a/keyring with the previously saved keyring, preserving only the [mon.] section. Remove the [client.admin] section.

Run the cephImage Docker container using the previously saved cephImage image:

docker run -it --rm -v /var/lib/rook:/var/lib/rook <cephImage> bash

Inside the container, create /etc/ceph/ceph.conf for a stable operation of ceph-mon:
```
touch /etc/ceph/ceph.conf
```

Change the directory to /var/lib/rook and edit monmap by replacing the existing mon hosts with the new mon-a endpoints:

cd /var/lib/rook
rm /var/lib/rook/mon-a/data/store.db/LOCK # make sure the quorum lock file does not exist
ceph-mon --extract-monmap monmap --mon-data ./mon-a/data  # Extract monmap from old ceph-mon db and save as monmap
monmaptool --print monmap  # Print the monmap content, which reflects the old cluster ceph-mon configuration.
monmaptool --rm a monmap  # Delete `a` from monmap.
monmaptool --rm b monmap  # Repeat, and delete `b` from monmap.
monmaptool --rm c monmap  # Repeat this pattern until all the old ceph-mons are removed and monmap won't be empty
monmaptool --addv a [v2:<nodeIP>:3300,v1:<nodeIP>:6789] monmap   # Replace it with the rook-ceph-mon-a address you got from previous command.
ceph-mon --inject-monmap monmap --mon-data ./mon-a/data  # Replace monmap in ceph-mon db with our modified version.
rm monmap
exit

Substitute <nodeIP> with the IP address of the current <nodeName> node.

Close the SSH connection.

Change fsid to the original one to run Rook as an old cluster:

kubectl -n rook-ceph edit secret/rook-ceph-mon

Note

The fsid is base64 encoded and must not contain a trailing carriage return. For example:

echo -n a811f99a-d865-46b7-8f2c-f94c064e4356 | base64  # Replace with the fsid from the old cluster.

Scale the ceph-lcm-mirantis/ceph-controller deployment replicas to 0:

kubectl -n ceph-lcm-mirantis scale deployment ceph-controller --replicas 0

Disable authentication:

Open the cm/rook-config-override ConfigMap for editing:

kubectl -n rook-ceph edit cm/rook-config-override

Add the following content:

data:
  config: |
    [global]
    ...
    auth cluster required = none
    auth service required = none
    auth client required = none
    auth supported = none

Start Rook Operator by scaling its deployment replicas to 1:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1

Inspect the Rook Operator logs and wait until the orchestration has settled:
```
kubectl -n rook-ceph logs -l app=rook-ceph-operator
```
Verify that the pods in the rook-ceph namespace have the rook-ceph-mon-a, rook-ceph-mgr-a, and all the auxiliary pods are up and running, and all rook-ceph-osd-ID-xxxxxx greater than zero are running:
```
kubectl -n rook-ceph get pod
```
Verify the Ceph state. The output must indicate that one mon, one mgr, and all Ceph OSDs are up and running and all PGs are either in the Active or Degraded state:
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \
-l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
```

Enter the ceph-tools pod and import the authentication key:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \
-l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash
vi key
[paste keyring content saved before, preserving only `[client admin]` section]
ceph auth import -i key
rm key
exit

Stop Rook Operator by scaling the deployment to 0 replicas:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0

Re-enable authentication:

Open the cm/rook-config-override ConfigMap for editing:

kubectl -n rook-ceph edit cm/rook-config-override

Remove the following content:

data:
  config: |
    [global]
    ...
    auth cluster required = none
    auth service required = none
    auth client required = none
    auth supported = none

Remove all Rook deployments generated with Rook Operator:

kubectl -n rook-ceph delete deploy -l app=rook-ceph-mon
kubectl -n rook-ceph delete deploy -l app=rook-ceph-mgr
kubectl -n rook-ceph delete deploy -l app=rook-ceph-osd
kubectl -n rook-ceph delete deploy -l app=rook-ceph-crashcollector

Start Ceph Controller by scaling its deployment replicas to 1:

kubectl -n ceph-lcm-mirantis scale deployment ceph-controller --replicas 1

Start Rook Operator by scaling its deployment replicas to 1:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1

Inspect the Rook Operator logs and wait until the orchestration has settled:
```
kubectl -n rook-ceph logs -l app=rook-ceph-operator
```
Verify that the pods in the rook-ceph namespace have the rook-ceph-mon-a, rook-ceph-mgr-a, and all the auxiliary pods are up and running, and all rook-ceph-osd-ID-xxxxxx greater than zero are running:
```
kubectl -n rook-ceph get pod
```
Verify the Ceph state. The output must indicate that one mon, one mgr, and all Ceph OSDs are up and running and the overall stored data size equals to the old cluster data size.
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
```
Edit the MiraCeph CR and add two more mon and mgr roles to the corresponding nodes:
```
kubectl -n ceph-lcm-mirantis edit miraceph
```
Inspect the Rook namespace and wait until all Ceph Monitors are in the Running state:
```
kubectl -n rook-ceph get pod -l app=rook-ceph-mon
```
Verify the Ceph state. The output must indicate that three mon (three in quorum), one mgr, and all Ceph OSDs are up and running and the overall stored data size equals to the old cluster data size.
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
```

Once done, the data from the failed or removed Ceph cluster is restored and ready to use.

Ceph Monitors recovery¶

This section describes how to recover failed Ceph Monitors of an existing Ceph cluster in the following state:

The Ceph cluster contains failed Ceph Monitors that cannot start and hang in the Error or CrashLoopBackOff state.

The logs of the failed Ceph Monitor pods contain the following lines:

mon.g does not exist in monmap, will attempt to join an existing cluster
...
mon.g@-1(???) e11 not in monmap and have been in a quorum before; must have been removed
mon.g@-1(???) e11 commit suicide!

The Ceph cluster contains at least one Running Ceph Monitor and the ceph -s command outputs one healthy mon and one healthy mgr instance.

Perform the following steps for all failed Ceph Monitors at a time if not stated otherwise.

To recover failed Ceph Monitors:

Obtain and export the kubeconfig of the affected cluster.

Scale the rook-ceph/rook-ceph-operator deployment down to 0 replicas:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0

Delete all failed Ceph Monitor deployments:
1. Identify the Ceph Monitor pods in the Error or CrashLookBackOff state:
```
kubectl -n rook-ceph get pod -l 'app in (rook-ceph-mon,rook-ceph-mon-canary)'
```
2. Verify that the affected pods contain the failure logs described above:
```
kubectl -n rook-ceph logs <failedMonPodName>
```
 Substitute <failedMonPodName> with the Ceph Monitor pod name. For example, rook-ceph-mon-g-845d44b9c6-fjc5d.
3. Save the identifying letters of failed Ceph Monitors for further usage. For example, f, e, and so on.
4. Delete all corresponding deployments of these pods:
 1. Identify the affected Ceph Monitor pod deployments:
 kubectl -n rook-ceph get deploy -l 'app in (rook-ceph-mon,rook-ceph-mon-canary)'
 2. Delete the affected Ceph Monitor pod deployments. For example, if the Ceph cluster has the rook-ceph-mon-c-845d44b9c6-fjc5d pod in the CrashLoopBackOff state, remove the corresponding rook-ceph-mon-c:
 kubectl -n rook-ceph delete deploy rook-ceph-mon-c
 Canary mon deployments have the suffix -canary.
Remove all corresponding entries of Ceph Monitors from the MON map:
1. Enter the ceph-tools pod:
```
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash
```
2. Inspect the current MON map and save the IP addresses of the failed Ceph monitors for further usage:
```
ceph mon dump
```
3. Remove all entries of failed Ceph Monitors using the previously saved letters:
```
ceph mon rm <monLetter>
```
 Substitute <monLetter> with the corresponding letter of a failed Ceph Monitor.
4. Exit the ceph-tools pod.

Remove all failed Ceph Monitors entries from the Rook mon endpoints ConfigMap:

Open the rook-ceph/rook-ceph-mon-endpoints ConfigMap for editing:
```
kubectl -n rook-ceph edit cm rook-ceph-mon-endpoints
```

Remove all entries of failed Ceph Monitors from the ConfigMap data and update the maxMonId value with the current number of Running Ceph Monitors. For example, rook-ceph-mon-endpoints has the following data:

data:
  csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["172.0.0.222:6789","172.0.0.223:6789","172.0.0.224:6789","172.16.52.217:6789","172.16.52.216:6789"]}]'
  data: a=172.0.0.222:6789,b=172.0.0.223:6789,c=172.0.0.224:6789,f=172.0.0.217:6789,e=172.0.0.216:6789
  mapping: '{"node":{
      "a":{"Name":"kaas-node-21465871-42d0-4d56-911f-7b5b95cb4d34","Hostname":"kaas-node-21465871-42d0-4d56-911f-7b5b95cb4d34","Address":"172.16.52.222"},
      "b":{"Name":"kaas-node-43991b09-6dad-40cd-93e7-1f02ed821b9f","Hostname":"kaas-node-43991b09-6dad-40cd-93e7-1f02ed821b9f","Address":"172.16.52.223"},
      "c":{"Name":"kaas-node-15225c81-3f7a-4eba-b3e4-a23fd86331bd","Hostname":"kaas-node-15225c81-3f7a-4eba-b3e4-a23fd86331bd","Address":"172.16.52.224"},
      "e":{"Name":"kaas-node-ba3bfa17-77d2-467c-91eb-6291fb219a80","Hostname":"kaas-node-ba3bfa17-77d2-467c-91eb-6291fb219a80","Address":"172.16.52.216"},
      "f":{"Name":"kaas-node-6f669490-f0c7-4d19-bf73-e51fbd6c7672","Hostname":"kaas-node-6f669490-f0c7-4d19-bf73-e51fbd6c7672","Address":"172.16.52.217"}}
  }'
  maxMonId: "5"

If e and f are the letters of failed Ceph Monitors, the resulting ConfigMap data must be as follows:

data:
  csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["172.0.0.222:6789","172.0.0.223:6789","172.0.0.224:6789"]}]'
  data: a=172.0.0.222:6789,b=172.0.0.223:6789,c=172.0.0.224:6789
  mapping: '{"node":{
      "a":{"Name":"kaas-node-21465871-42d0-4d56-911f-7b5b95cb4d34","Hostname":"kaas-node-21465871-42d0-4d56-911f-7b5b95cb4d34","Address":"172.16.52.222"},
      "b":{"Name":"kaas-node-43991b09-6dad-40cd-93e7-1f02ed821b9f","Hostname":"kaas-node-43991b09-6dad-40cd-93e7-1f02ed821b9f","Address":"172.16.52.223"},
      "c":{"Name":"kaas-node-15225c81-3f7a-4eba-b3e4-a23fd86331bd","Hostname":"kaas-node-15225c81-3f7a-4eba-b3e4-a23fd86331bd","Address":"172.16.52.224"}}
  }'
  maxMonId: "3"

Back up the data of the failed Ceph Monitors one by one:
1. SSH to the node of a failed Ceph Monitor using the previously saved IP address.
2. Move the Ceph Monitor data directory to another place:
```
mv /var/lib/rook/mon-<letter> /var/lib/rook/mon-<letter>.backup
```
3. Close the SSH connection.

Scale the rook-ceph/rook-ceph-operator deployment up to 1 replica:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1

Wait until all Ceph Monitors are in the Running state:

kubectl -n rook-ceph get pod -l app=rook-ceph-mon -w

Restore the data from the backup for each recovered Ceph Monitor one by one:
1. Enter a recovered Ceph Monitor pod:
```
kubectl -n rook-ceph exec -it <monPodName> bash
```
 Substitute <monPodName> with the recovered Ceph Monitor pod name. For example, rook-ceph-mon-g-845d44b9c6-fjc5d.
2. Recover the mon data backup for the current Ceph Monitor:
```
ceph-monstore-tool /var/lib/rook/mon-<letter>.backup/data store-copy /var/lib/rook/mon-<letter>/data/
```
 Substitute <letter> with the current Ceph Monitor pod letter, for example, e.

Verify the Ceph state. The output must indicate the desired number of Ceph Monitors and all of them must be in quorum.

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s

KaaSCephOperationRequest failure with a timeout during rebalance¶

Ceph OSD removal procedure includes the Ceph OSD out action that starts the Ceph PGs rebalancing process. The total time for rebalancing depends on a cluster hardware configuration: network bandwidth, Ceph PGs placement, number of Ceph OSDs, and so on. The default rebalance timeout is limited by 30 minutes, which applies to standard cluster configurations.

If the rebalance takes more than 30 minutes, the KaaSCephOperationRequest resources created for removing Ceph OSDs or nodes fail with the following example message:

status:
  removeStatus:
    osdRemoveStatus:
      errorReason: Timeout (30m0s) reached for waiting pg rebalance for osd 2
      status: Failed

To apply the issue resolution, increase the timeout for all future KaaSCephOperationRequest resources:

On the management cluster, open the Cluster resource of the affected managed cluster for editing:
```
kubectl -n <managedClusterProjectName> edit cluster <managedClusterName>
```
Replace <managedClusterProjectName> and <managedClusterName> with the corresponding values of the affected managed cluster.

Add pgRebalanceTimeoutMin to the ceph-controller Helm release values section in the Cluster spec:

spec:
  providerSpec:
    value:
      helmReleases:
      - name: ceph-controller
        values:
          controllers:
            cephRequest:
              parameters:
                pgRebalanceTimeoutMin: <rebalanceTimeout>

The <rebalanceTimeout> value is a required rebalance timeout in minutes. Must be an integer greater than zero. For example, 60.

Save the edits and exit from the Cluster resource.

If you have an existing KaaSCephOperationRequest resource with errorReason to process:

Copy the spec section in the failed KaaSCephOperationRequest resource.
Create a new KaaSCephOperationRequest with a different name. For details, see Creating a Ceph OSD removal request.
Paste the previously copied spec section of the failed KaaSCephOperationRequest resource to the new one.
Remove the failed KaaSCephOperationRequest resource.

Ceph Monitors store.db size rapidly growing¶

The MON_DISK_LOW Ceph Cluster health message indicates that the store.db size of the Ceph Monitor is rapidly growing and the compaction procedure is not working. In most cases, store.db starts storing a number of logm keys that are buffered due to Ceph OSD shadow errors.

To verify if store.db size is rapidly growing:

Identify the Ceph Monitors store.db size:

for pod in $(kubectl get pods -n rook-ceph | grep mon | awk '{print $1}'); \
do printf "$pod:\n"; kubectl exec -n rook-ceph "$pod" -it -c mon -- \
du -cms /var/lib/ceph/mon/ ; done

Repeat the previous step two or three times within the interval of 5-15 seconds.

If between the command runs the total size increases by more than 10 MB, perform the steps described below to resolve the issue.

To apply the issue resolution:

Verify the original state of placement groups (PGs):

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s

Apply clog_to_monitors with the false value for all Ceph OSDs at runtime:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
ceph tell osd.* config set clog_to_monitors false

Restart Ceph OSDs one by one:

Restart one of the Ceph OSDs:

for pod in $(kubectl get pods -n rook-ceph -l app=rook-ceph-osd | \
awk 'FNR>1{print $1}'); do printf "$pod:\n"; kubectl -n rook-ceph \
delete pod "$pod"; echo "Continue?"; read; done

Once prompted Continue?, first verify that rebalancing has finished for the Ceph cluster, the Ceph OSD is up and in, and all PGs have returned to their original state:
```
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd tree
```
Once you are confident that the Ceph OSD restart and recovery is over, press ENTER.

Restart the remaining Ceph OSDs.

Note

Periodically verify the Ceph Monitors store.db size:

for pod in $(kubectl get pods -n rook-ceph | grep mon | awk \
'{print $1}'); do printf "$pod:\n"; kubectl exec -n rook-ceph \
"$pod" -it -c mon -- du -cms /var/lib/ceph/mon/ ; done

After some of the affected Ceph OSDs restart, Ceph Monitors will start decreasing the store.db size to the original 100-300 MB. However, complete the restart of all Ceph OSDs.

Replaced Ceph OSD fails to start on authorization¶

In rare cases, when the replaced Ceph OSD has the same ID as the previous Ceph OSD and starts on a device with the same name as the previous Ceph OSD, Rook fails to update the keyring value, which is stored on a node in the corresponding host path. Thereby, Ceph OSD cannot start and fails with the following exemplary log output:

Defaulted container "osd" out of: osd, activate (init), expand-bluefs (init), chown-container-data-dir (init)
debug 2024-03-13T11:53:13.268+0000 7f8f790b4640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
debug 2024-03-13T11:53:13.268+0000 7f8f7a0b6640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
debug 2024-03-13T11:53:13.268+0000 7f8f798b5640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
failed to fetch mon config (--no-mon-config to skip)

To verify that the cluster is affected, compare the keyring values stored in the Ceph cluster and on a node in the corresponding host path:

Obtain the keyring of a Ceph OSD stored in the Ceph cluster:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph auth get osd.<ID>

Substitute <ID> with the number of the required Ceph OSD.

Example output:

[osd.3]
key = AQAcovBlqP4qHBAALK6943yZyazoup7nE1YpeQ==
caps mgr = "allow profile osd"
caps mon = "allow profile osd"
caps osd = "allow *"

Obtain the keyring value of the host path for the failed Ceph OSD:
1. SSH on a node hosting the failed Ceph OSD.
2. In /var/lib/rook/rook-ceph, search for a directory containing the keyring and whoami files that have the number of the failed Ceph OSD. For example:
```
# cat whoami
3
# cat keyring
[osd.3]
key = AQD2k/BlcE+YJxAA/QsD/fIAL1qPrh3hjQ7AKQ==
```

The cluster is affected if keyrings of the failed Ceph OSD of the host path and Ceph cluster differ. If so, proceed to fixing them and unblock the failed Ceph OSD.

To fix different keyrings and unblock the Ceph OSD authorization:

Obtain the keyring value of the host path for this Ceph OSD:
1. SSH on a node hosting the required Ceph OSD.
2. In /var/lib/rook/rook-ceph, search for a directory containing the keyring and whoami files that have the number of the required Ceph OSD. For example:
```
# cat whoami
3
# cat keyring
[osd.3]
key = AQD2k/BlcE+YJxAA/QsD/fIAL1qPrh3hjQ7AKQ==
```

Enter the ceph-tools pod:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

Export the current Ceph OSD keyring stored in the Ceph cluster:
```
ceph auth get osd.<ID> -o /tmp/key
```

Replace the exported key with the value from keyring. For example:

vi /tmp/key
# replace the key with the one from the keyring file
[osd.3]
key = AQD2k/BlcE+YJxAA/QsD/fIAL1qPrh3hjQ7AKQ==
caps mgr = "allow profile osd"
caps mon = "allow profile osd"
caps osd = "allow *"

Import the replaced Ceph OSD keyring to the Ceph cluster:
```
ceph auth import -i /tmp/key
```

Restart the failed Ceph OSD pod:

kubectl -n rook-ceph scale deploy rook-ceph-osd-<ID> --replicas 0
kubectl -n rook-ceph scale deploy rook-ceph-osd-<ID> --replicas 1

The ceph-exporter pods are present in the Ceph crash list¶

After a managed cluster update, the ceph-exporter pods may be present in the ceph crash ls or ceph health detail list while rook-ceph-exporter attempts to obtain the port that is still in use. For example:

ceph health detail

HEALTH_WARN 1 daemons have recently crashed
[WRN] RECENT_CRASH: 1 daemons have recently crashed
    client.ceph-exporter crashed on host kaas-node-b59f5e63-2bfd-43aa-bc80-42116d71188c at 2024-10-01T16:43:31.311563Z

The issue does not block the managed cluster update. Once the port becomes available, rook-ceph-exporter obtains the port and the issue disappears.

To apply the issue resolution, run ceph crash archive-all to remove ceph-exporter pods from the Ceph crash list.

Troubleshoot StackLight¶

This section provides solutions to the issues that may occur during StackLight usage. To troubleshoot StackLight alerts, refer to Troubleshoot alerts.

Patroni replication lag¶

PostgreSQL replication in a Patroni cluster is based on the Write-Ahead Log (WAL) syncing between the cluster leader and replica. Occasionally, this mechanism may lag due to networking issues, missing WAL segments (on rotation or recycle), increased Patroni Pods CPU usage, or due to a hardware failure.

In StackLight, the PostgresqlReplicationSlowWalDownload alert indicates that the Patroni cluster Replica is out of sync. This alert has the Warning severity because under such conditions Patroni cluster is still operational and the issue may disappear without intervention. However, a persisting replication lag may impact the cluster availability if another Pod in the cluster fails, leaving the leader alone to serve requests. In this case, the Patroni leader will become read-only and unable to serve write requests, which can cause outage of Alerta backed by Patroni. Grafana, which also uses Patroni, will still be operational but any dashboard changes will not be saved.

Therefore, if PostgresqlReplicationSlowWalDownload fires, observe the cluster and fix it if the issue persists or if the lag grows.

To apply the issue resolution:

Enter the Patroni cluster Pod:

kubectl exec -it -n stacklight patroni-13-2 patroni -- bash

Verify the current cluster state:
```
patronictl -c postgres.yml list
```
In the Lag in MB column of the output table, the replica Pod will indicate a non-zero value.
Enter the leader Pod if it is not the current one.

From the leader Pod, resync the replica Pod:

patronictl -c postgres.yml reinit patroni-13 <REPLICA-MEMBER-NAME>

In the Alertmanager or Alerta web UI, verify that no new alerts are firing for Patroni. The PostgresqlInsufficientWorkingMemory alert may become pending during the operation but should not fire.

Verify that the replication is in sync:

patronictl -c postgres.yml list

Example of a positive system response:

+ Cluster: patroni-13 (6974829572195451235)---+---------+-----+-----------+
| Member       | Host          | Role         | State   |  TL | Lag in MB |
+--------------+---------------+--------------+---------+-----+-----------+
| patroni-13-0 | 10.233.96.11  | Replica      | running | 875 |         0 |
| patroni-13-1 | 10.233.108.39 | Leader       | running | 875 |           |
| patroni-13-2 | 10.233.64.113 | Sync Standby | running | 875 |         0 |
+--------------+---------------+--------------+---------+-----+-----------+

Alertmanager does not send resolve notifications for custom alerts¶

Due to the Alertmanager issue, Alertmanager loses the in-memory alerts during restart. As a result, StackLight does not send notifications for custom alerts in the following case:

Adding a custom alert.
Then removing the custom alert and at the same time changing the Alertmanager configuration such as adding or removing a receiver.

For a removed custom alert, Alertmanager does not send a resolve notification to any of the configured receivers. Therefore, until after the time set in repeat_interval (3 hours by default), the alert will be visible in all receivers but not in the Prometheus and Alertmanager web UIs.

When the alert is re-added, Alertmanager does not send a firing notification for it until after the time set in repeat_interval, but the alert will be visible in the Prometheus and Alertmanager web UIs.

Troubleshoot alerts¶

This section describes the root causes, investigation, and mitigation steps for the available predefined StackLight alerts.

Note

The list of alerts in this section is not full and will be expanded.

Troubleshoot cAdvisor alerts¶

This section describes the investigation and troubleshooting steps for the cAdvisor service.

KubeContainersCPUThrottlingHigh

KubeContainersCPUThrottlingHigh¶

Root cause

The alert is based on the metric container_cpu_cfs_throttled_periods_total over container_cpu_cfs_periods_total and means the percentage of CPU periods where the container ran but was throttled (stopped from running the whole CPU period).

Investigation

The alert usually fires when a Pod starts, often during brief intervals. It may solve automatically once the Pod CPU usage stabilizes. If the issue persists:

Obtain the created_by_name label from the alert.

List the affected Pods using the created_by_name label:

kubectl get pods -n stacklight -o json | jq -r '.items[] | \
select(.metadata.ownerReferences[] | select(.name=="<created_by_name>")) | .metadata.name'

In the system response, obtain one or more affected Pod names.

List the affected containers. Using <pod_name> obtained in the previous step, run the following query in the Prometheus query window:
```
sum by (container) (rate(container_cpu_usage_seconds_total{pod="<pod_name>", container!="POD", container!=""}[3m]))
```
Verify the current request and limit difference received from Prometheus with the values from the Pod configuration for every container respectively:
```
kubectl describe <created_by_kind> <created_by_name>
```
In the command above, replace <created_by_kind> and <created_by_name> with the corresponding alert values.

If some of containers lack resources, increase their limits.

Mitigation

As a possible solution, increase Pod limits.

Troubleshoot Helm Controller alerts¶

This section describes the investigation and troubleshooting steps for the Helm Controller service and the HelmBundle custom resources.

HelmBundleReleaseNotDeployed
HelmControllerReconcileDown
HelmControllerTargetDown

HelmBundleReleaseNotDeployed¶

Root cause	Helm Controller release status differs from `deployed`. Broken HelmBundle configurations or missing Helm chart artifacts may cause this when applying the HelmBundle update.
Investigation	Inspect logs of every Helm Controller Pod for error or warning messages: kubectl logs -n <controller_namespace> <controller_name> In case of an error to fetch the chart, review the `chartURL` fields of the HelmBundle object to verify that the chart URL does not have typographical errors: kubectl get helmbundle -n <helmbundle_namespace> <helmbundle_name> -o yaml Verify that the chart artifact is accessible from your cluster.
Mitigation	If the chart artifact is not accessible from your cluster, investigate the network-related alerts, if any, and verify that the file is available in the repository.

See also

Collect cluster logs

HelmControllerReconcileDown¶

Root cause	Helm Controller failed to reconcile the HelmBundle spec.
Investigation and mitigation	Refer to HelmBundleReleaseNotDeployed.

HelmControllerTargetDown¶

Root cause

Prometheus fails in at least 10% of Helm Controller metrics scrapes. The following two components can cause the alert to fire:

Helm Controller Pod(s):
- If the Pod is down.
- If the Pod target endpoint is at least partially unresponsive. For example, in case of CPU throttling, application error preventing a restart, or container flapping.
Prometheus server if it cannot reach the helm-controller endpoint(s).

Investigation and mitigation

Refer to KubePodsCrashLooping.
Inspect and resolve the network-related alerts.

Troubleshoot Host Operating System Modules Controller alerts¶

This section describes the investigation and troubleshooting steps for the Host Operating System Modules Controller alerts.

Day2ManagementControllerTargetDown
Day2ManagementDeprecatedConfigs

Day2ManagementControllerTargetDown¶

Root cause

Prometheus failed to scrape the host-os-modules-controller metrics because of the host-os-modules-controller Pod outage or application error.

Investigation

Verify the status of the host-os-modules-controller Pod:

kubectl get pod -n kaas \
-l=app.kubernetes.io/name=host-os-modules-controller

Inspect the Kubernetes Pod events, if available:
```
kubectl describe pod -n kaas <pod_name>
```
Alternatively:
1. In the Discover section of OpenSearch Dashboards, change the index pattern to kubernetes_events-*.
2. Expand the required time range and filter the results by kubernetes.event.involved_object.name that equals the <pod_name>.
3. In the matched results, inspect the kubernetes.event.message field.
Inspect the host-os-modules-controller logs for error or warning messages:
```
kubectl logs -n kaas <pod_name>
```

For further steps, see the Investigation section of the KubePodsCrashLooping alert.

Mitigation

Refer to the Mitigation section of the KubePodsCrashLooping alert.

Day2ManagementDeprecatedConfigs¶

Root cause	One or more HostOSConfiguration objects contain configuration for one or more deprecated modules.
Investigation	Identify the names of `HostOSConfiguration` objects with configured deprecated modules: kubectl get hoc -A -o go-template="{{range .items}}{{if .status.containsDeprecatedModules}}\ {{.metadata.namespace}}/{{.metadata.name}}{{end}}{{\"\n\"}}{{end}}" For each object, identify the deprecated modules being used and the list of modules that replace the deprecated ones: kubectl get hoc <hoc_name> -n <hoc_namespace> \ -o go-template="{{range .status.configs}}\ {{if .moduleDeprecatedBy}}\ deprecated: {{.moduleName}}-{{.moduleVersion}};\ update to: {{range .moduleDeprecatedBy}}{{.name}}-{{.version}}; {{end}}\ {{\"\n\"}}\ {{end}}\ {{end}}" Read through the documentation or README of the new module, and manually update all affected `HostOSConfiguration` objects migrating from the deprecated version.
Mitigation	Use up-to-date versions of modules during configuration.

Troubleshoot Ubuntu kernel alerts¶

This section describes the investigation and troubleshooting steps for the Ubuntu kernel alerts.

KernelIOErrorsDetected

KernelIOErrorsDetected¶

Available MCC since 2.27.0 (Cluster releases 17.2.0 and 16.2.0)

Root cause

Kernel logs generated IO error logs, potentially indicating disk issues. IO errors may occur due to various reasons and are often unpredictable.

Investigation

Inspect kernel logs on affected nodes for IO errors to pinpoint the issue, identify the affected disk, if any, and assess its condition. Most major Linux distributions store kernel logs in /var/log/dmesg and occasionally in /var/log/kern.log.

If the issue is not related to a faulty disk, further inspect errors in logs to identify the root cause.

Mitigation

Mitigation steps depend on the identified issue. If the issue is caused by a faulty disk, replace the affected disk. Additionally, consider the following measures to prevent such issues in the future:

Implement proactive monitoring of disk health to detect early signs of failure and initiate replacements preemptively.
Utilize tools such as smartctl or nvme` for routine collection of disk metrics, enabling prediction of failures and early identification of underperforming disks to prevent major disruptions.

Troubleshoot Kubernetes applications alerts¶

This section describes the investigation and troubleshooting steps for the Kubernetes applications alerts.

KubePodsCrashLooping
KubePodsNotReady
KubePodsRegularLongTermRestarts
KubeDeploymentGenerationMismatch
KubeDeploymentReplicasMismatch
KubeDeploymentOutage
KubeStatefulSetReplicasMismatch
KubeStatefulSetGenerationMismatch
KubeStatefulSetOutage
KubeStatefulSetUpdateNotRolledOut
KubeDaemonSetRolloutStuck
KubeDaemonSetNotScheduled
KubeDaemonSetMisScheduled
KubeDaemonSetOutage
KubeCronJobRunning
KubeJobFailed

KubePodsCrashLooping¶

Related inhibited alert: KubePodsRegularLongTermRestarts.

Root cause

Termination of containers in Pods having .spec.restartPolicy set to Always causes Kubernetes to bring them back. If the container exits again, kubelet exponentially increases the back-off delay between next restarts until it reaches 5 minutes. The Pods being stuck in restarts loop get the CrashLoopBackOff status. Because of the underlying metric inertia, StackLight measures restarts in an extended 20-minute time window.

Investigation

Note

Verify if there are more alerts firing in the MOSK cluster to obtain more information on the cluster state and simplify issue investigation and mitigation.

Also examine the relation of the affected application with other applications (dependencies) and Kubernetes resources it relies on.

During investigation, the affected Pod will likely be in the CrashLoopBackOff or Error state.

List the unhealthy Pods of a particular application. Use the label selector, if possible.

kubectl get pods -n <pod_namespace> -l '<pod_app_label>=<pod_app_name>' \
-o=json | jq -r '.items[] | select(.status.phase != "Running") | \
.metadata.name, .status.phase'

Collect logs from one of the unhealthy Pods and inspect them for errors and stack traces:
```
kubectl logs -n <pod_namespace> <pod_name>
```
Inspect Kubernetes events or the termination reason and exit code of the Pod:
```
kubectl describe pods -n <pod_namespace> <pod_name>
```
Alternatively, inspect K8S Events in the OpenSearch Dashboards web UI.
In the Kubernetes Pods Grafana dashboard, monitor the Pod resources usage.

Important

Performing the following step requires understanding of Kubernetes workloads.
In some scenarios, observing Pods failing in real time may provide more insight on the issue. To investigate the application this way, restart (never with the --force flag) one of the failing Pods and inspect the following in the Kubernetes Pods Grafana dashboard, events and logs:
- Define whether the issue reproduces.
- Verify when does the issue reproduce in the Pod uptime: during the initialization or after some time.
- Verify that the application requirements for Kubernetes resources and external dependencies are satisfied.
- Define whether there is an issue with passing the readiness or liveness tests.
- Define how the Pod container terminates and whether it is OOMKilled.
Note

While investigating, monitor the application health and verify the resource limits. Most issues can be solved by fixing the dependent application or tuning, such as providing additional flags, changing resource limits, and so on.

Mitigation

Fixes typically fall into one of the following categories:

Fix the dependent service. For example, fixing opensearch-master makes fluentd-logs Pods start successfully.
Fix the configuration if it causes container failure.
Tune the application by providing flags changing its behavior.
Tune the CPU or MEM limits if the system terminates a container upon hitting the memory limits (OOMKilled) or stops responding because of CPU throttling.
Fix code in case of application bugs.

KubePodsNotReady¶

Removed in 17.0.0 and 16.0.0

Root cause

The Pod could not start successfully for the last 15 minutes, meaning that its status phase is one of the following:

Pending - at least one Pod container was not created. The Pod waits for the Kubernetes cluster to satisfy its requirement. For example, in case of failure to pull the Docker image or create a persistent volume.
Failed - the Pod terminated in the Error state and was not restarted. At least one container exited with a non-zero status code or was terminated by the system, for example, OOMKilled.
Unknown - kubelet communication issues.

Investigation

Note

Verify if there are more alerts firing in the MOSK cluster to obtain more information on the cluster state and simplify issue investigation and mitigation.

Also examine the relation of the affected application with other applications (dependencies) and Kubernetes resources it relies on.

List the unhealthy Pods of the affected application. Use the label selector, if possible.

kubectl get pods -n <pod_namespace> -l \
'<pod_app_label>=<pod_app_name>' -o=json | jq -r '.items[] | \
select(.status.phase != "Running") | .metadata.name'

For one of the unhealthy Pods, verify Kubernetes events, termination reason, and exit code (for Failed only) of the Pod:
```
kubectl describe pods -n <pod_namespace> <pod_name>
```
Alternatively, inspect K8S Events in the OpenSearch Dashboards web UI.
For Failed Pods, collect logs and inspect them for errors and stack traces:
```
kubectl logs -n <pod_namespace> <pod_name>
```
In the Kubernetes Pods Grafana dashboard, monitor the Pod resources usage.

Mitigation

For Pending, investigate and fix the root cause of the missing Pod requirements. For example, dependent application failure, unavailable Docker registry, unresponsive storage provided, and so on.
For Failed, see the KubePodsCrashLooping Mitigation section.
For Unknown, first verify and resolve the network-related alerts firing in the Kubernetes cluster.

See also

Kubernetes documentation: Pod phase

KubePodsRegularLongTermRestarts¶

Related inhibiting alert: KubePodsCrashLooping.

Root cause

It is a long-term version of the KubePodsCrashLooping alert, aiming to catch Pod container restarts in wider time windows. The alert raises when the Pod container restarts once a day in a 2-days time frame. It may indicate that a pattern in the application lifecycle needs investigation, such as deadlocks, memory leaks, and so on.

Investigation

While investigating, the affected Pod will likely be in the Running state.

List the Pods of the application, which containers were restarted at least twice. Use the label selector, if possible.

kubectl get pods -n <pod_namespace> -l \
'<pod_app_label>=<pod_app_name>' -o=json | jq -r '.items[] | \
select(.status.phase != "Running") | .metadata.name, .status.phase'

Collect logs for one of the affected Pods and inspect them for errors and stack traces:
```
kubectl logs -n <pod_namespace> <pod_name>
```
In the OpenSearch Dashboards web UI, inspect the K8S events dashboard. Filter the Pod using the kubernetes.event.involved_object.name key.
In the Kubernetes Pods Grafana dashboard, monitor the Pod resources usage. Filter the affected Pod and find the point in time when the container was restarted. Observations may take several days.

Mitigation

Refer to the KubePodsCrashLooping Mitigation section. Fixing this issue may require more effort than simple application tuning. You may need to upgrade the application, upgrade its dependency libraries, or apply a fix in the application code.

See also

Kubernetes documentation: Pod Lifecycle

KubeDeploymentGenerationMismatch¶

Root cause

Deployment generation, or version, occupies 2 fields in the object:

.metadata.generation (updated upon kubectl apply execution) - the desired Deployment generation.
.status.observedGeneration (triggers a new ReplicaSet rollout) - observed by Deployment controller.

When the Deployment controller fails to observe a new Deployment version, these 2 fields differ. The mismatch lasting for more than 15 minutes triggers the alert.

Investigation and mitigation

The alert indicates failure of a Kubernetes built-in Deployment controller and requires debugging on the control plane level. See Troubleshooting Guide for details on collecting cluster state and mitigating known issues.

See also

KubeDeploymentReplicasMismatch¶

Root cause	The number of available Deployment replicas did not match the desired state set in the `.spec.replicas` field for the last 30 minutes, meaning that at least one Deployment Pod is down.
Investigation and mitigation	Refer to KubePodsCrashLooping.

KubeDeploymentOutage¶

Related inhibited alert: KubeDeploymentReplicasMismatch.

Root cause	All Deployment replicas are unavailable for the last 10 minutes, meaning that the application is likely down.
Investigation	Verify the Deployment status: kubectl get deployment -n <deployment_namespace> <deployment_name> Inspect the related Kubernetes events for error messages and probe failures: kubectl describe deployment -n <deployment_namespace> <deployment_name> If events are unavailable, inspect K8S Events in the OpenSearch Dashboards web UI. List Pods of the Deployment and verify them one by one. Use label selectors, if possible: kubectl get pods -n <deployment_namespace> -l \ '<deployment_app_label>=<deployment_app_name>' See KubePodsCrashLooping.
Mitigation	Refer to KubePodsCrashLooping.

KubeStatefulSetReplicasMismatch¶

Root cause	The number of running StatefulSet replicas did not match the desired state set in the `.spec.replicas` field for the last 30 minutes, meaning that at least one StatefulSet Pod is down.
Investigation and mitigation	Refer to KubePodsCrashLooping.

KubeStatefulSetGenerationMismatch¶

Root cause

StatefulSet generation, or version, occupies 2 fields in the object:

.metadata.generation (updated upon kubectl apply execution) - the desired StatefulSet generation.
.status.observedGeneration (triggers a new ReplicaSet rollout) - observed by StatefulSet controller.

When the StatefulSet controller fails to observe a new StatefulSet version, these 2 fields differ. The mismatch lasting for more than 15 minutes triggers the alert.

Investigation and mitigation

The alert indicates failure of a Kubernetes built-in StatefulSet controller and requires debugging on the control plane level. See Troubleshooting Guide for details on collecting cluster state and mitigating known issues.

KubeStatefulSetOutage¶

Related inhibited alerts: KubeStatefulSetReplicasMismatch and KubeStatefulSetUpdateNotRolledOut.

Root cause	StatefulSet workloads are typically distributed across Kubernetes nodes. Therefore, losing more than one replica indicates either a serious application failure or issues on the Kubernetes cluster level. The application likely experiences severe performance degradation and availability issues.
Investigation	Verify the StatefulSet status: kubectl get sts -n <sts_namespace> <sts_name> Inspect the related Kubernetes events for error messages and probe failures kubectl describe sts -n <sts_namespace> <sts_name> If events are unavailable, inspect K8S Events in the OpenSearch Dashboards web UI. List the StatefulSet Pods and verify them one by one. Use the label selectors, if possible. kubectl get pods -n <sts_namespace> -l '<sts_app_label>=<sts_app_name>' See KubePodsCrashLooping.
Mitigation	Refer to KubePodsCrashLooping. If after fixing the root cause on the Pod level the affected Pods are still non-`Running`, contact Mirantis support. StatefulSets must be treated with special caution as they store data and their internal state.

KubeStatefulSetUpdateNotRolledOut¶

Root cause

The StatefulSet update did not finish in 30 minutes, which was detected in the mismatch of the .spec.replicas and .status.updatedReplicas fields. Such issue may occur during a rolling update if a newly created Pod fails to pass the readiness test and blocks the update.

Investigation

Verify the rollout status:
```
kubectl rollout status -n <sts_namespace> sts/<sts_name>
```
The output includes the number of updated Pods. In Container Cloud, StatefulSets use the RollingUpdate strategy for upgrades and the Pod management policy does not affect updates. Therefore, investigation requires verifying the failing Pods only.
List the non-Running Pods of the StatefulSet and inspect them one by one for error messages and probe failures. Use the label selectors, if possible.
```
kubectl get pods -n <sts_namespace> -l \
'<sts_app_label>=<sts_app_name>' -o=json | jq -r '.items[] | \
select(.status.phase!="Running") | .metadata.name'
```
See KubePodsCrashLooping. Pay special attention to the information about the application cluster issues, as clusters in Container Cloud are deployed as StatefulSets.

If none of these alerts apply and the new Pod is stuck failing to pass postStartHook (Pod is in the PodInitializing state) or the readiness probe (Pod in the Running state, but not fully ready, for example, 0/1) it may be caused by Pod inability to join the application cluster. Investigating such issue requires understanding how the application cluster initializes and how nodes join the cluster. The PodInitializing state may be especially problematic as the kubectl logs command does not collect logs from such Pod.

Warning

Perform the following step with caution and remember to perform a rollback afterward.

In some StatefulSets, disabling postStartHook unlocks the Pod to the Running state and allows for logs collection.

Mitigation

Refer to KubePodsCrashLooping.

If after fixing the root cause on the Pod level the affected Pods are still non-Running, contact Mirantis support. Treat StatefulSets with special caution as they store data and their internal state. Improper handling may result in a broken application cluster state and data loss.

See also

KubeDaemonSetRolloutStuck¶

Related inhibiting alert: KubeDaemonSetOutage.

Root cause	For the last 30 minutes, DaemonSet has at least one Pod (not necessarily the same one), which is not ready after being correctly scheduled. It may be caused by missing Pod requirements on the node or unexpected Pod termination.
Investigation	List the non-Running Pods of the DaemonSet: kubectl get pods -n <daemonset_namespace> -l \ '<daemonset_app_label>=<daemonset_app_name>' -o json \ \| jq '.items[] \| select(.status.phase!="Running") \| .metadata.name' For the listed Pods, apply the steps described in the KubePodsCrashLooping Investigation section.
Mitigation	See KubePodsCrashLooping.

See also

Kubernetes documentation: DaemonSet

KubeDaemonSetNotScheduled¶

Can relate to: KubeCPUOvercommitPods, KubeMemOvercommitPods.

Root cause	At least one Pod of the DaemonSet was not scheduled to a target node. This may happen if resource requests for the Pod cannot be satisfied by the node or if the node lacks other resources that the Pod requires, such as PV of a specific storage class.
Investigation	Identify the number of available and desired Pods of the DaemonSet: kubectl get daemonset -n <daemonset_namespace> <deamonset_name> Identify the nodes that already have the DaemonSet Pods running: kubectl get pods -n <daemonset_namespace> -l \ '<daemonset_app_label>=<daemonset_app_name>' -o json \ \| jq -r '.items[].spec.nodeName' Compare the result with all nodes: kubectl get nodes Identify the nodes that do not have the DaemonSet Pods running: kubectl describe nodes <node_name> See the `Allocated resources` and `Events` sections to identify the node that has not enough resources.
Mitigation	See KubeCPUOvercommitPods and KubeMemOvercommitPods.

KubeDaemonSetMisScheduled¶

Removed in MCC 2.27.0 (17.2.0 and 16.2.0)

Root cause	At least one node where the DaemonSet Pods were deployed got a `NoSchedule` taint added afterward. Taints are respected during the scheduling stage only, and the Pod is currently considered unschedulable to such nodes.
Investigation	List the taints of all Kubernetes cluster nodes: kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints Verify the DaemonSet tolerations and currently occupied nodes: kubectl get daemonset -n <daemonset_namespace> <daemonset_name> -o \ custom-columns=NAME:.metadata.name,TOLERATIONS:.spec.tolerations,NODE:.spec.nodeName Compare the output of the two commands above and define the nodes that should not have DaemonSet Pods deployed.
Mitigation	If the DaemonSet Pod should run on the affected nodes, add toleration for the corresponding taint in the DaemonSet. If the DaemonSet Pod should not run on the affected nodes, delete the DaemonSet Pods from all nodes with a non-tolerated taint.

KubeDaemonSetOutage¶

Related inhibiting alert: KubeDaemonSetRolloutStuck.

Root cause	Although the DaemonSet was not scaled down to zero, there are zero healthy Pods. As each DaemonSet Pod is deployed to a separate Kubernetes node, such situation is rare and typically caused by a broken configuration (ConfigMaps or Secrets) or wrongly tuned resource limits.
Investigation	Verify the DaemonSet status: kubectl get daemonset -n <daemonset_namespace> <daemonset_name> Inspect the related Kubernetes events for error messages and probe failures: kubectl describe daemonset -n <damonset_namespace> <damonset_name> If events are unavailable, inspect K8S Events in the OpenSearch Dashboards web UI. List the Deployment Pods and verify them one by one. Use the label selectors, if possible: kubectl get pods -n <damonset_namespace> -l '<damonset_app_label>=<damonset_app_name>'
Mitigation	See KubePodsCrashLooping.

See also

Kubernetes documentation: DaemonSet

KubeCronJobRunning¶

Related alert: ClockSkewDetected.

Root cause

A CronJob Pod fails to start in 15 minutes from the configured schedule due to the following possible root causes:

The previously scheduled Pod is still running and the CronJob .spec.concurrencyPolicy was set to Forbid.
The scheduled Job could not start in the CronJob .spec.startingDeadlineSeconds, if set.

Investigation

Inspect the running CronJob Pods. Drop the label selector if none is available.

kubectl get pods -n <cronjob_namespace> -l \
'<cronjob_app_label>=<cronjob_app_name>' -o=json | jq -r '.items[] \
| select(.status.phase=="Running") | .metadata.name'

If Pod uptime is unusually long, it can overlap with the upcoming Jobs. Verify the concurrencyPolicy setting:
```
kubectl get cronjob -n <cronjob_namespace> <conrjob_name> -o=json | \
jq -r '.spec.concurrencyPolicy == "Forbid"'
```
If the output is true, Kubernetes will not allow new Pods to run until the current one terminates. In this case, investigate and fix the issue on the application level.

Collect logs and inspect the Pod resources usage:

kubectl logs -n <cronjob_namespace> <cronjob_pod_name>

If all CronJob Pods terminate normally, inspect Kubernetes events for the CronJob:
```
kubectl describe cronjob -n <cronjob_namespace> <cronjob_name>
```
In case of events similar to Cannot determine if job needs to be started. Too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew.:
1. Verify if the ClockSkewDetected alert is firing for the affected cluster.
2. Verify the current starting deadline value:
```
kubectl get cronjob -n <cronjob_namespace> <conrjob_name> \
-o=json | jq -r '.spec.startingDeadlineSeconds'
```

Mitigation

For root cause 1, fix the issue on the application level.
For root cause 2:
1. If the ClockSkewDetected alert is firing for the affected cluster, resolve it first.
2. If the CronJob issue is still present, depending on your application, remove or increase the .spec.startingDeadlineSeconds value.

KubeJobFailed¶

Related inhibited alert: KubePodsNotReady.

Root cause	At least one container of a Pod started by the Job exited with a non-zero status or was terminated by the Kubernetes or Linux system.
Investigation	See KubePodsCrashLooping.
Mitigation	Investigate and fix the root cause of missing Pod requirements, such as failing dependency application, Docker registry unavailability, unresponsive storage provided, and so on. Use the Mitigation section in KubePodsCrashLooping. Verify and resolve network-related alerts firing in the Kubernetes cluster.

See also

Kubernetes documentation: Jobs

Troubleshoot Kubernetes resources alerts¶

This section describes the investigation and troubleshooting steps for the Kubernetes resources alerts.

KubeCPUOvercommitPods
KubeMemOvercommitPods

KubeCPUOvercommitPods¶

Root cause

The sum of Kubernetes Pods CPU requests is higher than the average capacity of the cluster without one node or 80% of total nodes CPU capacity, depending on what is higher. It is a common issue of a cluster with too many resources deployed.

Investigation

Select one of the following options to verify nodes CPU requests:

Inspect the allocated resources section in the output of the following command:
```
kubectl describe nodes
```
Inspect the Cluster CPU Capacity panel of the Kubernetes Cluster Grafana dashboard.

Mitigation

Increase the node(s) CPU capacity or add a worker node(s).

KubeMemOvercommitPods¶

Root cause

The sum of Kubernetes Pods RAM requests is higher than the average capacity of the cluster without one node or 80% of total nodes RAM capacity, depending on what is higher. It is a common issue of a cluster with too many resources deployed.

Investigation

Select one of the following options to verify nodes RAM requests:

Inspect the allocated resources section in the output of the following command:
```
kubectl describe nodes
```
Inspect the Cluster Mem Capacity panel of the Kubernetes Cluster Grafana dashboard.

Mitigation

Increase the node(s) CPU capacity or add a worker node(s).

Troubleshoot Kubernetes storage alerts¶

This section describes the investigation and troubleshooting steps for the Kubernetes storage alerts.

KubePersistentVolumeUsageCritical
KubePersistentVolumeFullInFourDays
KubePersistentVolumeErrors

KubePersistentVolumeUsageCritical¶

Related inhibited alert: KubePersistentVolumeFullInFourDays.

Root cause	Persistent volume (PV) has less than 3% of free space. Applications that rely on writing to the disk will crash without space available.
Investigation and mitigation	Refer to KubePersistentVolumeFullInFourDays.

See also

KubePersistentVolumeFullInFourDays¶

Root cause

The PV has less than 15% of total space available. Based on the predict_linear() Prometheus function, it is expected to fill up in four days.

Investigation

Verify the current PV size:

kubectl get pv <pv_name> -o=jsonpath='{.spec.capacity.storage}'

Verify the configured application retention period.
Optional. Review the data stored on the PV, including the application data, logs, and so on, to verify the space consumption and eliminate potential overuse:
1. Obtain the name of the Pod that uses the PV:
```
kubectl get pods -n <namespace> -o json | jq -r '.items[] | \
select(.spec.volumes[] | \
select(.persistentVolumeClaim.claimName=="<persistentvolumeclaim>")) \
| .metadata.name'
```
 Substitute <persistentvolumeclaim> with the value from the alert persistentvolumeclaim label.
2. Obtain the name of the container that has the volume mounted:
```
kubectl describe pod -n <namespace> <pod_name>
```
3. Execute the Pod and determine the files consuming the most space:
```
kubectl exec -it -n <namespace> <pod_name> -- /bin/bash
```

Mitigation

Select from the following options:

Decrease the application retention time, if applicable.
Resize the PV, if possible, or create a new PV, migrate data, and switch the volumes using rolling update.

See also

KubePersistentVolumeErrors¶

Root cause	Some PVs are in the `Failed` or `Pending` state.
Investigation	Verify the PVs status: kubectl get pv -o json \| jq -r '.items[] \| select(.status.phase=="Pending" or .status.phase=="Failed") \| .metadata.name' For the PVs in the `Failed` or `Pending` state: kubectl describe pv <pv_name> Inspect Kubernetes events, if available. Otherwise: In the Discover section of the OpenSearch Dashboards web UI, change the index pattern to kubernetes_events-*. Expand the time range and filter the results by kubernetes.event.involved_object.name, which equals to the `<pv_name>` from the previous step. In the matched results, find the kubernetes.event.message field. If the PV is in the `Pending` state, it waits to be provisioned. Verify the PV storage class name: kubectl get pv <pv_name> -o=json \| jq -r '.spec.storageClassName' Verify the provisioner name specified for the storage class: kubectl get sc <sc_name> -o=json \| jq -r '.spec.provisioner If the provisioner is deployed as a workload in the affected Kubernetes cluster, verify if it experiences availability or health issues. Further investigation and mitigation depends on the provisioner. The `Failed` state can be caused by a custom recycler error when a deprecated `Recycle` reclaim policy is used.
Mitigation	Fix the PV in `Pending` state according to the investigation outcome. Warning Deleting a PV causes data loss. Removing PVCs causes deletion of a PV with the `Delete` reclaim policy. Fix the PV in the `Failed` state: Investigate the recycler Pod by verifying the `kube-controller-manager` configuration. Search for the PV in the Pod logs. Delete the Pod and mounted PVC if it is still in the `Terminating` state.

See also

Troubleshoot Kubernetes system alerts¶

This section describes the investigation and troubleshooting steps for the Kubernetes system alerts.

KubeNodeNotReady
KubeletTooManyPods
KubeStateMetricsTargetDown
KubernetesMasterAPITargetsOutage

KubeNodeNotReady¶

Root cause

A node has entered the NotReady state and cannot run new Pods due to the following reasons:

Issues with the kubelet or kube-proxy processes.
High resources consumption (insufficient disk space, memory, CPU).

Investigation

In OpenSearch Dashboards, navigate to the Discover section.
Expand the time range and filter the results by the ucp-kubelet or ucp-kube-proxy logger.
Set the severity_label field matcher to ERROR. In results, search for message.
Inspect the status of the KubeCPUOvercommitPod and KubeMemOvercommitPods alerts to verify if PidPressure or DiskPressure takes place:
```
kubectl describe node <node_name>
```
In the Kubernetes Cluster Grafana dashboard, verify the resources consumption over time.

Mitigation

Contact Mirantis support for a detailed procedure on dealing with each of the root causes.

See also

Kubernetes documentation: Node status

KubeletTooManyPods¶

Root cause	The number of Pods reached 90% of Kubernetes node capacity.
Investigation	Verify the Pod capacity for nodes in your cluster: kubectl get node -o json \| jq '.items[] \| \ {node_name:.metadata.name, capacity:.status.capacity.pods}' Inspect the `Non-terminated Pods` section in the output of the following command: kubectl describe node <node_name>
Mitigation	Verify the nodes capacity. Verify the Pods distribution: kubectl get pods --all-namespaces -o json --field-selector \ spec.nodeName=<node> \| jq -r '.items \| length' If the distribution is extremely odd, investigate custom taints in underloaded nodes. If some of the custom taints are blocking Pods from being scheduled - consider adding tolerations or scaling the MOSK cluster out by adding worker nodes. If no custom taints exist, add worker nodes. Delete Pods that can be moved (preferably, multi-node Deployments).

See also

KubeStateMetricsTargetDown¶

Root cause

Prometheus scraping of the kube-state-metrics service is unreliable, resulting in the success rate below 90%. It indicates either failure of the kube-state-metrics Pod or (in rare scenarios) network issues causing scrape requests to timeout.

Related alert: KubeDeploymentOutage{deployment=prometheus-kube-state-metrics} (inhibiting).

Investigation

In the Prometheus web UI, search for firing alerts that relate to networking issues in the Container Cloud cluster and fix them.

If the cluster network is healthy, refer to the Investigation section of the KubePodsCrashLooping alert troubleshooting description to collect information about CoreDNS Pods.

Mitigation

Based on the investigation results, select from the following options:

Fix the networking issues
Apply solutions from Mitigation section of the KubePodsCrashLooping alert troubleshooting description

If the issue still persists, collect the investigation output and contact Mirantis support for further information.

KubernetesMasterAPITargetsOutage¶

Root cause

The Prometheus Blackbox Exporter target probing /healthz endpoints of the Kubernetes API server nodes is not reliably available. Prometheus metric scrapes fail. It indicates either the prometheus-blackbox-exporter Pod failure or (in rare cases) network issues causing scrape requests to time out.

Related alert: KubeDeploymentOutage{deployment=prometheus-kube-blackbox-exporter} (inhibiting).

Investigation

In the Prometheus web UI, search for firing alerts that relate to networking issues in the Container Cloud cluster and fix them.

If the cluster network is healthy, refer to the Investigation section of the KubePodsCrashLooping alert troubleshooting description to collect information about prometheus-blackbox-exporter Pods.

Mitigation

Based on the investigation results, select from the following options:

Fix the networking issues
Apply solutions from Mitigation section of the KubePodsCrashLooping alert troubleshooting description

If the issue still persists, collect the investigation output and contact Mirantis support for further information.

Troubleshoot Mirantis Container Cloud Exporter alerts¶

This section describes the investigation and troubleshooting steps for the Mirantis Container Cloud Exporter (MCC Exporter) service alerts.

MCCExporterTargetDown
MCCUpdateBlocked

MCCExporterTargetDown¶

Root cause

Prometheus failed to scrape MCC Exporter metrics because of the kaas-exporter Pod outage or application error.

Investigation

Verify the status of the MCC Exporter Pod:

kubectl get pod -n kaas \
-l=app.kubernetes.io/name=kaas-exporter

Inspect the Kubernetes Pod events, if available:
```
kubectl describe pod -n kaas <pod_name>
```
Alternatively:
1. In the Discover section of OpenSearch Dashboards, change the index pattern to kubernetes_events-*.
2. Expand the required time range and filter the results by kubernetes.event.involved_object.name that equals the <pod_name>.
3. In results, search for kubernetes.event.message.
Inspect MCC Exporter logs for error or warning messages:
```
kubectl logs -n kaas <pod_name>
```

For further steps, see the Investigation section of the KubePodsCrashLooping alert.

Mitigation

Refer to KubePodsCrashLooping.

MCCUpdateBlocked¶

Root cause

The Container Cloud update may be blocked due to one of the following reasons:

One or more unsupported managed clusters that must be updated to a supported Cluster release
One or more unsupported regional clusters that must be removed
Cluster update is in progress

Investigation and mitigation

Verify that the Cluster versions of your managed clusters are supported by the target Container Cloud release. For details, see Container Cloud documentation: Release Compatibility Matrix and Compatibility matrix.

To update a MOSK cluster, see Cluster update.
Verify that Container Cloud does not include any regional clusters. If this is the case, remove them as described in Remove a management cluster.
On the Clusters page of the Container Cloud web UI, verify that no clusters are in the Updating` status. Otherwise, wait until the update is complete.

Troubleshoot Mirantis Kubernetes Engine alerts¶

This section describes the investigation and troubleshooting steps for the Mirantis Kubernetes Engine (MKE) cluster alerts.

MKEAPICertExpirationHigh
MKEAPICertExpirationMedium

MKEAPICertExpirationHigh¶

Root cause	MKE cluster root certificate authority (CA) expires in less than 10 days.
Investigation	Connect to an MKE manager node through SSH. List the nodes and their statuses: docker node ls Switch to the node marked as `leader`. On the leader node, retrieve the CA certificate and inspect its `Validity` field: docker swarm ca \| openssl x509 -noout -text
Mitigation	Contact Mirantis support for a detailed procedure on certificate rotation.

See also

Docker documentation: docker swarm ca

MKEAPICertExpirationMedium¶

Root cause	MKE cluster root CA expires in less than 30 days.
Investigation and mitigation	Refer to MKEAPICertExpirationHigh.

Troubleshoot OpenSearch alerts¶

Available since MCC 2.26.0 (17.1.0 and 16.1.0)

This section describes the investigation and troubleshooting steps for the OpenSearch alerts.

OpenSearchStorageUsageCritical
OpenSearchStorageUsageMajor

OpenSearchStorageUsageCritical¶

Root cause

The OpenSearch volume has reached the default flood_stage disk allocation watermark of 95% disk usage. At this stage, all shards are in read-only mode.

Investigation and mitigation

Important. Allow deleting read-only shards. For details, see the step 3 of the “Temporary hacks/fixes” section in Opster documentation: Flood stage disk watermark exceeded on all indices on this node will be marked read-only.
Consider applying temporary fixes from the same article to allow logs flow until you fix the main issue.
Refer to the Investigation and mitigation section in OpenSearchStorageUsageMajor .

OpenSearchStorageUsageMajor¶

Root cause

The OpenSearch volume has reached the default value for the high disk allocation watermark of 90% disk usage. At this point, OpenSearch attempts to reassign shards to other nodes if these nodes are still under 90% of used disk space.

Investigation and mitigation

Verify that the user does not create indices that are not managed by StackLight, which may also cause unexpected storage usage. StackLight deletes old data only for its managed indices.
If an OpenSearch volume uses shared storage, such as LVP, disk usage may still exceed expected limits even if rotation works as expected. In this case, consider the following solutions:
- Increase disk space
- Delete old indices
- Lower retention thresholds for components that use shared storage. To reduce OpenSearch space usage, consider adjusting the elasticsearch.persistentVolumeUsableStorageSizeGB parameter.
By default, elasticsearch-curator deletes old logs when disk usage exceeds 80%. If it fails to delete old logs, inspect the known issues described in the product Release Notes.

Troubleshoot Release Controller alerts¶

This section describes the investigation and troubleshooting steps for the Mirantis Container Cloud Release Controller service.

MCCReleaseControllerDeploymentStateCritical

MCCReleaseControllerDeploymentStateCritical¶

Root cause	There are no Release Controller replicas scheduled in the cluster. By default, 3 replicas should be scheduled. The controller was either deleted or downscaled to 0.
Investigation	Verify the status of the `release-controller-release-controller` deployment: kubectl get deployment -n kaas release-controller-release-controller Verify the `.spec.replicas` field value in the `release-controller` deployment spec: kubectl get deployment -n kaas \ release-controller-release-controller -o=json \| jq -r \ '.spec.replicas'
Mitigation	If the Release Controller deployment has been downscaled to 0, set the replicas back to 3 in the `release-controller` Helm release in the `.spec.replicas` section of the `Deployment` object on the management cluster: kubectl edit deployment -n kaas release-controller-release-controller

Troubleshoot Telemeter client alerts¶

This section describes the investigation and troubleshooting steps for the Mirantis Container Cloud Telemeter client service.

TelemeterClientHAFailed

TelemeterClientHAFailed¶

Root cause	The Telemeter client fails to federate data from Prometheus or to send data to the Telemeter server due to a very long incoming data sample. The `limit-bytes` parameter in the StackLight Helm release is too low.
Investigation	Verify whether the logs of `telemeter client` contain alerts similar to `msg="unable to forward results" err="the incoming sample data is too long"`: kubectl -n stacklight logs telemeter-client-<podID> Verify the current length limit established by Helm release: kubectl -n stacklight get pods telemeter-client-<podID> -o yaml \| grep limit-bytes
Mitigation	Add the following parameter to the StackLight Helm release values of the corresponding `Cluster` object: telemetry: telemeterClient: limitBytes: 4194304 Wait for the `telemeter-client-<podID>` Pod to be be recreated and the byte limit to be changed from `--limit-bytes=1048576` to `--limit-bytes=4194304`.

OpenSearchPVCMismatch alert raises due to the OpenSearch PVC size mismatch¶

Caution

The below issue resolution applies since Container Cloud 2.22.0 (Cluster release 11.6.0) to existing clusters with insufficient resources. Before Container Cloud 2.22.0 (Cluster release 11.6.0), use the workaround described in the StackLight known issue 27732-1. New clusters deployed on top of Container Cloud 2.22.0 are not affected.

The OpenSearch elasticsearch.persistentVolumeClaimSize custom setting can be overwritten by logging.persistentVolumeClaimSize during deployment of a Container Cloud cluster of any type and is set to the default 30Gi. This issue raises the OpenSearchPVCMismatch alert. Since elasticsearch.persistentVolumeClaim is immutable, you cannot update the value by editing of the Cluster object.

Note

This issue does not affect cluster operability if the current volume capacity is enough for the cluster needs.

To apply the issue resolution, select from the following use cases:

StackLight with an expandable StorageClass for OpenSearch PVCs

Verify that the StorageClass provisioner has enough space to satisfy the new size:

kubectl get helmbundle stacklight-bundle -n stacklight -o json | jq '.spec.releases[] |
 select(.name == "opensearch") | .values.volumeClaimTemplate.resources.requests.storage'

The system response contains the value of the elasticsearch.persistentVolumeClaimSize parameter.

Scale down the opensearch-master StatefulSet with dependent resources to 0 and disable the elasticsearch-curator CronJob:

kubectl -n stacklight scale --replicas 0 deployment opensearch-dashboards \
&& kubectl -n stacklight get pods -l app=opensearch-dashboards | awk '{if (NR!=1) {print $1}}' | \
xargs -r kubectl -n stacklight wait --for=delete --timeout=10m pod

kubectl -n stacklight scale --replicas 0 deployment metricbeat \
&& kubectl -n stacklight get pods -l app=metricbeat | awk '{if (NR!=1) {print $1}}' | \
xargs -r kubectl -n stacklight wait --for=delete --timeout=10m pod

kubectl -n stacklight patch cronjobs elasticsearch-curator -p '{"spec": {"suspend": true}}'

kubectl -n stacklight scale --replicas 0 statefulset opensearch-master \
&& kubectl -n stacklight get pods -l app=opensearch-master | awk '{if (NR!=1) {print $1}}' | \
xargs -r kubectl -n stacklight wait --for=delete --timeout=30m pod

Patch the PVC with the correct value for elasticsearch.persistentVolumeClaimSize:

pvc_size=$(kubectl -n stacklight get statefulset -l 'app=opensearch-master' \
-o json | jq -r '.items[] | select(.spec.volumeClaimTemplates[].metadata.name // "" |
 startswith("opensearch-master")).spec.volumeClaimTemplates[].spec.resources.requests.storage')

 kubectl -n stacklight patch pvc opensearch-master-opensearch-master-0 \
 -p  '{ "spec": { "resources": { "requests": { "storage": "'"${pvc_size}"'" }}}}'

Scale up the opensearch-master StatefulSet with dependent resources to 1 and enable the elasticsearch-curator CronJob:

replicas=$(kubectl get helmbundle stacklight-bundle -n stacklight \
-o json | jq '.spec.releases[] | select(.name == "opensearch") | .values.replicas')

kubectl -n stacklight scale --replicas ${replicas} statefulset opensearch-master \
&& kubectl -n stacklight wait --for=condition=Ready --timeout=30m pod -l app=opensearch-master

kubectl -n stacklight scale --replicas 1 deployment opensearch-dashboards \
&& kubectl -n stacklight wait --for=condition=Ready --timeout=10m pod -l app=opensearch-dashboards

kubectl -n stacklight scale --replicas 1 deployment metricbeat \
&& kubectl -n stacklight wait --for=condition=Ready --timeout=10m pod -l app=metricbeat

kubectl -n stacklight patch cronjobs elasticsearch-curator -p '{"spec": {"suspend": false}}'

StackLight with a non-expandable StorageClass for OpenSearch PVCs

If StackLight is operating in HA mode, the local volume provisioner (LVP) has a non-expandable StorageClass used for OpenSearch PVCs provisioning. Thus, the affected PV nodes have insufficient disk space.

If StackLight is operating in non-HA mode, the default non-expandable storage provisioner is used.

Warning

After applying this issue resolution, the existing OpenSearch data will be lost. If data loss is acceptable, proceed with the steps below.

Move the existing log data to a new PV if required.
Verify that the provisioner has enough space to satisfy the new size:
```
kubectl get helmbundle stacklight-bundle -n stacklight -o json | jq '.spec.releases[] |
 select(.name == "opensearch") | .values.volumeClaimTemplate.resources.requests.storage'
```
The system response contains the value of the elasticsearch.persistentVolumeClaimSize parameter.

To satisfy the required size:
- For LVP, increase the disk size
- For non-LVP, make sure that the default StorageClass provisioner has enough space

Scale down the opensearch-master StatefulSet with dependent resources to 0 and disable the elasticsearch-curator CronJob:

kubectl -n stacklight scale --replicas 0 deployment opensearch-dashboards \
&& kubectl -n stacklight get pods -l app=opensearch-dashboards | awk '{if (NR!=1) {print $1}}' | \
xargs -r kubectl -n stacklight wait --for=delete --timeout=10m pod

kubectl -n stacklight scale --replicas 0 deployment metricbeat \
&& kubectl -n stacklight get pods -l app=metricbeat | awk '{if (NR!=1) {print $1}}' | \
xargs -r kubectl -n stacklight wait --for=delete --timeout=10m pod

kubectl -n stacklight patch cronjobs elasticsearch-curator -p '{"spec": {"suspend": true}}'

kubectl -n stacklight scale --replicas 0 statefulset opensearch-master \
&& kubectl -n stacklight get pods -l app=opensearch-master | awk '{if (NR!=1) {print $1}}' | \
xargs -r kubectl -n stacklight wait --for=delete --timeout=30m pod

Delete existing PVCs:
```
kubectl delete pvc -l 'app=opensearch-master' -n stacklight
```
Warning

This command removes all existing logs data from PVCs.

Scale up the opensearch-master StatefulSet with dependent resources and enable the elasticsearch-curator CronJob:

replicas=$(kubectl get helmbundle stacklight-bundle -n stacklight \
-o json | jq '.spec.releases[] | select(.name == "opensearch") | .values.replicas')

kubectl -n stacklight scale --replicas ${replicas} statefulset opensearch-master \
&& kubectl -n stacklight wait --for=condition=Ready --timeout=30m pod -l app=opensearch-master

kubectl -n stacklight scale --replicas 1 deployment opensearch-dashboards \
&& kubectl -n stacklight wait --for=condition=Ready --timeout=10m pod -l app=opensearch-dashboards

kubectl -n stacklight scale --replicas 1 deployment metricbeat \
&& kubectl -n stacklight wait --for=condition=Ready --timeout=10m pod -l app=metricbeat

kubectl -n stacklight patch cronjobs elasticsearch-curator -p '{"spec": {"suspend": false}}'

Tip

To verify whether a StorageClass is expandable:

kubectl get pvc -l 'app=opensearch-master' -n stacklight \
-Ao jsonpath='{range .items[*]}{.spec.storageClassName}{"\n"}{end}' | \
xargs -I{} bash -c "echo -n 'StorageClass: {}, expandable: ' \
&& kubectl get storageclass {} -Ao jsonpath='{.allowVolumeExpansion}' && echo ''"

Example of a system response for an expandable StorageClass:

StorageClass: csi-sc-cinderplugin, expandable: true

Example of a system response for a non-expandable StorageClass:

StorageClass: stacklight-elasticsearch-data, expandable:
StorageClass: stacklight-elasticsearch-data, expandable:
StorageClass: stacklight-elasticsearch-data, expandable:

OpenSearch cluster deadlock due to the corrupted index¶

Due to instability issues in a cluster, for example, after disaster recovery, networking issues, or low resources, some OpenSearch master pods may remain in the PostStartHookError due to the corrupted .opendistro-ism-config index.

To verify that the cluster is affected:

The cluster is affected only when both conditions are met:

One or two opensearch-master pods are stuck in the PostStartHookError state.

The following example contains two failed pods:

kubectl get pod -n stacklight | grep opensearch-master

opensearch-master-0    1/1   Running              0                  41d
opensearch-master-1    0/1   PostStartHookError   1659 (2m12s ago)   41d
opensearch-master-2    0/1   PostStartHookError   1660 (6m6s ago)    41d

In the logs of the opensearch container of the affected pods, the following WARN message is present:

kubectl logs opensearch-master-1 -n stacklight -c opensearch

...
[2024-06-05T08:30:26,241][WARN ][r.suppressed             ] [opensearch-master-1] path: /_plugins/_ism/policies/audit_rollover_policy, params: {policyID=audit_rollover_policy, if_seq_no=30554, if_primary_term=3}
org.opensearch.action.support.replication.ReplicationOperation$RetryOnPrimaryException: shard is not in primary mode
...

The message itself can differ, but the following two parts of this message indicate that the cluster is affected:

The /_plugins/_ism prefix in the path
The shard is not in primary mode exception

To apply the issue resolution:

Decrease the number of replica shards from 1 to 0 for the .opendistro-ism-config internal index:
1. Log in to the pod that is not affected by this issue, for example, opensearch-master-0:
```
kubectl exec -it pod/opensearch-master-0 -n stacklight -c opensearch -- bash
```
2. Verify that the .opendistro-ism-config index number of replicas is "1":
```
curl "http://localhost:9200/.opendistro-ism-config/_settings" | jq '.".opendistro-ism-config".settings.index.number_of_replicas'
```
  Example of system response:
```
"1"
```
3. Decrease replicas from 1 to 0:
```
curl -X PUT -H 'Content-Type: application/json' "http://localhost:9200/.opendistro-ism-config/_settings" -d '{"index.number_of_replicas": 0 }'
```
4. Verify that the .opendistro-ism-config index number of replicas is "0".
5. Wait around 30 minutes and verify whether the affected pods started normally or are still failing in the PostStartHookError loop.
  - If the pods started, increase the number of replicas for the .opendistro-ism-config index back to 1 again.
  - If the pods did not start, proceed to the following step.
Remove the internal .opendistro-ism-config index to recreate it again:
1. Remove the index:
```
curl -X DELETE "http://localhost:9200/.opendistro-ism-config"
```
2. Wait until all shards of this index are removed, which usually takes up to 10-15 seconds:
```
curl localhost:9200/_cat/shards | grep opendistro-ism-config
```
  The system response must be empty.
  
  This internal index will be recreated on the next PostStartHook execution of any affected replica.
3. Wait up to 30 minutes, assuming that during this time at least one attempt of PostStartHook execution occurs, and verify that the internal index was recreated:
```
curl localhost:9200/_cat/shards | grep opendistro-ism-config
```
  The system response must contain two shards in the output, for example:
```
.opendistro-ism-config    0 p STARTED    10.233.118.238 opensearch-master-2
.opendistro-ism-config    0 r STARTED    10.233.113.58  opensearch-master-1
```
4. Wait up to 30 minutes and verify whether the affected pods started normally.
5. Before 2.27.0 (Cluster releases 17.2.0 and 16.2.0), verify that the cluster is not affected by the issue 40020. If it is affected, proceed to the corresponding workaround.

Failure of shard relocation in the OpenSearch cluster¶

On large managed clusters, shard relocation may fail in the OpenSearch cluster with the yellow or red status of the OpenSearch cluster. The characteristic symptom of the issue is that in the stacklight namespace, the statefulset.apps/opensearch-master containers are experiencing throttling with the KubeContainersCPUThrottlingHigh alert firing for the following set of labels:

{created_by_kind="StatefulSet",created_by_name="opensearch-master",namespace="stacklight"}

Caution

The throttling that OpenSearch is experiencing may be a temporary situation, which may be related, for example, to a peaky load and the ongoing shards initialization as part of disaster recovery or after node restart. In this case, Mirantis recommends waiting until initialization of all shards is finished. After that, verify the cluster state and whether throttling still exists. And only if throttling does not disappear, apply the workaround below.

To verify that the initialization of shards is ongoing:

kubectl exec -it pod/opensearch-master-0 -n stacklight -c opensearch -- bash

curl "http://localhost:9200/_cat/shards" | grep INITIALIZING

Example of system response:

.ds-system-000072    2 r INITIALIZING    10.232.182.135 opensearch-master-1
.ds-system-000073    1 r INITIALIZING    10.232.7.145   opensearch-master-2
.ds-system-000073    2 r INITIALIZING    10.232.182.135 opensearch-master-1
.ds-audit-000001     2 r INITIALIZING    10.232.7.145   opensearch-master-2

The system response above indicates that shards from the .ds-system-000072, .ds-system-000073, and .ds-audit-000001 indicies are in the INITIALIZING state. In this case, Mirantis recommends waiting until this process is finished, and only then consider changing the limit.

You can additionally analyze the exact level of throttling and the current CPU usage on the Kubernetes Containers dashboard in Grafana.

To apply the issue resolution:

Verify the currently configured CPU requests and limits for the opensearch containers:

kubectl -n stacklight get statefulset.apps/opensearch-master -o jsonpath="{.spec.template.spec.containers[?(@.name=='opensearch')].resources}"

Example of system response:

{"limits":{"cpu":"600m","memory":"8Gi"},"requests":{"cpu":"500m","memory":"6Gi"}}

In the example above, the CPU request is 500m and the CPU limit is 600m.

Increase the CPU limit to a reasonably high number.

For example, the default CPU limit for the clusters with the clusterSize:large parameter set was increased from 8000m to 12000m for StackLight in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0).

Note

For details, on the clusterSize parameter, see Operations Guide: StackLight configuration parameters - Cluster size.

If the defaults are already overridden on the affected cluster using the resourcesPerClusterSize or resources parameters as described in Operations Guide: StackLight configuration parameters - Resource limits, then the exact recommended number depends on the currently set limit.

Mirantis recommends increasing the limit by 50%. If it does not resolve the issue, another increase iteration will be required.
When you select the required CPU limit, increase it as described in Operations Guide: StackLight configuration parameters - Resource limits.

If the CPU limit for the opensearch component is already set, increase it in the Cluster object for the opensearch parameter. Otherwise, the default StackLight limit is used. In this case, increase the CPU limit for the opensearch component using the resources parameter.
Wait until all opensearch-master pods are recreated with the new CPU limits and become running and ready.

To verify the current CPU limit for every opensearch container in every opensearch-master pod separately:
```
kubectl -n stacklight get pod/opensearch-master-<podSuffixNumber> -o jsonpath="{.spec.containers[?(@.name=='opensearch')].resources}"
```
In the command above, replace <podSuffixNumber> with the name of the pod suffix. For example, pod/opensearch-master-0 or pod/opensearch-master-2.

Example of system response:
```
{"limits":{"cpu":"900m","memory":"8Gi"},"requests":{"cpu":"500m","memory":"6Gi"}}
```
The waiting time may take up to 20 minutes depending on the cluster size.

If the issue is fixed, the KubeContainersCPUThrottlingHigh alert stops firing immediately, while OpenSearchClusterStatusWarning or OpenSearchClusterStatusCritical can still be firing for some time during shard relocation.

If the KubeContainersCPUThrottlingHigh alert is still firing, proceed with another iteration of the CPU limit increase.

StackLight pods get stuck with the ‘NodeAffinity failed’ error¶

On a managed cluster, the StackLight Pods may get stuck with the Pod predicate NodeAffinity failed error in the Pod status. The issue may occur if the StackLight node label was added to one machine and then removed from another one.

The issue does not affect the StackLight services, all required StackLight Pods migrate successfully except extra Pods that are created and stuck during Pod migration.

To apply the issue resolution, remove the stuck Pods:

kubectl --kubeconfig <managedClusterKubeconfig> -n stacklight delete pod <stuckPodName>

No logs are forwarded to Splunk¶

After enabling log forwarding to Splunk as described in Enable log forwarding to external destinations, you may see no specific errors but logs are not being sent to Splunk. In this case, debug the issue using the procedure below.

To debug the issue:

Temporary set the debug logging level for the syslog output plugin:

logging:
  externalOutputs:
    splunk_syslog_output:
      plugin_log_level: debug
      type: remote_syslog
      host: remote-splunk-syslog.svc
      port: 514
      protocol: tcp
      tls: true
      ca_file: /etc/ssl/certs/splunk-syslog.pem
      verify_mode: 0
      buffer:
        chunk_limit: 16MB
        total_limit: 128MB
  externalOutputSecretMounts:
  - secretName: syslog-pem
    mountPath: /etc/ssl/certs/splunk-syslog.pem

When the fluentd-logs pods are updated, grep any pod by splunk_syslog_output:

kubectl logs -n stacklight -f <fluentd-logs-pod-name>| grep 'splunk_syslog_output'

In the following example output, the error indicates that the specified Splunk host name cannot be resolved. Therefore, verify and update the host name accordingly.

Example output

2023-07-25 09:57:29 +0000 [info]: adding match in @splunk_syslog_output-external pattern="**" type="remote_syslog"
       @label @splunk_syslog_output-external
  <label @splunk_syslog_output-external>
      @id splunk_syslog_output-external
       path "/var/log/fluentd-buffers/splunk_syslog_output-external.system.buffer"
  path "/var/log/fluentd-buffers/splunk_syslog_output-external.system.buffer"
 path "/var/log/fluentd-buffers/splunk_syslog_output-external.system.buffer"
2023-07-25 09:57:30 +0000 [debug]: [splunk_syslog_output-external] restoring buffer file: path = /var/log/fluentd-buffers/splunk_syslog_output-external.system.buffer/buffer.q6014c3643b68e68c03c6217052e1af55.log
2023-07-25 09:57:30 +0000 [debug]: [splunk_syslog_output-external] restoring buffer file: path = /var/log/fluentd-buffers/splunk_syslog_output-external.system.buffer/buffer.q6014c36877047570ab3b892f6bd5afe8.log
2023-07-25 09:57:30 +0000 [debug]: [splunk_syslog_output-external] restoring buffer file: path = /var/log/fluentd-buffers/splunk_syslog_output-external.system.buffer/buffer.b6014c36d40fcc16ea630fa86c9315638.log
2023-07-25 09:57:30 +0000 [debug]: [splunk_syslog_output-external] buffer started instance=61140 stage_size=17628134 queue_size=5026605
2023-07-25 09:57:30 +0000 [debug]: [splunk_syslog_output-external] flush_thread actually running
2023-07-25 09:57:30 +0000 [debug]: [splunk_syslog_output-external] enqueue_thread actually running
2023-07-25 09:57:33 +0000 [debug]: [splunk_syslog_output-external] taking back chunk for errors. chunk="6014c3643b68e68c03c6217052e1af55"
2023-07-25 09:57:33 +0000 [warn]: [splunk_syslog_output-external] failed to flush the buffer. retry_times=0 next_retry_time=2023-07-25 09:57:35 +0000 chunk="6014c3643b68e68c03c6217052e1af55" error_class=SocketError error="getaddrinfo: Name or service not known"

API Reference¶

Mirantis OpenStack for Kubernetes (MOSK) provides a cloud operator with a declarative interface to describe the desired configuration of the cloud.

The program modules responsible for life cycle management of MOSK components extend the API of the underlying Kubernetes cluster with Custom Resources (CRs). These data structures define the services the cloud will provide and specifics of its behavior.

CRs used in MOSK contain dozens of tunable parameters, and the number is constantly growing with every new release as new capabilities get added to the product. Also, each parameter value must be in a specific format and within a range of valid values.

The purpose of the reference documents below is to provide cloud operators with an up-to-date and comprehensive definition of the language they need to use to communicate with MOSK.

Caution

The below documents are intended only for advanced infrastructure operators who are familiar with Kubernetes Cluster API.

Management resources¶

This section provides descriptions of custom resources for management of a MOSK cluster.

General resources¶

This section contains descriptions and examples of general custom resources for MOSK.

PublicKey resource¶

This section describes the PublicKey resource used in MOSK to provide SSH access to every machine of a cluster.

The Container Cloud PublicKey CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is PublicKey.
metadata
The metadata object field of the PublicKey resource contains the following fields:
- name
  Name of the public key.
- namespace
  Project where the public key is created.
spec
The spec object field of the PublicKey resource contains the publicKey field that is an SSH public key value.

The PublicKey resource example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: PublicKey
metadata:
  name: demokey
  namespace: test
spec:
  publicKey: |
    ssh-rsa AAAAB3NzaC1yc2EAAAA…

License resource¶

This section describes the License custom resource (CR) MOSK to maintain the MOSK license data.

Warning

The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying the object.

Therefore, do not use kubectl apply on this object. Use kubectl create, kubectl patch, or kubectl edit instead.

If you used kubectl apply on this object, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the object using kubectl edit.

The Container Cloud License CR contains the following fields:

apiVersion
The API version of the object that is kaas.mirantis.com/v1alpha1.
kind
The object type that is License.
metadata
The metadata object field of the License resource contains the following fields:
- name
  The name of the License object, must be license.
spec
The spec object field of the License resource contains the Secret reference where license data is stored.
- license
  - secret
    The Secret reference where the license data is stored.
    
    key
    The name of a key in the license Secret data field under which the license data is stored.
    
    name
    The name of the Secret where the license data is stored.
  - value
    The value of the updated license. If you need to update the license, place it under this field. The new license data will be placed to the Secret and value will be cleaned.
status
- customerID
  The unique ID of a customer generated during the license issuance.
- instance
  The unique ID of the current MOSK instance.
- dev
  The license is for development.
- openstack
  The license limits for MOSK clusters:
  
  clusters
  The maximum number of MOSK clusters to be deployed. If the field is absent, the number of deployments is unlimited.
  
  workersPerCluster
  The maximum number of workers per MOSK cluster to be created. If the field is absent, the number of workers is unlimited.
- expirationTime
  The license expiration time in the ISO 8601 format.
- expired
  The license expiration state. If the value is true, the license has expired. If the field is absent, the license is valid.

Configuration example of the status fields:

status:
 customerID: "auth0|5dd501e54138450d337bc356"
 instance: 7589b5c3-57c5-4e64-96a0-30467189ae2b
 dev: true
 limits:
   clusters: 3
   workersPerCluster: 5
 expirationTime: 2028-11-28T23:00:00Z

TLSConfig resource¶

This section describes the TLSConfig resource used in MOSK to configure TLS certificates for cluster applications.

Warning

Therefore, do not use kubectl apply on this object. Use kubectl create, kubectl patch, or kubectl edit instead.

If you used kubectl apply on this object, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the object using kubectl edit.

The TLSConfig CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is TLSConfig.
metadata
The metadata object field of the TLSConfig resource contains the following fields:
- name
  Name of the public key.
- namespace
  Project where the TLS certificate is created.
spec
The spec object field contains the configuration to apply for an application. It contains the following fields:
- serverName
  Host name of a server.
- serverCertificate
  Certificate to authenticate server’s identity to a client. A valid certificate bundle can be passed. The server certificate must be on the top of the chain.
- privateKey
  Reference to the Secret object that contains a private key. A private key is a key for the server. It must correspond to the public key used in the server certificate.
  
  key
  Key name in the secret.
  
  name
  Secret name.
- caCertificate
  Certificate that issued the server certificate. The top-most intermediate certificate should be used if a CA certificate is unavailable.

Configuration example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: TLSConfig
metadata:
  namespace: default
  name: keycloak
spec:
  caCertificate: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0...
  privateKey:
    secret:
      key: value
      name: keycloak-s7mcj
  serverCertificate: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0...
  serverName: keycloak.mirantis.com

ContainerRegistry resource¶

This section describes the ContainerRegistry custom resource (CR) used in MOSK to configure CA certificates on machines to access private Docker registries.

The ContainerRegistry CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1
kind
Object type that is ContainerRegistry
metadata
The metadata object field of the ContainerRegistry CR contains the following fields:
- name
  Name of the container registry
- namespace
  Project where the container registry is created
spec
The spec object field of the ContainerRegistry CR contains the following fields:
- domain
  Host name and optional port of the registry
- CACert
  CA certificate of the registry in the base64-encoded format

Caution

Only one ContainerRegistry resource can exist per domain. To configure multiple CA certificates for the same domain, combine them into one certificate.

The ContainerRegistry resource example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: ContainerRegistry
metadata:
  name: demoregistry
  namespace: test
spec:
  domain: demohost:5000
  CACert: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0...

MCCUpgrade resource¶

This section describes the MCCUpgrade resource used in MOSK to configure a schedule for the management cluster update.

The MCCUpgrade CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is MCCUpgrade.
metadata
The metadata object field of the MCCUpgrade resource contains the following fields:
- name
  The name of MCCUpgrade object, must be mcc-upgrade.
spec
The spec object field of the MCCUpgrade resource contains the schedule when management cluster update is allowed or blocked. This field contains the following fields:
- blockUntil
  Deprecated since Container Cloud 2.28.0 (Cluster release 16.3.0). Use autoDelay instead.
  
  Time stamp in the ISO 8601 format, for example, 2021-12-31T12:30:00-05:00. Updates will be disabled until this time. You cannot set this field to more than 7 days in the future and more than 30 days after the latest Container Cloud release.
- autoDelay
  Available since Container Cloud 2.28.0 (Cluster release 16.3.0).
  
  Flag that enables delay of the management cluster auto-update to a new Container Cloud release and ensures that auto-update is not started immediately on the release date. Boolean, false by default.
  
  The delay period is minimum 20 days for each newly discovered release and depends on specifics of each release cycle and on optional configuration of week days and hours selected for update. You can verify the exact date of a scheduled auto-update in the status section of the MCCUpgrade object.
  
  Note
  
  Modifying the delay period is not supported.
- timeZone
  Name of a time zone in the IANA Time Zone Database. This time zone will be used for all schedule calculations. For example: Europe/Samara, CET, America/Los_Angeles.
- schedule
  List of schedule items that allow an update at specific hours or weekdays. The update process can proceed if at least one of these items allows it. Schedule items allow update when both hours and weekdays conditions are met. When this list is empty or absent, update is allowed at any hour of any day. Every schedule item contains the following fields:
  
  hours
  Object with 2 fields: from and to. Both must be non-negative integers not greater than 24. The to field must be greater than the from one. Update is allowed if the current hour in the time zone specified by timeZone is greater or equals to from and is less than to. If hours is absent, update is allowed at any hour.
  
  weekdays
  Object with boolean fields with these names:
  
  monday
  
  tuesday
  
  wednesday
  
  thursday
  
  friday
  
  saturday
  
  sunday
  
  Update is allowed only on weekdays that have the corresponding field set to true. If all fields are false or absent, or weekdays is empty or absent, update is allowed on all weekdays.
Full spec example:
spec: autoDelay: true timeZone: CET schedule: - hours: from: 10 to: 17 weekdays: monday: true tuesday: true - hours: from: 7 to: 10 weekdays: monday: true friday: true
In this example, all schedule calculations are done in the CET timezone and upgrades are allowed only:
- From 7:00 to 17:00 on Mondays
- From 10:00 to 17:00 on Tuesdays
- From 7:00 to 10:00 on Fridays
status
The status object field of the MCCUpgrade resource contains information about the next planned management cluster update, if available. This field contains the following fields:
- nextAttempt ^{Deprecated since 2.28.0 (Cluster release 16.3.0)}
  Time stamp in the ISO 8601 format indicating the time when the Release Controller will attempt to discover and install a new Container Cloud release. Set to the next allowed time according to the schedule configured in spec or one minute in the future if the schedule currently allows update.
- message ^{Deprecated since 2.28.0 (Cluster release 16.3.0)}
  Message from the last update step or attempt.
- nextRelease
  Object describing the next release that the management cluster will be updated to. Absent if no new releases have been discovered. Contains the following fields:
  
  version
  Semver-compatible version of the next Container Cloud release, for example, 2.22.0.
  
  date
  Time stamp in the ISO 8601 format of the Container Cloud release defined in version:
  
  Since 2.28.0 (Cluster release 16.3.0), the field indicates the publish time stamp of a new release.
  
  Before 2.28.0 (Cluster release 16.2.x or earlier), the field indicates the discovery time stamp of a new release.
  
  scheduled
  Available since Container Cloud 2.28.0 (Cluster release 16.3.0). Time window that the pending Container Cloud release update is scheduled for:
  
  startTime
  Time stamp in the ISO 8601 format indicating the start time of the update for the pending Container Cloud release.
  
  endTime
  Time stamp in the ISO 8601 format indicating the end time of the update for the pending Container Cloud release.
- lastUpgrade
  Time stamps of the latest Container Cloud update:
  
  startedAt
  Time stamp in the ISO 8601 format indicating the time when the last Container Cloud update started.
  
  finishedAt
  Time stamp in the ISO 8601 format indicating the time when the last Container Cloud update finished.
- conditions
  Available since Container Cloud 2.28.0 (Cluster release 16.3.0). List of status conditions describing the status of the MCCUpgrade resource. Each condition has the following format:
  
  type
  Condition type representing a particular aspect of the MCCUpgrade object. Currently, the only supported condition type is Ready that defines readiness to process a new release.
  
  If the status field of the Ready condition type is False, the Release Controller blocks the start of update operations.
  
  status
  Condition status. Possible values: True, False, Unknown.
  
  reason
  Machine-readable explanation of the condition.
  
  lastTransitionTime
  Time of the latest condition transition.
  
  message
  Human-readable description of the condition.

Example of MCCUpgrade status:

status:
  conditions:
  - lastTransitionTime: "2024-09-16T13:22:27Z"
    message: New release scheduled for upgrade
    reason: ReleaseScheduled
    status: "True"
    type: Ready
  lastUpgrade: {}
  message: ''
  nextAttempt: "2024-09-16T13:23:27Z"
  nextRelease:
    date: "2024-08-25T21:05:46Z"
    scheduled:
      endTime: "2024-09-17T00:00:00Z"
      startTime: "2024-09-16T00:00:00Z"
    version: 2.28.0

CacheWarmupRequest resource¶

TechPreview Available since Container Cloud 2.24.0 (Cluster release 14.0.0)

This section describes the CacheWarmupRequest custom resource (CR) used in MOSK for for management clusters to predownload images and store them in the mcc-cache service.

The CacheWarmupRequest CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is CacheWarmupRequest.
metadata
The metadata object field of the CacheWarmupRequest resource contains the following fields:
- name
  Name of the CacheWarmupRequest object that must match the existing management cluster name to which the warm-up operation applies.
- namespace
  Project in which the cluster is created. Always set to default as the only available project for management clusters creation.
spec
The spec object field of the CacheWarmupRequest resource contains the settings for artifacts fetching and artifacts filtering through Cluster releases. This field contains the following fields:
- clusterReleases
  Array of strings. Defines a set of Cluster release names to warm up in the mcc-cache service.
- openstackReleases
  Optional. Array of strings. Defines a set of OpenStack releases to warm up in mcc-cache. Applicable only if the ClusterReleases field contains mosk releases.
  
  If you plan to upgrade an OpenStack version, define the current and the target versions including the intermediate versions, if any. For example, to upgrade OpenStack from Victoria to Yoga:
  
  openstackReleases: - victoria - wallaby - xena - yoga
- fetchRequestTimeout
  Optional. String. Time for a single request to download a single artifact. Defaults to 30m. For example, 1h2m3s.
- clientsPerEndpoint
  Optional. Integer. Number of clients to use for fetching artifacts per each mcc-cache service endpoint. Defaults to 2.
- openstackOnly
  Optional. Boolean. Enables fetching of the OpenStack-related artifacts for MOSK. Defaults to false. Applicable only if the ClusterReleases field contains mosk releases. Useful when you need to upgrade only an OpenStack version.

Example configuration:

apiVersion: kaas.mirantis.com/v1alpha1
kind: CacheWarmupRequest
metadata:
  name: example-cluster-name
  namespace: default
spec:
  clusterReleases:
  - mke-14-0-1
  - mosk-15-0-1
  openstackReleases:
  - yoga
  fetchRequestTimeout: 30m
  clientsPerEndpoint: 2
  openstackOnly: false

In this example:

The CacheWarmupRequest object is created for a management cluster named example-cluster-name.
The CacheWarmupRequest object is created in the only allowed default Container Cloud project.
Two Cluster releases mosk-15-0-1 and mke-14-0-1 will be predownloaded.
For mosk-15-0-1, only images related to the OpenStack version Yoga will be predownloaded.
Maximum time-out for a single request to download a single artifact is 30 minutes.
Two parallel workers will fetch artifacts per each mcc-cache service endpoint.
All artifacts will be fetched, not only those related to OpenStack.

See also

Warm up the Container Cloud cache

IAM resources¶

This section contains descriptions and examples of the IAM custom resources for MOSK. For management details, see Manage user roles through Container Cloud API.

IAMUser resource¶

IAMUser is the Cluster (non-namespaced) object. Its objects are synced from Keycloak that is they are created upon user creation in Keycloak and deleted user upon deletion in Keycloak. The IAMUser is exposed as read-only to all users. It contains the following fields:

apiVersion
API version of the object that is iam.mirantis.com/v1alpha1
kind
Object type that is IAMUser
metadata
Object metadata that contains the following field:
- name
  Sanitized user name without special characters with first 8 symbols of the user UUID appended to the end
displayName
Name of the user as defined in the Keycloak database
externalID
ID of the user as defined in the Keycloak database

Configuration example:

apiVersion: iam.mirantis.com/v1alpha1
kind: IAMUser
metadata:
  name: userone-f150d839
displayName: userone
externalID: f150d839-d03a-47c4-8a15-4886b7349791

IAMRole resource¶

IAMRole is the read-only cluster-level object that can have global, namespace, or cluster scope. It contains the following fields:

apiVersion
API version of the object that is iam.mirantis.com/v1alpha1.
kind
Object type that is IAMRole.
metadata
Object metadata that contains the following field:
- name
  Role name. Possible values are: global-admin, cluster-admin, operator, bm-pool-operator, user, member, stacklight-admin, management-admin.
  
  For details on user role assignment, see Manage user roles through Container Cloud API.
  
  Note
  
  The management-admin role is available since Container Cloud 2.25.0 (Cluster releases 17.0.0, 16.0.0, 14.1.0).
description
Role description.
scope
Role scope.

Configuration example:

apiVersion: iam.mirantis.com/v1alpha1
kind: IAMRole
metadata:
  name: global-admin
description: Gives permission to manage IAM role bindings in the Container Cloud deployment.
scope: global

IAMGlobalRoleBinding resource¶

IAMGlobalRoleBinding is the Cluster (non-namespaced) object that should be used for global role bindings in all namespaces. This object is accessible to users with the global-admin IAMRole assigned through the IAMGlobalRoleBinding object. The object contains the following fields:

apiVersion
API version of the object that is iam.mirantis.com/v1alpha1.
kind
Object type that is IAMGlobalRoleBinding.
metadata
Object metadata that contains the following field:
- name
 Role binding name. If the role binding is user-created, user can set any unique name. If a name relates to a binding that is synced by user-controller from Keycloak, the naming convention is <username>-<rolename>.
role
Object role that contains the following field:
- name
  Role name.
user
Object name that contains the following field:
- name
  Name of the iamuser object that the defined role is provided to. Not equal to the user name in Keycloak.

legacy
Defines whether the role binding is legacy. Possible values are true or false.
legacyRole
Applicable when the legacy field value is true. Defines the legacy role name in Keycloak.
external
Defines whether the role is assigned through Keycloak and is synced by user-controller with the MOSK API as the IAMGlobalRoleBinding object. Possible values are true or false.

Caution

If you create the IAM*RoleBinding, do not set or modify the legacy, legacyRole, and external fields unless absolutely necessary and you understand all implications.

Configuration example:

apiVersion: iam.mirantis.com/v1alpha1
kind: IAMGlobalRoleBinding
metadata:
  name: userone-global-admin
role:
  name: global-admin
user:
  name: userone-f150d839
external: false
legacy: false
legacyRole: “”

IAMRoleBinding resource¶

IAMRoleBinding is the namespaced object that represents a grant of one role to one user in all clusters of the namespace. It is accessible to users that have either of the following bindings assigned to them:

IAMGlobalRoleBinding that binds them with the global-admin, operator, or user iamRole. For user, the bindings are read-only.
IAMRoleBinding that binds them with the operator or user iamRole in a particular namespace. For user, the bindings are read-only.

The IAMRoleBinding resource contains the following fields:

apiVersion
API version of the object that is iam.mirantis.com/v1alpha1.
kind
Object type that is IAMRoleBinding.
metadata
Object metadata that contains the following fields:
- namespace
 Namespace that the defined binding belongs to.
- name
 Role binding name. If the role is user-created, user can set any unique name. If a name relates to a binding that is synced from Keycloak, the naming convention is <userName>-<roleName>.

legacy
Defines whether the role binding is legacy. Possible values are true or false.
legacyRole
Applicable when the legacy field value is true. Defines the legacy role name in Keycloak.
external
Defines whether the role is assigned through Keycloak and is synced by user-controller with the MOSK API as the IAMGlobalRoleBinding object. Possible values are true or false.

Caution

If you create the IAM*RoleBinding, do not set or modify the legacy, legacyRole, and external fields unless absolutely necessary and you understand all implications.

role
Object role that contains the following field:
- name
  Role name.
user
Object user that contains the following field:
- name
  Name of the iamuser object that the defined role is granted to. Not equal to the user name in Keycloak.

Configuration example:

apiVersion: iam.mirantis.com/v1alpha1
kind: IAMRoleBinding
metadata:
  namespace: nsone
  name: userone-operator
external: false
legacy: false
legacyRole: “”
role:
  name: operator
user:
  name: userone-f150d839

IAMClusterRoleBinding resource¶

IAMClusterRoleBinding is the namespaced object that represents a grant of one role to one user on one cluster in the namespace. This object is accessible to users that have either of the following bindings assigned to them:

IAMGlobalRoleBinding that binds them with the global-admin, operator, or user iamRole. For user, the bindings are read-only.
IAMRoleBinding that binds them with the operator or user iamRole in a particular namespace. For user, the bindings are read-only.

The IAMClusterRoleBinding object contains the following fields:

apiVersion
API version of the object that is iam.mirantis.com/v1alpha1.
kind
Object type that is IAMClusterRoleBinding.
metadata
Object metadata that contains the following fields:
- namespace
 Namespace of the cluster that the defined binding belongs to.
- name
 Role binding name. If the role is user-created, user can set any unique name. If a name relates to a binding that is synced from Keycloak, the naming convention is <userName>-<roleName>-<clusterName>.
role
Object role that contains the following field:
- name
  Role name.
user
Object user that contains the following field:
- name
  Name of the iamuser object that the defined role is granted to. Not equal to the user name in Keycloak.
cluster
Object cluster that contains the following field:
- name
  Name of the cluster on which the defined role is granted.

legacy
Defines whether the role binding is legacy. Possible values are true or false.
legacyRole
Applicable when the legacy field value is true. Defines the legacy role name in Keycloak.
external
Defines whether the role is assigned through Keycloak and is synced by user-controller with the MOSK API as the IAMGlobalRoleBinding object. Possible values are true or false.

Caution

If you create the IAM*RoleBinding, do not set or modify the legacy, legacyRole, and external fields unless absolutely necessary and you understand all implications.

Configuration example:

apiVersion: iam.mirantis.com/v1alpha1
kind: IAMClusterRoleBinding
metadata:
  namespace: nsone
  name: userone-clusterone-admin
role:
  name: cluster-admin
user:
  name: userone-f150d839
cluster:
  name: clusterone
legacy: false
legacyRole: “”
external: false

Cluster resources¶

This section contains descriptions and examples of cluster resources for MOSK.

BareMetalHost resource¶

Private API since Container Cloud 2.29.0 (Cluster release 16.4.0)

Warning

Since Container Cloud 2.29.0 (Cluster release 16.4.0), use the BareMetalHostInventory resource resource instead of BareMetalHost for adding and modifying configuration of a bare metal server. Any change in the BareMetalHost object will be overwitten by BareMetalHostInventory.

For any existing BareMetalHost object, a BareMetalHostInventory object is created automatically during management cluster update to the Cluster release 16.4.0.

This section describes the BareMetalHost resource used in MOSK. The BareMetalHost object is being created for each Machine and contains all information about machine hardware configuration. BareMetalHost objects are used to monitor and manage the state of a bare metal server. This includes inspecting the host hardware, firmware, operating system provisioning, power control, server deprovision. When a machine is created, the bare metal provider assigns a BareMetalHost to that machine using labels and the BareMetalHostProfile configuration.

For demonstration purposes, the Container Cloud BareMetalHost custom resource (CR) can be split into the following major sections:

BareMetalHost metadata
BareMetalHost configuration
BareMetalHost status

BareMetalHost metadata¶

The Container Cloud BareMetalHost CR contains the following fields:

apiVersion
API version of the object that is metal3.io/v1alpha1.
kind
Object type that is BareMetalHost.
metadata
The metadata field contains the following subfields:
- name
  Name of the BareMetalHost object.
- namespace
  Project in which the BareMetalHost object was created.
- annotations
  Key-value pairs to attach additional metadata to the object:
  
  kaas.mirantis.com/baremetalhost-credentials-name
  Key that connects the BareMetalHost object with a previously created BareMetalHostCredential object. The value of this key must match the BareMetalHostCredential object name.
  
  host.dnsmasqs.metal3.io/address
  Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Key that assigns a particular IP address to a bare metal host during PXE provisioning.
  
  baremetalhost.metal3.io/detached
  Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Key that pauses host management by the bare metal operator for a manual IP address assignment.
  
  Note
  
  If the host provisioning has already started or completed, addition of this annotation deletes the information about the host from Ironic without triggering deprovisioning. The bare metal Operator recreates the host in Ironic once you remove the annotation. For details, see Metal3 documentation.
  
  inspect.metal3.io/hardwaredetails-storage-sort-term
  Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Optional. Key that defines sorting of the bmh:status:storage[] list during inspection of a bare metal host. Accepts multiple tags separated by a comma or semi-column with the ASC/DESC suffix for sorting direction. Example terms: sizeBytes DESC, hctl ASC, type ASC, name DESC.
  
  Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), the following default value applies: hctl ASC, wwn ASC, by_id ASC, name ASC.
- labels
  Labels used by the bare metal provider to find a matching BareMetalHost object to deploy a machine:
  
  hostlabel.bm.kaas.mirantis.com/controlplane
  
  hostlabel.bm.kaas.mirantis.com/worker
  
  hostlabel.bm.kaas.mirantis.com/storage
  
  Each BareMetalHost object added using the Container Cloud web UI will be assigned one of these labels. If the BareMetalHost and Machine objects are created using API, any label may be used to match these objects for a bare metal host to deploy a machine.
  Warning
  
  Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
  name: master-0
  namespace: default
  labels:
    kaas.mirantis.com/baremetalhost-id: hw-master-0
    kaas.mirantis.com/baremetalhost-id: <bareMetalHostHardwareNodeUniqueId>
  annotations:
    kaas.mirantis.com/baremetalhost-credentials-name: hw-master-0-credentials

BareMetalHost configuration¶

The spec section for the BareMetalHost object defines the desired state of BareMetalHost. It contains the following fields:

bmc
Details for communication with the Baseboard Management Controller (bmc) module on a host. Contains the following subfields:
- address
 URL for communicating with the BMC. URLs vary depending on the communication protocol and the BMC type, for example:
 
 IPMI
 Default BMC type in the ipmi://<host>:<port> format. You can also use a plain <host>:<port> format. A port is optional if using the default port 623.
 
 You can change the IPMI privilege level from the default ADMINISTRATOR to OPERATOR with an optional URL parameter privilegelevel: ipmi://<host>:<port>?privilegelevel=OPERATOR.
 
 Redfish
 BMC type in the redfish:// format. To disable TLS, you can use the redfish+http:// format. A host name or IP address and a path to the system ID are required for both formats. For example, redfish://myhost.example/redfish/v1/Systems/System.Embedded.1 or redfish://myhost.example/redfish/v1/Systems/1.
- credentialsName
 Name of the secret containing the BareMetalHost object credentials.
 
 Since Container Cloud 2.21.0 (Cluster release 11.5.0), this field is updated automatically during cluster deployment. For details, see BareMetalHostCredential resource.
 
 Before Container Cloud 2.21.0 (Cluster release 11.4.0 or earlier), the secret requires the username and password keys in the Base64 encoding.
- disableCertificateVerification
 Boolean to skip certificate validation when true.
bootMACAddress
MAC address for booting.
bootMode
Boot mode: UEFI if UEFI is enabled and legacy if disabled.
online
Defines whether the server must be online after provisioning is done. Boolean.

Warning

Setting online: false to more than one bare metal host in a management cluster at a time can make the cluster non-operational.

Configuration example:

metadata:
  name: node-1-name
  annotations:
    kaas.mirantis.com/baremetalhost-credentials-name: node-1-credentials
spec:
  bmc:
    address: 192.168.33.106:623
    credentialsName: ''
  bootMACAddress: 0c:c4:7a:a8:d3:44
  bootMode: legacy
  online: true

BareMetalHost status¶

The status field of the BareMetalHost object defines the current state of BareMetalHost. It contains the following fields:

errorMessage
Latest error message reported by the provisioning subsystem.
goodCredentials
Latest credentials that were validated.
hardware
Hardware discovered on the host. Contains information about the storage, CPU, host name, firmware, and so on.
operationalStatus
Status of the host:
- OK
  Host is configured correctly and is manageable.
- discovered
  Host is only partially configured. For example, the bmc address is discovered but not the login credentials.
- error
  Host has any sort of error.
poweredOn
Host availability status: powered on (true) or powered off (false).
provisioning
State information tracked by the provisioner:
- state
  Current action being done with the host by the provisioner.
- id
  UUID of a machine.
triedCredentials
Details of the last credentials sent to the provisioning backend.

Configuration example:

status:
  errorMessage: ""
  goodCredentials:
    credentials:
      name: master-0-bmc-secret
      namespace: default
    credentialsVersion: "13404"
  hardware:
    cpu:
      arch: x86_64
      clockMegahertz: 3000
      count: 32
      flags:
      - 3dnowprefetch
      - abm
      ...
      model: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
    firmware:
      bios:
        date: ""
        vendor: ""
        version: ""
    hostname: ipa-fcab7472-892f-473c-85a4-35d64e96c78f
    nics:
    - ip: ""
      mac: 0c:c4:7a:a8:d3:45
      model: 0x8086 0x1521
      name: enp8s0f1
      pxe: false
      speedGbps: 0
      vlanId: 0
      ...
    ramMebibytes: 262144
    storage:
    - by_path: /dev/disk/by-path/pci-0000:00:1f.2-ata-1
      hctl: "4:0:0:0"
      model: Micron_5200_MTFD
      name: /dev/sda
      rotational: false
      serialNumber: 18381E8DC148
      sizeBytes: 1920383410176
      vendor: ATA
      wwn: "0x500a07511e8dc148"
      wwnWithExtension: "0x500a07511e8dc148"
      ...
    systemVendor:
      manufacturer: Supermicro
      productName: SYS-6018R-TDW (To be filled by O.E.M.)
      serialNumber: E16865116300188
  operationalStatus: OK
  poweredOn: true
  provisioning:
    state: provisioned
  triedCredentials:
    credentials:
      name: master-0-bmc-secret
      namespace: default
    credentialsVersion: "13404"

BareMetalHostCredential resource¶

This section describes the BareMetalHostCredential custom resource (CR) used in the management API for MOSK. The BareMetalHostCredential object is created for each BareMetalHostInventory and contains all information about the Baseboard Management Controller (bmc) credentials.

Note

Caution

Warning

Therefore, do not use kubectl apply on this object. Use kubectl create, kubectl patch, or kubectl edit instead.

If you used kubectl apply on this object, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the object using kubectl edit.

For demonstration purposes, the BareMetalHostCredential CR can be split into the following sections:

BareMetalHostCredential metadata
BareMetalHostCredential configuration

BareMetalHostCredential metadata¶

The BareMetalHostCredential metadata contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1
kind
Object type that is BareMetalHostCredential
metadata
The metadata field contains the following subfields:
- name
  Name of the BareMetalHostCredential object
- namespace
  Project in which the related BareMetalHostInventory object is created
- labels
  Labels used by the bare metal provider:
  
  kaas.mirantis.com/region
  Region name
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.

BareMetalHostCredential configuration¶

The spec section for the BareMetalHostCredential object contains sensitive information that is moved to a separate Secret object during cluster deployment:

username
User name of the bmc account with administrator privileges to control the power state and boot source of the bare metal host
password
Details on the user password of the bmc account with administrator privileges:
- value
  Password that will be automatically removed once saved in a separate Secret object
- name
  Name of the Secret object where credentials are saved

The BareMetalHostCredential object creation triggers the following automatic actions:

Create an underlying Secret object containing data about username and password of the bmc account of the related BareMetalHostCredential object.
Erase sensitive password data of the bmc account from the BareMetalHostCredential object.
Add the created Secret object name to the spec.password.name section of the related BareMetalHostCredential object.
Update BareMetalHostInventory.spec.bmc.bmhCredentialsName with the BareMetalHostCredential object name.

Note

Before Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0), BareMetalHost.spec.bmc.credentialsName was updated with the BareMetalHostCredential object name.

Note

When you delete a BareMetalHostInventory object, the related BareMetalHostCredential object is deleted automatically.

Note

On existing clusters, a BareMetalHostCredential object is automatically created for each BareMetalHostInventory object during a cluster update.

Example of BareMetalHostCredential before the cluster deployment starts:

apiVersion: kaas.mirantis.com/v1alpha1
kind: BareMetalHostCredential
metadata:
  name: hw-master-0-credetnials
  namespace: default
spec:
  username: admin
  password:
    value: superpassword

Example of BareMetalHostCredential created during cluster deployment:

apiVersion: kaas.mirantis.com/v1alpha1
kind: BareMetalHostCredential
metadata:
  name: hw-master-0-credetnials
  namespace: default
spec:
  username: admin
  password:
    name: secret-cv98n7c0vb9

BareMetalHostInventory resource¶

Available since Container Cloud 2.29.0 (Cluster release 16.4.0)

Note

Caution

This section describes the BareMetalHostInventory resource used in MOSK to monitor and manage the state of a bare metal server. This includes inspecting the host hardware, firmware, operating system provisioning, power control, and server deprovision. The BareMetalHostInventory object is created for each Machine and contains all information about machine hardware configuration.

Each BareMetalHostInventory object is synchronized with an automatically created BareMetalHost object, which is used for internal purposes of the private API.

Use the BareMetalHostInventory object instead of BareMetalHost resource for adding and modifying configuration of a bare metal server.

Caution

Any change in the BareMetalHost object will be overwitten by BareMetalHostInventory.

For any existing BareMetalHost object, a BareMetalHostInventory object is created automatically during management cluster update to Container Cloud 2.29.0 (Cluster release 16.4.0).

For demonstration purposes, the Container Cloud BareMetalHostInventory custom resource (CR) can be split into the following major sections:

BareMetalHostInventory metadata
BareMetalHostInventory configuration
BareMetalHostInventory status

BareMetalHostInventory metadata¶

The BareMetalHostInventory CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is BareMetalHostInventory.
metadata
The metadata field contains the following subfields:
- name
  Name of the BareMetalHostInventory object.
- namespace
  Project in which the BareMetalHostInventory object was created.
- annotations
  - host.dnsmasqs.metal3.io/address
    Key that assigns a particular IP address to a bare metal host during PXE provisioning. For details, see Manually allocate IP addresses for bare metal hosts.
  - baremetalhost.metal3.io/detached
    Key that pauses host management by the bare metal Operator for a manual IP address assignment.
    
    Note
    
    If the host provisioning has already started or completed, addition of this annotation deletes the information about the host from Ironic without triggering deprovisioning. The bare metal Operator recreates the host in Ironic once you remove the annotation. For details, see Metal3 documentation.
  - inspect.metal3.io/hardwaredetails-storage-sort-term
    Optional. Key that defines sorting of the bmh:status:storage[] list during inspection of a bare metal host. Accepts multiple tags separated by a comma or semi-column with the ASC/DESC suffix for sorting direction. Example terms: sizeBytes DESC, hctl ASC, type ASC, name DESC.
    
    The default value is hctl ASC, wwn ASC, by_id ASC, name ASC.
- labels
  Labels used by the bare metal provider to find a matching BareMetalHostInventory object for machine deployment. For example:
  
  hostlabel.bm.kaas.mirantis.com/controlplane
  
  hostlabel.bm.kaas.mirantis.com/worker
  
  hostlabel.bm.kaas.mirantis.com/storage
  
  Warning
  
  Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: BareMetalHostInventory
metadata:
  name: master-0
  namespace: default
  labels:
    kaas.mirantis.com/baremetalhost-id: hw-master-0
  annotations:
    inspect.metal3.io/hardwaredetails-storage-sort-term: hctl ASC, wwn ASC, by_id ASC, name ASC

BareMetalHostInventory configuration¶

The spec section for the BareMetalHostInventory object defines the required state of BareMetalHostInventory. It contains the following fields:

bmc
Details for communication with the Baseboard Management Controller (bmc) module on a host. Contains the following subfields:
- address
 URL for communicating with the BMC. URLs vary depending on the communication protocol and the BMC type. For example:
 
 IPMI
 Default BMC type in the ipmi://<host>:<port> format. You can also use a plain <host>:<port> format. A port is optional if using the default port 623.
 
 You can change the IPMI privilege level from the default ADMINISTRATOR to OPERATOR with an optional URL parameter privilegelevel: ipmi://<host>:<port>?privilegelevel=OPERATOR.
 
 Redfish
 BMC type in the redfish:// format. To disable TLS, you can use the redfish+http:// format. A host name or IP address and a path to the system ID are required for both formats. For example, redfish://myhost.example/redfish/v1/Systems/System.Embedded.1 or redfish://myhost.example/redfish/v1/Systems/1.
- bmhCredentialsName
 Name of the BareMetalHostCredentials object.
- disableCertificateVerification
 Key that disables certificate validation. Boolean, false by default. When true, the validation is skipped.
bootMACAddress
MAC address for booting.
bootMode
Boot mode: UEFI if UEFI is enabled and legacy if disabled.
online
Defines whether the server must be online after provisioning is done.

Warning

Setting online: false to more than one bare metal host in a management cluster at a time can make the cluster non-operational.

Configuration example:

metadata:
  name: master-0
spec:
  bmc:
    address: 192.168.33.106:623
    bmhCredentialsName: 'master-0-bmc-credentials'
  bootMACAddress: 0c:c4:7a:a8:d3:44
  bootMode: legacy
  online: true

BareMetalHostInventory status¶

The status field of the BareMetalHostInventory object defines the current state of BareMetalHostInventory. It contains the following fields:

errorMessage
Latest error message reported by the provisioning subsystem.
errorCount
Number of errors that the host has encountered since the last successful operation.
operationalStatus
Status of the host:
- OK
  Host is configured correctly and is manageable.
- discovered
  Host is only partially configured. For example, the bmc address is discovered but the login credentials are not.
- error
  Host has any type of error.
poweredOn
Host availability status that is powered on (true) or powered off (false).
operationHistory
Key that contains information about performed operations.

Status example:

status:
  errorCount: 0
  errorMessage: ""
  operationHistory:
    deprovision:
      end: null
      start: null
    inspect:
      end: "2025-01-01T00:00:00Z"
      start: "2025-01-01T00:00:00Z"
    provision:
      end: "2025-01-01T00:00:00Z"
      start: "2025-01-01T00:00:00Z"
    register:
      end: "2025-01-01T00:00:00Z"
      start: "2025-01-01T00:00:00Z"
  operationalStatus: OK
  poweredOn: true

BareMetalHostProfile resource¶

This section describes the BareMetalHostProfile resource used in MOSK to define how the storage devices and operating system are provisioned and configured.

For demonstration purposes, the BareMetalHostProfile custom resource (CR) is split into the following major sections:

metadata
spec

metadata¶

The BareMetalHostProfile CR contains the following fields:

apiVersion
API version of the object that is metal3.io/v1alpha1.
kind
Object type that is BareMetalHostProfile.
metadata
The metadata field contains the following subfields:
- name
  Name of the bare metal host profile.
- namespace
  Project in which the bare metal host profile was created.

Configuration example:

apiVersion: metal3.io/v1alpha1
kind: BareMetalHostProfile
metadata:
  name: default
  namespace: default

spec¶

The spec field of BareMetalHostProfile object contains the fields to customize your hardware configuration:

Warning

All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it

The wipe field is always considered true for these devices. The false value is ignored.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

devices
List of definitions of the physical storage devices. To configure more than three storage devices per host, add additional devices to this list. Each device in the list can have one or more partitions defined by the list in the partitions field.
- Each device in the list must have the following fields in the properties section for device handling:
 - workBy (recommended, string)
 Defines how the device should be identified. Accepts a comma-separated string with the following recommended value (in order of priority): by_id,by_path,by_wwn,by_name. Since Container Cloud 2.25.1 (Cluster releases 17.0.1 and 16.0.1) , this value is set by default.
 - wipeDevice (recommended, object)
 Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Enables and configures cleanup of a device or its metadata before cluster deployment. Contains the following fields:
 
 eraseMetadata (dictionary)
 Enables metadata cleanup of a device. Contains the following field:
 
 enabled (boolean)
 Enables the eraseMetadata option. False by default.
 
 eraseDevice (dictionary)
 Configures a complete cleanup of a device. Contains the following fields:
 
 blkdiscard (object)
 Executes the blkdiscard command on the target device to discard all data blocks. Contains the following fields:
 
 enabled (boolean)
 Enables the blkdiscard option. False by default.
 
 zeroout (string)
 Configures writing of zeroes to each block during device erasure. Contains the following options:
 
 fallback - default, blkdiscard attempts to write zeroes only if the device does not support the block discard feature. In this case, the blkdiscard command is re-executed with an additional --zeroout flag.
 
 always - always write zeroes.
 
 never - never write zeroes.
 
 userDefined (object)
 Enables execution of a custom command or shell script to erase the target device. Contains the following fields:
 
 enabled (boolean)
 Enables the userDefined option. False by default.
 
 command (string)
 Defines a command to erase the target device. Empty by default. Mutually exclusive with script. For the command execution, the ansible.builtin.command module is called.
 
 script (string)
 Defines a plain-text script allowing pipelines (|) to erase the target device. Empty by default. Mutually exclusive with command. For the script execution, the ansible.builtin.shell module is called.
 
 When executing a command or a script, you can use the following environment variables:
 
 DEVICE_KNAME (always defined by Ansible)
 Device kernel path, for example, /dev/sda
 
 DEVICE_BY_NAME (optional)
 Link from /dev/disk/by-name/ if it was added by udev
 
 DEVICE_BY_ID (optional)
 Link from /dev/disk/by-id/ if it was added by udev
 
 DEVICE_BY_PATH (optional)
 Link from /dev/disk/by-path/ if it was added by udev
 
 DEVICE_BY_WWN (optional)
 Link from /dev/disk/by-wwn/ if it was added by udev
 
 For configuration details, see Wipe a device or partition.
 - wipe (boolean, deprecated)
 Defines whether the device must be wiped of the data before being used.
 
 This field is deprecated since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0) for the sake of wipeDevice and will be removed in one of the following releases.
 
 For backward compatibility, any existing wipe: true option is automatically converted to the following structure:
 
 wipeDevice: eraseMetadata: enabled: True
 
 Before Container Cloud 2.26.0, the wipe field is mandatory.
- Each device in the list can have the following fields in its properties section that affect the selection of the specific device when the profile is applied to a host:
 - type (optional, string)
 The device type. Possible values: hdd, ssd, nvme. This property is used to filter selected devices by type.
 - partflags (optional, string)
 Extra partition flags to be applied on a partition. For example, bios_grub.
 - minSizeGiB, maxSizeGiB (deprecated, optional, string)
 The lower and upper limit of the selected device size. Only the devices matching these criteria are considered for allocation. Omitted parameter means no upper or lower limit.
 
 The minSize and maxSize parameter names are also available for the same purpose.
 
 You can use flexible size units throughout bare metal host profiles. For example, you can now use either sizeGiB: 0.1 or size: 100Mi when specifying a device size.
 
 Mirantis recommends using only one parameter name type and units throughout the configuration files. If both sizeGiB and size are used, sizeGiB is ignored during deployment and the suffix is adjusted accordingly. For example, 1.5Gi will be serialized as 1536Mi. The size without units is counted in bytes. For example, size: 120 means 120 bytes.
 
 Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), minSizeGiB and maxSizeGiB are deprecated. Instead of floats that define sizes in GiB for *GiB fields, use the <sizeNumber>Gi text notation (Ki, Mi, and so on). All newly created profiles are automatically migrated to the Gi syntax. In existing profiles, migrate the syntax manually.
 - byName (optional, string)
 Forbidden in new profiles since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). The specific device name to be selected during provisioning, such as dev/sda.
 
 Warning
 
 With NVME devices and certain hardware disk controllers, you cannot reliably select such device by the system name. Therefore, use a more specific byPath, serialNumber, or wwn selector.
 
 Caution
 
 Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), byName is deprecated. Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), byName is blocked by admission-controller in new BareMetalHostProfile objects. As a replacement, use a more specific selector, such as byPath, serialNumber, or wwn.
 - byPath (optional, string) ^{Since 2.26.0 (17.1.0, 16.1.0)}
 The specific device name with its path to be selected during provisioning, such as /dev/disk/by-path/pci-0000:00:07.0.
 - serialNumber (optional, string) ^{Since 2.26.0 (17.1.0, 16.1.0)}
 The specific serial number of a physical disk to be selected during provisioning, such as S2RBNXAH116186E.
 - wwn (optional, string) ^{Since 2.26.0 (17.1.0, 16.1.0)}
 The specific World Wide Name number of a physical disk to be selected during provisioning, such as 0x5002538d409aeeb4.
 
 Warning
 
 When using strict filters, such as byPath, serialNumber, or wwn, Mirantis strongly recommends not combining them with a soft filter, such as minSize / maxSize. Use only one approach.
softRaidDevices ^{Tech Preview}
List of definitions of a software-based Redundant Array of Independent Disks (RAID) created by mdadm. Use the following fields to describe an mdadm RAID device:
- name (mandatory, string)
  Name of a RAID device. Supports the following formats:
  
  dev path, for example, /dev/md0.
  
  simple name, for example, raid-name that will be created as /dev/md/raid-name on the target OS.
- devices (mandatory, list)
  List of partitions from the devices list. Expand the resulting list of devices into at least two partitions.
- level (optional, string)
  Level of a RAID device, defaults to raid1. Possible values: raid1, raid0, raid10.
- metadata (optional, string)
  Metadata version of RAID, defaults to 1.0. Possible values: 1.0, 1.1, 1.2. For details about the differences in metadata, see man 8 mdadm.
  
  Warning
  
  The EFI system partition partflags: ['esp'] must be a physical partition in the main partition table of the disk, not under LVM or mdadm software RAID.
fileSystems
List of file systems. Each file system can be created on top of either device, partition, or logical volume. If more file systems are required for additional devices, define them in this field. Each fileSystems in the list has the following fields:
- fileSystem (mandatory, string)
  Type of a file system to create on a partition. For example, ext4, vfat.
- mountOpts (optional, string)
  Comma-separated string of mount options. For example, rw,noatime,nodiratime,lazytime,nobarrier,commit=240,data=ordered.
- mountPoint (optional, string)
  Target mount point for a file system. For example, /mnt/local-volumes/.
- partition (optional, string)
  Partition name to be selected for creation from the list in the devices section. For example, uefi.
- logicalVolume (optional, string)
  LVM logical volume name if the file system is supposed to be created on an LVM volume defined in the logicalVolumes section. For example, lvp.
logicalVolumes
List of LVM logical volumes. Every logical volume belongs to a volume group from the volumeGroups list and has the size attribute for a size in the corresponding units.

You can also add a software-based RAID raid1 created by LVM using the following fields:
- name (mandatory, string)
  Name of a logical volume.
- vg (mandatory, string)
  Name of a volume group that must be a name from the volumeGroups list.
- sizeGiB or size (mandatory, string)
  Size of a logical volume in gigabytes. When set to 0, all available space on the corresponding volume group will be used. The 0 value equals -l 100%FREE in the lvcreate command.
- type (optional, string)
  Type of a logical volume. If you require a usual logical volume, you can omit this field.
  
  Possible values:
  
  linear
  Default. A usual logical volume. This value is implied for bare metal host profiles created using the Container Cloud release earlier than 2.12.0 (Cluster releases 7.1.0, 6.18.0, or earlier) where the type field is unavailable.
  
  raid1 ^{Tech Preview}
  Serves to build the raid1 type of LVM. Equals to the lvcreate --type raid1... command. For details, see man 8 lvcreate and man 7 lvmraid.
  
  You can use flexible size units throughout bare metal host profiles. For example, you can now use either sizeGiB: 0.1 or size: 100Mi when specifying a device size.
  
  Mirantis recommends using only one parameter name type and units throughout the configuration files. If both sizeGiB and size are used, sizeGiB is ignored during deployment and the suffix is adjusted accordingly. For example, 1.5Gi will be serialized as 1536Mi. The size without units is counted in bytes. For example, size: 120 means 120 bytes.
volumeGroups
List of definitions of LVM volume groups. Each volume group contains one or more devices or partitions from the devices list. Contains the following field:
- devices (mandatory, list)
  List of partitions to be used in a volume group. For example:
  
  - partition: lvm_root_part1 - partition: lvm_root_part2
  
  Must contain the following field:
  
  name (mandatory, string)
  Name of a volume group to be created. For example: lvm_root.
preDeployScript (optional, string)
Shell script that executes on a host before provisioning the target operating system inside the ramfs system.
postDeployScript (optional, string)
Shell script that executes on a host after deploying the operating system inside the ramfs system that is chrooted to the target operating system. To use a specific default gateway (for example, to have Internet access) on this stage, refer to Configure multiple DHCP address ranges.
grubConfig (optional, object)
Set of options for the Linux GRUB bootloader on the target operating system. Contains the following field:
- defaultGrubOptions (optional, array)
  Set of options passed to the Linux GRUB bootloader. Each string in the list defines one parameter. For example:
  
  defaultGrubOptions: - GRUB_DISABLE_RECOVERY="true" - GRUB_PRELOAD_MODULES=lvm - GRUB_TIMEOUT=20
kernelParameters:sysctl (optional, object)
List of kernel sysctl options passed to /etc/sysctl.d/999-baremetal.conf during a bmh provisioning. For example:
kernelParameters: sysctl: fs.aio-max-nr: "1048576" fs.file-max: "9223372036854775807"
For the list of options prohibited to change, refer to MKE documentation: Set up kernel default protections.
Note

If asymmetric traffic is expected on some of the managed cluster nodes, enable the loose mode for the corresponding interfaces on those nodes by setting the net.ipv4.conf.<interface-name>.rp_filter parameter to "2" in the kernelParameters.sysctl section. For example:
kernelParameters: sysctl: net.ipv4.conf.k8s-lcm.rp_filter: "2"
kernelParameters:modules (optional, object)
List of options for kernel modules to be passed to /etc/modprobe.d/{filename} during a bare metal host provisioning. For example:
kernelParameters: modules: - content: | options kvm_intel nested=1 filename: kvm_intel.conf

General configuration example with the deprecated wipe option for devices - applies before 2.26.0 (17.1.0 and 16.1.0)

spec:
  devices:
   - device:
       #byName: /dev/sda
       minSize: 61GiB
       wipe: true
       workBy: by_wwn,by_path,by_id,by_name
     partitions:
       - name: bios_grub
         partflags:
         - bios_grub
         size: 4Mi
         wipe: true
       - name: uefi
         partflags: ['esp']
         size: 200Mi
         wipe: true
       - name: config-2
         # limited to 64Mb
         size: 64Mi
         wipe: true
       - name: md_root_part1
         wipe: true
         partflags: ['raid']
         size: 60Gi
       - name: lvm_lvp_part1
         wipe: true
         partflags: ['raid']
         # 0 Means, all left space
         size: 0
   - device:
       #byName: /dev/sdb
       minSize: 61GiB
       wipe: true
       workBy: by_wwn,by_path,by_id,by_name
     partitions:
       - name: md_root_part2
         wipe: true
         partflags: ['raid']
         size: 60Gi
       - name: lvm_lvp_part2
         wipe: true
         # 0 Means, all left space
         size: 0
   - device:
       #byName: /dev/sdc
       minSize: 30Gib
       wipe: true
       workBy: by_wwn,by_path,by_id,by_name
  softRaidDevices:
    - name: md_root
      metadata: "1.2"
      devices:
        - partition: md_root_part1
        - partition: md_root_part2
  volumeGroups:
    - name: lvm_lvp
      devices:
        - partition: lvm_lvp_part1
        - partition: lvm_lvp_part2
  logicalVolumes:
    - name: lvp
      vg: lvm_lvp
      # Means, all left space
      sizeGiB: 0
  postDeployScript: |
    #!/bin/bash -ex
    echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
  preDeployScript: |
    #!/bin/bash -ex
    echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/rules.d/60-ssd-scheduler.rules
    echo $(date) 'pre_deploy_script done' >> /root/pre_deploy_done
  fileSystems:
    - fileSystem: vfat
      partition: config-2
    - fileSystem: vfat
      partition: uefi
      mountPoint: /boot/efi/
    - fileSystem: ext4
      softRaidDevice: md_root
      mountPoint: /
    - fileSystem: ext4
      logicalVolume: lvp
      mountPoint: /mnt/local-volumes/
  grubConfig:
    defaultGrubOptions:
    - GRUB_DISABLE_RECOVERY="true"
    - GRUB_PRELOAD_MODULES=lvm
    - GRUB_TIMEOUT=20
  kernelParameters:
    sysctl:
    # For the list of options prohibited to change, refer to
    # https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.html
      kernel.dmesg_restrict: "1"
      kernel.core_uses_pid: "1"
      fs.file-max: "9223372036854775807"
      fs.aio-max-nr: "1048576"
      fs.inotify.max_user_instances: "4096"
      vm.max_map_count: "262144"
    modules:
      - filename: kvm_intel.conf
        content: |
          options kvm_intel nested=1

Mounting recommendations for the /var directory¶

During volume mounts, Mirantis strongly advises against mounting the entire /var directory to a separate disk or partition. Otherwise, the cloud-init service may fail to configure the target host system during the first boot.

This recommendation allows preventing the following cloud-init issue related to asynchronous mount in systemd with ignoring dependency:

System boots the / mounts.
The cloud-init service starts and processes data in /var/lib/cloud-init, which currently references [/]var/lib/cloud-init.
The systemd service mounts /var/lib/cloud-init and breaks the cloud-init service logic.

See also

Cluster resource¶

This section outlines the Cluster resource used the in MOSK that describes the cluster-level parameters.

For demonstration purposes, the Cluster custom resource (CR) is split into the following major sections:

metadata
spec:providerSpec
spec:providerSpec common
spec:providerSpec configuration
status:providerStatus common
status:providerStatus for cluster readiness
status:providerStatus for Open ID Connect
status:providerStatus for cluster releases

Warning

The fields of the Cluster resource that are located under the status section including providerStatus are available for viewing only. They are automatically generated by the bare metal cloud provider and must not be modified using API.

metadata¶

The Cluster CR contains the following fields:

apiVersion
API version of the object that is cluster.k8s.io/v1alpha1.
kind
Object type that is Cluster.

The metadata object field of the Cluster resource contains the following fields:

name
Name of a cluster. A MOSK cluster name is specified under the Cluster Name field in the Create Cluster wizard of the Container Cloud web UI. A management cluster name is configurable in the bootstrap script.
namespace
Project in which the cluster object was created. The management cluster is always created in the default project. The MOSK cluster project equals to the selected project name.
labels
Key-value pairs attached to the object:
- kaas.mirantis.com/provider
  Provider type that is baremetal.
- kaas.mirantis.com/region
  Region name. The default region name for the management cluster is region-one.
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
Warning

Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: cluster.k8s.io/v1alpha1
kind: Cluster
metadata:
  name: demo
  namespace: test
  labels:
    kaas.mirantis.com/provider: baremetal

spec:providerSpec¶

The spec object field of the Cluster object represents the BaremetalClusterProviderSpec subresource that contains a complete description of the desired bare metal cluster state and all details to create the cluster-level resources. It also contains the fields required for LCM deployment and integration of MOSK components.

The providerSpec object field contains the following generic fields:

apiVersion
API version of the object that is baremetal.k8s.io/v1alpha1
kind
Object type that is BaremetalClusterProviderSpec

Configuration example:

spec:
  ...
  providerSpec:
    value:
      apiVersion: baremetal.k8s.io/v1alpha1
      kind: BaremetalClusterProviderSpec

spec:providerSpec common¶

The common providerSpec object field of the Cluster resource contains the following fields:

credentials
Field reserved for other cloud providers, has an empty value. Disregard this field.
release
Name of the ClusterRelease object to install on a cluster.
helmReleases
List of enabled Helm releases from the Release object that run on a cluster.
proxy
Name of the Proxy object.
tls
TLS configuration for endpoints of a cluster.
- keycloak
  KeyCloak endpoint.
  
  tlsConfigRef
  Reference to the TLSConfig object.
- ui
  Web UI endpoint.
  
  tlsConfigRef
  Reference to the TLSConfig object.
For more details, see TLSConfig resource.
maintenance
Maintenance mode of a cluster. Prepares a cluster for maintenance and enables the possibility to switch machines into maintenance mode.
containerRegistries
List of the ContainerRegistries resources names.
ntpEnabled
NTP server mode. Boolean, enabled by default.

Since Container Cloud 2.23.0 (Cluster release 11.7.0), you can optionally disable NTP to disable the management of chrony configuration by MOSK and use your own system for chrony management. Otherwise, configure the regional NTP server parameters to be applied to all machines of MOSK clusters.

Before Container Cloud 2.23.0 (Cluster releases 12.5.0, 11.6.0, or earlier), you can optionally configure NTP parameters if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org) are accessible from the node where a management cluster is being provisioned. Otherwise, this configuration is mandatory.

To configure NTP during management cluster bootstrap, see Configure optional settings.

audit

Optional. Technology preview. Available since Container Cloud 2.24.0 (Cluster release 14.0.0). Auditing tools enabled on the cluster. Contains the auditd field that enables the Linux Audit daemon auditd to monitor activity of cluster processes and prevent potential malicious activity.

secureOverlay
Optional. Technology Preview. Deprecated since Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0). Available since Container Cloud 2.24.0 (Cluster release 14.0.0). Enables WireGuard for traffic encryption on the Kubernetes workloads network. Boolean. Disabled by default.

Caution

Before enabling WireGuard, ensure that the Calico MTU size is at least 60 bytes smaller than the interface MTU size of the workload network. IPv4 WireGuard uses a 60-byte header. For details, see Set the MTU size for Calico.

Caution

Changing this parameter on a running cluster causes a downtime that can vary depending on the cluster size.

For more details about WireGuard, see Calico documentation: Encrypt in-cluster pod traffic.
useBGPAnnouncement
Optional. Technology preview. Available since Container Cloud 2.24.4 (Cluster releases 15.0.3 and 14.0.3). To enable the use of BGP announcement for the cluster API LB address, set to true. See Configure BGP announcement for cluster API LB address for details.

Configuration example:

spec:
  ...
  providerSpec:
    value:
      credentials: ""
      publicKeys:
        - name: bootstrap-key
      release: ucp-5-7-0-3-3-3-tp11
      helmReleases:
        - name: metallb
          values: {}
        ...
        - name: stacklight
          ...
      tls:
        keycloak:
          certificate:
            name: keycloak
          hostname: container-cloud-auth.example.com
        ui:
          certificate:
            name: ui
          hostname: container-cloud-ui.example.com
      containerRegistries:
      - demoregistry
      ntpEnabled: false
      ...

spec:providerSpec configuration¶

This section represents MOSK components that are enabled on a cluster. It contains the following fields:

management
Configuration for the management cluster components:
- enabled
  Management cluster enabled (true) or disabled (false).
- helmReleases
  List of the management cluster Helm releases that will be installed on the cluster. A Helm release includes the name and values fields. The specified values will be merged with relevant Helm release values of the management cluster in the Release object.
regional
List of regional cluster components for the provider:
- provider
  Provider type that is baremetal.
- helmReleases
  List of the regional Helm releases that will be installed on the cluster. A Helm release includes the name and values fields. The specified values will be merged with relevant regional Helm release values in the Release object.
release
Name of the Container Cloud Release object.

Configuration example:

spec:
  ...
  providerSpec:
     value:
       kaas:
         management:
           enabled: true
           helmReleases:
             - name: kaas-ui
               values:
                 serviceConfig:
                   server: https://10.0.0.117
         regional:
           - helmReleases:
             - name: baremetal-provider
               values: {}
             provider: baremetal
           ...
         release: kaas-2-0-0

status:providerStatus common¶

^{Must not be modified using API}

The common providerStatus object field of the Cluster resource contains the following fields:

apiVersion
API version of the object that is baremetal.k8s.io/v1alpha1
kind
Object type that is BaremetalClusterProviderStatus
loadBalancerHost
Load balancer IP or host name of the cluster
apiServerCertificate
Server certificate of Kubernetes API
ucpDashboard
URL of the Mirantis Kubernetes Engine (MKE) Dashboard
maintenance
Maintenance mode of a cluster. Prepares a cluster for maintenance and enables the possibility to switch machines into maintenance mode.

Configuration example:

status:
  providerStatus:
    apiVersion: baremetal.k8s.io/v1alpha1
    kind: BaremetalClusterProviderStatus
    loadBalancerHost: 10.0.0.100
    apiServerCertificate: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS…
    ucpDashboard: https://10.0.0.100:6443

status:providerStatus for cluster readiness¶

^{Must not be modified using API}

The providerStatus object field of the Cluster resource that reflects the cluster readiness contains the following fields:

persistentVolumesProviderProvisioned
Status of the persistent volumes provisioning. Prevents the Helm releases that require persistent volumes from being installed until some default StorageClass is added to the Cluster object.
helm
Details about the deployed Helm releases:
- ready
  Status of the deployed Helm releases. The true value indicates that all Helm releases are deployed successfully.
- releases
  List of the enabled Helm releases that run on the cluster:
  
  releaseStatuses
  List of the deployed Helm releases. The success: true field indicates that the release is deployed successfully.
  
  stacklight
  Status of the StackLight deployment. Contains URLs of all StackLight components. The success: true field indicates that StackLight is deployed successfully.
nodes
Details about the cluster nodes:
- ready
  Number of nodes that completed the deployment or update.
- requested
  Total number of nodes. If the number of ready nodes does not match the number of requested nodes, it means that a cluster is being currently deployed or updated.
notReadyObjects
The list of the services, deployments, and statefulsets Kubernetes objects that are not in the Ready state yet. A service is not ready if its external address has not been provisioned yet. A deployment or statefulset is not ready if the number of ready replicas is not equal to the number of desired replicas. Both objects contain the name and namespace of the object and the number of ready and desired replicas (for controllers). If all objects are ready, the notReadyObjects list is empty.

Configuration example:

status:
  providerStatus:
    persistentVolumesProviderProvisioned: true
    helm:
      ready: true
      releases:
        releaseStatuses:
          iam:
            success: true
          ...
        stacklight:
          alerta:
            url: http://10.0.0.106
          alertmanager:
            url: http://10.0.0.107
          grafana:
            url: http://10.0.0.108
          kibana:
            url: http://10.0.0.109
          prometheus:
            url: http://10.0.0.110
          success: true
    nodes:
      ready: 3
      requested: 3
    notReadyObjects:
      services:
        - name: testservice
          namespace: default
      deployments:
        - name: baremetal-provider
          namespace: kaas
          replicas: 3
          readyReplicas: 2
      statefulsets: {}

status:providerStatus for Open ID Connect¶

^{Must not be modified using API}

The oidc section of the providerStatus object field in the Cluster resource reflects the Open ID Connect configuration details. It contains the required details to obtain a token for a MOSK cluster and consists of the following fields:

certificate
Base64-encoded OIDC certificate.
clientId
Client ID for OIDC requests.
groupsClaim
Name of an OIDC groups claim.
issuerUrl
Issuer URL to obtain the representation of the realm.
ready
OIDC status relevance. If true, the status corresponds to the LCMCluster OIDC configuration.

Configuration example:

status:
  providerStatus:
    oidc:
      certificate: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUREekNDQWZ...
      clientId: kaas
      groupsClaim: iam_roles
      issuerUrl: https://10.0.0.117/auth/realms/iam
      ready: true

status:providerStatus for cluster releases¶

^{Must not be modified using API}

The releaseRefs section of the providerStatus object field in the Cluster resource provides the current Cluster release version as well as the one available for upgrade. It contains the following fields:

current
Details of the currently installed Cluster release:
- lcmType
  Type of the Cluster release (ucp).
- name
  Name of the Cluster release resource.
- version
  Version of the Cluster release.
- unsupportedSinceKaaSVersion
  Indicates that a Container Cloud release newer than the current one exists and that it does not support the current Cluster release.
available
List of the releases available for upgrade. Contains the name and version fields.

Configuration example:

status:
  providerStatus:
    releaseRefs:
      available:
        - name: ucp-5-5-0-3-4-0-dev
          version: 5.5.0+3.4.0-dev
      current:
        lcmType: ucp
        name: ucp-5-4-0-3-3-0-beta1
        version: 5.4.0+3.3.0-beta1

Machine resource¶

This section outlines the Machine resource used in MOSK to describe the machine-level parameters.

For demonstration purposes, the Machine custom resource (CR) is split into the following major sections:

metadata
spec:providerSpec for instance configuration
Machine status

metadata¶

The Machine CR contains the following fields:

apiVersion
API version of the object that is cluster.k8s.io/v1alpha1.
kind
Object type that is Machine.

The metadata object field of the Machine resource contains the following fields:

name
Name of the Machine object.
namespace
Project in which the Machine object is created.
annotations
Key-value pair to attach arbitrary metadata to the object:
- metal3.io/BareMetalHost
 Annotation attached to the Machine object to reference the corresponding BareMetalHostInventory object in the <BareMetalHostProjectName/BareMetalHostName> format.
 
 Note
 
 Before update of the management cluster to Container Cloud 2.29.0 (Cluster release 16.4.0), instead of BareMetalHostInventory, use the BareMetalHost object. For details, see BareMetalHost resource.
 
 Caution
 
 While the Cluster release of the management cluster is 16.4.0, BareMetalHostInventory operations are allowed to m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
labels
Key-value pairs that are attached to the object:
- kaas.mirantis.com/provider
  Provider type that matches the provider type in the Cluster object and must be baremetal.
- kaas.mirantis.com/region
  Region name that matches the region name in the Cluster object.
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
- cluster.sigs.k8s.io/cluster-name
  Cluster name that the Machine object is linked to.
- cluster.sigs.k8s.io/control-plane
  For the control plane role of a machine, this label contains any value, for example, "true". For the worker role, this label is absent.
Warning

Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  name: example-control-plane
  namespace: example-ns
  annotations:
    metal3.io/BareMetalHost: default/master-0
  labels:
    kaas.mirantis.com/provider: baremetal
    cluster.sigs.k8s.io/cluster-name: example-cluster
    cluster.sigs.k8s.io/control-plane: "true" # remove for worker

spec:providerSpec for instance configuration¶

The spec object field of the Machine object represents the BareMetalMachineProviderSpec subresource with all required details to create a bare metal instance. It contains the following fields:

apiVersion
API version of the object that is baremetal.k8s.io/v1alpha1.
kind
Object type that is BareMetalMachineProviderSpec.
bareMetalHostProfile
Configuration profile of a bare metal host:
- name
  Name of a bare metal host profile
- namespace
  Project in which the bare metal host profile is created.
l2TemplateIfMappingOverride
If specified, overrides the interface mapping value for the corresponding L2Template object.
l2TemplateSelector
Optional. Contains the name (first priority) or label of the L2 template that will be applied during machine creation. The l2TemplateSelector field is copied from providerSpec of the Machine object to the IpamHost object only once, during machine creation. To modify l2TemplateSelector after creation of the Machine object, edit the IpamHost object.
hostSelector
Specifies the matching criteria for labels on the bare metal hosts. Limits the set of the BareMetalHostInventory objects considered for claiming for the Machine object. The following selector labels can be added when creating a machine using the Container Cloud web UI:
- hostlabel.bm.kaas.mirantis.com/controlplane
- hostlabel.bm.kaas.mirantis.com/worker
- hostlabel.bm.kaas.mirantis.com/storage
Any custom label that is assigned to one or more bare metal hosts using API can be used as a host selector. If the BareMetalHostInventory objects with the specified label are missing, the Machine object will not be deployed until at least one bare metal host with the specified label is available.

Note

Before update of the management cluster to Container Cloud 2.29.0 (Cluster release 16.4.0), instead of BareMetalHostInventory, use the BareMetalHost object. For details, see BareMetalHost resource.

Caution

While the Cluster release of the management cluster is 16.4.0, BareMetalHostInventory operations are allowed to m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
nodeLabels
This field contains the list of node labels to be attached to a node for the user to run certain components on separate cluster nodes. The list of allowed node labels is located in the Cluster object status providerStatus.releaseRef.current.allowedNodeLabels field.

If the value field is not defined in allowedNodeLabels, a label can have any value. For example:
allowedNodeLabels: - displayName: Stacklight key: stacklight
Before or after a machine deployment, add the required label from the allowed node labels list with the corresponding value to spec.providerSpec.value.nodeLabels in machine.yaml. For example:
nodeLabels: - key: stacklight value: enabled
Adding of a node label that is not available in the list of allowed node labels is restricted.
distribution ^Mandatory
Specifies an operating system (OS) distribution ID that is present in the current ClusterRelease object under the AllowedDistributions list. When specified, the BareMetalHostInventory object linked to this Machine object will be provisioned using the selected OS distribution instead of the default one.

By default, ubuntu/jammy is installed on greenfield MOSK clusters since MOSK 24.3. The default distribution is marked with the boolean flag default inside one of the elements under the AllowedDistributions list.

The ubuntu/focal distribution was deprecated in MOSK 24.3 and only supported for existing MOSK clusters. MOSK 24.3.x release series is the last one to support Ubuntu 20.04 as the host operating system for MOSK clusters.

Caution

The outdated ubuntu/bionic distribution, which is removed in MOSK 23.3, is only supported for existing clusters based on Ubuntu 18.04.

Warning

During the course of the MOSK 24.3 and Container Cloud 2.28.x series, Mirantis highly recommends upgrading an operating system on your cluster machines to Ubuntu 22.04 before the following major release becomes available.

It is not mandatory to upgrade all machines at once. You can upgrade them one by one or in small batches, for example, if the maintenance window is limited in time.

The Cluster release update of the Ubuntu 20.04-based MOSK clusters will become impossible as of Container Cloud 2.29.0, where Ubuntu 22.04 is the only supported version.

Management cluster update to Container Cloud 2.29.1 will be blocked if at least one node of any related MOSK cluster is running Ubuntu 20.04.
maintenance
Maintenance mode of a machine. If enabled, the node of the selected machine is drained, cordoned, and prepared for maintenance operations.
upgradeIndex (optional)
Positive numeral value that determines the order of machines upgrade. The first machine to upgrade is always one of the control plane machines with the lowest upgradeIndex. Other control plane machines are upgraded one by one according to their upgrade indexes.

If the Cluster spec dedicatedControlPlane field is false, worker machines are upgraded only after the upgrade of all control plane machines finishes. Otherwise, they are upgraded after the first control plane machine, concurrently with other control plane machines.

If two or more machines have the same value of upgradeIndex, these machines are equally prioritized during upgrade.
deletionPolicy
Generally available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Policy used to identify steps required during a Machine object deletion. Supported policies are as follows:
- graceful
  Prepares a machine for deletion by cordoning, draining, and removing from Docker Swarm of the related node. Then deletes Kubernetes objects and associated resources. Can be aborted only before a node is removed from Docker Swarm.
- unsafe
  Default. Deletes Kubernetes objects and associated resources without any preparations.
- forced
  Deletes Kubernetes objects and associated resources without any preparations. Removes the Machine object even if the cloud provider or LCM Controller gets stuck at some step. May require a manual cleanup of machine resources in case of the controller failure.
For details on the workflow of machine deletion policies, see Overview of machine deletion policies.

Configuration example:

spec:
  ...
  providerSpec:
    value:
      apiVersion: baremetal.k8s.io/v1alpha1
      kind: BareMetalMachineProviderSpec
      bareMetalHostProfile:
        name: default
        namespace: default
      l2TemplateIfMappingOverride:
        - eno1
        - enp0s0
      l2TemplateSelector:
        label: l2-template1-label-1
      hostSelector:
        matchLabels:
          kaas.mirantis.com/baremetalhost-id: hw-master-0
      kind: BareMetalMachineProviderSpec
      nodeLabels:
      - key: stacklight
        value: enabled
      distribution: ubuntu/jammy
      delete: false
      deletionPolicy: graceful

Machine status¶

The status object field of the Machine object represents the BareMetalMachineProviderStatus subresource that describes the current bare metal instance state and contains the following fields:

apiVersion
API version of the object that is cluster.k8s.io/v1alpha1.
kind
Object type that is BareMetalMachineProviderStatus.
hardware
Provides a machine hardware information:
- cpu
  Number of CPUs.
- ram
  RAM capacity in GB.
- storage
  List of hard drives mounted on the machine. Contains the disk name and size in GB.
status
Represents the current status of a machine:
- Provision
  A machine is yet to obtain a status
- Uninitialized
  A machine is yet to obtain the node IP address and host name
- Pending
  A machine is yet to receive the deployment instructions and it is either not booted yet or waits for the LCM controller to be deployed
- Prepare
  A machine is running the Prepare phase during which Docker images and packages are being predownloaded
- Deploy
  A machine is processing the LCM Controller instructions
- Reconfigure
  A machine is being updated with a configuration without affecting workloads running on the machine
- Ready
  A machine is deployed and the supported Mirantis Kubernetes Engine (MKE) version is set
- Maintenance
  A machine host is cordoned, drained, and prepared for maintenance operations
currentDistribution
Generally available since Container Cloud 2.24.2 (Cluster releases 15.0.1 and 14.0.1). Distribution ID of the current operating system installed on the machine. For example, ubuntu/jammy.
maintenance
Maintenance mode of a machine. If enabled, the node of the selected machine is drained, cordoned, and prepared for maintenance operations.
reboot
Available since Container Cloud 2.22.0 (Cluster release 11.6.0). Indicator of a host reboot to complete the Ubuntu operating system updates, if any.
- required
  Specifies whether a host reboot is required. Boolean. If true, a manual host reboot is required.
- reason
  Specifies the package name(s) to apply during a host reboot.
upgradeIndex
Positive numeral value that determines the order of machines upgrade. If upgradeIndex in the Machine object spec is set, this status value equals the one in the spec. Otherwise, this value displays the automatically generated order of upgrade.
delete
Generally available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Start of a machine deletion or a successful abortion. Boolean.
prepareDeletionPhase
Generally available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Preparation phase for a graceful machine deletion. Possible values are as follows:
- started
  The provider controller prepares a machine for deletion by cordoning, draining the machine, and so on.
- completed
  LCM Controller starts removing the machine resources since the preparation for deletion is complete.
- aborting
  The provider controller attempts to uncordon the node. If the attempt fails, the status changes to failed.
- failed
  Error in the deletion workflow.
For the workflow description of a graceful deletion, see Overview of machine deletion policies.

Configuration example:

status:
  providerStatus:
    apiVersion: baremetal.k8s.io/v1alpha1
    kind: BareMetalMachineProviderStatus
    hardware:
      cpu: 11
      ram: 16
    storage:
      - name: /dev/vda
        size: 61
      - name: /dev/vdb
        size: 32
      - name: /dev/vdc
        size: 32
    reboot:
      required: true
      reason: |
        linux-image-5.13.0-51-generic
        linux-base
    status: Ready
    upgradeIndex: 1

Cluster lifecycle resources¶

This section contains descriptions and examples of cluster lifecycle resources for MOSK.

Diagnostic resource¶

Available since MCC 2.28.0 (Cluster releases 17.3.0 and 16.3.0)

This section describes the Diagnostic custom resource (CR) used in the management API to trigger self-diagnostics for management or MOSK clusters.

The Diagnostic CR contains the following fields:

apiVersion
API version of the object that is diagnostic.mirantis.com/v1alpha1.
kind
Object type that is Diagnostic.
metadata
Object metadata that contains the following fields:
- name
  Name of the Diagnostic object.
- namespace
  Namespace used to create the Diagnostic object. Must be equal to the namespace of the target cluster.
spec
Resource specification that contains the following fields:
- cluster
  Name of the target cluster to run diagnostics on.
- checks
  Reserved for internal usage, any override will be discarded.
status
- finishedAt
  Completion timestamp of diagnostics. If the Diagnostic Controller version is outdated, this field is not set and the corresponding error message is displayed in the error field.
- error
  Error that occurs during diagnostics or if the Diagnostic Controller version is outdated. Omitted if empty.
- controllerVersion
  Version of the controller that launched diagnostics.
- result
  Map of check statuses where the key is the check name and the value is the result of the corresponding diagnostic check:
  
  description
  Description of the check in plain text.
  
  result
  Result of diagnostics. Possible values are PASS, ERROR, FAIL, WARNING, INFO.
  
  message
  Optional. Explanation of the check results. It may optionally contain a reference to the documentation describing a known issue related to the check results, including the existing workaround for the issue.
  
  success
  Success status of the check. Boolean.
  
  ticketInfo
  Optional. Information about the ticket to track the resolution progress of the known issue related to the check results. For example, FIELD-12345.

The Diagnostic resource example:

apiVersion: diagnostic.mirantis.com/v1alpha1
kind: Diagnostic
metadata:
  name: test-diagnostic
  namespace: test-namespace
spec:
  cluster: test-cluster
status:
  finishedAt: 2024-07-01T11:27:14Z
  error: ""
  controllerVersion: v1.40.11
  result:
    bm_address_capacity:
      description: Baremetal addresses capacity
      message: LCM Subnet 'default/k8s-lcm-nics' has 8 allocatable addresses (threshold
        is 5) - OK; PXE-NIC Subnet 'default/k8s-pxe-nics' has 7 allocatable addresses
        (threshold is 5) - OK; Auto-assignable address pool 'default' from MetallbConfig
        'default/kaas-mgmt-metallb' has left 21 available IP addresses (threshold
        is 10) - OK
      result: INFO
      success: true
    bm_artifacts_overrides:
      description: Baremetal overrides check
      message: BM operator has no undesired overrides
      result: PASS
      success: true

See also

Run cluster self-diagnostics

GracefulRebootRequest resource¶

Available since Container Cloud 2.23.0 (Cluster release 11.7.0)

This section describes the GracefulRebootRequest custom resource (CR) used in the management API for a rolling reboot of several or all cluster machines without workloads interruption. The resource is also useful for a bulk reboot of machines, for example, on large clusters.

The GracefulRebootRequest CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is GracefulRebootRequest.
metadata
Metadata of the GracefulRebootRequest CR that contains the following fields:
- name
  Name of the GracefulRebootRequest object. The object name must match the name of the cluster on which you want to reboot machines.
- namespace
  Project where the GracefulRebootRequest is created.
spec
Specification of the GracefulRebootRequest CR that contains the following fields:
- machines
  List of machines for a rolling reboot. Each machine of the list is cordoned, drained, rebooted, and uncordoned in the order of cluster upgrade policy. For details about the upgrade order, see Change the upgrade order of a machine.
  
  Leave this field empty to reboot all cluster machines.
  
  Caution
  
  The cluster and machines must have the Ready status to perform a graceful reboot.

Configuration example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: GracefulRebootRequest
metadata:
  name: demo-cluster
  namespace: demo-project
spec:
  machines:
  - demo-worker-machine-1
  - demo-worker-machine-3

ClusterUpdatePlan resource¶

Available since Container Cloud 2.27.0 (Cluster release 17.2.0) TechPreview

This section describes the ClusterUpdatePlan custom resource (CR) used in the management API to granularly control update process of a MOSK cluster by stopping the update after each step.

The ClusterUpdatePlan CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is ClusterUpdatePlan.
metadata
Metadata of the ClusterUpdatePlan CR that contains the following fields:
- name
  Name of the ClusterUpdatePlan object.
- namespace
  Project name of the cluster that relates to ClusterUpdatePlan.
spec
Specification of the ClusterUpdatePlan CR that contains the following fields:
- source
  Source name of the Cluster release from which the cluster is updated.
- target
  Target name of the Cluster release to which the cluster is updated.
- cluster
  Name of the cluster for which ClusterUpdatePlan is created.
- releaseNotes
  Available since Container Cloud 2.29.0 (Cluster release 17.4.0). Link to MOSK release notes of the target release.
- steps
  List of update steps, where each step contains the following fields:
  
  id
  Available since Container Cloud 2.28.0 (Cluster release 17.3.0). Step ID.
  
  name
  Step name.
  
  description
  Step description.
  
  constraints
  Description of constraints applied during the step execution.
  
  impact
  Impact of the step on the cluster functionality and workloads. Contains the following fields:
  
  users
  Impact on the user operations. Possible values: none, major, or minor.
  
  workloads
  Impact on workloads. Possible values: none, major, or minor.
  
  info
  Additional details on impact, if any.
  
  duration
  Details about duration of the step execution. Contains the following fields:
  
  estimated
  Estimated time to complete the update step.
  
  Note
  
  Before Container Cloud 2.29.0 (Cluster release 17.4.0), this field was named eta.
  
  info
  Additional details on update duration, if any.
  
  granularity
  Information on the current step granularity. Indicates whether the current step is applied to each machine individually or to the entire cluster at once. Possible values are cluster or machine.
  
  commence
  Flag that allows controlling the step execution. Boolean, false by default. If set to true, the step starts execution after all previous steps are completed.
  
  Caution
  
  Cancelling an already started update step is unsupported.
status
Status of the ClusterUpdatePlan CR that contains the following fields:
- startedAt
  Time when ClusterUpdatePlan has started.
- completedAt
  Available since Container Cloud 2.29.0 (Cluster release 17.4.0). Time of update completion.
- status
  Overall object status.
- steps
  List of step statuses in the same order as defined in spec. Each step status contains the following fields:
  
  id
  Available since Container Cloud 2.28.0 (Cluster release 17.3.0). Step ID.
  
  name
  Step name.
  
  status
  Step status. Possible values are:
  
  NotStarted
  Step has not started yet.
  
  Scheduled
  Available since Container Cloud 2.28.0 (Cluster release 17.3.0). Step is already triggered but its execution has not started yet.
  
  InProgress
  Step is currently in progress.
  
  AutoPaused
  Available since Container Cloud 2.29.0 (Cluster release 17.4.0) as Technology Preview. Update is automatically paused by the trigger from a firing alert defined in the UpdateAutoPause configuration. For details, see UpdateAutoPause resource.
  
  Stuck
  Step execution contains an issue, which also indicates that the step does not fit into the estimate defined in the duration field for this step in spec.
  
  Completed
  Step has been completed.
  
  message
  Message describing status details of the current update step.
  
  duration
  Current duration of the step execution.
  
  startedAt
  Start time of the step execution.

Example of a ClusterUpdatePlan object:

apiVersion: kaas.mirantis.com/v1alpha1
kind: ClusterUpdatePlan
metadata:
  creationTimestamp: "2025-02-06T16:53:51Z"
  generation: 11
  name: mosk-17.4.0
  namespace: child
  resourceVersion: "6072567"
  uid: 82c072be-1dc5-43dd-b8cf-bc643206d563
spec:
  cluster: mosk
  releaseNotes: https://docs.mirantis.com/mosk/latest/25.1-series.html
  source: mosk-17-3-0-24-3
  steps:
  - commence: true
    description:
    - install new version of OpenStack and Tungsten Fabric life cycle management
      modules
    - OpenStack and Tungsten Fabric container images pre-cached
    - OpenStack and Tungsten Fabric control plane components restarted in parallel
    duration:
      estimated: 1h30m0s
      info:
      - 15 minutes to cache the images and update the life cycle management modules
      - 1h to restart the components
    granularity: cluster
    id: openstack
    impact:
      info:
      - some of the running cloud operations may fail due to restart of API services
        and schedulers
      - DNS might be affected
      users: minor
      workloads: minor
    name: Update OpenStack and Tungsten Fabric
  - commence: true
    description:
    - Ceph version update
    - restart Ceph monitor, manager, object gateway (radosgw), and metadata services
    - restart OSD services node-by-node, or rack-by-rack depending on the cluster
      configuration
    duration:
      estimated: 8m30s
      info:
      - 15 minutes for the Ceph version update
      - around 40 minutes to update Ceph cluster of 30 nodes
    granularity: cluster
    id: ceph
    impact:
      info:
      - 'minor unavailability of object storage APIs: S3/Swift'
      - workloads may experience IO performance degradation for the virtual storage
        devices backed by Ceph
      users: minor
      workloads: minor
    name: Update Ceph
  - commence: true
    description:
    - new host OS kernel and packages get installed
    - host OS configuration re-applied
    - container runtime version gets bumped
    - new versions of Kubernetes components installed
    duration:
      estimated: 1h40m0s
      info:
      - about 20 minutes to update host OS per a Kubernetes controller, nodes updated
        one-by-one
      - Kubernetes components update takes about 40 minutes, all nodes in parallel
    granularity: cluster
    id: k8s-controllers
    impact:
      users: none
      workloads: none
    name: Update host OS and Kubernetes components on master nodes
  - commence: true
    description:
    - new host OS kernel and packages get installed
    - host OS configuration re-applied
    - container runtime version gets bumped
    - new versions of Kubernetes components installed
    - data plane components (Open vSwitch and Neutron L3 agents, TF agents and vrouter)
      restarted on gateway and compute nodes
    - storage nodes put to “no-out” mode to prevent rebalancing
    - by default, nodes are updated one-by-one, a node group can be configured to
      update several nodes in parallel
    duration:
      estimated: 8h0m0s
      info:
      - host OS update - up to 15 minutes per node (not including host OS configuration
        modules)
      - Kubernetes components update - up to 15 minutes per node
      - OpenStack controllers and gateways updated one-by-one
      - nodes hosting Ceph OSD, monitor, manager, metadata, object gateway (radosgw)
        services updated one-by-one
    granularity: machine
    id: k8s-workers-vdrok-child-default
    impact:
      info:
      - 'OpenStack controller nodes: some running OpenStack operations might not
        complete due to restart of components'
      - 'OpenStack compute nodes: minor loss of the East-West connectivity with
        the Open vSwitch networking back end that causes approximately 5 min of
        downtime'
      - 'OpenStack gateway nodes: minor loss of the North-South connectivity with
        the Open vSwitch networking back end: a non-distributed HA virtual router
        needs up to 1 minute to fail over; a non-distributed and non-HA virtual
        router failover time depends on many factors and may take up to 10 minutes'
      users: major
      workloads: major
    name: Update host OS and Kubernetes components on worker nodes, group vdrok-child-default
  - commence: true
    description:
    - restart of StackLight, MetalLB services
    - restart of auxiliary controllers and charts
    duration:
      estimated: 1h30m0s
    granularity: cluster
    id: mcc-components
    impact:
      info:
      - minor cloud API downtime due restart of MetalLB components
      users: minor
      workloads: none
    name: Auxiliary components update
  target: mosk-17-4-0-25-1
status:
  completedAt: "2025-02-07T19:24:51Z"
  startedAt: "2025-02-07T17:07:02Z"
  status: Completed
  steps:
  - duration: 26m36.355605528s
    id: openstack
    message: Ready
    name: Update OpenStack and Tungsten Fabric
    startedAt: "2025-02-07T17:07:02Z"
    status: Completed
  - duration: 6m1.124356485s
    id: ceph
    message: Ready
    name: Update Ceph
    startedAt: "2025-02-07T17:33:38Z"
    status: Completed
  - duration: 24m3.151554465s
    id: k8s-controllers
    message: Ready
    name: Update host OS and Kubernetes components on master nodes
    startedAt: "2025-02-07T17:39:39Z"
    status: Completed
  - duration: 1h19m9.359184228s
    id: k8s-workers-vdrok-child-default
    message: Ready
    name: Update host OS and Kubernetes components on worker nodes, group vdrok-child-default
    startedAt: "2025-02-07T18:03:42Z"
    status: Completed
  - duration: 2m0.772243006s
    id: mcc-components
    message: Ready
    name: Auxiliary components update
    startedAt: "2025-02-07T19:22:51Z"
    status: Completed

UpdateGroup resource¶

Available since Container Cloud 2.27.0 (17.2.0 and 16.2.0)

This section describes the UpdateGroup custom resource (CR) used in the management API to configure update concurrency for specific sets of machines or machine pools within a cluster. This resource enhances the update process by allowing a more granular control over the concurrency of machine updates. This resource also provides a way to control the reboot behavior of machines during a Cluster release update.

The UpdateGroup CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is UpdateGroup.
metadata
Metadata of the UpdateGroup CR that contains the following fields. All of them are required.
- name
 Name of the UpdateGroup object.
- namespace
 Project where the UpdateGroup is created.
- labels
 Label to associate the UpdateGroup with a specific cluster in the cluster.sigs.k8s.io/cluster-name: <cluster-name> format.
spec
Specification of the UpdateGroup CR that contains the following fields:
- index
  Index to determine the processing order of the UpdateGroup object. Groups with the same index are processed concurrently.
  
  The update order of a machine within the same group is determined by the upgrade index of a specific machine. For details, see Change the upgrade order of a machine.
- concurrentUpdates
  Number of machines to update concurrently within UpdateGroup.
- rebootIfUpdateRequires
  Technology Preview. Available since Container Cloud 2.28.0 (17.3.0 and 16.3.0). Automatic reboot of controller or worker machines of an update group if a Cluster release update involves node reboot, for example, when kernel version update is available in new Cluster release. You can set this parameter for management or MOSK clusters.
  
  Boolean. By default, true on management clusters and false on MOSK clusters. On MOSK clusters:
  
  If set to true, related machines are rebooted as part of a Cluster release update that requires a reboot.
  
  If set to false, machines are not rebooted even if a Cluster release update requires a reboot.
  
  Caution
  
  During a distribution upgrade, machines are always rebooted, overriding rebootIfUpdateRequires: false.

Configuration example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: UpdateGroup
metadata:
  name: update-group-example
  namespace: my-cluster-ns
  labels:
    cluster.sigs.k8s.io/cluster-name: my-cluster
spec:
  index: 10
  concurrentUpdates: 2
  rebootIfUpdateRequires: false

UpdateAutoPause resource¶

Available since Container Cloud 2.29.0 (Cluster release 17.4.0) TechPreview

This section describes the UpdateAutoPause custom resource (CR) used in the management API to configure automatic pausing of cluster release updates in a MOSK cluster using StackLight alerts.

The UpdateAutoPause CR contains the following fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is UpdateAutoPause.
metadata
Metadata of the UpdateAutoPause CR that contains the following fields:
- name
  Name of the UpdateAutoPause object. Must match the cluster name.
- namespace
  Project where the UpdateAutoPause is created. Must match the cluster namespace.
spec
Specification of the UpdateAutoPause CR that contains the following field:
- alerts
  List of alert names. The occurrence of any alert from this list triggers auto-pause of the cluster release update.
status
Status of the UpdateAutoPause CR that contains the following fields:
- firingAlerts
  List of currently firing alerts from the specified set.
- error
  Error message, if any, encountered during object processing.

Configuration example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: UpdateAutoPause
metadata:
  name: example-cluster
  namespace: example-ns
spec:
  alerts:
    - KubernetesNodeNotReady
    - KubernetesContainerOOMKilled
status:
  firingAlerts:
    - KubernetesNodeNotReady
  error: ""

See also

Configure update auto-pause

Networking resources¶

This section contains descriptions and examples of networking custom resources for MOSK.

IPaddr¶

This section describes the IPaddr resource used in the management API. The IPAddr object describes an IP address and contains all information about the associated MAC address.

For demonstration purposes, the IPaddr custom resource (CR) is split into the following major sections:

IPaddr metadata
IPAddr spec
IPAddr status

IPaddr metadata¶

The IPaddr CR metadata contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is IPaddr.
metadata
The metadata field contains the following subfields:
- name
  Name of the IPaddr object in the auto-XX-XX-XX-XX-XX-XX format where XX-XX-XX-XX-XX-XX is the associated MAC address.
- namespace
  Project in which the IPaddr object was created.
- labels
  Key-value pairs that are attached to the object:
  
  ipam/IP
  IPv4 address
  
  ipam/IpamHostID
  Unique ID of the associated IpamHost object
  
  ipam/MAC
  MAC address
  
  ipam/SubnetID
  Unique ID of the Subnet object
  
  ipam/UID
  Unique ID of the IPAddr object
  Warning
  
  Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: ipam.mirantis.com/v1alpha1
kind: IPaddr
metadata:
  name: auto-0c-c4-7a-a8-b8-18
  namespace: default
  labels:
    ipam/IP: 172.16.48.201
    ipam/IpamHostID: 848b59cf-f804-11ea-88c8-0242c0a85b02
    ipam/MAC: 0C-C4-7A-A8-B8-18
    ipam/SubnetID: 572b38de-f803-11ea-88c8-0242c0a85b02
    ipam/UID: 84925cac-f804-11ea-88c8-0242c0a85b02

IPAddr spec¶

The spec object field of the IPAddr resource contains the associated MAC address and the reference to the Subnet object:

mac
MAC address in the XX:XX:XX:XX:XX:XX format
subnetRef
Reference to the Subnet resource in the <subnetProjectName>/<subnetName> format

Configuration example:

spec:
  mac: 0C:C4:7A:A8:B8:18
  subnetRef: default/kaas-mgmt

IPAddr status¶

The status object field of the IPAddr resource reflects the actual state of the IPAddr object. In contains the following fields:

address
IP address.
cidr
IPv4 CIDR for the Subnet.
gateway
Gateway address for the Subnet.
mac
MAC address in the XX:XX:XX:XX:XX:XX format.
nameservers
List of the IP addresses of name servers of the Subnet. Each element of the list is a single address, for example, 172.18.176.6.

state ^{Since MCC 2.23.0 (11.7.0)}
Message that reflects the current status of the resource. The list of possible values includes the following:
- OK - object is operational.
- ERR - object is non-operational. This status has a detailed description in the messages list.
- TERM - object was deleted and is terminating.
messages ^{Since MCC 2.23.0 (11.7.0)}
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
phase
Deprecated since Container Cloud 2.23.0 (Cluster release 11.7.0) and will be removed in one of the following releases in favor of state. Possible values: Active, Failed, or Terminating.

Configuration example:

status:
  address: 172.16.48.201
  cidr: 172.16.48.201/24
  gateway: 172.16.48.1
  objCreated: 2021-10-21T19:09:32Z  by  v5.1.0-20210930-121522-f5b2af8
  objStatusUpdated: 2021-10-21T19:14:18.748114886Z  by  v5.1.0-20210930-121522-f5b2af8
  objUpdated: 2021-10-21T19:09:32.606968024Z  by  v5.1.0-20210930-121522-f5b2af8
  mac: 0C:C4:7A:A8:B8:18
  nameservers:
  - 172.18.176.6
  state: OK
  phase: Active

IpamHost¶

This section describes the IpamHost resource used in the management API for MOSK.

The kaas-ipam controller monitors the current state of the bare metal Machine, verifies whether BareMetalHost is successfully created and inspection is completed. Then the kaas-ipam controller fetches the information about the network interface configuration, creates the IpamHost object, and requests the IP addresses.

The IpamHost object is created for each Machine and contains configuration of the host network interfaces and IP address. It also contains information about associated MAC addresses, BareMetalHost and Machine objects.

Note

Caution

For demonstration purposes, the IpamHost custom resource (CR) is split into the following major sections:

IpamHost metadata
IpamHost configuration
IpamHost status

IpamHost metadata¶

The IpamHost CR contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
kind
Object type that is IpamHost
metadata
The metadata field contains the following subfields:
- name
  Name of the IpamHost object
- namespace
  Project in which the IpamHost object has been created
- labels
  Key-value pairs that are attached to the object:
  
  cluster.sigs.k8s.io/cluster-name
  References the Cluster object name that IpamHost is assigned to
  
  ipam/BMHostID
  Unique ID of the associated BareMetalHost object
  
  ipam/MAC-XX-XX-XX-XX-XX-XX: "1"
  Number of NICs of the host that the corresponding MAC address is assigned to
  
  ipam/MachineID
  Unique ID of the associated Machine object
  
  ipam/UID
  Unique ID of the IpamHost object
  Warning
  
  Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: ipam.mirantis.com/v1alpha1
kind: IpamHost
metadata:
  name: master-0
  namespace: default
  labels:
    cluster.sigs.k8s.io/cluster-name: kaas-mgmt
    ipam/BMHostID: 57250885-f803-11ea-88c8-0242c0a85b02
    ipam/MAC-0C-C4-7A-1E-A9-5C: "1"
    ipam/MAC-0C-C4-7A-1E-A9-5D: "1"
    ipam/MachineID: 573386ab-f803-11ea-88c8-0242c0a85b02
    ipam/UID: 834a2fc0-f804-11ea-88c8-0242c0a85b02

IpamHost configuration¶

The spec field of the IpamHost resource describes the required state of the object. It contains the following fields:

nicMACmap
Represents an unordered list of all NICs of the host obtained during the bare metal host inspection. Each NIC entry contains such fields as name, mac, ip, and so on. The primary field defines which NIC was used for PXE booting. Only one NIC can be primary. The IP address is not configurable and is provided only for debug purposes.
l2TemplateSelector
Optional. Contains the name (first priority) or label of the L2 template that will be applied during machine creation. The l2TemplateSelector field is copied from providerSpec of the Machine object to the IpamHost object only once, during machine creation. To modify l2TemplateSelector after creation of the Machine object, edit the IpamHost object.
netconfigUpdateMode ^TechPreview
Update mode of network configuration. Possible values:
- MANUAL
  Default, recommended. An operator manually applies new network configuration.
- AUTO-UNSAFE
  Unsafe, not recommended. If new network configuration is rendered by kaas-ipam successfully, it is applied automatically with no manual approval.
- MANUAL-GRACEPERIOD
  Initial value set during the IpamHost object creation. If new network configuration is rendered by kaas-ipam successfully, it is applied automatically with no manual approval. This value is implemented for automatic changes in the IpamHost object during the host provisioning and deployment. The value is changed automatically to MANUAL in three hours after the IpamHost object creation.
netconfigUpdateAllow ^TechPreview
Manual approval of network changes. Possible values: true or false. Set to true to approve the Netplan configuration file candidate (stored in netconfigCandidate) and copy its contents to the effective Netplan configuration file list (stored in netconfigFiles). After that, its value is automatically switched back to false.

Note

This value has effect only if netconfigUpdateMode is set to MANUAL.

Set to true only if status.netconfigCandidateState of network configuration candidate is OK.
Caution

The following fields of the ipamHost status are renamed since MOSK 23.1 in the scope of the L2Template and IpamHost objects refactoring:
- netconfigV2 to netconfigCandidate
- netconfigV2state to netconfigCandidateState
- netconfigFilesState to netconfigFilesStates (per file)
No user actions are required after renaming.

The format of netconfigFilesState changed after renaming. The netconfigFilesStates field contains a dictionary of statuses of network configuration files stored in netconfigFiles. The dictionary contains the keys that are file paths and values that have the same meaning for each file that netconfigFilesState had:
- For a successfully rendered configuration file: OK: <timestamp> <sha256-hash-of-rendered-file>, where a timestamp is in the RFC 3339 format.
- For a failed rendering: ERR: <error-message>.

Configuration example:

spec:
  nicMACmap:
  - mac: 0c:c4:7a:1e:a9:5c
    name: ens11f0
  - ip: 172.16.48.157
    mac: 0c:c4:7a:1e:a9:5d
    name: ens11f1
    primary: true
  l2TemplateSelector:
    label:xxx
  netconfigUpdateMode: manual
  netconfigUpdateAllow: false

IpamHost status¶

Caution

The following fields of the ipamHost status are renamed since MOSK 23.1 in the scope of the L2Template and IpamHost objects refactoring:

netconfigV2 to netconfigCandidate
netconfigV2state to netconfigCandidateState
netconfigFilesState to netconfigFilesStates (per file)

No user actions are required after renaming.

The format of netconfigFilesState changed after renaming. The netconfigFilesStates field contains a dictionary of statuses of network configuration files stored in netconfigFiles. The dictionary contains the keys that are file paths and values that have the same meaning for each file that netconfigFilesState had:

For a successfully rendered configuration file: OK: <timestamp> <sha256-hash-of-rendered-file>, where a timestamp is in the RFC 3339 format.
For a failed rendering: ERR: <error-message>.

The status field of the IpamHost resource describes the current state of the object. It contains the following fields:

netconfigCandidate
Candidate of the Netplan configuration file in human readable format that is rendered using the corresponding L2Template. This field contains valid data if l2RenderResult and netconfigCandidateState retain the OK result.
l2RenderResult ^Deprecated
Status of a rendered Netplan configuration candidate stored in netconfigCandidate. Possible values:
- For a successful L2 template rendering: OK: timestamp sha256-hash-of-rendered-netplan, where timestamp is in the RFC 3339 format
- For a failed rendering: ERR: <error-message>
This field is deprecated and will be removed in one of the following releases. Use netconfigCandidateState instead.
netconfigCandidateState ^TechPreview
Status of a rendered Netplan configuration candidate stored in netconfigCandidate. Possible values:
- For a successful L2 template rendering: OK: timestamp sha256-hash-of-rendered-netplan, where timestamp is in the RFC 3339 format
- For a failed rendering: ERR: <error-message>
netconfigFiles
List of Netplan configuration files rendered using the corresponding L2Template. It is used to configure host networking during bare metal host provisioning and during Kubernetes node deployment. For details, refer to Workflow of the netplan configuration using an L2 template.

Its contents are changed only if rendering of Netplan configuration was successful. So, it always retains the last successfully rendered Netplan configuration. To apply changes in contents, the infrastructure operator approval is required. For details, see Modify network configuration on an existing machine.

Every item in this list contains:
- content
  The base64-encoded Netplan configuration file that was rendered using the corresponding L2Template.
- path
  The file path for the Netplan configuration file on the target host.
netconfigFilesStates
Status of Netplan configuration files stored in netconfigFiles. Possible values:
- For a successful L2 template rendering: OK: timestamp sha256-hash-of-rendered-netplan, where timestamp is in the RFC 3339 format
- For a failed rendering: ERR: <error-message>
serviceMap
Dictionary of services and their endpoints (IP address and optional interface name) that have the ipam/SVC-<serviceName> label. These addresses are added to the ServiceMap dictionary during rendering of an L2 template for a given IpamHost. For details, see Service labels and their life cycle.

state ^{Since MCC 2.23.0 (11.7.0)}
Message that reflects the current status of the resource. The list of possible values includes the following:
- OK - object is operational.
- ERR - object is non-operational. This status has a detailed description in the messages list.
- TERM - object was deleted and is terminating.
messages ^{Since MCC 2.23.0 (11.7.0)}
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.

Configuration example:

status:
  l2RenderResult: OK
  l2TemplateRef: namespace_name/l2-template-name/1/2589/88865f94-04f0-4226-886b-2640af95a8ab
  netconfigFiles:
    - content: ...<base64-encoded Netplan configuration file>...
      path: /etc/netplan/60-kaas-lcm-netplan.yaml
  netconfigFilesStates: /etc/netplan/60-kaas-lcm-netplan.yaml: 'OK: 2023-01-23T09:27:22.71802Z ece7b73808999b540e32ca1720c6b7a6e54c544cc82fa40d7f6b2beadeca0f53'
  netconfigCandidate:
    ...
    <Netplan configuration file in plain text, rendered from L2Template>
    ...
  netconfigCandidateState: OK: 2022-06-08T03:18:08.49590Z a4a128bc6069638a37e604f05a5f8345cf6b40e62bce8a96350b5a29bc8bccde\
  serviceMap:
    ipam/SVC-ceph-cluster:
      - ifName: ceph-br2
        ipAddress: 10.0.10.11
      - ifName: ceph-br1
        ipAddress: 10.0.12.22
    ipam/SVC-ceph-public:
      - ifName: ceph-public
        ipAddress: 10.1.1.15
    ipam/SVC-k8s-lcm:
      - ifName: k8s-lcm
        ipAddress: 10.0.1.52
  phase: Active
  state: OK
  objCreated: 2021-10-21T19:09:32Z  by  v5.1.0-20210930-121522-f5b2af8
  objStatusUpdated: 2021-10-21T19:14:18.748114886Z  by  v5.1.0-20210930-121522-f5b2af8
  objUpdated: 2021-10-21T19:09:32.606968024Z  by  v5.1.0-20210930-121522-f5b2af8

L2Template¶

This section describes the L2Template resource used in the management API. By default, MOSK configures a single interface on cluster nodes, leaving all other physical interfaces intact. With L2Template, you can create advanced host networking configurations for your clusters. For example, you can create bond interfaces on top of physical interfaces on the host.

For demonstration purposes, the L2Template custom resource (CR) is split into the following major sections:

L2Template metadata
L2Template configuration
L2Template status

L2Template metadata¶

The metadata section of the L2Template CR contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is L2Template.
metadata
The metadata field contains the following subfields:
- name
 Name of the L2Template object.
- namespace
 Project in which the L2Template object was created.
- labels
 Key-value pairs that are attached to the object:
 
 Caution
 
 All ipam/* labels, except ipam/DefaultForCluster, are set automatically and must not be configured manually.
 
 cluster.sigs.k8s.io/cluster-name
 Mandatory for newly created L2Template since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0). Reference to the Cluster object name that this template is applied to.
 
 The process of selecting the L2Template object for a specific cluster is as follows:
 
 The kaas-ipam controller monitors the L2Template objects with the cluster.sigs.k8s.io/cluster-name: <clusterName> label.
 
 The L2Template object with the cluster.sigs.k8s.io/cluster-name: <clusterName> label is assigned to a cluster with Name: <clusterName>, if available.
 
 ipam/PreInstalledL2Template: "1"
 Is automatically added during a management cluster deployment. Indicates that the current L2Template object was preinstalled. Represents L2 templates that are automatically copied to a project once it is created. Once the L2 templates are copied, the ipam/PreInstalledL2Template label is removed.
 
 Note
 
 Preinstalled L2 templates are removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0) along with the ipam/PreInstalledL2Template label. During cluster update to the mentioned releases, existing preinstalled templates are automatically removed.
 
 ipam/DefaultForCluster
 This label is unique per cluster. When you use several L2 templates per cluster, only the first template is automatically labeled as the default one. All consequent templates must be referenced in the configuration files of machines using L2templateSelector. You can manually configure this label if required.
 
 ipam/UID
 Unique ID of an object.
 
 kaas.mirantis.com/provider
 Provider type.
 
 kaas.mirantis.com/region
 Region name.
 
 Note
 
 The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
 Warning
 
 Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  name: l2template-test
  namespace: default
  labels:
    ipam/DefaultForCluster: "1"
    cluster.sigs.k8s.io/cluster-name: test-cluster
    kaas.mirantis.com/provider: baremetal

L2Template configuration¶

L2 template requirements

An L2 template must have the same project (Kubernetes namespace) as the referenced cluster.
A cluster can be associated with many L2 templates. Only one of them can have the ipam/DefaultForCluster label. Every L2 template that does not have the ipam/DefaultForCluster label can be later assigned to a particular machine using l2TemplateSelector.
The following rules apply to the default L2 template of a namespace:
- Since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0), creation of the default L2 template for a namespace is disabled. On existing clusters, the Spec.clusterRef: default parameter of such an L2 template is automatically removed during the migration process. Subsequently, this parameter is not substituted with the cluster.sigs.k8s.io/cluster-name label, ensuring the application of the L2 template across the entire Kubernetes namespace. Therefore, you can continue using existing default namespaced L2 templates.
- Before Container Cloud 2.25.0 (Cluster releases 15.x, 14.x, or earlier), the default L2Template object of a namespace must have the Spec.clusterRef: default parameter that is deprecated since 2.25.0.

The spec field of the L2Template resource describes the desired state of the object. It contains the following fields:

ifMapping
List of interface names for the template. The interface mapping is defined globally for all Machine objects linked to the template but can be overridden at the host level, if required, by editing the IpamHost object for a particular host. The ifMapping parameter is mutually exclusive with autoIfMappingPrio.
autoIfMappingPrio
List of prefixes, such as eno, ens, and so on, to match the interfaces to automatically create a list for the template. If you are not aware of any specific ordering of interfaces on the nodes, use the default ordering from Predictable Network Interfaces Names specification for systemd.

You can also override the default NIC list per host using the IfMappingOverride parameter of the corresponding IpamHost. The provision value corresponds to the network interface that was used to provision a node. Usually, it is the first NIC found on a particular node. It is defined explicitly to ensure that this interface will not be reconfigured accidentally.

The autoIfMappingPrio parameter is mutually exclusive with ifMapping.
l3Layout
Subnets to be used in the npTemplate section. The field contains a list of subnet definitions with parameters used by template macros.
- subnetName
  Defines the alias name of the subnet that can be used to reference this subnet from the template macros. This parameter is mandatory for every entry in the l3Layout list.
- subnetPool ^{Unsupported since 2.28.0 (17.3.0 and 16.3.0)}
  Optional. Default: none. Defines a name of the parent SubnetPool object that will be used to create a Subnet object with a given subnetName and scope. For deprecation details, see SubnetPool resource management.
  
  If a corresponding Subnet object already exists, nothing will be created and the existing object will be used. If no SubnetPool is provided, no new Subnet object will be created.
- scope
  Logical scope of the Subnet object with a corresponding subnetName. Possible values:
  
  global - the Subnet object is accessible globally, for any project and cluster, for example, the PXE subnet.
  
  namespace - the Subnet object is accessible within the same project where the L2 template is defined.
  
  cluster - unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). The Subnet object uses the namespace where the referenced cluster is located. A subnet is only accessible to the cluster that L2Template.metadata.labels:cluster.sigs.k8s.io/cluster-name (mandatory since MOSK 23.3) or L2Template.spec.clusterRef (deprecated in MOSK 23.3) refers to. The Subnet objects with the cluster scope will be created for every new cluster.
  
  Note
  
  Every subnet referenced in an L2 template can have either a global or namespaced scope. In the latter case, the subnet must exist in the same project where the corresponding cluster and L2 template are located.
- labelSelector
  Contains a dictionary of labels and their respective values that will be used to find the matching Subnet object. If the labelSelector field is omitted, the Subnet object will be selected by name, specified by the subnetName parameter.
  
  Caution
  
  The labels and their values in this section must match the ones added for the corresponding Subnet object.
Caution

The l3Layout section is mandatory for each L2Template custom resource.

npTemplate

A netplan-compatible configuration with special lookup functions that defines the networking settings for the cluster hosts, where physical NIC names and details are parameterized. This configuration will be processed using Go templates. Instead of specifying IP and MAC addresses, interface names, and other network details specific to a particular host, the template supports use of special lookup functions. These lookup functions, such as nic, mac, ip, and so on, return host-specific network information when the template is rendered for a particular host.

Caution

All rules and restrictions of the netplan configuration also apply to L2 templates. For details, see official netplan documentation.

Caution

Mirantis strongly recommends following the below conventions on network interface naming:

A physical NIC name set by an L2 template must not exceed 15 symbols. Otherwise, an L2 template creation fails. This limit is set by the Linux kernel.
Names of virtual network interfaces such as VLANs, bridges, bonds, veth, and so on must not exceed 15 symbols.

Mirantis recommends setting interfaces names that do not exceed 13 symbols for both physical and virtual interfaces to avoid corner cases and issues in netplan rendering.

The following table describes the main lookup functions for an L2 template.

Lookup function	Description
`{{nic N}}`	Name of a NIC number N. NIC numbers correspond to the interface mapping list. This macro can be used as a key for the elements of the `ethernets` map, or as the value of the `name` and `set-name` parameters of a NIC. It is also used to reference the physical NIC from definitions of virtual interfaces (`vlan`, `bridge`).
`{{mac N}}`	MAC address of a NIC number N registered during a host hardware inspection.
`{{ip “N:subnet-a”}}`	IP address and mask for a NIC number N. The address will be auto-allocated from the given subnet if the address does not exist yet.
`{{ip “br0:subnet-x”}}`	IP address and mask for a virtual interface, `“br0”` in this example. The address will be auto-allocated from the given subnet if the address does not exist yet. For virtual interfaces names, an IP address placeholder must contain a human-readable ID that is unique within the L2 template and must have the following format: `{{ip "<shortUniqueHumanReadableID>:<subnetNameFromL3Layout>"}}` The `<shortUniqueHumanReadableID>` is made equal to a virtual interface name throughout this document and Container Cloud bootstrap templates.
`{{cidr_from_subnet “subnet-a”}}`	IPv4 CIDR address from the given subnet.
`{{gateway_from_subnet “subnet-a”}}`	IPv4 default gateway address from the given subnet.
`{{nameservers_from_subnet “subnet-a”}}`	List of the IP addresses of name servers from the given subnet.
`{{cluster_api_lb_ip}}`	Technology Preview since Container Cloud 2.24.4 (Cluster releases 15.0.3 and 14.0.3). IP address for a cluster API load balancer.

clusterRef
Caution

Deprecated since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0) in favor of the mandatory cluster.sigs.k8s.io/cluster-name label. Will be removed in one of the following releases.

On existing clusters, this parameter is automatically migrated to the cluster.sigs.k8s.io/cluster-name label since 2.25.0.

If an existing cluster has clusterRef: default set, the migration process involves removing this parameter. Subsequently, it is not substituted with the cluster.sigs.k8s.io/cluster-name label, ensuring the application of the L2 template across the entire Kubernetes namespace.

The Cluster object name that this template is applied to. The default value is used to apply the given template to all clusters within a particular project, unless an L2 template that references a specific cluster name exists. The clusterRef field has priority over the cluster.sigs.k8s.io/cluster-name label:
- When clusterRef is set to a non-default value, the cluster.sigs.k8s.io/cluster-name label will be added or updated with that value.
- When clusterRef is set to default, the cluster.sigs.k8s.io/cluster-name label will be absent or removed.

Configuration example:

spec:
  autoIfMappingPrio:
  - provision
  - eno
  - ens
  - enp
  l3Layout:
    - subnetName: kaas-mgmt
      scope:      global
      labelSelector:
        kaas-mgmt-subnet: ""
    - subnetName: demo-pods
      scope:      namespace
    - subnetName: demo-ext
      scope:      namespace
    - subnetName: demo-ceph-cluster
      scope:      namespace
    - subnetName: demo-ceph-replication
      scope:      namespace
  npTemplate: |
    version: 2
    ethernets:
      {{nic 1}}:
        dhcp4: false
        dhcp6: false
        addresses:
          - {{ip "1:kaas-mgmt"}}
        gateway4: {{gateway_from_subnet "kaas-mgmt"}}
        nameservers:
          addresses: {{nameservers_from_subnet "kaas-mgmt"}}
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}

L2Template status¶

The status field of the L2Template resource reflects the actual state of the L2Template object and contains the following fields:

state ^{Since MCC 2.23.0 (11.7.0)}
Message that reflects the current status of the resource. The list of possible values includes the following:
- OK - object is operational.
- ERR - object is non-operational. This status has a detailed description in the messages list.
- TERM - object was deleted and is terminating.
messages ^{Since MCC 2.23.0 (11.7.0)}
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
phase
Deprecated since Container Cloud 2.23.0 (Cluster release 11.7.0) and will be removed in one of the following releases in favor of state. Possible values: Active, Failed, or Terminating.
reason
Deprecated since Container Cloud 2.23.0 (Cluster release 11.7.0) and will be removed in one of the following releases in favor of messages. For the field description, see messages.

Configuration example:

status:
  phase: Failed
  state: ERR
  messages:
    - "ERR: The kaas-mgmt subnet in the terminating state."
  objCreated: 2021-10-21T19:09:32Z  by  v5.1.0-20210930-121522-f5b2af8
  objStatusUpdated: 2021-10-21T19:14:18.748114886Z  by  v5.1.0-20210930-121522-f5b2af8
  objUpdated: 2021-10-21T19:09:32.606968024Z  by  v5.1.0-20210930-121522-f5b2af8

MetalLBConfig¶

GA since MCC 2.24.0 (14.0.0) for management clusters GA and mandatory for any cluster type after a management cluster upgrade to MCC 2.25.0 (16.0.0)

This section describes the MetalLBConfig custom resource used in the management API that contains the MetalLB configuration objects for a particular cluster.

For demonstration purposes, the MetalLBConfig custom resource description is split into the following major sections:

MetalLBConfig metadata
MetalLBConfig spec
MetalLBConfig status
MetalLB configuration examples

The management API also uses the third-party open-source MetalLB API. For details, see MetalLB objects.

MetalLBConfig metadata¶

The MetalLBConfig CR contains the following general fields:

apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is MetalLBConfig.

The metadata object field of the MetalLBConfig resource contains the following fields:

name
Name of the MetalLBConfig object.
namespace
Project in which the object was created. Must match the project name of the target cluster.
labels
Key-value pairs attached to the object. Mandatory labels:
- kaas.mirantis.com/provider
  Provider type that is baremetal.
- kaas.mirantis.com/region
  Region name that matches the region name of the target cluster.
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
- cluster.sigs.k8s.io/cluster-name
  Name of the cluster that the MetalLB configuration must apply to.
Warning

Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: MetalLBConfig
metadata:
  name: metallb-demo
  namespace: test-ns
  labels:
    kaas.mirantis.com/provider: baremetal
    cluster.sigs.k8s.io/cluster-name: test-cluster

MetalLBConfig spec¶

The spec field of the MetalLBConfig object represents the MetalLBConfigSpec subresource that contains the description of MetalLB configuration objects. These objects are created in the target cluster during its deployment.

The spec field contains the following optional fields:

addressPools
Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), deprecated in 2.26.0 (Cluster releases 17.2.0 and 16.2.0).

List of MetalLBAddressPool objects to create MetalLB AddressPool objects.
bfdProfiles
List of MetalLBBFDProfile objects to create MetalLB BFDProfile objects.
bgpAdvertisements
List of MetalLBBGPAdvertisement objects to create MetalLB BGPAdvertisement objects.
bgpPeers
List of MetalLBBGPPeer objects to create MetalLB BGPPeer objects.
communities
List of MetalLBCommunity objects to create MetalLB Community objects.
ipAddressPools
List of MetalLBIPAddressPool objects to create MetalLB IPAddressPool objects.
l2Advertisements
List of MetalLBL2Advertisement objects to create MetalLB L2Advertisement objects.

The l2Advertisements object allows defining interfaces to optimize the announcement. When you use the interfaces selector, LB addresses are announced only on selected host interfaces.

Mirantis recommends using the interfaces selector if nodes use separate host networks for different types of traffic. The pros of such configuration are as follows: less spam on other interfaces and networks and limited chances to reach IP addresses of load-balanced services from irrelevant interfaces and networks.

Caution

Interface names in the interfaces list must match those on the corresponding nodes.
templateName
Unsupported since Container Cloud 2.28.0 (17.3.0 and 16.3.0). For details, see Deprecation notes.

Name of the MetalLBConfigTemplate object used as a source of MetalLB configuration objects. Mutually exclusive with the fields listed below that will be part of the MetalLBConfigTemplate object. For details, see MetalLBConfigTemplate.

Before Cluster releases 17.2.0 and 16.2.0, MetalLBConfigTemplate is the default configuration method for MetalLB. Since Cluster releases 17.2.0 and 16.2.0, use the MetalLBConfig object instead.
Support status of MetalLBConfigTemplate for MOSK
- 24.3 - unsupported
- 24.2 - deprecated
- 23.3 - generally available
- 23.2 - technical preview

The objects listed in the spec field of the MetalLBConfig object, such as MetalLBIPAddressPool, MetalLBL2Advertisement, and so on, are used as templates for the MetalLB objects that will be created in the target cluster. Each of these objects has the following structure:

labels
Optional. Key-value pairs attached to the metallb.io/<objectName> object as metadata.labels.
name
Name of the metallb.io/<objectName> object.
spec
Contents of the spec section of the metallb.io/<objectName> object. The spec field has the metallb.io/<objectName>Spec type. For details, see MetalLB objects.

For example, MetalLBIPAddressPool is a template for the metallb.io/IPAddressPool object and has the following structure:

labels
Optional. Key-value pairs attached to the metallb.io/IPAddressPool object as metadata.labels.
name
Name of the metallb.io/IPAddressPool object.
spec
Contents of the spec section of the metallb.io/IPAddressPool object. The spec has the metallb.io/IPAddressPoolSpec type.

MetalLB objects¶

MOSK supports the following MetalLB object types of the metallb.io API group:

IPAddressPool
Community
L2Advertisement

BFDProfile
BGPAdvertisement
BGPPeer

As of v1beta1 and v1beta2 API versions, metadata of MetalLB objects has a standard format with no specific fields or labels defined for any particular object:

apiVersion
API version of the object that can be metallb.io/v1beta1 or metallb.io/v1beta2.
kind
Object type that is one of the metallb.io types listed above. For example, IPAddressPool.
metadata
Object metadata that contains the following subfields:
- name
  Name of the object.
- namespace
  Namespace where the MetalLB components are located. It matches metallb-system in MOSK.
- labels
  Optional. Key-value pairs that are attached to the object. It can be an arbitrary set of labels. No special labels are defined as of v1beta1 and v1beta2 API versions.

The MetalLBConfig object contains spec sections of the metallb.io/<objectName> objects that have the metallb.io/<objectName>Spec type. For metallb.io/<objectName> and metallb.io/<objectName>Spec types definitions, refer to the official MetalLB documentation:

Note

Before Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), metallb.io/<objectName> objects v0.13.9 are supported.

The l2Advertisements object allows defining interfaces to optimize the announcement. When you use the interfaces selector, LB addresses are announced only on selected host interfaces. Mirantis recommends this configuration if nodes use separate host networks for different types of traffic. The pros of such configuration are as follows: less spam on other interfaces and networks, limited chances to reach services LB addresses from irrelevant interfaces and networks.

Configuration example:

l2Advertisements: |
  - name: management-lcm
    spec:
      ipAddressPools:
        - default
      interfaces:
        # LB addresses from the "default" address pool will be announced
        # on the "k8s-lcm" interface
        - k8s-lcm

Caution

Interface names in the interfaces list must match those on the corresponding nodes.

MetalLBConfig status¶

Available since MCC 2.24.0 (14.0.0) for management clusters

Caution

For MOSK clusters, this field is generally available since 23.3.

The status field describes the actual state of the object.иIt contains the following fields:

bootstrapMode ^{Only in MCC 2.24.0}
Field that appears only during a management cluster bootstrap as true and is used internally for bootstrap. Once deployment completes, the value is moved to false and is excluded from the status output.
objects
Description of MetalLB objects that is used to create MetalLB native objects in the target cluster.

The format of underlying objects is the same as for those in the spec field, except templateName, which is obsolete since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0) and which is not present in this field. The objects contents are rendered from the following locations, with possible modifications for the bootstrap cluster:
- Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0): MetalLBConfig.spec
- Before Container Cloud 2.28.0 (Cluster releases 17.2.0, 16.2.0, or earlier):
  - MetalLBConfigTemplate.status of the corresponding template if MetalLBConfig.spec.templateName is defined
  - MetalLBConfig.spec if MetalLBConfig.spec.templateName is not defined
propagateResult
Result of objects propagation. During objects propagation, native MetalLB objects of the target cluster are created and updated according to the description of the objects present in the status.objects field.

This field contains the following information:
- message
  Text message that describes the result of the last attempt of objects propagation. Contains an error message if the last attempt was unsuccessful.
- success
  Result of the last attempt of objects propagation. Boolean.
- time
  Timestamp of the last attempt of objects propagation. For example, 2023-07-04T00:30:36Z.
If the objects propagation was successful, the MetalLB objects of the target cluster match the ones present in the status.objects field.
updateResult
Status of the MetalLB objects update. Has the same format of subfields that in propagateResult described above.

During objects update, the status.objects contents are rendered as described in the objects field definition above.

If the objects update was successful, the MetalLB objects description present in status.objects is rendered successfully and up to date. This description is used to update MetalLB objects in the target cluster. If the objects update was not successful, MetalLB objects will not be propagated to the target cluster.

MetalLB configuration examples¶

Example of configuration template for using L2 announcements:

apiVersion: kaas.mirantis.com/v1alpha1
kind: MetalLBConfig
metadata:
  labels:
    cluster.sigs.k8s.io/cluster-name: demo-cluster
    kaas.mirantis.com/provider: baremetal
  name: demo-cluster-l2
  namespace: demo-ns
spec:
  ipAddressPools:
    - name: services
      spec:
        addresses:
          - 10.100.91.151-10.100.91.170
        autoAssign: true
        avoidBuggyIPs: false
  l2Advertisements:
    - name: services
      spec:
        ipAddressPools:
        - services

Example of configuration extract for using the interfaces selector, which enables announcement of LB addresses only on selected host interfaces:

l2Advertisements:
  - name: services
    spec:
      ipAddressPools:
      - default
      interfaces:
      - k8s-lcm

Caution

Interface names in the interfaces list must match the ones on the corresponding nodes.

After the object is created and processed by the MetalLB Controller, the status field is added. For example:

status:
  objects:
    ipAddressPools:
    - name: services
      spec:
        addresses:
        - 10.100.100.151-10.100.100.170
        autoAssign: true
        avoidBuggyIPs: false
    l2Advertisements:
      - name: services
        spec:
          ipAddressPools:
          - services
  propagateResult:
    message: Objects were successfully updated
    success: true
    time: "2023-07-04T14:31:40Z"
  updateResult:
    message: Objects were successfully read from MetalLB configuration specification
    success: true
    time: "2023-07-04T14:31:39Z"

Example of native MetalLB objects to be created in the demo-ns/demo-cluster cluster during deployment:

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: services
  namespace: metallb-system
spec:
  addresses:
  - 10.100.91.151-10.100.91.170
  autoAssign: true
  avoidBuggyIPs: false
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: services
  namespace: metallb-system
spec:
  ipAddressPools:
  - services

Example of configuration template for using BGP announcements:

apiVersion: kaas.mirantis.com/v1alpha1
kind: MetalLBConfig
metadata:
  labels:
    cluster.sigs.k8s.io/cluster-name: demo-cluster
    kaas.mirantis.com/provider: baremetal
  name: demo-cluster-bgp
  namespace: demo-ns
spec:
  bgpPeers:
    - name: bgp-peer-rack1
      spec:
        peerAddress: 10.0.41.1
        peerASN: 65013
        myASN: 65099
        nodeSelectors:
          - matchLabels:
              rack-id: rack1
    - name: bgp-peer-rack2
      spec:
        peerAddress: 10.0.42.1
        peerASN: 65023
        myASN: 65099
        nodeSelectors:
          - matchLabels:
              rack-id: rack2
    - name: bgp-peer-rack3
      spec:
        peerAddress: 10.0.43.1
        peerASN: 65033
        myASN: 65099
        nodeSelectors:
          - matchLabels:
              rack-id: rack3
  ipAddressPools:
    - name: services
      spec:
        addresses:
          - 10.100.191.151-10.100.191.170
        autoAssign: true
        avoidBuggyIPs: false
  bgpAdvertisements:
    - name: services
      spec:
        ipAddressPools:
        - services

See also

MetalLBConfigTemplate¶

Unsupported since MCC 2.28.0 (17.3.0 and 16.3.0)

Warning

The MetalLBConfigTemplate object may not work as expected due to its deprecation. For details, see MetalLBConfigTemplate resource management.

Support status of MetalLBConfigTemplate

Container Cloud release	Cluster release	Support status
2.29.0	17.4.0 and 16.4.0	Admission Controller blocks creation of the object
2.28.0	17.3.0 and 16.3.0	Unsupported for any cluster type
2.27.0	17.2.0 and 16.2.0	Deprecated for any cluster type
2.25.0	17.0.0 and 16.0.0	Generally available for MOSK clusters
2.24.2	15.0.1, 14.0.1, 14.0.0	Technology Preview for MOSK clusters
2.24.0	14.0.0	Generally available for management clusters

This section describes the MetalLBConfigTemplate custom resource used in the management API that contains the template for MetalLB configuration for a particular cluster.

Before Container Cloud 2.27.0 (Cluster releases 17.1.0, 16.1.0, or earlier), MetalLBConfigTemplate is the default configuration method for MetalLB on bare metal deployments. This method allows the use of Subnet objects to define MetalLB IP address pools the same way as they were used before introducing the MetalLBConfig and MetalLBConfigTemplate objects. Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), use the MetalLBConfig object for this purpose instead.

For demonstration purposes, the MetalLBConfigTemplate custom resource description is split into the following major sections:

MetalLBConfigTemplate metadata
MetalLBConfigTemplate spec
MetalLBConfigTemplate status
MetalLB configuration examples

MetalLBConfigTemplate metadata¶

The Container Cloud MetalLBConfigTemplate CR contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is MetalLBConfigTemplate.

The metadata object field of the MetalLBConfigTemplate resource contains the following fields:

name
Name of the MetalLBConfigTemplate object.
namespace
Project in which the object was created. Must match the project name of the target cluster.
labels
Key-value pairs attached to the object. Mandatory labels:
- kaas.mirantis.com/provider
  Provider type that is baremetal.
- kaas.mirantis.com/region
  Region name that matches the region name of the target cluster.
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
- cluster.sigs.k8s.io/cluster-name
  Name of the cluster that the MetalLB configuration applies to.
Warning

Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: ipam.mirantis.com/v1alpha1
kind: MetalLBConfigTemplate
metadata:
  name: metallb-demo
  namespace: test-ns
  labels:
    kaas.mirantis.com/provider: baremetal
    cluster.sigs.k8s.io/cluster-name: test-cluster

MetalLBConfigTemplate spec¶

The spec field of the MetalLBConfigTemplate object contains the templates of MetalLB configuration objects and optional auxiliary variables. MOSK uses these templates to create MetalLB configuration objects during the cluster deployment.

The spec field contains the following optional fields:

machines
Key-value dictionary to select IpamHost objects corresponding to nodes of the target cluster. Keys contain machine aliases used in spec.templates. Values contain the NameLabelsSelector items that select IpamHost by name or by labels. For example:
machines: control1: name: mosk-control-uefi-0 worker1: labels: uid: kaas-node-4003a5f6-2667-40e3-aa64-ebe713a8a7ba
This field is required if some IP addresses of nodes are used in spec.templates.
vars
Key-value dictionary of arbitrary user-defined variables that are used in spec.templates. For example:
vars: localPort: 4561
templates
List of templates for MetalLB configuration objects that are used to render MetalLB configuration definitions and create MetalLB objects in the target cluster. Contains the following optional fields:
- bfdProfiles
  Template for the MetalLBBFDProfile object list to create MetalLB BFDProfile objects.
- bgpAdvertisements
  Template for the MetalLBBGPAdvertisement object list to create MetalLB BGPAdvertisement objects.
- bgpPeers
  Template for the MetalLBBGPPeer object list to create MetalLB BGPPeer objects.
- communities
  Template for the MetalLBCommunity object list to create MetalLB Community objects.
- ipAddressPools
  Template for the MetalLBIPAddressPool object list to create MetalLB IPAddressPool objects.
- l2Advertisements
  Template for the MetalLBL2Advertisement object list to create MetalLB L2Advertisement objects.
Each template is a string and has the same structure as the list of the corresponding objects described in MetalLBConfig spec such as MetalLBIPAddressPool and MetalLBL2Advertisement, but you can use additional functions and variables inside these templates.

Note

When using the MetalLBConfigTemplate object, you can define MetalLB IP address pools using both Subnet objects and spec.ipAddressPools templates. IP address pools rendered from these sources will be concatenated and then written to status.renderedObjects.ipAddressPools.

You can use the following functions in templates:
- ipAddressPoolNames
  Selects all IP address pools of the given announcement type found for the target cluster. Possible types: layer2, bgp, any.
  
  The any type includes all IP address pools found for the target cluster. The announcement types of IP address pools are verified using the metallb/address-pool-protocol labels of the corresponding Subnet object.
  
  The ipAddressPools templates have no types as native MetalLB IPAddressPool objects have no announcement type.
  
  The l2Advertisements template can refer to IP address pools of the layer2 or any type.
  
  The bgpAdvertisements template can refer to IP address pools of the bgp or any type.
  
  IP address pools are searched in the templates.ipAddressPools field and in the Subnet objects of the target cluster. For example:
  
  l2Advertisements: | - name: l2services spec: ipAddressPools: {{ipAddressPoolNames "layer2"}} bgpAdvertisements: | - name: l3services spec: ipAddressPools: {{ipAddressPoolNames "bgp"}} l2Advertisements: | - name: any spec: ipAddressPools: {{ipAddressPoolNames "any"}} bgpAdvertisements: | - name: any spec: ipAddressPools: {{ipAddressPoolNames "any"}}
The l2Advertisements object allows defining interfaces to optimize the announcement. When you use the interfaces selector, LB addresses are announced only on selected host interfaces. Mirantis recommends this configuration if nodes use separate host networks for different types of traffic. The pros of such configuration are as follows: less spam on other interfaces and networks, limited chances to reach services LB addresses from irrelevant interfaces and networks.

Configuration example:
l2Advertisements: | - name: management-lcm spec: ipAddressPools: - default interfaces: # LB addresses from the "default" address pool will be announced # on the "k8s-lcm" interface - k8s-lcm
Caution

Interface names in the interfaces list must match those on the corresponding nodes.

MetalLBConfigTemplate status¶

The status field describes the actual state of the object. It contains the following fields:

renderedObjects
MetalLB objects description rendered from spec.templates in the same format as they are defined in the MetalLBConfig spec field.

All underlying objects are optional. The following objects can be present: bfdProfiles, bgpAdvertisements, bgpPeers, communities, ipAddressPools, l2Advertisements.

state ^{Since MCC 2.23.0 (11.7.0)}
Message that reflects the current status of the resource. The list of possible values includes the following:
- OK - object is operational.
- ERR - object is non-operational. This status has a detailed description in the messages list.
- TERM - object was deleted and is terminating.
messages ^{Since MCC 2.23.0 (11.7.0)}
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.

MetalLB configuration examples¶

The following examples contain configuration templates that include MetalLBConfigTemplate.

Configuration example for using L2 (ARP) announcement¶

After the objects are created and processed by the kaas-ipam Controller, the status field displays for MetalLBConfigTemplate:

The following example illustrates contents of the status field that displays for MetalLBConfig after the objects are processed by the MetalLB Controller.

Using the objects described above, several native MetalLB objects are created in the kaas-mgmt cluster during deployment.

Configuration example for using BGP announcement¶

In the following configuration example, MetalLB is configured to use BGP for announcement of external addresses of Kubernetes load-balanced services for the MOSK cluster from master nodes. Each master node is located in its own rack without the L2 layer extension between racks.

This section contains only examples of the objects required to illustrate the MetalLB configuration. For Rack, MultiRackCluster, L2Template and other objects required to configure BGP announcement of the cluster API load balancer address for this scenario, refer to Multiple rack configuration example.

The following objects illustrate configuration for three subnets that are used to configure external network in three racks. Each master node uses its own external L2/L3 network segment.

Rack objects and ipam/RackRef labels in Machine objects are not required for MetalLB configuration. But in this example, rack objects are implied to be used for configuration of BGP announcement of the cluster API load balancer address. Rack objects are not present in this example.

Machine objects select different L2 templates because each master node uses different L2/L3 network segments for LCM, external, and other networks.

See also

MultiRackCluster¶

TechPreview Available since MCC 2.24.4 (15.0.3 and 14.0.3)

This section describes the MultiRackCluster resource used in the management API.

When you create a cluster with a multi-rack topology, where Kubernetes masters are distributed across multiple racks without L2 layer extension between them, the MultiRackCluster resource allows you to set cluster-wide parameters for configuration of the BGP announcement of the cluster API load balancer address. In this scenario, the MultiRackCluster object must be bound to the Cluster object.

The MultiRackCluster object is generally used for a particular cluster in conjunction with Rack objects described in Rack.

For demonstration purposes, we split the MultiRackCluster custom resource (CR) description into the following major sections:

MultiRackCluster metadata
MultiRackCluster spec
MultiRackCluster status
MultiRackCluster and Rack usage examples

MultiRackCluster metadata¶

The Container Cloud MultiRackCluster CR metadata contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is MultiRackCluster.
metadata
The metadata field contains the following subfields:
- name
  Name of the MultiRackCluster object.
- namespace
  Container Cloud project (Kubernetes namespace) in which the object was created.
- labels
  Key-value pairs that are attached to the object:
  
  cluster.sigs.k8s.io/cluster-name
  Cluster object name that this MultiRackCluster object is applied to. To enable the use of BGP announcement for the cluster API LB address, set the useBGPAnnouncement parameter in the Cluster object to true:
  
  spec: providerSpec: value: useBGPAnnouncement: true
  
  kaas.mirantis.com/provider
  Provider name that is baremetal.
  
  kaas.mirantis.com/region
  Region name.
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
  
  Warning
  
  Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

The MultiRackCluster metadata configuration example:

apiVersion: ipam.mirantis.com/v1alpha1
kind: MultiRackCluster
metadata:
  name: multirack-test-cluster
  namespace: test-ns
  labels:
    cluster.sigs.k8s.io/cluster-name: test-cluster
    kaas.mirantis.com/provider: baremetal

MultiRackCluster spec¶

The spec field of the MultiRackCluster resource describes the desired state of the object. It contains the following fields:

bgpdConfigFileName
Name of the configuration file for the BGP daemon (bird). Recommended value is bird.conf.
bgpdConfigFilePath
Path to the directory where the configuration file for the BGP daemon (bird) is added. The recommended value is /etc/bird.
bgpdConfigTemplate
Optional. Configuration text file template for the BGP daemon (bird) configuration file where you can use go template constructs and the following variables:
- RouterID, LocalIP
  Local IP on the given network, which is a key in the Rack.spec.peeringMap dictionary, for a given node. You can use it, for example, in the router id {{$.RouterID}}; instruction.
- LocalASN
  Local AS number.
- NeighborASN
  Neighbor AS number.
- NeighborIP
  Neighbor IP address. Its values are taken from Rack.spec.peeringMap, it can be used only inside the range iteration through the Neighbors list.
- Neighbors
  List of peers in the given network and node. It can be iterated through the range statement in the go template.
Values for LocalASN and NeighborASN are taken from:
- MultiRackCluster.defaultPeer - if not used as a field inside the range iteration through the Neighbors list.
- Corresponding values of Rack.spec.peeringMap - if used as a field inside the range iteration through the Neighbors list.
This template can be overridden using the Rack objects. For details, see Rack spec.
defaultPeer
Configuration parameters for the default BGP peer. These parameters will be used in rendering of the configuration file for BGP daemon from the template if they are not overridden for a particular rack or network using Rack objects. For details, see Rack spec.
- localASN
  Mandatory. Local AS number.
- neighborASN
  Mandatory. Neighbor AS number.
- neighborIP
  Reserved. Neighbor IP address. Leave it as an empty string.
- password
  Optional. Neighbor password. If not set, you can hardcode it in bgpdConfigTemplate. It is required for MD5 authentication between BGP peers.

Configuration examples:

MultiRackCluster status¶

The status field of the MultiRackCluster resource reflects the actual state of the MultiRackCluster object and contains the following fields:

state ^{Since MCC 2.23.0 (11.7.0)}
Message that reflects the current status of the resource. The list of possible values includes the following:
- OK - object is operational.
- ERR - object is non-operational. This status has a detailed description in the messages list.
- TERM - object was deleted and is terminating.
messages ^{Since MCC 2.23.0 (11.7.0)}
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.

Configuration example:

status:
  checksums:
    annotations: sha256:38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bed
    labels: sha256:d8f8eacf487d57c22ca0ace29bd156c66941a373b5e707d671dc151959a64ce7
    spec: sha256:66b5d28215bdd36723fe6230359977fbede828906c6ae96b5129a972f1fa51e9
  objCreated: 2023-08-11T12:25:21.00000Z  by  v6.5.999-20230810-155553-2497818
  objStatusUpdated: 2023-08-11T12:32:58.11966Z  by  v6.5.999-20230810-155553-2497818
  objUpdated: 2023-08-11T12:32:57.32036Z  by  v6.5.999-20230810-155553-2497818
  state: OK

MultiRackCluster and Rack usage examples¶

The following configuration examples of several bare metal objects illustrate how to configure BGP announcement of the load balancer address used to expose the cluster API.

Single rack configuration example¶

In the following example, all master nodes are in a single rack. One Rack object is required in this case for master nodes. Some worker nodes can coexist in the same rack with master nodes or occupy separate racks. It is implied that the useBGPAnnouncement parameter is set to true in the corresponding Cluster object.

After the objects are created and nodes are provisioned, the IpamHost objects will have BGP daemon configuration files in their status fields. For example:

You can decode /etc/bird/bird.conf contents and verify the configuration:

echo "<<base64-string>>" | base64 -d

The following system output applies to the above configuration examples:

BGP daemon configuration files are copied from IpamHost.status to the corresponding LCMMachine object the same way as it is done for netplan configuration files. Then, the configuration files are written to the corresponding node by the LCM-Agent.

Multiple rack configuration example¶

In the following configuration example, each master node is located in its own rack. Three Rack objects are required in this case for master nodes. Some worker nodes can coexist in the same racks with master nodes or occupy separate racks. Only objects that are required to show configuration for BGP announcement of the cluster API load balancer address are provided here.

For the description of Rack, MetalLBConfig, and other objects that are required for MetalLB configuration in this scenario, refer to Configuration example for using BGP announcement.

It is implied that the useBGPAnnouncement parameter is set to true in the corresponding Cluster object.

The following Rack objects differ in neighbor IP addresses and in the network (L3 subnet) used for BGP connection to announce the cluster API LB IP and for cluster API traffic.

As compared to single rack examples, the following Machine objects differ in:

BMH selectors
L2Template selectors
Rack selectors (the ipam/RackRef label)
The rack-id node labels

The labels on master nodes are required for MetalLB node selectors if MetalLB is used to announce LB IP addresses on master nodes. In this scenario, the L2 (ARP) announcement mode cannot be used for MetalLB because master nodes are in different L2 segments. So, the BGP announcement mode must be used for MetalLB. Node selectors are required to properly configure BGP connections from each master node.

Note

Caution

The following L2Template objects differ in LCM and external subnets that each master node uses.

The following MetalLBConfig example illustrates how node labels are used in nodeSelectors of bgpPeers. Each of bgpPeers corresponds to one of master nodes.

After the objects are created and nodes are provisioned, the IpamHost objects will have BGP daemon configuration files in their status fields. Refer to Single rack configuration example on how to verify the BGP configuration files.

Rack¶

TechPreview Available since MCC 2.24.4 (Cluster releases 15.0.3 and 14.0.3)

This section describes the Rack resource used in the management API.

When you create a MOSK cluster with a multi-rack topology, where Kubernetes masters are distributed across multiple racks without L2 layer extension between them, the Rack resource allows you to configure BGP announcement of the cluster API load balancer address from each rack.

In this scenario, Rack objects must be bound to Machine objects corresponding to master nodes of the cluster. Each Rack object describes the configuration of the BGP daemon (bird) used to announce the cluster API LB address from a particular master node (or from several nodes in the same rack).

Rack objects are used for a particular cluster only in conjunction with the MultiRackCluster object described in MultiRackCluster.

For demonstration purposes, we split the Rack custom resource (CR) description into the following major sections:

Rack metadata
Rack spec
Rack status

For configuration examples, see MultiRackCluster and Rack usage examples.

Rack metadata¶

The Rack CR metadata contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is Rack.
metadata
The metadata field contains the following subfields:
- name
  Name of the Rack object. Corresponding Machine objects must have their ipam/RackRef label value set to the name of the Rack object. This label is required only for Machine objects of the master nodes that announce the cluster API LB address.
- namespace
  Project (Kubernetes namespace) where the object was created.
- labels
  Key-value pairs that are attached to the object:
  
  cluster.sigs.k8s.io/cluster-name
  Cluster object name that this Rack object is applied to.
  
  kaas.mirantis.com/provider
  Provider name that is baremetal.
  
  kaas.mirantis.com/region
  Region name.
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
  
  Warning
  
  Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Rack metadata example:

apiVersion: ipam.mirantis.com/v1alpha1
kind: Rack
metadata:
  name: rack-1
  namespace: test-ns
  labels:
    cluster.sigs.k8s.io/cluster-name: test-cluster
    kaas.mirantis.com/provider: baremetal

Corresponding Machine metadata example:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  labels:
    cluster.sigs.k8s.io/cluster-name: test-cluster
    cluster.sigs.k8s.io/control-plane: controlplane
    hostlabel.bm.kaas.mirantis.com/controlplane: controlplane
    ipam/RackRef: rack-1
    kaas.mirantis.com/provider: baremetal
  name: test-master-1-control-efi-6tg52
  namespace: test-ns

Rack spec¶

The spec field of the Rack resource describes the desired state of the object. It contains the following fields:

bgpdConfigTemplate
Optional. Configuration file template that will be used to create configuration file for a BGP daemon on nodes in this rack. If not set, the configuration file template from the corresponding MultiRackCluster object is used.
peeringMap
Structure that describes general parameters of BGP peers to be used in the configuration file for a BGP daemon for each network where BGP announcement is used. Also, you can define a separate configuration file template for the BGP daemon for each of those networks. The peeringMap structure is as follows:
peeringMap: <network-name-a>: peers: - localASN: <localASN-1> neighborASN: <neighborASN-1> neighborIP: <neighborIP-1> password: <password-1> - localASN: <localASN-2> neighborASN: <neighborASN-2> neighborIP: <neighborIP-2> password: <password-2> bgpdConfigTemplate: | <configuration file template for a BGP daemon> ...
- <network-name-a>
 Name of the network where a BGP daemon should connect to the neighbor BGP peers. By default, it is implied that the same network is used on the node to make connection to the neighbor BGP peers as well as to receive and respond to the traffic directed to the IP address being advertised. In our scenario, the advertised IP address is the cluster API LB IP address.
 
 This network name must be the same as the subnet name used in the L2 template (l3Layout section) for the corresponding master node(s).
- peers
 Optional. List of dictionaries where each dictionary defines configuration parameters for a particular BGP peer. Peer parameters are as follows:
 
 localASN
 Optional. Local AS number. If not set, it can be taken from MultiRackCluster.spec.defaultPeer or can be hardcoded in bgpdConfigTemplate.
 
 neighborASN
 Optional. Neighbor AS number. If not set, it can be taken from MultiRackCluster.spec.defaultPeer or can be hardcoded in bgpdConfigTemplate.
 
 neighborIP
 Mandatory. Neighbor IP address.
 
 password
 Optional. Neighbor password. If not set, it can be taken from MultiRackCluster.spec.defaultPeer or can be hardcoded in bgpdConfigTemplate. It is required when MD5 authentication between BGP peers is used.
- bgpdConfigTemplate
 Optional. Configuration file template that will be used to create the configuration file for the BGP daemon of the network-name-a network on a particular node. If not set, Rack.spec.bgpdConfigTemplate is used.

Configuration example:

Rack status¶

The status field of the Rack resource reflects the actual state of the Rack object and contains the following fields:

state ^{Since MCC 2.23.0 (11.7.0)}
Message that reflects the current status of the resource. The list of possible values includes the following:
- OK - object is operational.
- ERR - object is non-operational. This status has a detailed description in the messages list.
- TERM - object was deleted and is terminating.
messages ^{Since MCC 2.23.0 (11.7.0)}
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.

Configuration example:

status:
  checksums:
    annotations: sha256:cd4b751d9773eacbfd5493712db0cbebd6df0762156aefa502d65a9d5e8af31d
    labels: sha256:fc2612d12253443955e1bf929f437245d304b483974ff02a165bc5c78363f739
    spec: sha256:8f0223b1eefb6a9cd583905a25822fd83ac544e62e1dfef26ee798834ef4c0c1
  objCreated: 2023-08-11T12:25:21.00000Z  by  v6.5.999-20230810-155553-2497818
  objStatusUpdated: 2023-08-11T12:33:00.92163Z  by  v6.5.999-20230810-155553-2497818
  objUpdated: 2023-08-11T12:32:59.11951Z  by  v6.5.999-20230810-155553-2497818
  state: OK

Subnet¶

This section describes the Subnet resource used in the management API to allocate IP addresses for cluster nodes.

For demonstration purposes, we split the Subnet custom resource (CR) into the following major sections:

Subnet metadata
Subnet spec
Subnet status

Subnet metadata¶

The Subnet CR metadata contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is Subnet
metadata
This field contains the following subfields:
- name
  Name of the Subnet object.
- namespace
  Project in which the Subnet object was created.
- labels
  Key-value pairs that are attached to the object:
  
  ipam/DefaultSubnet: "1" ^{Deprecated since MCC 2.14.0 (5.21.0)}
  Indicates that this subnet was automatically created for the PXE network.
  
  ipam/UID
  Unique ID of a subnet.
  
  kaas.mirantis.com/provider
  Provider type.
  
  kaas.mirantis.com/region
  Region name.
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
  Warning
  
  Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: ipam.mirantis.com/v1alpha1
kind: Subnet
metadata:
  name: kaas-mgmt
  namespace: default
  labels:
    ipam/UID: 1bae269c-c507-4404-b534-2c135edaebf5
    kaas.mirantis.com/provider: baremetal

Subnet spec¶

The spec field of the Subnet resource describes the desired state of a subnet. It contains the following fields:

cidr
A valid IPv4 CIDR, for example, 10.11.0.0/24.
gateway
A valid gateway address, for example, 10.11.0.9.
includeRanges
A comma-separated list of IP address ranges within the given CIDR that should be used in the allocation of IPs for nodes. The gateway, network, broadcast, and DNSaddresses will be excluded (protected) automatically if they intersect with one of the range. The IPs outside the given ranges will not be used in the allocation. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77.

Warning

Do not use values that are out of the given CIDR.
excludeRanges
A comma-separated list of IP address ranges within the given CIDR that should not be used in the allocation of IPs for nodes. The IPs within the given CIDR but outside the given ranges will be used in the allocation. The gateway, network, broadcast, and DNS addresses will be excluded (protected) automatically if they are included in the CIDR. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77.

Warning

Do not use values that are out of the given CIDR.
useWholeCidr
If set to false (by default), the subnet address and broadcast address will be excluded from the address allocation. If set to true, the subnet address and the broadcast address are included into the address allocation for nodes.
nameservers
The list of IP addresses of name servers. Each element of the list is a single address, for example, 172.18.176.6.

Configuration example:

spec:
  cidr: 172.16.48.0/24
  excludeRanges:
  - 172.16.48.99
  - 172.16.48.101-172.16.48.145
  gateway: 172.16.48.1
  nameservers:
  - 172.18.176.6

Subnet status¶

The status field of the Subnet resource describes the actual state of a subnet. It contains the following fields:

allocatable
The number of IP addresses that are available for allocation.
allocatedIPs
The list of allocated IP addresses in the IP:<IPAddr object UID> format.
capacity
The total number of IP addresses to be allocated, including the sum of allocatable and already allocated IP addresses.
cidr
The IPv4 CIDR for a subnet.
gateway
The gateway address for a subnet.
nameservers
The list of IP addresses of name servers.
ranges
The list of IP address ranges within the given CIDR that are used in the allocation of IPs for nodes.
statusMessage
Deprecated since Container Cloud 2.23.0 (Cluster release 11.7.0) and will be removed in one of the following releases in favor of state and messages. Since Container Cloud 2.24.0 (Cluster release 14.0.0), this field is not set for subnets of newly created clusters. For the field description, see state.

state ^{Since MCC 2.23.0 (11.7.0)}
Message that reflects the current status of the resource. The list of possible values includes the following:
- OK - object is operational.
- ERR - object is non-operational. This status has a detailed description in the messages list.
- TERM - object was deleted and is terminating.
messages ^{Since MCC 2.23.0 (11.7.0)}
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.

Configuration example:

status:
  allocatable: 51
  allocatedIPs:
  - 172.16.48.200:24e94698-f726-11ea-a717-0242c0a85b02
  - 172.16.48.201:2bb62373-f726-11ea-a717-0242c0a85b02
  - 172.16.48.202:37806659-f726-11ea-a717-0242c0a85b02
  capacity: 54
  cidr: 172.16.48.0/24
  gateway: 172.16.48.1
  nameservers:
  - 172.18.176.6
  ranges:
  - 172.16.48.200-172.16.48.253
  objCreated: 2021-10-21T19:09:32Z  by  v5.1.0-20210930-121522-f5b2af8
  objStatusUpdated: 2021-10-21T19:14:18.748114886Z  by  v5.1.0-20210930-121522-f5b2af8
  objUpdated: 2021-10-21T19:09:32.606968024Z  by  v5.1.0-20210930-121522-f5b2af8
  state: OK

SubnetPool¶

Unsupported since MCC 2.28.0 (17.3.0 and 16.3.0)

Warning

The SubnetPool object is unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). For details, see Deprecation Notes: SubnetPool resource management.

This section describes the SubnetPool resource used in the management API to manage a pool of addresses from which subnets can be allocated.

For demonstration purposes, the SubnetPool custom resource (CR) can be split into the following major sections:

SubnetPool metadata
SubnetPool spec
SubnetPool status

SubnetPool metadata¶

The Container Cloud SubnetPool CR contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is SubnetPool.
metadata
The metadata field contains the following subfields:
- name
  Name of the SubnetPool object.
- namespace
  Project in which the SubnetPool object was created.
- labels
  Key-value pairs that are attached to the object:
  
  kaas.mirantis.com/provider
  Provider type that is baremetal.
  
  kaas.mirantis.com/region
  Region name.
  
  Note
  
  The kaas.mirantis.com/region label is removed from all MOSK objects in 24.1. Therefore, do not add the label starting with this release. On existing clusters updated to this release, or if added manually, MOSK ignores this label.
  Warning
  
  Labels and annotations that are not documented in this API Reference are generated automatically. Do not modify them using the API.

Configuration example:

apiVersion: ipam.mirantis.com/v1alpha1
kind: SubnetPool
metadata:
  name: kaas-mgmt
  namespace: default
  labels:
    kaas.mirantis.com/provider: baremetal

SubnetPool spec¶

The spec field of the SubnetPool resource describes the desired state of a subnet pool. It contains the following fields:

cidr
Valid IPv4 CIDR. For example, 10.10.0.0/16.
blockSize
IP address block size to use when assigning an IP address block to every new managed Subnet object. For example, if you set /25, every new managed Subnet will have 128 IPs to allocate. Possible values are from /29 to the cidr size. Immutable.
nameservers
Optional. List of IP addresses of name servers to use for every new managed Subnet object. Each element of the list is a single address, for example, 172.18.176.6. Default: empty.
gatewayPolicy
Optional. Method of assigning a gateway address to new managed Subnet objects. Default: none. Possible values are:
- first - first IP of the IP address block assigned to a managed Subnet, for example, 10.11.10.1.
- last - last IP of the IP address block assigned to a managed Subnet, for example, 10.11.10.254.
- none - no gateway address.

Configuration example:

spec:
  cidr: 10.10.0.0/16
  blockSize: /25
  nameservers:
  - 172.18.176.6
  gatewayPolicy: first

SubnetPool status¶

The status field of the SubnetPool resource describes the actual state of a subnet pool. It contains the following fields:

allocatedSubnets
List of allocated subnets. Each subnet has the <CIDR>:<SUBNET_UID> format.
blockSize
Block size to use for IP address assignments from the defined pool.
capacity
Total number of IP addresses to be allocated. Includes the number of allocatable and already allocated IP addresses.
allocatable
Number of subnets with the blockSize size that are available for allocation.

state ^{Since MCC 2.23.0 (11.7.0)}
Message that reflects the current status of the resource. The list of possible values includes the following:
- OK - object is operational.
- ERR - object is non-operational. This status has a detailed description in the messages list.
- TERM - object was deleted and is terminating.
messages ^{Since MCC 2.23.0 (11.7.0)}
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.

Example:

status:
  allocatedSubnets:
  - 10.10.0.0/24:0272bfa9-19de-11eb-b591-0242ac110002
  blockSize: /24
  capacity: 54
  allocatable: 51
  objCreated: 2021-10-21T19:09:32Z  by  v5.1.0-20210930-121522-f5b2af8
  objStatusUpdated: 2021-10-21T19:14:18.748114886Z  by  v5.1.0-20210930-121522-f5b2af8
  objUpdated: 2021-10-21T19:09:32.606968024Z  by  v5.1.0-20210930-121522-f5b2af8
  state: OK

Host OS configuration resources¶

This section contains descriptions and examples of host operating system configuration custom resources for MOSK.

HostOSConfiguration¶

TechPreview since MCC 2.26.0 (17.1.0 and 16.1.0)

Warning

For security reasons and to ensure safe and reliable cluster operability, test this configuration on a staging environment before applying it to production. For any questions, contact Mirantis support.

Caution

As long as the feature is still on the development stage, Mirantis highly recommends deleting all HostOSConfiguration objects, if any, before automatic upgrade of the management cluster to Container Cloud 2.27.0 (Cluster release 16.2.0). After the upgrade, you can recreate the required objects using the updated parameters.

This precautionary step prevents re-processing and re-applying of existing configuration, which is defined in HostOSConfiguration objects, during management cluster upgrade to 2.27.0. Such behavior is caused by changes in the HostOSConfiguration API introduced in 2.27.0.

This section describes the HostOSConfiguration custom resource (CR) used in the management API. It contains all necessary information to introduce and load modules for further configuration of the host operating system of the related Machine object.

Note

This object must be created and managed on the management cluster.

For demonstration purposes, we split the HostOSConfiguration CR into the following sections:

HostOSConfiguration metadata
HostOSConfiguration configuration
HostOSConfiguration status

HostOSConfiguration metadata¶

The HostOSConfiguration custom resource (CR) metadata contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is HostOSConfiguration.
metadata
Object metadata that contains the following subfields:
- name
  Object name.
- namespace
  Project in which the HostOSConfiguration object is created.

Configuration example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: HostOSConfiguration
metadata:
  name: host-os-configuration-sample
  namespace: default

HostOSConfiguration configuration¶

The spec object field contains configuration for a HostOSConfiguration object and has the following fields:

machineSelector
Required for production deployments. A set of Machine objects to apply the HostOSConfiguration object to. Has the format of the Kubernetes label selector.
configs
Required. List of configurations to apply to Machine objects defined in machineSelector. Each entry has the following fields:
- module
  Required. Name of the module that refers to an existing module in one of the HostOSConfigurationModules objects.
- moduleVersion
  Required. Version of the module in use in the SemVer format.
- description
  Optional. Description and purpose of the configuration.
- order
  Optional. Positive integer between 1 and 1024 that indicates the order of applying the module configuration. A configuration with the lowest order value is applied first. If the order field is not set:
  
  Since MCC 2.27.0 (Cluster releases 17.2.0 and 16.2.0), the configuration is applied in the order of appearance in the list after all configurations with the value are applied.
  
  In MCC 2.26.0 (Cluster releases 17.1.0 and 16.1.0), the following rules apply to the ordering when comparing each pair of entries:
  
  Ordering by alphabet based on the module values unless they are equal.
  
  Ordering by version based on the moduleVersion values, with preference given to the lesser value.
- values
  Optional if secretValues is set. Module configuration in the format of key-value pairs.
- secretValues
  Optional if values is set. Reference to a Secret object that contains the configuration values for the module:
  
  namespace
  Project name of the Secret object.
  
  name
  Name of the Secret object.
  
  Note
  
  You can use both values and secretValues together. But if the values are duplicated, the secretValues data rewrites duplicated keys of the values data.
  
  Warning
  
  The referenced Secret object must contain only primitive non-nested values. Otherwise, the values will not be applied correctly.
- phase
  Optional. LCM phase, in which a module configuration must be executed. The only supported and default value is reconfigure. Hence, you may omit this field.
order
Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Optional. Positive integer between 1 and 1024 that indicates the order of applying HostOSConfiguration objects on newly added or newly assigned machines. An object with the lowest order value is applied first. If the value is not set, the object is applied last in the order.

If no order field is set for all HostOSConfiguration objects, the objects are sorted by name.

Note

If a user changes the HostOSConfiguration object that was already applied on some machines, then only the changed items from the spec.configs section of the HostOSConfiguration object are applied to those machines, and the execution order applies only to the changed items.

The configuration changes are applied on corresponding LCMMachine objects almost immediately after host-os-modules-controller verifies the changes.

Configuration example:

spec:
   machineSelector:
      matchLabels:
        label-name: "label-value"
   configs:
   - description: Brief description of the configuration
     module: container-cloud-provided-module-name
     moduleVersion: 1.0.0
     order: 1
     # the 'phase' field is provided for illustration purposes. it is redundant
     # because the only supported value is "reconfigure".
     phase: "reconfigure"
     values:
       foo: 1
       bar: "baz"
     secretValues:
       name: values-from-secret
       namespace: default

HostOSConfiguration status¶

The status field of the HostOSConfiguration object contains the current state of the object:

controllerUpdate
Reserved. Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Indicates whether the status updates are initiated by host-os-modules-controller.
isValid
Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Indicates whether all given configurations have been validated successfully and are ready to be applied on machines. An invalid object is discarded from processing.
specUpdatedAt
Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Defines the time of the last change in the object spec observed by host-os-modules-controller.
containsDeprecatedModules
Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). Indicates whether the object uses one or several deprecated modules. Boolean.
machinesStates
Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Specifies the per-machine state observed by baremetal-provider. The keys are machines names, and each entry has the following fields:
- observedGeneration
  Read-only. Specifies the sequence number representing the quantity of changes in the object since its creation. For example, during object creation, the value is 1.
- selected
  Indicates whether the machine satisfied the selector of the object. Non-selected machines are not defined in machinesStates. Boolean.
- secretValuesChanged
  Indicates whether the secret values have been changed and the corresponding stateItems have to be updated. Boolean.
  
  The value is set to true by host-os-modules-controller if changes in the secret data are detected. The value is set to false by baremetal-provider after processing.
- configStateItemsStatuses
  Specifies key-value pairs with statuses of StateItems that are applied to the machine. Each key contains the name and version of the configuration module. Each key value has the following format:
  
  Key: name of a configuration StateItem
  
  Value: simplified status of the configuration StateItem that has the following fields:
  
  hash
  Value of the hash sum from the status of the corresponding StateItem in the LCMMachine object. Appears when the status switches to Success.
  
  state
  Actual state of the corresponding StateItem from the LCMMachine object. Possible values: Not Started, Running, Success, Failed.
configs
List of configurations statuses, indicating results of application of each configuration. Every entry has the following fields:
- moduleName
  Existing module name from the list defined in the spec:modules section of the related HostOSConfigurationModules object.
- moduleVersion
  Existing module version defined in the spec:modules section of the related HostOSConfigurationModules object.
- modulesReference
  Name of the HostOSConfigurationModules object that contains the related module configuration.
- modulePlaybook
  Name of the Ansible playbook of the module. The value is taken from the related HostOSConfigurationModules object where this module is defined.
- moduleURL
  URL to the module package in the FQDN format. The value is taken from the related HostOSConfigurationModules object where this module is defined.
- moduleHashsum
  Hash sum of the module. The value is taken from the related HostOSConfigurationModules object where this module is defined.
- lastDesignatedConfiguration
  Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Key-value pairs representing the latest designated configuration data for modules. Each key corresponds to a machine name, while the associated value contains the configuration data encoded in the gzip+base64 format.
- lastValidatedSpec
  Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Last validated module configuration encoded in the gzip+base64 format.
- valuesValid
  Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Validation state of the configuration and secret values defined in the object spec against the module valuesValidationSchema. Always true when valuesValidationSchema is empty.
- error
  Details of an error, if any, that occurs during the object processing by host-os-modules-controller.
- secretObjectVersion
  Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Resource version of the corresponding Secret object observed by host-os-modules-controller. Is present only if secretValues is set.
- moduleDeprecatedBy
  Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). List of modules that deprecate the currently configured module. Contains the name and version fields specifying one or more modules that deprecate the current module.
- supportedDistributions
  Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). List of operating system distributions that are supported by the current module. An empty list means support of any distribution by the current module.

HostOSConfiguration status example:

status:
  configs:
  - moduleHashsum: bc5fafd15666cb73379d2e63571a0de96fff96ac28e5bce603498cc1f34de299
    moduleName: module-name
    modulePlaybook: main.yaml
    moduleURL: <url-to-module-archive.tgz>
    moduleVersion: 1.1.0
    modulesReference: mcc-modules
    moduleDeprecatedBy:
    - name: another-module-name
      version: 1.0.0
  - moduleHashsum: 53ec71760dd6c00c6ca668f961b94d4c162eef520a1f6cb7346a3289ac5d24cd
    moduleName: another-module-name
    modulePlaybook: main.yaml
    moduleURL: <url-to-another-module-archive.tgz>
    moduleVersion: 1.1.0
    modulesReference: mcc-modules
    secretObjectVersion: "14234794"
  containsDeprecatedModules: true
  isValid: true
  machinesStates:
    default/master-0:
      configStateItemsStatuses:
        # moduleName-moduleVersion
        module-name-1.1.0:
          # corresponding state item
          host-os-download-<object-name>-module-name-1.1.0-reconfigure:
            hash: 0e5c4a849153d3278846a8ed681f4822fb721f6d005021c4509e7126164f428d
            state: Success
          host-os-<object-name>-module-name-1.1.0-reconfigure:
            state: Not Started
        another-module-name-1.1.0:
          host-os-download-<object-name>-another-module-name-1.1.0-reconfigure:
            state: Not Started
          host-os-<object-name>-another-module-name-1.1.0-reconfigure:
            state: Not Started
      observedGeneration: 1
      selected: true
  updatedAt: "2024-04-23T14:10:28Z"

HostOSConfigurationModules¶

TechPreview since MCC 2.26.0 (17.1.0 and 16.1.0)

Warning

This section describes the HostOSConfigurationModules custom resource (CR) used in the management API. It contains all necessary information to introduce and load modules for further configuration of the host operating system of the related Machine object. For description of module format, schemas, and rules, see Format and structure of a module package.

Note

This object must be created and managed on the management cluster.

For demonstration purposes, we split the HostOSConfigurationModules CR into the following sections:

HostOSConfigurationModules metadata
HostOSConfigurationModules configuration
HostOSConfigurationModules status

HostOSConfigurationModules metadata¶

The HostOSConfigurationModules custom resource (CR) metadata contains the following fields:

apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is HostOSConfigurationModules.
metadata
Object metadata that contains the following subfield:
- name
  Object name.

Configuration example:

apiVersion: kaas.mirantis.com/v1alpha1
kind: HostOSConfigurationModules
metadata:
  name: host-os-configuration-modules-sample

HostOSConfigurationModules configuration¶

The spec object field contains configuration for a HostOSConfigurationModules object and has the following fields:

modules
List of available modules to use as a configuration. Each entry has the following fields:
- name
  Required. Module name that must equal the corresponding custom module name defined in the metadata section of the corresponding module. For reference, see Metadata file format.
- url
  Required for custom modules. URL to the archive containing the module package in the FQDN format. If omitted, the module is considered as the one provided and validated by MOSK.
- version
  Required. Module version in SemVer format that must equal the corresponding custom module version defined in the metadata section of the corresponding module. For reference, see Metadata file format.
- sha256sum
  Required. Hash sum computed using the SHA-256 algorithm. The hash sum is automatically validated upon fetching the module package, the module does not load if the hash sum is invalid.
- deprecates ^{Since MCC 2.28.0 (17.3.0 and 16.3.0)}
  Reserved. List of modules that will be deprecated by the module. This field is overriden by the same field, if any, of the module metadata section.
  
  Contains the name and version fields specifying one or more modules to be deprecated. If name is omitted, it inherits the name of the current module.

Configuration example:

spec:
    modules:
    - name: mirantis-provided-module-name
      sha256sum: ff3c426d5a2663b544acea74e583d91cc2e292913fc8ac464c7d52a3182ec146
      version: 1.0.0
    - name: custom-module-name
      url: https://fully.qualified.domain.name/to/module/archive/module-name-1.0.0.tgz
      sha256sum: 258ccafac1570de7b7829bde108fa9ee71b469358dbbdd0215a081f8acbb63ba
      version: 1.0.0

HostOSConfigurationModules status¶

The status field of the HostOSConfigurationModules object contains the current state of the object:

modules
List of module statuses, indicating the loading results of each module. Each entry has the following fields:
- name
  Name of the loaded module.
- version
  Version of the loaded module.
- url
  URL to the archive containing the loaded module package in the FQDN format.
- docURL
  URL to the loaded module documentation if it was initially present in the module package.
- description
  Description of the loaded module if it was initially present in the module package.
- sha256sum
  Actual SHA-256 hash sum of the loaded module.
- valuesValidationSchema
  JSON schema used against the module configuration values if it was initially present in the module package. The value is encoded in the gzip+base64 format.
- state
  Actual availability state of the module. Possible values are: available or error.
- error
  Error, if any, that occurred during the module fetching and verification.
- playbookName
  Name of the module package playbook.
- deprecates ^{Since MCC 2.28.0 (17.3.0 and 16.3.0)}
  List of modules that are deprecated by the module. Contains the name and version fields specifying one or more modules deprecated by the current module.
- deprecatedBy ^{Since MCC 2.28.0 (17.3.0 and 16.3.0)}
  List of modules that deprecate the current module. Contains the name and version fields specifying one or more modules that deprecate the current module.
- supportedDistributions ^{Since 2.28.0 (17.3.0 and 16.3.0)}
  List of operating system distributions that are supported by the current module. An empty list means support of any distribution by the current module.

HostOSConfigurationModules status example:

status:
  modules:
  - description: Brief description of the module
    docURL: https://docs.mirantis.com
    name: mirantis-provided-module-name
    playbookName: directory/main.yaml
    sha256sum: ff3c426d5a2663b544acea74e583d91cc2e292913fc8ac464c7d52a3182ec146
    state: available
    url: https://example.mirantis.com/path/to/module-name-1.0.0.tgz
    valuesValidationSchema: <gzip+base64 encoded data>
    version: 1.0.0
    deprecates:
    - name: custom-module-name
      version: 1.0.0
  - description: Brief description of the module
    docURL: https://example.documentation.page/module-name
    name: custom-module-name
    playbookName: directory/main.yaml
    sha256sum: 258ccafac1570de7b7829bde108fa9ee71b469358dbbdd0215a081f8acbb63ba
    state: available
    url: https://fully.qualified.domain.name/to/module/archive/module-name-1.0.0.tgz
    version: 1.0.0
    deprecatedBy:
    - name: mirantis-provided-module-name
      version: 1.0.0
    supportedDistributions:
    - ubuntu/jammy

OpenStack Operator resources¶

The purpose of the reference documents below is to provide cloud operators with an up-to-date and comprehensive definition of the language they need to use to communicate with OpenStack:

MOSK 25.1 series OpenStack API Reference
MOSK 24.3 series OpenStack API Reference
MOSK 24.2 series OpenStack API Reference
MOSK 24.1 series:
- MOSK 24.1 series OpenStack API Reference
- MOSK 24.1 series OpenStack Secret API Reference
MOSK 23.3 series:
- MOSK 23.3 series OpenStack API Reference
- MOSK 23.3 series OpenStack Secret API Reference
MOSK 23.2 series:
- MOSK 23.2 series OpenStack API Reference
- MOSK 23.2 series OpenStack Secret API Reference

Tungsten Fabric Operator resources¶

The purpose of the reference documents below is to provide cloud operators with an up-to-date and comprehensive definition of the language they need to use to communicate with Tungsten Fabric:

MOSK 25.1 Tungsten Fabric API v2 Reference
MOSK 24.3 Tungsten Fabric API v2 Reference
MOSK 24.2 series:
- MOSK 24.2 Tungsten Fabric API v2 Reference
- MOSK 24.2 Tungsten Fabric API Reference Deprecated
MOSK 24.1 series:
- MOSK 24.1 Tungsten Fabric API Reference
- MOSK 24.1 Tungsten Fabric API v2 Reference TechPreview
MOSK 23.3 Tungsten Fabric API Reference
MOSK 23.2 Tungsten Fabric API Reference

Ceph resources¶

For descriptions of available Ceph custom resources for a MOSK cluster, refer to Ceph configuration options.

StackLight resources¶

For descriptions of available StackLight custom resources for a MOSK cluster, refer to StackLight configuration parameters.

User Guide¶

This guide provides the usage instructions for a Mirantis OpenStack for Kubernetes (MOSK) environment and is intended for the cloud end user to successfully perform the OpenStack lifecycle management.

Use the Instance HA service¶

The section provides instructions on how to verify whether the Masakari service has been correctly configured by the cloud operator and will recover an instance from the process and compute node failures.

Verify recovery from a VM process failure¶

Create an instance:

openstack server create --image Cirros-5.1 --flavor m1.tiny --network DemoNetwork DemoInstance01

If required, mark it with the HA_Enabled tag:

Note

Depending on the Masakari service configuration, you may need to mark instances with the HA_Enabled tag. For more information about service configuration, refer to Configure high availability with Masakari.
```
openstack server set --property HA_Enabled=True DemoInstance01
```

Identify the compute host with the instance:

openstack server show DemoInstance01 |grep host
| OS-EXT-SRV-ATTR:host | vs-ps-vyvsrkrdpusv-1-w2mtagbeyhel-server-cgpejthzbztt |

ps xafu |grep qemu
nova      5231 34.3  1.1 5459452 184712 ?      Sl   07:39   0:18  |   \_ /usr/bin/qemu-system-x86_64 -name guest=instance-00000002....

Simulate the failure by killing the process:
```
kill -9 5231
```

Verify notifications:

openstack notification list
+--------------------------------------+----------------------------+---------+------+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| notification_uuid                    | generated_time             | status  | type | source_host_uuid                     | payload                                                                                                               |
+--------------------------------------+----------------------------+---------+------+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| 2fb82a5c-9a8b-4cef-a06e-a737e1b565a0 | 2021-07-06T07:40:40.000000 | running | VM   | 6f1bd5aa-0c21-446a-b6dd-c1b4d09759be | {'event': 'LIFECYCLE', 'instance_uuid': '165cdfaf-b9e5-42b2-bbb9-af9283a789ae', 'vir_domain_event': 'STOPPED_FAILED'} |
+--------------------------------------+----------------------------+---------+------+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------+

openstack notification list
+--------------------------------------+----------------------------+----------+------+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| notification_uuid                    | generated_time             | status   | type | source_host_uuid                     | payload                                                                                                               |
+--------------------------------------+----------------------------+----------+------+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| 2fb82a5c-9a8b-4cef-a06e-a737e1b565a0 | 2021-07-06T07:40:40.000000 | finished | VM   | 6f1bd5aa-0c21-446a-b6dd-c1b4d09759be | {'event': 'LIFECYCLE', 'instance_uuid': '165cdfaf-b9e5-42b2-bbb9-af9283a789ae', 'vir_domain_event': 'STOPPED_FAILED'} |
+--------------------------------------+----------------------------+----------+------+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------+

Verify that the instance process has been recovered:

ps xafu |grep qemu
root      8800  0.0  0.0  11488  1104 pts/1    S+   07:41   0:00  |   |   \_ grep --color=auto qemu
nova      8323  104  0.7 1262628 128936 ?      Sl   07:40   0:09  |   \_ /usr/bin/qemu-system-x86_64 -name guest=instance-00000002

Verify recovery from a node failure¶

Create an instance:

openstack server create --image Cirros-5.1 --flavor m1.tiny --network DemoNetwork DemoInstance01

If required, mark it with the HA_Enabled tag:

Note

Depending on the Masakari service configuration, you may need to mark instances with the HA_Enabled tag. For more information about service configuration, refer to Configure high availability with Masakari.
```
openstack server set --property HA_Enabled=True DemoInstance01
```

Identify the compute host with the instance:

openstack server show DemoInstance01 |grep host
| OS-EXT-SRV-ATTR:host | vs-ps-vyvsrkrdpusv-1-w2mtagbeyhel-server-cgpejthzbztt |

After a while, verify that the instance has been evacuated:

openstack server show DemoInstance01 |grep host
| OS-EXT-SRV-ATTR:host | vs-ps-vyvsrkrdpusv-0-ukqbpy2pkcuq-server-s4u2thvgxdfi |

Configure the introspective instance monitor¶

Available since MOSK 25.1 TechPreview

This section explains how to set up the introspective instance monitor to enhance the reliability of virtual machines in your OpenStack environment.

To configure the introspective instance monitoring:

Verify that the introspective instance monitor is enabled in the OpenStackDeployment custom resource as described in Enabling introspective instance monitor.
Create a high availability (HA) segment to group relevant compute hosts that will participate in monitoring and recovery as described in Group compute nodes into segments.
Ensure the virtual machine image supports the QEMU Guest Agent for monitoring by updating the image with the hw_qemu_guest_agent=yes property:
```
openstack image set --property hw_qemu_guest_agent=yes <IMAGE_NAME>
```

When creating a virtual machine, specify the HA_Enabled=True property to include it in monitoring and recovery:

openstack server create --flavor <FLAVOR> \
                        --image <IMAGE> \
                        --network <NETWORK> \
                        --property HA_Enabled=True <SERVER_NAME>

Install the QEMU Guest Agent to enable communication between the host and guest operating systems and ensure precise monitoring of the instance health.
Linux-based virtual machines
1. Install the QEMU Guest Agent using the system package manager. For example:
 sudo apt install qemu-guest-agent
2. Verify that the agent is up and running:
 systemctl status qemu-guest-agent
Windows-based virtual machines
Note

This procedure uses a generic approach to adding drivers to Windows. The exact steps may vary depending on the Windows version. Refer to the installation documentation specific to your Windows version for detailed instructions.
1. Download the VirtIO driver ISO file (virtio-win.iso).
2. Install the VirtIO Serial Driver:
 1. Attach virtio-win.iso to your Windows virtual machine.
 2. Log in to the virtual machine.
 3. Open the Windows Device Manager.
 4. Locate PCI Simple Communications Controller in the device list.
 5. Right-click on it and select Update Driver.
 6. Browse to the mounted ISO directory DRIVE:\vioserial\<OSVERSION>\, where <OSVERSION> corresponds to the Windows version of your Windows virtual machine.
 7. Click Next to install the driver.
 8. Reboot the virtual machine to complete the driver installation.
3. Install the QEMU Guest Agent:
 1. Log in to the virtual machine.
 2. Use the File Explorer to navigate to the guest-agent directory in the virtio-win CD drive.
 3. Run the installer: qemu-ga-x86_64.msi (64-bit) or qemu-ga-i386.msi (32-bit).
 4. Verify that the qemu-guest-agent is up and running. For example, in the PowerShell:
 
 Get-Service QEMU-GA
 
 Expected system response:
 
 Status Name DisplayName ------ ---- ----------- Running QEMU-GA QEMU Guest Agent

See also

Enabling introspective instance monitor

Manage application credentials¶

Application credentials is a mechanism in the MOSK Identity service (Keystone) that enables application automation tools, such as shell scripts, Terraform modules, Python programs, and others, to securely perform various actions in the cloud API in order to deploy and manage application components.

Historically, dedicated technical user accounts were created to be used by application automation tools. The application credentials mechanism has significant advantages over the legacy approach in terms of the following:

Self-service: Cloud users manage application credential objects completely on their own, without having to reach out to cloud operators.

Note

Application credentials are owned by the cloud user who created them, not the project, domain, or system that they have access to. Non-admin users only have access to application credentials that they have created themselves.
Security: Cloud users creating application credentials have control over the actions that automation tools will be allowed to perform on their behalf by the following:
- Specifying the cloud API endpoints the tool may access.
- Delegating to the tool just a subset of the owner’s roles.
- Restricting the tool from creating new application credential objects or trusts.
- Defining the validity period for a credential.
Simplicity: In case a credential is compromised, the automation tools using it can be easily switched to a new object.

See also

Usage limitations¶

For security reasons, a cloud user who logs in to the cloud through the Mirantis Container Cloud IAM or an external identity provider cannot use the application credentials mechanism by default. To enable the functionality, contact your cloud operator.
MOSK Object Storage service does not support application credentials authentication to access S3 API. To authenticate in S3 API, use the EC2 credentials mechanism.
MOSK Object Storage service has limited support for application credentials when accessing Swift API. The service does not accept application credentials with restrictions to allowed API endpoints.

Create an application credential using CLI¶

You can create an application credential using OpenStack CLI or Horizon. To create an application credential using CLI, use the openstack application credential create command.

If you do not provide the application credential secret, one will be generated automatically.

Warning

The application credential secret displays only once upon creation. It cannot be recovered from the Identity service. Therefore, capture the secret string from the command output and keep it in a safe place for future usage.

When creating application credentials, you can limit their capabilities depending on the security requirements of your deployment:

Define expiration time.
Limit the roles of an application credential to only a subset of roles that the user creating the credential has.
Pass a list of allowed API paths and actions, aka access rules, that the application credential will have access to. For the comprehensive list of possible options when creating credentials, consult the upstream OpenStack documentation.
Restrict an application credential from creating another application credential or a trust.

Note

This is the default behavior, but depending on what the application credential is used for, you may need to loosen this restriction.

An application credential will be created with access to the scope of your current session. For example, if your current credential is scoped to a specific project, domain, or system, the created application credential will have access to the same scope.

Create an application credential through Horizon¶

In the Identity panel, select Application Credentials.

In this view, you can list, create, and delete application credentials as well as display details of a specific application credential.
Click Create Application Credential.

In the wizard that opens, fill in the required fields and download clouds.yaml or an RC file to authenticate with the created application credential.

Authenticate with an application credential¶

To authenticate in a MOSK cloud using an application credential, you need to know the ID and secret of the application credential.

When using the human-readable name of an application credential instead of its ID, you also have to supply the user ID or the user name with the user domain ID or name. These details are required for the Identity service (Keystone) to resolve your application credential, since different users may have application credentials with the same name.

The following example illustrates a snippet from an RC file with required environment variables using the application credential name:

export OS_AUTH_URL="https://keystone.it.just.works/v3"
export OS_AUTH_TYPE=v3applicationcredential
export OS_APPLICATION_CREDENTIAL_NAME=myappcreds
export OS_APPLICATION_CREDENTIAL_SECRET=supersecret
export OS_USERNAME=demo
export OS_USER_DOMAIN_NAME=Default

The following example illustrates a snippet of an entry in clouds.yaml using the application credential ID:

clouds:
  my-app:
    auth_type: v3applicationcredential
    auth:
      auth_url: https://keystone.it.just.works/v3
      application_credential_id: 21dced0fd20347869b93710d2b98aae0
      application_credential_secret: supersecret

Using OpenStack CLI explicit arguments¶

Use the following openstackclient explicit arguments while authenticating with an application credential:

openstack --os-auth-type v3applicationcredential \
          --os-auth-url https://keystone.it.just.works/v3 \
          --os-application-credential-name my-appcreds \
          --os-application-credential-secret supersecret \
          --os-username demo \
          --os-user-domain-name Default \
          <ANY OSC COMMAND>

Using shell scripting¶

The following example curl command outputs the OpenStack keystone token using the application_credential authentication method:

curl -X POST https://keystone.it.just.works/v3/auth/tokens \
     -i -H "Content-Type: application/json" \
     -d '{"auth":{"identity":{"methods":["application_credential"],"application_credential":{"id": "21dced0fd20347869b93710d2b98aae0","secret": "supersecret"}}}}'

The token is located in the x-subject-token header of the response, and the response body contains information about the user, scope, effective roles, and the service catalog.

Rotate an application credential¶

In case an application credential becomes invalid due to the expiry or the owner-user leaving the team, or compromised if its secret gets exposed, Mirantis recommends rotating the credential immediately as follows:

Create a new application credential with the same permissions.
Adjust the automation tooling configuration to use the new object.
Delete the old object. This can be performed by the owner-user or cloud operator.

Authenticate in OpenStack API as a federated OIDC user¶

This section offers an example workflow of federated user authentication in OpenStack using an external identity provider and OpenID Connect (OIDC) protocol. The example illustrates a typical HTTP-based interchange of authentication data happening underneath between the client software, identity provider, and MOSK Identity service (OpenStack Keystone) when a cloud user logs in to a cloud using their corporate or social ID, depending on cloud configuration.

The instructions below can be handy for cloud operators who want to delve into how federated authentication operates in OpenStack and troubleshoot any related issues, as well as advanced cloud users keen on crafting their own basic automation tools for cloud interactions.

The instructions are provided for educational purposes. Mirantis encourages the majority of cloud users to rely on existing mature tools and libraries, such as openstacksdk, keystoneauth, python-openstackclient, or gophercloud to communicate with OpenStack APIs in a programmable manner.

Warning

Mirantis advises cloud users not to rely on federated authentication when managing their cloud resources using command line and cloud automation tool. Instead, consider using OpenStack built-in application credentials mechanism to ensure secure and reliable access to OpenStack APIs of your MOSK cloud. See Manage application credentials for details.

Verify cloud configuration with the cloud administrator¶

For cloud users to be able to log in to an OpenStack cloud using their federated identity, the cloud administrator should configure the cloud to integrate with an external OIDC-compatible identity provider, such as Mirantis Container Cloud IAM (Keycloak) with all the necessary resources pre-created in the OpenStack Keystone API.

To authenticate in OpenStack using your federated identity, an OIDC-compatible identity provider protocol, and the v3oidcpassword authentication method, get yourself acquainted with the authentication parameters listed below. You can extract the required information from the OpenStack RC file that is available for download from OpenStack Dashboard (Horizon) once you log in to it with your federated identity, or the clouds.yaml file.

OpenStack RC file - authetication parameters¶
Parameter	Description
`OS_AUTH_URL`	The URL pointing to the OpenStack Keystone API.
`OS_DISCOVERY_ENDPOINT`	The URL pointing to the OIDC discovery document served by the identity provider, usually ends with `.well-known/openid-configuration`.
`OS_CLIENT_ID`	The identifier of the OIDC client to use.
`OS_CLIENT_SECRET`	The secret for the OIDC client. Many OIDC providers require this parameter. Some providers, including Mirantis Container Cloud IAM (Keycloak), allow so-called public clients, where secrets are not used and can be any string.
`OS_OPENID_SCOPE`	The scope requested when using OIDC authentication. This is at least `openid`, but your identity provider may allow to return additional scopes.
`OS_IDENTITY_PROVIDER`	The name of the corresponding identity provider object as created in the Keystone API.
`OS_PROTOCOL`	The name of the protocol object as created in the Keystone API.
`OS_USERNAME`	Your user name.
`OS_PASSWORD`	Your password in the identity provider, such as Mirantis Container Cloud IAM (Keycloak).

Note

Additionally, to obtain a scoped token, you need information about the target scope, such as, OS_PROJECT_DOMAIN_NAME and OS_PROJECT_NAME.

Below is an example of RC file to set environment variables used in the further code examples for the project scope authentication:

export OS_AUTH_URL=https://keystone.it.just.works
export OS_DISCOVERY_ENDPOINT=https://keycloak.it.just.works/auth/realms/iam/.well-known/openid-configuration
export OS_CLIENT_ID=os
export OS_CLIENT_SECRET=someRandomClientSecretMightBeNull
export OS_USERNAME=writer
export OS_PASSWORD=password
export OS_IDENTITY_PROVIDER=keycloak
export OS_PROTOCOL=mapped
export OS_OPENID_SCOPE=openid
export OS_PROJECT_DOMAIN_NAME=Default
export OS_PROJECT_NAME=admin

Obtain the OIDC access token¶

Obtain the token endpoint of the identity provider by extracting it from the OIDC discovery document:
```
token_endpoint=$(curl -sk -X GET $OS_DISCOVERY_ENDPOINT | jq -r .token_endpoint)
```

Obtain the access token from the identity provider by sending the POST request to the token endpoint that will return the access token in exchange to the login credentials:

access_token=$(curl -sk \
 -X POST $token_endpoint \
 -u $OS_CLIENT_ID:$OS_CLIENT_SECRET \
 -d "username=${OS_USERNAME}&password=${OS_PASSWORD}&scope=${OS_OPENID_SCOPE}&grant_type=password" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 | jq -r .access_token)

Note

As per OpenID Connect RFC, the request to the endpoint must use the Form serialization.

Now, you can exchange the OIDC access token for an unscoped token from OpenStack Keystone.

Obtain the unscoped token from OpenStack Keystone¶

The Keystone token is included in the response header. However, the response body in the JSON format often contains additional data that may prove useful for certain applications. The following example excludes the body by using the -I flag:

unscoped_token=$(curl -sik \
 -I \
 -X POST $OS_AUTH_URL/OS-FEDERATION/identity_providers/${OS_IDENTITY_PROVIDER}/protocols/${OS_PROTOCOL}/auth \
 -H "Authorization: Bearer $access_token" \
 | grep x-subject-token \
 | awk '{print $2}' \
 | tr -d '\r')

Note

The tr -d '\r' line trims the carriage return characters from grep and awk outputs so that the extracted data can be properly inserted into the JSON authentication request later.

Now, you can generate a scoped token from OpenStack Keystone using the unscoped token and specifying the project scope.

Optional. Discover available scopes¶

In case you do not know beforehand the OpenStack authorization scope you want to log in to, use the unscoped token to get a list of the available scopes:

curl -sk $OS_AUTH_URL/auth/projects \
 -H "X-Auth-Token: $unscoped_token" | jq .projects

Obtain the scoped token from OpenStack Keystone¶

Create a JSON-formatted authentication request using the following script. For example, for the project scope:

token_request=$(mktemp)
cat > $token_request << EOJSON
{
  "auth": {
    "identity": {
      "methods": [
        "token"
      ],
      "token": {
        "id": "$unscoped_token"
      }
    },
    "scope": {
      "project": {
        "domain": {
          "name": "$OS_PROJECT_DOMAIN_NAME"
        },
        "name": "$OS_PROJECT_NAME"
      }
    }
  }
}
EOJSON

Send the authentication request to OpenStack Keystone to obtain a scoped token:

scoped_token=$(curl -sik \
 -X POST $OS_AUTH_URL/auth/tokens \
 -d "@$token_request" -H "Content-Type: application/json" \
 | grep x-subject-token \
 | awk '{print $2}' \
 | tr -d '\r')

Remove the temporary file used to store the authentication request:
```
rm $token_request
```

Use the scoped token to access OpenStack services APIs¶

To verify that you can access the OpenStack services APIs, for example, obtain the list of available images:

curl -sk -H "X-Auth-Token: $scoped_token" \
 -X GET https://glance.it.just.works/v2/images

Replace https://glance.it.just.works with the endpoint of the MOSK Image service (OpenStack Glance) that you can obtain from the Access & Security dashboard of OpenStack Horizon.

Use the Shared Filesystems service¶

Available since MOSK 24.3

Set up Shared Filesystems service with generic driver¶

This section provides the necessary steps to prepare your environment for working with the MOSK Shared Filesystems service (OpenStack Manila), create a share network, configure share file systems, and grant access to clients. By following these instructions, you will ensure that your system is properly set up to manage and mount shared resources using Manila with the generic driver.

Prepare an environment to work with Manila¶

Create a share network:

openstack share network create --name <share-network-name> \
                               --neutron-net-id <neutron-net-id> \
                               --neutron-subnet-id <neutron-subnet-id>

For example, use the command below to create the demo_share_net share network with the following neutron-net-id and neutron-subnet-id:

2a23fceb-eeed-439c-8506-ef489362d1ee is the ID of the demo_net network which the virtual machine is connected to
479b27e4-0b73-4f8d-a838-d330d72578ea is the ID of the demo_subnet subnet of the demo_net network, which is connected to a public router

openstack share network create --name demo_share_net \
                               --neutron-net-id 2a23fceb-eeed-439c-8506-ef489362d1ee  \
                               --neutron-subnet-id 479b27e4-0b73-4f8d-a838-d330d72578ea

Example of a positive system response:

+---------------------------------+----------------------------------------------------------+
| Field                           | Value                                                    |
+---------------------------------+----------------------------------------------------------+
| created_at                      | 2022-11-23T15:24:16.291622                               |
| description                     | None                                                     |
| id                              | 52438b02-7a06-4dfc-a0a9-cbc773bd4389                     |
| name                            | demo_share_net                                           |
| project_id                      | 6a1e7dfe64164fb884e146634a84b204                         |
| security_service_update_support | True                                                     |
| share_network_subnets           |                                                          |
|                                 | id = bfa7e38f-2132-4c13-85d4-bf51085f6360                |
|                                 | availability_zone = None                                 |
|                                 | created_at = 2022-11-23T15:24:16.336306                  |
|                                 | updated_at = None                                        |
|                                 | segmentation_id = None                                   |
|                                 | neutron_net_id = 2a23fceb-eeed-439c-8506-ef489362d1ee    |
|                                 | neutron_subnet_id = 479b27e4-0b73-4f8d-a838-d330d72578ea |
|                                 | ip_version = None                                        |
|                                 | cidr = None                                              |
|                                 | network_type = None                                      |
|                                 | mtu = None                                               |
|                                 | gateway = None                                           |
| status                          | active                                                   |
| updated_at                      | None                                                     |
+---------------------------------+----------------------------------------------------------+

Verify that the share network has been successfully created:

openstack share network list

Example of a positive system response:

+--------------------------------------+----------------+
| ID                                   | Name           |
+--------------------------------------+----------------+
| 52438b02-7a06-4dfc-a0a9-cbc773bd4389 | demo_share_net |
+--------------------------------------+----------------+

Set up Shared Filesystems service with CephFS driver¶

Deploy your first cloud application using automation¶

This section aims to help you build your first cloud application and onboard it to a MOSK cloud. It will guide you through the process of deploying and managing a sample application using automation, and showcase the powerful capabilities of OpenStack.

Sample application¶

The sample application offered by Mirantis is a typical web-based application consisting of a front end that provides a RESTful API situated behind the cloud load balancer (OpenStack Octavia) and a backend database that stores data in the cloud block storage (OpenStack Cinder volumes).

Mirantis RefApp

You can extend the sample application to make use of advanced features offered by MOSK, for example:

An HTTPS-terminating load balancer that stores its certificate in the Key Manager service (OpenStack Barbican)
A public endpoint accessible by the domain name with the help of the DNS service (OpenStack Designate)

Deployment automation tools¶

The sample application intends to showcase how deployment automation can enable the DevOps engineers to streamline the process of installing, updating, and managing their workloads in the cloud providing an efficient and scalable approach to building and running cloud-based applications.

The sample application offers example templates for the most common tools that include:

OpenStack Heat, an OpenStack service used to orchestrate composite cloud applications with a declarative template format through the OpenStack-native REST API
Terraform, an Infrastructure-as-code tool from HashiCorp, designed to build, change, and version cloud and on-prem resources using a declarative configuration language

You can easily customize and extend the templates for similar workloads.

Note

The sample source code and automation templates reside in the OpenStack RefApp GitHub repository.

Environment

The sample cloud application deployment has been verified in the following environment:

OpenStack command-line client v5.8.1
Terraform v1.3.x
OpenStack Yoga
Ubuntu 18.04 LTS (Bionic Beaver) as guest operating system

Obtain the access credentials to the cloud¶

Log in to the cloud web UI (OpenStack Horizon).
Navigate to the project where you want to deploy the application.
Use the top-right user menu to download the OpenStack RC File to your local machine.

Note

As an example, you will be using your own user credential to deploy the sample application. However, in the future, Mirantis strongly recommends creating dedicated application credentials for your workloads.

See also

Deploy sample application with OpenStack Heat¶

Prepare for the Heat stack creation:
1. Load the previously downloaded openrc file into the environment to configure the OpenStack client with the access credentials:
```
source <YOUR_PROJECT>-openrc.sh
```
2. Verify that a Heat stack with the target name does not exist:
```
STACK_NAME=sampleapp
openstack stack check $STACK_NAME
```

Generate an SSH keypair to access front-end and database instances:

cd <OPENSTACK_REFAPP>/heat-templates
ssh-keygen -m PEM -N '' -C '' -f .openstack

Verify the default deployment configuration in the top.yaml template. Modify parameters as required. For example, you may want to change the image, flavor, or network parameters.

Create the stack using the provided template with the public key generated above:

PUBLIC_KEY=$(<.openstack.pub)
openstack stack create -t top.yaml --parameter "cluster_public_key=${PUBLIC_KEY}" $STACK_NAME

Verify that the Heat stack has been created successfully and the application instances are running:

openstack stack event list $STACK_NAME
openstack stack resource list $STACK_NAME
openstack stack show $STACK_NAME

Obtain the URL of application public endpoint:

openstack stack output show $STACK_NAME app_url

Deploy sample application with Terraform¶

Download and install the Terraform binary on your local machine, for example:

wget -O- https://releases.hashicorp.com/terraform/1.3.9/terraform_1.3.9_linux_amd64.zip | funzip > /usr/bin/terraform
chmod +x /usr/bin/terraform

Generate an SSH keypair to be used to log in to the application instances:

cd <OPENSTACK_REFAPP>/terraform/templates
ssh-keygen -m PEM -N '' -C '' -f .openstack

Load the previously downloaded openrc file into the environment to configure Terraform with the cloud access credentials:
```
source <YOUR_PROJECT>-openrc.sh
```
Verify that all required Terraform providers are properly installed and configured:
```
terraform init
```
Generate a speculative execution plan that outlines the actions required to implement the current configuration and apply the Terraform state:
```
terraform plan
terraform apply -auto-approve
```
Note

If HTTPS_PROXY is set, also set NO_PROXY before running terraform plan:
```
export NO_PROXY="it.just.works"
```
Obtain the URL of the application public endpoint:
```
terraform output app_url
```

Verify application functioning¶

Run the curl tool against the URL of the application public end point to make sure that all components of the application have been deployed correctly and it is responding to user requests:

$ curl http://<APP_URL>/
{"host": "host name of API instance that replied","app": "openstack-refapp"}

The sample application provides a RESTful API, which you can use for advanced database queries.

See also

OpenStack RefApp API

Deploy your first cloud application using cloud web UI¶

This section aims to help you build your first cloud application and onboard it to a MOSK cloud. It will guide you through the process of deploying a simple application using the cloud web UI (OpenStack Horizon).

The section will also introduce you into the fundamental OpenStack primitives that are commonly used to create virtual infrastructures for cloud applications.

Sample application¶

The sample application in the context of this tutorial is a typical web-based application consisting of Wordpress, a popular web content management system, and a database that stores Wordpress data in the cloud block storage (OpenStack Cinder volume).

Mirantis-refapp-simple.html

You can extend the sample application to make use of advanced features offered by MOSK, for example:

Add an HTTPS-terminating load balancer that stores its certificate in the Key Manager service (OpenStack Barbican)
Make the public endpoint accessible by the domain name with the help of the DNS service (OpenStack Designate)

Prerequisites¶

You can run the sample application on any OpenStack cloud. It can be a private MOSK cluster of your company, a public OpenStack cloud, or even your own tiny TryMOSK instance spinned up in an AWS tenant as described in the Try Mirantis OpenStack for Kubernetes on AWS article.

To deploy the sample application, you need:

Access to your cloud web UI with the credentails obtained from your cloud administator:
- The URL of the cloud web UI (OpenStack Horizon)
- The login, password, and, optionally, authentication method that you need to use to log in to the MOSK cloud
- The name of the OpenStack project with enough resources available
Connectivity from the MOSK cluster to the Internet to be able to download the components of the sample application. If needed, consult with your cloud administrator.
A local machine with the SSH client installed and connectivity to the cloud public address (floating IP) space.

Environment

The sample cloud application deployment has been verified in the following environment:

OpenStack Yoga
Ubuntu 18.04 LTS (Bionic Beaver) as guest operating system

Deploy sample application using the web UI¶

Log in to the cloud web UI:
1. Open your favorite web browser and navigate to the URL of the cloud web UI.
2. Use the access credentials to log in.
3. Select the appropriate project from the drop-down menu at the top left.
Create a dedicated private network for your application:

Note

Virtual networks in OpenStack are used for isolated communication between instances (virtual machines) within the cloud. Instances get plugged into networks to communicate with any virtual networking entities, such as virtual routers and load balancers as well as the outside world.
1. Navigate to Network > Networks.
2. Click Create Network. The Create Network dialog box opens.
3. In the Network tab, specify a name for the new network. Select the Enable Admin State and Create Subnet check boxes.
4. In the Subnet tab, specify a name for the subnet and network address, for example, 192.168.1.0/24.
5. In the Subnet Details tab, keep the preset configuration.
6. Click Create.
Create and connect a network router.

Note

A virtual router, just like its physical counterpart, is used to pass the layer 3 network traffic between two or more networks. Also, a router performs the network address translation (NAT) for instances to be able to communicate with the outside world through the external networks.

To create the network router:
1. Navigate to Network > Routers.
2. Click Create Router. The Create Router dialog box opens.
3. Specify a meaningful name for the new router and select the external network it needs to be plugged in. If you do not know which external network to select, consult your cloud administator.
Now, when the router is up and running, you need to plug it into the application private network, so it can forward the packets between your local machine and the instance, which you will create later.

To connect the network router:
1. Navigate to Network > Routers.
2. Find the router you have just created and click on its name.
3. Open the Interfaces tab and click Add Interface. The Add Interface dialog box opens.
4. Select the subnetwork that you provided in the first step.
5. Click Add Interface.
Create an instance:

Note

A virtual machine, or an instance in the OpenStack terminology, is the machine where your application processes will be effectively running.
1. Navigate to Compute > Instances.
2. Click Launch Instance. The Launch Instance dialog box opens.
3. In the Details tab, specify a meaningful name for the new instance, so that you can easily identify it among others later.
4. In the Source tab, select the Image boot source. MOSK comes with a few prebuilt images, we will be using the one with the Ubuntu Bionic Server image for the sample application.
5. In the Flavor tab, pick the m1.small size for the instance, which provides just enough resources for the application to run.
6. In the Networks tab, select the previously created private network to plug the instance into.
7. In the Security Groups tab, verify that the default security group is selected and it allows the ingress HTTP, HTTPs, and SSH traffic.
8. In the Key Pair tab, create or import a new key pair to be able to log in to the instance securily through the SSH protocol. Make sure to have a copy of the private key on your local machine to pass to the SSH client.
9. Now all the needed settings have been made, click Launch Instance.
  
  Wait for the new instance to get shown in the Active state in the Instances dashboard.
Attach a volume to the instance.

Note

Volumes in OpenStack provide persistent storage for applications, allowing the data, which is placed on them, to exist independently from the instances, as opposed to the data written to ephemeral storage, which automatically gets terminated together with its instance.

To create the volume:
1. Navigate to Volumes > Volumes.
2. Click Create Volume. The Create Volume dialog box opens.
3. Specify a meaningful name for the new volume.
4. Leave all fields with the default values. A 1 GiB volume will be enough for the sample application.
Once the volume is allocated, it shows up in the same Volumes dashboard. Now, you can attach the new volume to your running instance:
1. In the Volumes dashboard, select the volume to add to the instance.
2. Click Manage Attachments. The Manage Volume Attachments dialog box opens.
3. Select the required instance.
4. Click Attach Volume.
Now, the Attached To column in the Volumes dashboard will display your volume device name as the volume attached to your instance. Also, you can view the status of a volume that can be either Available or In-Use.
Expose the instance outside:

Note

A floating IP address in OpenStack is an abstraction over a publicly routable IP address that allows an instance to be accessed from outside the cloud. Floating IPv4 addresses are typically scarce and expensive resources and, so they need to be explicitly allocated and assigned to selected instances.
1. Navigate to Compute > Instances.
2. From the Actions drop-down list next to your instance, select Associate Floating IP. The Manage Floating IP Associations dialog box opens.
3. Allocate a new floating IP address using the + button.
  
  The Port to be associated should be already filled in with the instance private port.
4. Click Associate.
The Compute > Instances dashboard will display the instance floating IP address along with the private one. Write down the floating IP address as you will need it in the next step.
Access the instance through SSH.

On your local machine use the SSH client to log in to the instance by its floating IP address. Ensure the 0600 permission on the private key file.
```
ssh -i <path-to-your-private-key> ubuntu@<floating-IP>
```
All operations on the instance listed below must be performed by the superuser, as root:
```
sudo su -
```
Initialize persistent storage for the application data.

A newly created volume is always an empty block device that needs to be provisioned before the application can write or read any data.

Use the device name under which the volume got attached to the instance to create a file system on the block device:
```
mkfs.ext4 <volume-device-name> -q
```
Create a mount point and mount the newly provisioned block device into the guest operating system:
```
mkdir /mnt/data && mount <volume-device-name> /mnt/data
```
Start the application components.

We will run the application components as Docker containers to simplify their provisioning and configuration and isolate them from each other.

The common Ubuntu cloud image does not have Docker engine preinstalled, so you need to install it manually:
```
sudo apt-get update && sudo apt-get install -y docker.io
```
Our sample application consists of two Docker containers:
- MySQL database server
- Wordpress instance
First, we create a new Docker network so that both containers can communicate with each other:
```
docker network create samplenet
```
Now, let’s spin up the MySQL database. We will place all its data in a separate directory on the mounted volume. You can find more information about the parameters of MySQL image on its DockerHub page.
```
mkdir /mnt/data/db

docker run -d --name db --network samplenet --volume /mnt/data/db:/var/lib/mysql -e MYSQL_DATABASE=exampledb -e MYSQL_USER=exampleuser -e MYSQL_PASSWORD=examplepass -e MYSQL_RANDOM_ROOT_PASSWORD=1 mysql:5.7
```
It is time to start the Wordpress Docker container. Its data will be also hosted on the volume.
```
mkdir /mnt/data/wordpress

docker run -d --name wordpress --network samplenet --publish 80:80 --volume /mnt/data/wordpress:/var/www/html -e WORDPRESS_DB_HOST=db -e WORDPRESS_DB_USER=exampleuser -e WORDPRESS_DB_PASSWORD=examplepass -e WORDPRESS_DB_NAME=exampledb wordpress
```

Now, the sample application is up and running.

Verify application functioning¶

Use the web browser on you local machine to navigate to the application endpoint http://<instance-floating-IP-address>. If you have followed all the steps accurately, your browser should now display the Wordpress Getting Started dialog.

Feel free to provide the necessary parameters and proceed with the initialization. Once it finishes, you proceed with building your own cloud-hosted website and start serving the users. Congratulations!

Use Heat to create and manage Tungsten Fabric objects¶

Utilizing OpenStack Heat templates is a common practice to orchestrate Tungsten Fabric resources. Heat allows for the definition of templates, which can depict the relationships between resources such as networks, and enforce policies accordingly. Through these templates, OpenStack REST APIs are invoked to create the necessary infrastructure in the correct order required to launch applications.

Managing Tungsten Fabric resources through OpenStack Heat represents a structured and automated approach as compared to using of Tungsten Fabric UI or API directly. The Heat templates provide a declarative mechanism to define and manage infrastructure, ensuring repeatability and consistency across deployments. This contrasts with the manual and potentially error-prone process of managing resources through the Tungsten Fabric UI and API.

To orchestrate Tungsten Fabric objects through a Heat template:

Define the template with the Tungsten Fabric objects as required.

Note

You can view the full list of Heat resources available in your environment from either OpenStack Horizon dashboard, the Project > Orchestration > Resource Types page, or the OpenStack CLI:

openstack orchestration resource type list

Also, you can obtain the specification of the Tungsten Fabric configuration API by accessing http://TF_API_ADDRESS:8082/documentation/contrail_openapi.html on your environment.

Below is an example template showcasing a Heat topology that illustrates the creation sequence of the following Tungsten Fabric resources: instance, port, network, router, and external network.

Example Heat topology with Tungsten Fabric resources

heat_template_version: 2015-04-30
description: HOT template to create a Instance connected to external network
parameters:
  stack_prefix:
    type: string
    description: Prefix name for stack resources.
    default: "net-logical-router"
  project:
    type: string
    description: project for the Server
  public_network_id:
    type: string
  floating_ip_pool:
    type: string
  subnet_ip_prefix:
    type: string
    default: '192.168.96.0'
  subnet_ip_prefix_len:
    type: string
    default: '24'
  server_image:
    type: string
    description: Name of image to use for server.
    default: 'Cirros-6.0'
  availability_zone:
    type: string
    default: 'nova'
resources:
  ipam:
    type: OS::ContrailV2::NetworkIpam
    properties:
      name: { list_join: [ '_', [ get_param: stack_prefix, "ipam" ] ] }
      project: { get_param: project }
  private_network:
    type: OS::ContrailV2::VirtualNetwork
    properties:
      name: { list_join: [ '_', [ get_param: stack_prefix, "network" ] ] }
      project: { get_param: project }
      network_ipam_refs: [{ get_resource: ipam }]
      network_ipam_refs_data: [
        {
          network_ipam_refs_data_ipam_subnets: [
            {
              network_ipam_refs_data_ipam_subnets_subnet_name: { list_join: [ '_', [ get_param: stack_prefix, "subnet" ] ] },
              network_ipam_refs_data_ipam_subnets_subnet:
                {
                  network_ipam_refs_data_ipam_subnets_subnet_ip_prefix: '192.168.96.0',
                  network_ipam_refs_data_ipam_subnets_subnet_ip_prefix_len: '24',
                },
              network_ipam_refs_data_ipam_subnets_allocation_pools: [
                {
                  network_ipam_refs_data_ipam_subnets_allocation_pools_start: '192.168.96.10',
                  network_ipam_refs_data_ipam_subnets_allocation_pools_end: '192.168.96.100'
                }
              ],
              network_ipam_refs_data_ipam_subnets_default_gateway: '192.168.96.1',
              network_ipam_refs_data_ipam_subnets_enable_dhcp: 'true',
            }
          ]
        }
      ]
  private_network_interface:
    type: OS::ContrailV2::VirtualMachineInterface
    properties:
      name: { list_join: [ '_', [ get_param: stack_prefix, "interface" ] ] }
      project: { get_param: project }
      virtual_machine_interface_device_owner: 'network:router_interface'
      virtual_machine_interface_bindings: {
        virtual_machine_interface_bindings_key_value_pair: [
          {
            virtual_machine_interface_bindings_key_value_pair_key: 'vnic_type',
            virtual_machine_interface_bindings_key_value_pair_value: 'normal'
          }
        ]
      }
      virtual_network_refs: [{ get_resource: private_network }]
  instance_ip:
    type: OS::ContrailV2::InstanceIp
    properties:
      name: { list_join: [ '_', [ get_param: stack_prefix, "instance_ip" ] ] }
      fq_name: { list_join: [ '_', [ "fq_name", get_param: stack_prefix ] ] }
      virtual_network_refs: [{ get_resource: private_network }]
      virtual_machine_interface_refs: [{ get_resource: private_network_interface }]
  router:
    type: OS::ContrailV2::LogicalRouter
    properties:
      name: { list_join: [ '_', [ get_param: stack_prefix, "router" ] ] }
      project: { get_param: project }
      virtual_machine_interface_refs: [{ get_resource: private_network_interface }]
      virtual_network_refs: [{ get_param: public_network_id }]
      virtual_network_refs_data: [
        {
          virtual_network_refs_data_logical_router_virtual_network_type: 'ExternalGateway'
        },
      ]
  security_group:
    type: OS::ContrailV2::SecurityGroup
    properties:
      # description: SG with allowed ssh/icmp traffic
      name: { list_join: [ '_', [ get_param: stack_prefix, "sg" ] ] }
      project: { get_param: project }
      security_group_entries: {
        security_group_entries_policy_rule: [
          {
            security_group_entries_policy_rule_direction: '>',
            security_group_entries_policy_rule_protocol: 'any',
            security_group_entries_policy_rule_ethertype: 'IPv4',
            security_group_entries_policy_rule_src_addresses: [
              {
                security_group_entries_policy_rule_src_addresses_security_group: 'local',
              }
            ],
            security_group_entries_policy_rule_dst_addresses: [
              {
                security_group_entries_policy_rule_dst_addresses_subnet:
                  {
                    security_group_entries_policy_rule_dst_addresses_subnet_ip_prefix: '0.0.0.0',
                    security_group_entries_policy_rule_dst_addresses_subnet_ip_prefix_len: '0',
                  },
              }
            ]
          },
          {
            security_group_entries_policy_rule_direction: '>',
            security_group_entries_policy_rule_protocol: 'any',
            security_group_entries_policy_rule_ethertype: 'IPv6',
            security_group_entries_policy_rule_src_addresses: [
              {
                security_group_entries_policy_rule_src_addresses_security_group: 'local',
              }
            ],
            security_group_entries_policy_rule_dst_addresses: [
              {
                security_group_entries_policy_rule_dst_addresses_subnet:
                  {
                    security_group_entries_policy_rule_dst_addresses_subnet_ip_prefix: '::',
                    security_group_entries_policy_rule_dst_addresses_subnet_ip_prefix_len: '0',
                  },
              }
            ]
          },
          {
            security_group_entries_policy_rule_direction: '>',
            security_group_entries_policy_rule_protocol: 'icmp',
            security_group_entries_policy_rule_ethertype: 'IPv4',
            security_group_entries_policy_rule_src_addresses: [
              {
                security_group_entries_policy_rule_src_addresses_subnet:
                  {
                    security_group_entries_policy_rule_src_addresses_subnet_ip_prefix: '0.0.0.0',
                    security_group_entries_policy_rule_src_addresses_subnet_ip_prefix_len: '0',
                  },
              }
            ],
            security_group_entries_policy_rule_dst_addresses: [
              {
                security_group_entries_policy_rule_dst_addresses_security_group: 'local',
              }
            ]
          },
          {
            security_group_entries_policy_rule_direction: '>',
            security_group_entries_policy_rule_protocol: 'tcp',
            security_group_entries_policy_rule_ethertype: 'IPv4',
            security_group_entries_policy_rule_src_addresses: [
              {
                security_group_entries_policy_rule_src_addresses_subnet:
                  {
                    security_group_entries_policy_rule_src_addresses_subnet_ip_prefix: '0.0.0.0',
                    security_group_entries_policy_rule_src_addresses_subnet_ip_prefix_len: '0',
                  }
              }
            ],
            security_group_entries_policy_rule_dst_addresses: [
              {
                security_group_entries_policy_rule_dst_addresses_security_group: 'local',
              }
            ],
            security_group_entries_policy_rule_dst_ports: [
              {
                security_group_entries_policy_rule_dst_ports_start_port: '22',
                security_group_entries_policy_rule_dst_ports_end_port: '22',
              }
            ]
          }
        ]
      }
  flavor:
    type: OS::Nova::Flavor
    properties:
      disk: 3
      name: { list_join: [ '_', [ get_param: stack_prefix, "flavor" ] ] }
      ram: 1024
      vcpus: 2
  server_port:
    type: OS::Neutron::Port
    properties:
      network_id: { get_resource: private_network }
      binding:vnic_type: 'normal'
      security_groups: [ { get_resource: security_group } ]
  server:
    type: OS::Nova::Server
    properties:
      name: { list_join: [ '_', [ get_param: stack_prefix, "server" ] ] }
      image: { get_param: server_image }
      flavor: { get_resource: flavor }
      availability_zone: { get_param: availability_zone }
      networks:
        - port: { get_resource: server_port }
  server_fip:
    type: OS::ContrailV2::FloatingIp
    properties:
      floating_ip_pool: { get_param: floating_ip_pool }
      virtual_machine_interface_refs: [{ get_resource: server_port }]
outputs:
  server_fip:
    description: Floating IP address of server in public network
    value: { get_attr: [ server_fip, floating_ip_address ] }

Create an environment file to define values to put in the variables in the template file:

parameters:
  stack_prefix: <STACK_NAME>
  project: <PROJECT>
  public_network_id: <PUBLIC_NETWORK_ID>
  floating_ip_pool: <FLOATING_IP_POOL_UUID>
  server_image: <SERVER_IMAGE>

Use the OpenStack CLI or other client libraries to deploy the defined template:

openstack stack create -e <ENV_FILE_NAME> -t <TEMPLATE_FILE_NAME> <STACK_NAME>

As a result of this procedure, you create and configure the Tungsten Fabric resources as specified in the template.

See also

Use S3 API for MOSK Object Storage¶

In this section, discover how to harness the S3 API functionality within the OpenStack Object Storage environment.

Prerequisites¶

Before you start using the S3 API, ensure you have the necessary prerequisites in place. This includes having access to an OpenStack deployment with the Object Storage service enabled and authenticated credentials.

Object Storage service enabled¶

Verify the presence of the object-store service within the OpenStack Identity service catalog. If the service is present, the following command returns endpoints related to the object-store service:

openstack catalog show object-store

If the object-store service is not present in the OpenStack Identity service catalog, consult your cloud operator to confirm that the Object Store service is enabled in the kind: OpenStackDeployment resource controlling your OpenStack installation. The following element must be present in the configuration:

kind: OpenStackDeployment
spec:
  features:
    services:
    - object-storage

Authorization¶

The S3 API utilizes the AWS authorization protocol, which is not directly compatible with the OpenStack Identity service, aka Keystone, by default. To access the MOSK Object Storage service using the S3 API, you should create EC2 credentials within the OpenStack Identity service:

openstack ec2 credentials create -f yaml

Example output:

access: a354a74e0fa3434e8039d0425f7a0b59
links
  self: https://keystone.it.just.works/v3/users/801b9014d3d441478bf0ccac30b80459/credentials/OS-EC2/a354a74e0fa3434e8039d0425f7a0b59
project_id: 274b929c00b346c2ad0849d19d3e6f46
secret: d7c2ca9488dd4c8ab3cff2f1aad1c683
trust_id: null
user_id: 801b9014d3d441478bf0ccac30b80459

When accessing the Object Storage service through the S3 API, take note of the access and secret fields. These values serve as respective equivalents for the access_key and secret_access_key options, or similarly named parameters, within the S3-specific tools.

Obtaining the S3 endpoint¶

When using the Object Storage service endpoint, exclude the final /swift/v1/... section.

To obtain the endpoint:

openstack versions show --service object-store --status CURRENT \
    --interface public --region <desired region> \
    -c Endpoint -f value | sed 's/\/swift\/.*$//'

Example output:

https://openstack-store.it.just.works

S3-specific tools configuration¶

To interact seamlessly with OpenStack Object Storage through the S3 API, familiarize yourself with essential S3-specific tools, such as s3cmd, the AWS Command Line Interface (CLI), and Boto3 SDK for Python.

This section provides concise yet comprehensive configuration examples for utilizing these S3-specific tools allowing users to interact with the Amazon S3 and other cloud storage providers employing the S3 protocol.

s3cmd¶

S3cmd is a free command-line client designed for uploading, retrieving, and managing data across various cloud storage service providers that utilize the S3 protocol, including Amazon S3.

Example of a minimal s3cfg configuration:

[default]
# use 'access' value from "openstack ec2 credentials create"
access_key = a354a74e0fa3434e8039d0425f7a0b59
# use 'secret' value from "openstack ec2 credentials create"
secret_key = d7c2ca9488dd4c8ab3cff2f1aad1c683
# use hostname of the "openstack-store" service, without protocol
host_base = openstack-store.it.just.works
# important, leave empty
host_bucket =

When configured, you can use s3cmd as usual:

s3cmd -c s3cfg ls                                           # list buckets
s3cmd -c s3cfg mb s3://my-bucket                            # create a bucket
s3cmd -c s3cfg put myfile.txt s3://my-bucket                # upload file to bucket
s3cmd -c s3cfg get s3://my-bucket/myfile.txt myfile2.txt    # download file
s3cmd -c s3cfg rm s3://my-bucket/myfile.txt                 # delete file from bucket
s3cmd -c s3cfg rb s3://my-bucket                            # delete bucket

AWS CLI¶

The AWS CLI stands as the official and powerful command-line interface provided by Amazon Web Services (AWS). It serves as a versatile tool that enables users to interact with AWS services directly from the command line. Offering a wide range of functionalities, the AWS CLI facilitates diverse operations, including but not limited to resource provisioning, configuration management, deployment, and monitoring across various AWS services.

To start using the AWS CLI:

Set the authorization values as shell variables:

# use "access" field from created ec2 credentials
export AWS_ACCESS_KEY_ID=a354a74e0fa3434e8039d0425f7a0b59
# use "secret" field from created ec2 credentials
export AWS_SECRET_ACCESS_KEY=a354a74e0fa3434e8039d0425f7a0b59

Explicitly provide the --endpoint-url set to the endpoint of the openstack-store service to every aws CLI command:

export S3_API_URL=https://openstack-store.it.just.works

aws --endpoint-url $S3_API_URL s3 mb s3://my-bucket
aws --endpoint-url $S3_API_URL s3 cp myfile.txt s3://my-bucket
aws --endpoint-url $S3_API_URL s3 ls s3://my-bucket
aws --endpoint-url $S3_API_URL s3 rm s3://my-bucket/myfile.txt
aws --endpoint-url $S3_API_URL s3 rm s3://my-bucket

boto¶

Boto3 is the official Python3 SDK (Software Development Kit) specifically designed for Amazon Web Services (AWS), providing comprehensive support for various AWS services, including the S3 API for object storage. It offers extensive functionality and tools for developers to interact programmatically with AWS services, facilitating tasks such as managing, accessing, and manipulating data stored in Amazon S3 buckets.

Presuming that you have configured the environment with the same environment variables as in the example for AWS CLI, you can create an S3 client in Python as follows:

import boto3, os

# high level "resource" interface
s3 = boto3.resources("s3", endpoint_url=os.getenv("S3_API_URL"))
for bucket in s3.buckets.all():  # returns rich objects
    print(bucket.name)

# low level "client" interface
s3 = boto3.client("s3", endpoint_url=os.getenv("S3_API_URL"))
buckets = s3.list_buckets()  # returns raw JSON-like dictionaries

Run Windows guests¶

Available since MOSK 24.1 TechPreview

MOSK enables users to configure and run Windows guests on OpenStack, which allows for optimization of cloud infrastructure for diverse workloads. This section delves into the nuances of achieving seamless integration between the Windows operating system and MOSK clouds.

Supported Windows versions¶

The list of the supported Windows versions includes:

Windows 10 22H2
Windows 11 23H2

Note

While Windows operating system of other versions may function, their compatibility is unverified.

Configuring Windows images or flavors¶

You can configure Windows guests through the image metadata properties os_distro and os_type or through the flavor extra specs os:distro and os:type.

Configuration example using image metadata properties:

$ openstack image set $WINDOWS_IMAGE \
   --property os_distro=windows \
   --property os_type=windows

Also, you have the option to set up Windows guests in a way that supports UEFI Secure Boot and includes an emulated virtual Trusted Platform Module (TPM). This configuration enhances security features for your Windows virtual machines within the OpenStack environment.

Note

Windows 11 imposes a security system requirement, necessitating the activation of UEFI Secure Boot and ensuring that TPM version 2.0 is enabled.

Configuration example for the image with Windows 11:

$ openstack image set $WINDOWS_IMAGE \
   --property os_distro=windows \
   --property os_type=windows \
   --property hw_firmware_type=uefi \
   --property hw_machine_type=q35 \
   --property os_secure_boot=required \
   --property hw_tpm_model=tpm-tis \
   --property hw_tpm_version=2.0

Enabling UEFI Secure Boot¶

To confirm support for the UEFI Secure Boot feature, examine the traits associated with the compute node resource provider:

$ COMPUTE_UUID=$(openstack resource provider list --name $HOST -f value -c uuid)
$ openstack resource provider trait list $COMPUTE_UUID | grep COMPUTE_SECURITY_UEFI_SECURE_BOOT
| COMPUTE_SECURITY_UEFI_SECURE_BOOT |

You can configure the UEFI Secure Boot support through flavor extra specs or image metadata properties. For x86_64 hosts, enabling secure boot also necessitates configuring the use of the Q35 machine type. MOSK enables you to configure this on a per-guest basis using the hw_machine_type image metadata property.

Configuration example for the image that meets both requirements:

$ openstack image set $IMAGE \
   --property hw_firmware_type=uefi \
   --property hw_machine_type=q35 \
   --property os_secure_boot=required

Enabling vTPM¶

Caution

MOSK does not support the live migration operation for instances with virtual Trusted Platform Module (vTPM) enabled.

To confirm support for the vTPM feature, examine the traits associated with the compute node resource provider:

$ COMPUTE_UUID=$(openstack resource provider list --name $HOST -f value -c uuid)
$ openstack resource provider trait list $COMPUTE_UUID | grep SECURITY_TPM
| COMPUTE_SECURITY_TPM_1_2 |
| COMPUTE_SECURITY_TPM_2_0 |

A vTPM can be requested for a server through either flavor extra specs or image metadata properties. There are two supported TPM versions: 1.2 and 2.0, along with two models: TPM Interface Specification (TIS) and Command-Response Buffer (CRB). Notably, the CRB model is only supported with version 2.0.

TPM versions and models support matrix¶
TPM version	1.2	2.0
TPM Interface Specification (TIS) model
Command-Response Buffer (CRB) model

Configuration example for a flavor to use the TPM 2.0 with the TIS model:

$ openstack flavor set $FLAVOR \
   --property hw:tpm_version=2.0 \
   --property hw:tpm_model=tpm-tis

Use the Dynamic Resource Balancer service¶

Available since MOSK 24.2 TechPreview

The Dynamic Resource Balancer (DRB) service automatically moves OpenStack instances around to achieve more optimal resource usage in a MOSK cluster.

Consult your cloud administrator to determine if this service is enabled in your cloud and which mode is used. The DRB service relies on the OpenStack live migration mechanism to ensure that instances can be seamlessly moved to another hypervisor of its choice.

Note

The live migration mode supports local block storage. The live migration mechanism automatically determines whether Nova should migrate using a local block storage or a shared storage.

Depending on the nature of your workload and configuration of the MOSK cluster, you can explicitly configure the DRB service in two ways:

Opt out of automatic instance migration if your instances are sensitive to live migration. For example, if the instances rely on special local resources such as SR-IOV-based virtual NICs or cannot tolerate CPU throttling.
Opt in to optimize your instance placement at any time. In this case, if the DRB service is configured not to move instances by default, this allows your applications to relocate away from noisy neighbors that are consuming excessive shared resources on a hypervisor.

Prohibit DRB from migrating an instance¶

If the DRB service in your MOSK cluster is configured to auto-migrate all instances by default, you, as the owner of the instance, can opt out of such automated migrations.

To achieve this, tag your instances with lcm.mirantis.com:no-drb. For example, using the openstack CLI:

openstack --os-compute-api-version 2.26 server set <server-name-or-id> \
    --tag lcm.mirantis.com:no-drb

Successfull execution of the command above produces no output. To revert to the default behavior, use the openstack server unset command in a similar way.

Note

OpenStack instance tags are distinct from metadata (key-value pairs). Therefore, use instance tagging explicitly for this purpose.

Allow DRB to migrate an instance at any time¶

If the DRB service in your MOSK cloud is configured to move only specific instances, in order for the placement of your instances to get automatically optimized, you need to explicitly tag each instance with lcm.mirantis.com:drb. For example, using the openstack CLI:

openstack --os-compute-api-version 2.26 server set <server-name-or-id> \
    --tag lcm.mirantis.com:drb

Successfull execution of the command above produces no output. To revert to the default behavior, use the openstack server unset command in a similar way.

Note

OpenStack instance tags are distinct from metadata (key-value pairs). Therefore, use instance tagging explicitly for this purpose.

Migrate instances¶

Available since MOSK 24.3

MOSK provides the capability to perform instance migrations for the non-administrative users of the OpenStack cloud.

Consult your cloud administrator to ensure this functionality is available to you. If it is, you may have access to cold migration, live migration, or both. Refer to Instance migration to learn more about these migration types in MOSK.

If migration is available to you as a non-administrative user, it is, by default, a completely scheduler-controlled type of migration. As a user, you do not have the option to select the target host for your instance. Instead, the Compute service scheduler automatically selects the best-suited target host, if one is available.

To perform migrations, you can use any preferred method, including direct API interactions, CLI tools (the openstack client), or the OpenStack Dashboard service.

Configure network trunking in projects¶

Available since MOSK 25.1 TechPreview

This tutorial provides step-by-step instructions on how to use the Neutron Trunk extension in your project infrastructure. By following this guide, you will learn how to configure trunk ports in OpenStack Neutron, enabling efficient network segmentation and traffic management.

Overview¶

The Neutron Trunk extension allows a single virtual machine (VM) to connect to multiple networks using a single port. This is achieved by designating one port as the parent port, which handles untagged IP packets, while additional subports receive tagged packets through the IEEE 802.1Q VLAN protocol.

The Neutron Trunk extension is enabled by default.

Environment description¶

This tutorial uses a predefined environment setup, illustrated in the following diagram:

trunk_lab

In this environment, we have three virtual machines (VMs) and two networks with one subnet in each. These subnets have different address pools:

VM1 is connected to net_A through port_A (standard port, no trunking)
VM2 is connected to net_B through port_B (standard port, no trunking)
VM3 is connected to both net_A and net_B using a trunk port:
- The primary port is attached to net_A
- A subport is attached to net_B with VLAN ID 100.

All VMs have public network access through floating IP addresses, allowing remote management through SSH.

Preconfigure the environment¶

Before running the setup commands, ensure that your OpenStack project has a keypair named test_key available.

To quickly set up the environment, use the following commands:

openstack network create mgmt_net
openstack subnet create mgmt_subnet --network mgmt_net --subnet-range 192.0.2.0/24 --allocation-pool start=192.0.2.2,end=192.0.2.20
openstack router create mgmt_router
openstack router set mgmt_router --external-gateway public
openstack router add subnet mgmt_router mgmt_subnet
openstack security group create test_sg
openstack security group rule create test_sg --remote-ip 0.0.0.0/0
for i in {1..3}; do \
  openstack port create --network mgmt_net --fixed-ip subnet=mgmt_subnet,ip-address=192.0.2.2${i} --security-group test_sg mgmt_port${i}; \
  openstack floating ip create --port mgmt_port${i} public; \
  openstack server create --image Ubuntu-18.04 --flavor m1.tiny_test --key-name test_key --port mgmt_port${i} vm${i}; \
done
openstack network create net_A
openstack subnet create subnet_A --network net_A --subnet-range 192.10.0.0/24 --allocation-pool start=192.10.0.2,end=192.10.0.20
openstack network create net_B
openstack subnet create subnet_B --network net_B --subnet-range 10.0.10.0/24 --allocation-pool start=10.0.10.2,end=10.0.10.20
openstack port create --network net_A --fixed-ip subnet=subnet_A,ip-address=192.10.0.100 --security-group test_sg port_A
openstack port create --network net_B --fixed-ip subnet=subnet_B,ip-address=10.0.10.100 --security-group test_sg port_B
openstack server add port vm1 port_A
openstack server add port vm2 port_B

Now, let’s create trunk ports and test how it works.

Configure Neutron trunk ports¶

Create ports for the trunk:

openstack port create \
    --network net_A \
    --fixed-ip subnet=subnet_A,ip-address=192.10.0.120 \
    trunk_port

port_mac=$(openstack port show trunk_port -c mac_address -f value)

openstack port create \
    --network net_B \
    --mac-address ${port_mac} \
    --fixed-ip subnet=subnet_B,ip-address=10.0.10.120 \
    trunk_subport

Example of a positive system response:

+-------------------------+-------------------------------------------------------------------------------------------------------+
| Field                   | Value                                                                                                 |
+-------------------------+-------------------------------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                                                    |
| binding_vif_type        | unbound                                                                                               |
| binding_vnic_type       | normal                                                                                                |
| fixed_ips               | ip_address='192.10.0.120', subnet_id='aefc3ce5-8b53-41b0-a97a-887e9f78832b'                           |
| id                      | 224fc881-e897-4d75-8b9f-0e02544fe3a0                                                                  |
| mac_address             | fa:16:3e:40:09:af                                                                                     |
| name                    | trunk_port                                                                                            |
| network_id              | 02108a2b-7cf2-4ef4-88f6-d8a101e4d688                                                                  |
| revision_number         | 1                                                                                                     |
| status                  | DOWN                                                                                                  |
| trunk_details           | None                                                                                                  |
+-------------------------+-------------------------------------------------------------------------------------------------------+
+-------------------------+----------------------------------------------------------------------------------------------------+
| Field                   | Value                                                                                              |
+-------------------------+----------------------------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                                                 |
| binding_vif_type        | unbound                                                                                            |
| binding_vnic_type       | normal                                                                                             |
| fixed_ips               | ip_address='10.0.10.120', subnet_id='161e3165-e411-4675-921f-875b4004ba0f'                         |
| id                      | 6dc57f5e-4448-4a9d-ba13-64d96951caaa                                                               |
| mac_address             | fa:16:3e:40:09:af                                                                                  |
| name                    | trunk_subport                                                                                      |
| network_id              | 66e13174-bead-4560-bf22-620c6df59eef                                                               |
| revision_number         | 1                                                                                                  |
| status                  | DOWN                                                                                               |
| trunk_details           | None                                                                                               |
+-------------------------+----------------------------------------------------------------------------------------------------+

Note

We create trunk_subport using the same MAC address as its parent trunk_port. Neutron developers recommend this approach to avoid issues with ARP spoof protection and the native OVS firewall driver.

Create the trunk:

openstack network trunk create \
  --parent-port trunk_port \
  --subport port=trunk_subport,segmentation-type=vlan,segmentation-id=100 \
  trunk_test

Example of a positive system response:

+-------------------+-------------------------------------------------------------------------------------------------+
| Field             | Value                                                                                           |
+-------------------+-------------------------------------------------------------------------------------------------+
| id                | d98b39e6-44a9-4885-a99f-e0bfd346e4cb                                                            |
| is_admin_state_up | True                                                                                            |
| name              | trunk_test                                                                                      |
| port_id           | 224fc881-e897-4d75-8b9f-0e02544fe3a0                                                            |
| revision_number   | 0                                                                                               |
| status            | DOWN                                                                                            |
| sub_ports         | port_id='6dc57f5e-4448-4a9d-ba13-64d96951caaa', segmentation_id='100', segmentation_type='vlan' |
| tags              | []                                                                                              |
+-------------------+-------------------------------------------------------------------------------------------------+

Add trunk_port to the VM3. You can use the trunk_port name or ID of trunk_test as a port value when you add it to the VM:
```
openstack server add port vm3 trunk_port
```

Verify the status of the ports:

openstack port list --network net_A
openstack port list --network net_B

Example of a positive system response:

+--------------------------------------+------------+-------------------+-----------------------------------------------------------------------------+--------+
| ID                                   | Name       | MAC Address       | Fixed IP Addresses                                                          | Status |
+--------------------------------------+------------+-------------------+-----------------------------------------------------------------------------+--------+
| 224fc881-e897-4d75-8b9f-0e02544fe3a0 | trunk_port | fa:16:3e:40:09:af | ip_address='192.10.0.120', subnet_id='aefc3ce5-8b53-41b0-a97a-887e9f78832b' | ACTIVE |
| 94a4133d-1ee9-4e66-b7c8-0fbf45f71670 | port_A     | fa:16:3e:60:dc:5f | ip_address='192.10.0.100', subnet_id='aefc3ce5-8b53-41b0-a97a-887e9f78832b' | ACTIVE |
+--------------------------------------+------------+-------------------+-----------------------------------------------------------------------------+--------+
+--------------------------------------+---------------+-------------------+----------------------------------------------------------------------------+--------+
| ID                                   | Name          | MAC Address       | Fixed IP Addresses                                                         | Status |
+--------------------------------------+---------------+-------------------+----------------------------------------------------------------------------+--------+
| 6dc57f5e-4448-4a9d-ba13-64d96951caaa | trunk_subport | fa:16:3e:40:09:af | ip_address='10.0.10.120', subnet_id='161e3165-e411-4675-921f-875b4004ba0f' | ACTIVE |
| bc19ce85-387a-4f30-bf01-3c8251c9c8c9 | port_B        | fa:16:3e:62:27:66 | ip_address='10.0.10.100', subnet_id='161e3165-e411-4675-921f-875b4004ba0f' | ACTIVE |
+--------------------------------------+---------------+-------------------+----------------------------------------------------------------------------+--------+

Verify the trunk status:

openstack network trunk show trunk_test

Example of a positive system response:

+-------------------+-------------------------------------------------------------------------------------------------+
| Field             | Value                                                                                           |
+-------------------+-------------------------------------------------------------------------------------------------+
| id                | d98b39e6-44a9-4885-a99f-e0bfd346e4cb                                                            |
| is_admin_state_up | True                                                                                            |
| name              | trunk_test                                                                                      |
| port_id           | 224fc881-e897-4d75-8b9f-0e02544fe3a0                                                            |
| project_id        | 86ff0f08f36d46a592f6273ba417ea54                                                                |
| revision_number   | 2                                                                                               |
| status            | ACTIVE                                                                                          |
| sub_ports         | port_id='6dc57f5e-4448-4a9d-ba13-64d96951caaa', segmentation_id='100', segmentation_type='vlan' |
| tags              | []                                                                                              |
+-------------------+-------------------------------------------------------------------------------------------------+

Verify the VM status:

openstack server list

Example of a positive system response:

+--------------------------------------+------+--------+-------------------------------------------------------+--------------+--------------+
| ID                                   | Name | Status | Networks                                              | Image        | Flavor       |
+--------------------------------------+------+--------+-------------------------------------------------------+--------------+--------------+
| 604ebbb1-8c2c-4087-b231-528df7ba5d33 | vm3  | ACTIVE | mgmt_net=10.11.12.134, 192.0.2.23; net_A=192.10.0.120 | Ubuntu-18.04 | m1.tiny_test |
| 3de1061b-18aa-4bf7-806f-28c4309de592 | vm2  | ACTIVE | mgmt_net=10.11.12.118, 192.0.2.22; net_B=10.0.10.100  | Ubuntu-18.04 | m1.tiny_test |
| a94fe384-9de4-4d75-8b39-2b22d6855f8f | vm1  | ACTIVE | mgmt_net=10.11.12.100, 192.0.2.21; net_A=192.10.0.100 | Ubuntu-18.04 | m1.tiny_test |
+--------------------------------------+------+--------+-------------------------------------------------------+--------------+--------------+

Configure IP addresses for new interfaces on the VMs:

On VM1:

ip link
ip addr add 192.10.0.100/24 dev ens8
ip link set ens8 up

On VM2:

ip link
ip addr add 10.0.10.100/24 dev ens8
ip link set ens8 up

For VM3 we also need to create the VLAN device to receive packets from trunk_subport. In this example, the configured interface is named ens8:

ip link
ip link add link ens8 name ens8.100 type vlan id 100
ip addr add 192.10.0.120/24 dev ens8
ip addr add 10.0.10.120/24 dev ens8.100
ip link set ens8 up

Note

In our setup, VMs are running on Ubuntu. If you use another operating system, set the IP addresses correspondingly.

Verify that VM3 can ping VM1 and VM2 through its trunk port:

root@vm3:~# ping -w1 -c1 192.10.0.100
PING 192.10.0.100 (192.10.0.100) 56(84) bytes of data.
64 bytes from 192.10.0.100: icmp_seq=1 ttl=64 time=2.69 ms

--- 192.10.0.100 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.696/2.696/2.696/0.000 ms

root@vm3:~# ping -w1 -c1 10.0.10.100
PING 10.0.10.100 (10.0.10.100) 56(84) bytes of data.
64 bytes from 10.0.10.100: icmp_seq=1 ttl=64 time=5.46 ms

--- 10.0.10.100 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.465/5.465/5.465/0.000 ms

By following this tutorial, you have successfully configured a trunk port in OpenStack Neutron. VM3 can now communicate with both net_A and net_B through a single interface using VLAN segmentation. This setup enables efficient network management and reduces the number of required ports, simplifying your infrastructure.

For further customization, refer to the official OpenStack Neutron documentation on trunk port configurations.

Configure per-instance migration mode¶

Available since MOSK 25.1 TechPreview

MOSK enables cloud users to mark their instances for LCM to handle them individually during host maintenance operations, such as host reboots or data plane restarts. This can be useful if some instances in your environment are sensitive to live migration, allowing you, as a cloud user, to communicate your requirements effectively to the cloud operator and streamline cluster maintenance.

To mark the instances that require individual handling during host maintenance, assign the openstack.lcm.mirantis.com:maintenance_action=<ACTION-TAG> tag to them using the Nova API:

openstack --os-compute-api-version 2.26 server set <SERVER-ID> \
          --tag openstack.lcm.mirantis.com:maintenance_action=<ACTION-TAG>

Below is the table that describes the supported tag values.

Maintenance action tags for instance migration handling¶
Tag value	Description
`poweroff`	The instance can be gracefully powered off during a host reboot. This is equivalent to the `skip` mode in `openstack.lcm.mirantis.com/instance_migration_mode` for instance migration configuration for hosts.
`live_migrate`	The instance can be live-migrated during maintenance. This is equivalent to the `live` mode in `openstack.lcm.mirantis.com/instance_migration_mode` for instance migration configuration for hosts.
`notify`	The user must be explicitly notified about planned host maintenance. This is equivalent to the `manual` mode in `openstack.lcm.mirantis.com/instance_migration_mode` for instance migration configuration for hosts.

See also

Security Guide¶

This guide provides recommendations on how to effectively use product capabilities to harden the security of a Mirantis OpenStack for Kubernetes (MOSK) deployment.

Note

The guide is being under development and will be updated with new sections in future releases of the product documentation.

CADF audit notifications in OpenStack services¶

MOSK services can emit notifications in the Cloud Auditing Data Federation (CADF) format, which is a standardized format for event data. The information contained in such notifications describes every action users perform in the cloud and is commonly used by organizations to perform security audits and intrusion detection.

Currently, the following MOSK services support the emission of CADF notifications:

Compute service (OpenStack Nova)
Block Storage service (OpenStack Cinder)
Images service (OpenStack Glance)
Networking service (OpenStack Neutron)
Orchestration service (OpenStack Heat)
DNS service (OpenStack Designate)
Bare Metal service (OpenStack Ironic)
Load Balancing service (OpenStack Octavia)

CADF notifications are enabled in the features:logging:cadf section of the OpenStackDeployment custom resource. For example:

spec:
  features:
    logging:
      cadf:
        enabled: true

The way the notification messages get delivered to the consumers is controlled by the notification driver setting. The following options are supported:

messagingv2 - Default
Messages get posted to the notifications.info queue in the MOSK message bus, which is RabbitMQ
log
Messages get posted to a standard log output and then collected by Mirantis StackLight

Configuration example:

spec:
  features:
    logging:
      cadf:
        enabled: true
        driver: log

Rotation of credentials in OpenStack¶

MOSK generates all credentials used internally, including two types of credentials generated during the OpenStack deployment:

Credentials for admin users provide unlimited access and enable the initial configuration of cloud entities. Three sets of such credentials are generated for accessing the following services:
- OpenStack database
- OpenStack APIs (OpenStack admin identity account)
- OpenStack messaging
Credentials for OpenStack service users are generated for each deployed OpenStack service. To operate successfully, OpenStack services require three sets of credentials for accessing the following services:
- OpenStack database
- OpenStack APIs (OpenStack service identity account)
- OpenStack messaging

To enhance the information security level, Mirantis recommends changing the passwords of internally used credentials periodically. We suggest changing the credentials every month. MOSK includes an automated routine for changing credentials, which must be triggered manually.

Restarting OpenStack services is necessary to apply new credentials. Therefore, it is crucial to have a smooth transition period to minimize the downtime for the OpenStack control plane. To achieve this, perform the credential rotation as described in Rotate OpenStack credentials.

Firewall configuration¶

This section includes the details about ports and protocols used in a MOSK deployment.

Container Cloud¶

Mirantis Container Cloud – LCM¶
Component	Network	Protocol	Port	Consumers
Web UI, cache, Kubernetes API, and others	LCM API/Mgmt	TCP	443 - Kubernetes API 6443 - MKE API	External clients
SSH	LCM API/Mgmt	TCP	22	External clients
Chrony	LCM_API/Mgmt	TCP	323	All nodes in management and managed clusters
NTP	LCM_API/Mgmt	UDP	123	All nodes in management and managed clusters
LDAP	LCM API/Mgmt	UDP	389
LDAPs	LCM API/Mgmt	TCP/UDP	686

Mirantis Container Cloud – Bare metal¶
Component	Network	Protocol	Port
Ironic	LCM 0	TCP/UDP	TCP: 601, 5050, 6385, 8482, 9797 UDP: 514
Ironic syslog	PXE	TCP/UDP	TCP: 601 UDP: 514
Ironic image repo	PXE	TCP	80
MKE/Kubernetes API	LCM 0	TCP/UDP	TCP: 179, 2376, 2377, 7946, 10250, 12376, 12379-12388 UDP: 4789, 7946
BOOTP	PXE	UDP	68
DHCP server	PXE	UDP	67
IPMI	PXE/LCM 0	TCP/UDP	TCP: 623 1 UDP: 623
SSH	PXE/LCM	TCP	22
DNS	LCM 0	TCP/UDP	53
NTP	LCM 0	TCP/UDP	123
TFTP	PXE	UDP	69
LDAP	LCM 0	TCP	636
HTTPS	LCM 0	TCP	443, 6443
StackLight	LCM 0	TCP	9091 9126 19100 ^{Since MCC 2.25.0 (17.0.0 and 16.0.0)} 9100 ^{Before MCC 2.25.0 (15.x, 14.x, or earlier)}

0(1,2,3,4,5,6,7,8): Depends on the default route.
1: Depends on the Baseboard Management Controller (BMC) protocol, defaults to IPMI.

Mirantis Kubernetes Engine¶

For available Mirantis Kubernetes Engine (MKE) ports, refer to MKE Documentation: Open ports to incoming traffic.

StackLight¶

The tables below contain the details about ports and protocols used by different StackLight components.

Warning

This section does not describe communications within the cluster network.

User interfaces¶

Component	Network	Direction	Port/Protocol	Consumer	Comments
Alerta UI	External network (LB service)	Inbound	443/TCP/HTTPS	Cluster users	Add the assigned external IP to the `allowlist`.
Alertmanager UI	External network (LB service)	Inbound	443/TCP/HTTPS	Cluster users	Add the assigned external IP to the `allowlist`.
Grafana UI	External network (LB service)	Inbound	443/TCP/HTTPS	Cluster users	Add the assigned external IP to the `allowlist`.
OpenSearch Dashboards UI	External network (LB service)	Inbound	443/TCP/HTTPS	Cluster users	Only when the StackLight logging stack is enabled. Add the assigned external IP to the `allowlist`.
Prometheus UI	External network (LB service)	Inbound	443/TCP/HTTPS	Cluster users	Add the assigned external IP to the `allowlist`.

Alertmanager notifications receivers¶

Component	Network	Direction	Port/Protocol	Destination	Comments
Alertmanager Email notifications integration	Cluster network	Outbound	TCP/SMTP	Depends on the configuration, see the comment.	Only when email notifications are enabled. Add an SMTP host URL to the `allowlist`.
Alertmanager Microsoft Teams notifications integration	Cluster network	Outbound	TCP/HTTPS	Depends on the configuration, see the comment.	Only when Microsoft Teams notifications are enabled. Add a webhook URL to the `allowlist`.
Alertmanager Salesforce notifications integration	Cluster network	Outbound	TCP/HTTPS	For Mirantis support mirantis.my.salesforce.com and login.salesforce.com. Depends on the configuration, see the comment.	Only when Salesforce notifications are enabled. Add an SF instance URL and an SF login URL to the `allowlist`. See requirements for firewall or proxy for details.
Alertmanager ServiceNow notifications integration	Cluster network	Outbound	TCP/HTTPS	Depends on the configuration, see the comment.	Only when notifications to ServiceNow are enabled. Add a configured ServiceNow URL to the `allowlist`.
Alertmanager Slack notifications integration	Cluster network	Outbound	TCP/HTTPS	Depends on the configuration, see the comment.	Only when notifications to Slack are enabled. Add a configured Slack URL to the `allowlist`.
Notification integration of Alertmanager generic receivers	Cluster network	Outbound	Customizable, see the comment	Depends on the configuration, see the comment.	Only when any custom Alertmanager integrations is enabled. Depending on the integration type, add the corresponding URL to the `allowlist`.

External integrations¶

Component	Network	Direction	Port/Protocol	Destination	Comments
Salesforce reporter	Cluster network	Outbound	TCP/HTTPS	For Mirantis support mirantis.my.salesforce.com and login.salesforce.com. Depends on the configuration, see the comment.	Only when the Salesforce reporter is enabled. Add a SF instance URL and SF login URL to the `allowlist`. See requirements for firewall or proxy for details.
Prometheus Remote Write	Cluster network	Outbound	TCP	Depends on the configuration, see the comment.	Only when the Prometheus Remote Write feature is enabled. Add a configured remote write destination URL to the `allowlist`.
Prometheus custom scrapes	Cluster network	Outbound	TCP	Depends on the configuration, see the comment.	Only when the Custom Prometheus scrapes feature is enabled. Add configured scrape targets to the `allowlist`.
Fluentd remote syslog output	Cluster network	Outbound	TCP or UDP (protocol and port are configurable)	Depends on the configuration, see the comment.	Only when the Logging to remote Syslog feature is enabled. Add a configured remote syslog URL to the `allowlist`.
Metric Collector	Cluster network	Outbound	9093/TCP or 443/TCP	mcc-metrics-prod-ns.servicebus.windows.net	Applicable to management clusters only. Add a specific URL from Microsoft Azure to the `allowlist`. See requirements for firewall or proxy for details. 9093/TCP applies if proxy is disabled. 443/TCP applies if proxy is enabled.
External Endpoint monitoring	Cluster network	Outbound	TCP/HTTP(S)	Depends on the configuration, see the comment.	Only when the External endpoint monitoring feature is enabled. Add configured monitored URLs to the `allowlist`.
SSL certificate monitoring	Cluster network	Outbound	TCP/HTTP(S)	Depends on the configuration, see the comment.	Only when SSL certificates monitoring feature is enabled. Add configured monitored URLs to the `allowlist`.

Metrics exporters¶

Component	Network	Direction	Port/Protocol	Consumer	Comments
Prometheus Node Exporter	Host network	Inbound (from cluster network)	19100/TCP ^{Since 23.3} 9100/TCP ^{Before 23.3}	Prometheus from the `stacklight` namespace	Prometheus from Cluster network scrape metrics from all nodes.
Fluentd (Prometheus metrics endpoint)	Host network	Inbound (from cluster network)	24231/TCP	Prometheus from the `stacklight` namespace	Only when the StackLight logging stack is enabled. Prometheus from the cluster network scrapes metrics from all nodes.
Calico node	Host network	Inbound (from cluster network)	9091/TCP	Prometheus from the `stacklight` namespace	Prometheus from cluster network scrape metrics from all nodes.
Telegraf SMART plugin	Host network	Inbound (from cluster network)	9126/TCP	Prometheus from the `stacklight` namespace	Prometheus from cluster network scrapes metrics from all nodes.
MKE Manager API	Host network	Inbound (from cluster network)	4443/TCP	Blackbox exporter from the `stacklight` namespace	Applicable to Master node only. Blackbox exporter from cluster network probes all master nodes.
MKE Metrics Engine	Host network	Inbound (from cluster network)	12376/TCP	Prometheus from the `stacklight` namespace	Prometheus from cluster network scrape metrics from all nodes.
Kubernetes Master API	Host network	Inbound (from cluster network)	5443/TCP	Blackbox exporter from the `stacklight` namespace	Applicable to Master node only. Blackbox exporter from cluster network probes all master nodes.
Libvirt Exporter	Host network	Inbound (from cluster network)	9177/TCP	Blackbox exporter from the `stacklight` namespace	Prometheus from cluster network scrapes metrics from all compute nodes.
TF Controller Exporter	Host network	Inbound (from cluster network)	9779/TCP	Blackbox exporter from the `stacklight` namespace	Applicable to MOSK with Tungsten Fabric deployments only. Prometheus from Cluster network scrapes metrics from all Tungsten Fabric control nodes.
TF vRouter Exporter	Host network	Inbound (from cluster network)	9779/TCP	Blackbox exporter from the `stacklight` namespace	Applicable to MOSK with Tungsten Fabric deployment only. Prometheus from Cluster network scrapes metrics from all compute nodes.

Container Cloud telemetry¶

Component	Network	Direction	Port/Protocol	Consumer	Destination	Comments
Telemeter client	Cluster network	Outbound (to management cluster External LB)	443/TCP	n/a	Telemeter server on a management cluster (`telemeter-server` External IP from the `stacklight` namespace of the management cluster)	The Telemeter client on the MOSK cluster pushes metrics to the `telemeter-server` on the management cluster
Telemeter server	External network (LB service)	Inbound (from managed cluster network)	443/TCP	Telemeter client on managed clusters	n/a	Applicable to management clusters only. The Telemeter client on the managed cluster pushes metrics to the Telemeter server on the management cluster.

Ceph¶

Ceph monitors use their node host networks to interact with Ceph daemons. Ceph daemons communicate with each other over a specified cluster network and provide endpoints over the public network.

The messenger V2 (msgr2) or earlier V1 (msgr) protocols are used for communication between Ceph daemons.

Ceph daemon	Network	Protocol	Port	Description	Consumers
Manager (`mgr`)	Cluster network	msgr/msgr2	6800, 9283	Listens on the first available port of the 6800-7300 range. Uses 9283 port for exporting metrics.	`csi-rbdplugin`, `csi-rbdprovisioner`, `rook-ceph-mon`
Metadata server (`mds`)	Cluster network	msgr/msgr2	6800	Listens on the first available port of the 6800-7300 range	`csi-cephfsplugin`, `csi-cephfsprovisioner`
Monitor (`mon`)	LCM host network	msgr/msgr2	msgr:3300, msgr2:6789	Monitor has separate ports for `msgr` and `msgr2`	Ceph clients `rook-ceph-osd`, `rook-ceph-rgw`
Ceph OSD (`osd`)	Cluster network	msgr/msgr2	6800-7300	Binds to the first available port from the 6800-7300 range	`rook-ceph-mon`, `rook-ceph-mgr`, `rook-ceph-mds`

Ceph network policies¶

Available since MOSK 24.1

Ceph Controller uses the NetworkPolicy objects for each Ceph daemon. Each NetworkPolicy is applied to a pod with defined labels in the rook-ceph namespace. It only allows the use of the ports specified in the NetworkPolicy spec. Any other port is prohibited.

Ceph daemon	Pod label	Allowed ports
Manager (`mgr`)	`app=rook-ceph-mgr`	6800-7300, 9283
Monitor (`mon`)	`app=rook-ceph-mon`	3300, 6789
Ceph OSD (`osd`)	`app=rook-ceph-osd`	6800-7300
Metadata server (`mds`)	`app=rook-ceph-mds`	6800-7300
Ceph Object Storage (`rgw`)	`app=rook-ceph-rgw`	Value from `spec.cephClusterSpec.objectStorage.rgw.gateway.port`, Value from `spec.cephClusterSpec.objectStorage.rgw.gateway.securePort`

See also

MOSK¶

Communications between Mirantis OpenStack for Kubernetes (MOSK) components are provided by the Calico networking. All internal communications occur through the Calico tunnel through the VXLAN or WireGuard protocols.

Note

Since Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0), WireGuard is deprecated. If you still require the feature, contact Mirantis support for further information.

Caution

These ports are only used for in-cluster communications. Open them only to a trusted network and never at a perimeter firewall.

Component	Protocol	Port	Description
Calico VXLAN	UDP	4792	Calico networking with VXLAN enabled
Calico WireGuard	UPD	51820	Calico networking with IPv4 Wireguard enabled

In-cluster communications between MetalLB speaker components are done using the LCM network. MetalLB components also provide metrics to be collected by StackLight.

Caution

These ports are only used for in-cluster communications. Open them only to a trusted network and never at a perimeter firewall.

Component	Protocol	Port	Description
MetalLB MemberList	TCP/UDP	7947	MetalLB speaker communications using MemberList
MetalLB metrics	TCP	7472	MetalLB controller & speaker metrics

OpenStack¶

The table below describes OpenStack internal protocols and ports used outside the Calico networking.

Component	Network	Protocol	Port	Description	Consumers
Nova	Live migration network	TCP	8022	SSH transport for migration	`nova-compute` during migration
Libvirt	Live migration network	TCP	16509	Libvirt	`nova-compute` during migration
Libvirt	Live migration network	TCP	5900-6923	VNC ports accessed from noVNC Proxy	`nova-novncproxy`, `nova-spiceproxy`
Neutron	Tunnel network	UDP	4790	Neutron tenant networks	`neutron-ovs` agents
Neutron/IPsec	Tunnel network	UDP	500	IKE/ISAKMP	`neutron-ovs` agents
Neutron/IPsec	Tunnel network	ESP (50)			`neutron-ovs` agents
Neutron/IPsec	Tunnel network	AH (51)			`neutron-ovs` agents
Ironic	OpenStack bare metal network	TCP	8080	NGINX HTTP storage	Bare metal nodes
Ironic	OpenStack bare metal network	UDP	69	TFTP	Bare metal nodes

Services exposed outside a managed cluster to public clients¶
Component	Network	Protocol	Port	Description
Ingress	MetalLB	TCP	80, 443	Public OpenStack endpoints
Designate/PowerDNS	MetalLB	UDP/UDP	53	PowerDNS backend

Tungsten Fabric¶

The table below describes Tungsten Fabric internal protocols and ports used outside the Calico networking.

Component	Network	Protocol	Port	Description	Consumers
TF vRouter	MPLSoUDP Tunnel networks	UDP	6635, 51234	Tenant networks	TF vRouter
TF vRouter	VxLAN Tunnel Networks	UDP	4789	Tenant networks	TF vRouter
Introspect	Management network	TCP	8085, 8083		All TF services
BGP peering	Management or data network	TCP	1179		TF controller
XMPP peering	Management or data network	TCP	5269

Services exposed outside a MOSK cluster to public clients¶
Component	Network	Protocol	Port	Description
Ingress	MetalLB	TCP	80,443	Public TF endpoints including API and web UI
TF DNS	MetalLB	UDP/UDP	53	DNS backend

OpenStack API access policies¶

OpenStack provides operators with fine-grained control over access to API endpoints and actions through access policies. These policies allow cloud administrators to restrict or grant access based on roles and the current request context, such as the project, domain, or system.

OpenStack services come with a set of default policy rules that are generally sufficient for most users. However, for specific use cases, these policies may need to be modified.

MOSK enables you to define custom policies through the OpenStackDeployment custom resource. For configuration details, refer to features:policies.

New and legacy defaults¶

Since the Victoria release, OpenStack has effectively maintained two sets of default policies for each of its services.

Legacy default policies¶

With the legacy default policies, only the admin role has a dedicated meaning. Granting this role to a user in any context provides global administrative access to the service APIs.

Any other role within the project grants the user standard access, enabling them to create resources as well.

New default policies¶

The new default policies are based on the enhanced capabilities of updated OpenStack Keystone. They incorporate the hierarchical default roles, such as reader, member, and admin, as well as system scope.

How both sets of default policies work together¶

If a policy rule is explicitly defined by the deployment or by the cloud operator through the OpenStackDeployment custom resoruce, only that rule is enforced.

If no explicit API access rule is set, MOSK applies both the legacy and new policy sets simultaneously. Each API access request is checked by both sets, and access is granted if either of the policy sets allows it. This behavior is controlled by the [oslo_policy] enforce_new_defaults configuration option, which is set individually for each OpenStack service. Setting this option to True ensures that API access to this service is evaluated only against the new default policies.

Caution

Mirantis does not recommend enforcing the new default policies.

Our test results indicate that these policies are not yet consistently reliable across all services. Additionally, as of the OpenStack Antelope release, the new default policies have not undergone extensive testing in the upstream development.

Enforcing or using the new default policies may lead to unexpected consequences potentially affecting LCM operations such as running Tempest tests, performing automatic live migrations during node maintenance, and so on.

Changes to upstream default policies in MOSK¶

MOSK deploys OpenStack with certain upstream policies customized and additional fine-grained policies that are not present in upstream. The following list provides details on these policies.

Barbican¶

The global-secret-decoder role¶

A user with this role can decode any Barbican secret in any project.

This role is specifically granted to the service user performing automatic instance live migrations during node maintenance. Granting this role to the service user enables them to live migrate instances that use encrypted volumes.

By default, upstream policies restrict secret decryption to either user who created the secret or the administrator of the corresponding project.

A user that created an order can also delete that order¶

Available since MOSK 23.1

A user can automatically clean up orders, preventing them from accumulating and causing the Barbican database to grow uncontrollably.

By default, upstream policies allow a user with the creator role to create orders. However, they restrict order deletion to the project administrator.

Nova¶

Non-admins can cold-migrate instances without specifying the destination host¶

Available since MOSK 24.3

Refer to Instance migration for details.

By default, upstream policies restrict cold migrations to administrative users only.

Non-admins can live-migrate instances without specifying the destination host¶

Available since MOSK 24.3

Refer to Instance migration for details.

By default, upstream policies restrict live migration to administrative users only, without the ability to distinguish between different types of live migration.

Per-tag server tag policies¶

Available since MOSK 25.1

A cloud operator can define flexible rules to control assignment and removal of specific server tags to and from OpenStack instances. These rules allow the operator to restrict tag assignment and removal based on their value.

Per-tag server tag policies include the following:

os_compute_api:os-server-tags:update:{tag_name}
Restricts access to the APIs for creating instances, adding tags, and replacing tags
os_compute_api:os-server-tags:delete:{tag_name}
Restricts access to the APIs for deleting a single tag, deleting all tags, and replacing existing tags

For example, to ensure that only administrators can exclude specific instances from migration by the DRB service, the operator can configure the following policies through the OpenStackDeployment custom resource:
kind: OpenStackDeployment spec: features: policies: nova: os_compute_api:os-server-tags:update:lcm.mirantis.com:no-drb: rule:admin_api os_compute_api:os-server-tags:delete:lcm.mirantis.com:no-drb: rule:admin_api

Data protection¶

MOSK offers various mechanisms to ensure data integrity and confidentiality. This section provides an overview of the data protection capabilities available in MOSK.

Data encryption capabilities¶

This section provides an overview of the data protection capabilities available in MOSK, focusing primarily on data encryption. You will gain insights into different data encryption features of MOSK, understand the type of data they protect, where encryption occurs concerning cloud boundaries, and whether these mechanisms are available by default or require explicit enablement by the cloud operator or cloud user.

Data protection capabilities¶
Data protection capability	Data category	Data	Protection type	Protection boundaries	Availability
Encryption of cloud control plane communications	User, system	Cloud control plane traffic	In-flight	Cloud control plane including all the nodes	Disabled by default. Enabled by the cloud operator.
Encryption of data transfer for the noVNC client	User	Instance VNC console access traffic	In-flight	Cloud user - Compute hypervisor	Disabled by default. Enabled by the cloud operator.
Encryption of east-west tenant traffic	Application	Instance network traffic	In-flight	Private network	Disabled by default. Enabled by the cloud operator.
Block storage volume encryption	Application	Volumes	In-flight and at-rest	Compute hypervisor - storage cluster	Disabled by default. Enabled by the cloud operator. Activated by the cloud user per each volume.
Ephemeral storage encryption	Application	Instances ephemeral storage	At-rest	Compute hypervisor	Disabled by default. Enabled by the cloud operator per compute node.
Object storage server-side encryption	Application	Object data	In-flight and at-rest	Cloud API - Storage cluster	Enabled by default. Activated by the cloud user per object bucket.
Encryption of live migration data	Application	Inside memory and ephemeral storage	In-flight	Source hypervisor - Target hypervisor	Disabled by default. Enabled by the cloud operator.
API communications encryption	User	User communication with the cloud API	In-flight	Cloud user - Cloud API	Always enabled.
HashiCorp Vault as the backend for the Key Manager service	Application	Application secrets	At-rest	HashiCorp Vault service 0	Disabled by default. Enabled by the cloud operator.
Hiding sensitive information in OpenStackDeployment	System	OpenStack configuration secrets	At-rest	MOSK underlay Kubernetes cluster	Always enabled. Activated by the cloud operator per configuration secret.
Database backups encryption	Application	OpenStack database data	In-flight	OpenStack controller	Disabled by default. Enabled by the cloud operator.

0: Communication between HashiCort Vault and Key Manager is protected with TLS/SSL

Encryption of live migration data¶

Available since MOSK 23.2

Live migration enables the seamless movement of a running instance to another node within the cluster, ensuring uninterrupted access to the virtual workload.

In MOSK, the native TLS encryption feature is available for QEMU and libvirt, securing all data transports, including disks not on shared storage. Additionally, the libvirt daemon exclusively listens to TLS connections.

To establish a TLS environment, encompassing CA, server, and client certificates, the relevant compute nodes automatically generate these components. By default, these certificates are encrypted with a 2048-bit RSA private key and are valid for 3650 days.

You can easily enable live migration over TLS by configuring the features:nova:libvirt:tls parameter in the OpenStackDeployment custom resource. For reference, see Configuring live migration.

Caution

Instances started before enabling secure live migration will not support live migration.

The issue arises due to the SSL certificates for live migration with QEMU native TLS being generated during the service update. Thus, these certificates do not exist in the libvirt container when existing instances were started. Consequently, QEMU processes of those instances lack the required SSL certificate information, leading to migration failures with an internal error:

internal error: unable to execute QEMU command ‘object-add’: Unable to access credentials /etc/pki/qemu/ca-cert.pem: No such file or directory

As a workaround, stop and then start the instances that failed to live migrate. This process will create new QEMU processes within the libvirt container, ensuring the availability of TLS certificate details.

Encryption of cloud control plane communications¶

Deprecated since MOSK 25.1

In a cloud infrastructure, the components comprising the cloud control plane exchange messages that may contain sensitive information, such as cloud configuration details, application and cloud user credentials, and other essential data that an attacker can use to highjack the cloud. Encrypting the control plane traffic is crucial for data confidentiality and overall security of the cloud.

MOSK offers the ability to encrypt its control plane communication by means of encapsulating the in-cluster traffic of the underlying Kubernetes into a WireGuard mesh network built across its nodes.

Untitled Diagram

Note

Since Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0), WireGuard is deprecated. If you still require the feature, contact Mirantis support for further information.

Benefits of the WireGuard encryption¶

When an attacker is able to intercept the traffic between the nodes of a MOSK cluster but does not have access to the nodes themselves, WireGuard ensures the following:

Data confidentiality
Any intercepted traffic remains unreadable, especially the traffic of those components of the MOSK control plane that do not enable SSL/TLS encryption on the application level and rather rely on the underlying networking layer.
Data integrity
Alterations in traffic are detectable, ensuring that no tampering has occurred during transit.
Authentication
Only machines with valid cryptographic credentials can join the network and exchange data.

Communications protected by WireGuard encryption¶

The following control plane components can have their communications protected with the WireGuard encryption:

OpenStack database (MariaDB)
OpenStack message bus (RabbitMQ)
OpenStack internal API
OpenStack services interacting with auxiliary components, such as memcached, RedisDB, and PowerDNS
Interaction between StackLight internal components, including collection of metrics from OpenStack, Ceph, and other subsystems
Tungsten Fabric auxiliary components that include ZooKeeper, Kafka, Cassandra database, Redis database, and RabbitMQ

Communications not protected by WireGuard encryption¶

All components of the cloud control plane that require explicit firewall rules configuration as per MOSK firewall configuration guide utilize the Kubernetes host network mode for their pods, and, therefore, cannot be protected by WireGuard.

Enabling the WireGuard encryption¶

By default, the WireGuard encryption of the control plane communications is not enabled in MOSK. However, it is possible to enable the encryption upon initial deployment or later.

For the configuration details and possible downtime for the cloud control plane, read about the Enable WireGuard parameter in Mirantis Container Cloud: Create a cluster using web UI.

When enabling WireGuard, make sure to configure the Calico MTU size correctly. It must be at least 60 bytes smaller than the interface MTU size of the workload network.

How the WireGuard encryption works¶

Protocols and ports:

WireGuard operates on UDP and uses a single port for all traffic. Usually it is port 51820. Therefore, ensure that this port is open in your firewall.

See also

MOSK firewall configuration

Key generation, distribution, and rotation:

WireGuard uses public-private ECDH key pairs for secure handshake between the nodes of the cluster. Each node obtains its unique pair, with the public key shared across other nodes. A key pair persists indefinitely unless the node is reprovisioned and re-added to the cluster.
The handshake procedure establishes symmetric keys used for traffic encryption and automatically re-occurs every few minutes to ensure data security.

WireGuard encryption characteristics¶

Characteristic	Details
Handshake	WireGuard uses the Noise_IK handshake from Noise framework, building on the work of CurveCP, NaCL, KEA+, SIGMA, FHMQV, and HOMQV
Cipher	WireGuard uses the ChaCha20 cipher for symmetric encryption, authenticated with Poly1305
Key length	The symmetric encryption key length is 256 bits

Impact on cloud performance¶

While WireGuard is designed for efficiency, enabling encryption introduces some overhead.

Caution

The impact can vary depending on the cloud scale and usage profile.

You may experience the following:

A slight increase in CPU utilization on the MOSK cluster nodes.
Less than 30% loss of network throughput, which, given the cluster is designed according to Mirantis recommendations, does not impact control plane communications of an average cloud.

See also

Compliance¶

This section provides insights into the standards and regulatory requirements that MOSK adheres to, ensuring a secure and compliant environment that you can trust.

FIPS compliance¶

Federal Information Processing Standard Publication (FIPS), outlines security requirements for cryptographic modules used by the US government and its contractors to protect sensitive and valuable information. It categorizes the level of security provided by these modules, ranging from level 1 to level 4, with each level having progressively stringent security measures.

The FIPS mode within OpenStack verifies that its cryptographic algorithms and modules strictly conform to approved standards. This is crucial for several reasons:

Regulatory compliance
Many government agencies and industries dealing with sensitive data, such as finance and healthcare, require FIPS-140 compliance as a regulatory mandate. Ensuring compliance enables organizations to operate within legal boundaries and meet industry standards.
Data security
FIPS-140 compliance ensures a higher level of security for cryptographic functions, protecting sensitive information from unauthorized access and manipulation. FIPS-compliant environments have a high level of security for data encryption, digital signatures, and the integrity of communication channels.
Interoperability
FIPS-140 compliance can enhance interoperability by ensuring that systems and cryptographic modules across different platforms or vendors meet a standard set of security requirements. This is essential, especially in multi-cloud or interconnected environments.

OpenStack API¶

Available since MOSK 23.3

MOSK ensures that the user-to-cloud communications are always protected in compliance with FIPS 140-2. The capability is implemented as an SSL/TLS proxy injected into MOSK underlying Kubernetes ingress networking and performs the data encryption using a FIPS-validated cryptographic module.

Container images signing and validation¶

Available since MOSK 24.1 TechPreview

Container Cloud uses policy-controller for signature validation of pod images. It verifies that images used by the Container Cloud and Mirantis OpenStack for Kubernetes (MOSK) controllers are signed by a trusted authority. The policy-controller inspects defined image policies that list image registries and authorities for signature validation.

The policy-controller validates only pods with image references from the Container Cloud content delivery network (CDN). Other registries are ignored by the controller.

The policy-controller supports two modes of image policy validation for Container Cloud and MOSK images:

warn
Default. Allows controllers to use untrusted images, but a warning message is logged in the policy-controller logs and sent as an admission response.
enforce
Experimental. Blocks pod creating and updating operations if a pod image does not have a valid Mirantis signature. If a pod creation or update is blocked in the enforce mode, send the untrusted artifact to Mirantis support for further inspection. To unblock pod operations, switch to the warn mode.

Warning

The enforce mode is still under development and is available as an experimental option. Mirantis does not recommend enabling this option for production deployments. The full support for this option will be announced separately in one of the following Container Cloud releases.

In case of unstable connections from the policy-controller to Container Cloud CDN that disrupt pod creation and update operations, you can disable the controller by setting enabled: false in the configuration.

The policy-controller configuration is located in the Cluster object:

spec:
  ...
  providerSpec:
    value:
      ...
      helmReleases:
      ...
      - name: policy-controller
        enabled: [true|false]
        values:
          policy:
            mode: [enforce|warn]

FAQ¶

This section provides answers to common questions about MOSK, and it is designed to help you quickly find the information you need. We have included answers to the most common questions and uncertainties that our users encounter, along with helpful tips and references to step-by-step instructions where required.

The questions with answers in this section are organized by topic. If you cannot find the information you are looking for in this section, search in the whole documentation set. Also, do not hesitate to contact us through the Feedback button. We are always available to answer your questions and provide you with the assistance you need to use our product effectively.

Major and patch releases: updating your cluster¶

What is the difference between patch and major release versions?¶

Both major and patch release versions incorporate solutions for security vulnerabilities and known product issues. The primary distinction between these two release types lies in the fact that major release versions introduce new functionalities, whereas patch release versions predominantly offer minor product enhancements.

Patch releases strive to considerably reduce the timeframe for delivering CVE resolutions in images to your deployments, aiding in the mitigation of cyber threats and data breaches.

Content	Major release	Patch release
Version update and upgrade of the major product components including but not limited to OpenStack, Tungsten Fabric, Kubernetes, Ceph, and Stacklight 0
Container runtime changes including Mirantis Container Runtime and containerd updates
Changes in public API
Changes in the Container Cloud and MOSK lifecycle management including but not limited to machines, clusters, Ceph OSDs
Host machine changes including host operating system and kernel updates
Patch version bumps of MKE and Kubernetes
Fixes for Common Vulnerabilities and Exposures (CVE) in images
Fixes for known product issues

0: StackLight subcomponents may be updated during patch releases

Learn more about product releases in MOSK major and patch releases.

What is a product release series?¶

A product release series is a series of consecutive releases that starts with a major release and includes a number of patch releases built on top of the major release.

For example, the 23.1 series includes the 23.1 major release and 23.1.1, 23.1.2, 23.1.3, and 23.1.4 patch releases.

What is the difference between the new and old update schemes¶

Starting from MOSK 24.1.5, Mirantis introduces a new update scheme allowing for the update path flexibility. For details, see Update schemes comparison and Release notes: Cluster update scheme.

Is updating to patch release versions mandatory?¶

No.

Apply patch updates only if you want to receive security fixes as soon as they become available and you are prepared to update your cluster often, approximately once in three weeks.

Otherwise, you can skip patch releases and update only between major releases. Each subsequent major release includes patch release updates of the previous major release.

When planning the update path for your cluster, take into account the release support status included in Release Compatibility Matrix.

Can I skip patch releases within a single series since 24.1 series?¶

Yes.

Can I skip patch releases within a single series before 24.1 series?¶

Yes.

Additonally, before MOSK 24.1 series, it is technically not possible to update to any intermediate release version if the newer patch version has been released. You can update only to the latest available patch version in the series, which contains the updates from all the preceding versions. For example, if your cluster is running MOSK 23.1 and the latest available patch version is MOSK 23.1.2, you must update to 23.1.2 receiving the product updates from 23.1.1 and 23.1.2 at one go.

Moreover, if between the two major releases you apply at least one patch version belonging to the N series, you have to obtain the last patch release in the series to be able to update to the N+1 major release version.

How do I update to a patch version within the same series?¶

When updating between the patches of the same series, follow the Update to a patch version procedure.

How do I update between major series?¶

When updating between the series, follow the Cluster update procedure.

How do I update to the next major version if I started receiving patches of the previous series?¶

Caution

This answer applies only to MOSK 24.1 series. Starting from MOSK 24.1.5, Mirantis introduces a new update scheme allowing for the update path flexibility.

For details, refer to MOSK major and patch releases.

Firstly, if you started receiving patch updates from the previous release series, update your cluster to the latest patch release in that series as described the Update to a patch version procedure.

After, follow the Cluster update procedure to update from the latest patch release in the series to the next major release. It is technically impossible to receive a major release while on any patch release in the previous series other than the last one.

Release Notes¶

This document provides a high-level overview of new features, known issues, and bug fixes included in the latest MOSK release. It also includes lists of release artifacts and fixed Common Vulnerabilities and Exposures (CVEs). The content is intended to help product users, operators, and administrators stay informed about key changes and improvements to the platform.

In addition to release highlights, the document includes update notes for each MOSK release. These notes support a smooth and informed cluster update process by outlining the update impact, critical pre- and post-update steps, and other relevant information.

25.1 series¶

25.1¶

Release date	March 11, 2025
Name	MOSK 25.1
Cluster release	17.4.0
Highlights	Open Virtual Network (OVN) for greenfield deployments External IP address capacity monitoring Technical preview for synchronization of local MariaDB backups with remote S3 storage Technical preview for CephFS driver for the Shared Filesystems service (OpenStack Manila) Technical preview for introspective instance monitor Technical preview for individual instance migration handling Message of the Day (MOTD) for MOSK Dashboard Volume type selection for instance creation Technical preview for network port trunking OpenStack database backup encryption OpenStack Controller renaming to Rockoon Technical preview for OpenSDN 24.1 Automatic Cassandra repairs Per-node alerts for Cinder, Neutron, and Nova RabbitMQ monitoring rework Hiding sensitive ingress data of Ceph public endpoints Rook 1.14 `BareMetalHostInventory` instead of `BareMetalHost` Technical preview for automatic pausing of a MOSK cluster update Granular cluster update through the Container Cloud web UI Containerd as default container runtime

New features¶

MOSK 25.1 features¶
Component	Support scope	Feature
OpenStack	Full	Open Virtual Network (OVN) for greenfield deployments
	Full	External IP address capacity monitoring
	TechPreview	Synchronization of local MariaDB backups with remote S3 storage
	TechPreview	CephFS driver for the Shared Filesystems service (OpenStack Manila)
	TechPreview	Introspective instance monitor
	Full	Restricting tag assignments on OpenStack instances
	TechPreview	Individual instance migration handling
	Full	Message of the Day (MOTD) for MOSK Dashboard
	Full	Volume type selection for instance creation
	TechPreview	Network port trunking
	TechPreview	OpenStack database backup encryption
	Full	OpenStack Controller (Rockoon)
Tungsten Fabric	TechPreview	OpenSDN 24.1
	Full	Automatic Cassandra repairs
StackLight	Full	Per-node alerts for Cinder, Neutron, and Nova
	Full	RabbitMQ monitoring rework
Ceph	Full	Hiding sensitive ingress data of Ceph public endpoints
	Full	Rook 1.14
Bare metal	Full	BareMetalHostInventory instead of BareMetalHost
Cluster update	TechPreview	Automatic pausing of a MOSK cluster update
Container Cloud web UI	Full	Granular cluster update through the Container Cloud web UI
Container runtime	Full	Containerd as default container runtime

Open Virtual Network (OVN) for greenfield deployments¶

Introduced support for Open Virtual Network as a networking backend for OpenStack on greenfield deployments.

Learn more

External IP address capacity monitoring¶

Introduced the IP address capacity monitoring, enabling cloud operators to better manage routable IP addresses. By providing insights into capacity usage, this monitoring capability helps predict future cloud needs, prevent service disruptions, and optimize the allocation of external IP address pools.

Learn more

Synchronization of local MariaDB backups with remote S3 storage¶

TechPreview

Implemented the capability to synchronize local MariaDB backups with a remote S3 storage ensuring data safety through secure authentication and server-side encryption for stored archives.

Learn more

Reference Architecture: Synchronization of local MariaDB backups with a remote S3 storage

CephFS driver for the Shared Filesystems service (OpenStack Manila)¶

TechPreview

Implemented support for the CephFS driver for the MOSK Shared Filesystems service.

Learn more

Introspective instance monitor¶

TechPreview

Implemented support for introspective instance monitor in the Instance High Availability (HA) service to improve the reliability and availability of OpenStack environments by continuously monitoring virtual machines for critical failure events. These include operating system crashes, kernel panics, unresponsive states, and so on.

Learn more

Restricting tag assignments on OpenStack instances¶

Implemented the capability that enables cloud operators to define flexible rules to control assignment and removal of specific tags to and from OpenStack instances. The per-tag server tag policies allow the operator to restrict tag assignment and removal based on tag values.

Learn more

Security Guide: Per-tag server tag policies

Individual instance migration handling¶

TechPreview

Implemented the capability that enables the cloud users to mark instances that should be handled individually during host maintenance operations, such as host reboots or data plane restarts. This provides greater flexibility during cluster updates, especially for workloads that are sensitive to live migration.

To mark the instances that require individual handling during host maintenance, one of the following values for the openstack.lcm.mirantis.com:maintenance_action=<ACTION-TAG> server tag can be used: poweroff, live_migrate, or notify.

Learn more

Message of the Day (MOTD) for MOSK Dashboard¶

Enabled cloud operators to configure Message of the Day (MOTD) in the MOSK Dashboard (OpenStack Horizon). This feature allows cloud operators to communicate critical information, such as infrastructure issues, scheduled maintenance, and other important events, directly to users.

Learn more

Reference Architecture: Message of the Day (MOTD)

Volume type selection for instance creation¶

Added the capability for cloud users to specify the type of the volume to be created when launching instances using Image (with Create New Volume selected) as a boot source through the MOSK Dashboard (OpenStack Horizon). The default selection is the default volume type as returned by the Cinder API.

This enhancement provides greater control and an improved user experience for instance configuration through the web UI.

Network port trunking¶

TechPreview

Enabled Neutron Trunk extension by default for all MOSK deployments to streamline the configuration of network port trunking in projects.

Learn more

User Guide: Configure network trunking in projects

OpenStack database backup encryption¶

Enhanced cloud security by providing the capability to enable encryption of OpenStack database backups, both local and remote, using the OpenSSL aes-256-cbc encryption through the OpenStackDeployment custom resource.

Learn more

Reference Architecture: Backup encryption

OpenStack Controller (Rockoon)¶

The OpenStack Controller, which is the central component of MOSK and is responsible for the life cycle management of OpenStack services running in Kubernetes containers, has been open-sourced under the new name Rockoon and will be maintained as an independent open-source project going forward.

As part of this transition, all openstack-controller pods are now named rockoon across the MOSK documentation and deployments. This change does not affect functionality, but users should update any references to the previous pod names accordingly.

OpenSDN 24.1¶

TechPreview

Implemented the technical preview support for OpenSDN 24.1, successor to Tungsten Fabric, for greenfield deployments.

To start experimenting with the new functionality, set tfVersion to 24.1 in the TFOperator custom resource during the cloud deployment.

Learn more

Automatic Cassandra repairs¶

Introduced automatic Cassandra database repairs for Tungsten Fabric through the tf-dbrepair-job CronJob. This enhancement allows users to enable scheduled repairs, ensuring the health and consistency of their Cassandra clusters with minimal manual intervention.

Learn more

Operations Guide: Enable Tungsten Fabric Cassandra repairs

Per-node alerts for Cinder, Neutron, and Nova¶

Reworked the following agent-related and service-related alerts from the cluster-wide to the host-wide scope, including the corresponding changes in the inhibition rules:

CinderServiceDown
NeutronAgentDown
NovaServiceDown

This enhancement allows the operator to better operate environments on a large scale.

Learn more

RabbitMQ monitoring rework¶

Reworked monitoring of RabbitMQ by implementing the following changes:

Switched from the obsolete prometheus-rabbitmq-exporter job to the rabbitmq-prometheus-plugin one, which is based on the native RabbitMQ Prometheus plugin ensuring reliable and direct metric colletion.
Introduced the RabbitMQ Overview Grafana dashboard and reworked all alert rules to utilize metrics from the RabbitMQ Prometheus plugin. This dashboard replaces the deprecated RabbitMQ dashboard, which will be removed in one of the following releases.
Introduced the RabbitMQ Erlang Grafana dashboard to further enhance RabbitMQ monitoring capabilities.
Reworked RabbitMQ alerts:
- Added the RabbitMQTargetDown alert.
- Renamed RabbitMQNetworkPartitionsDetected to RabbitMQUnreachablePeersDetected.
- Deprecated RabbitMQDown and RabbitMQExporterTargetDown. They will be removed in one of the following releases.

Warning

If you use deprecated RabbitMQ metrics in customizations such as alerts and dashboards, switch to the new metrics and dashboards within the course of the MOSK 25.1 series to prevent issues once the deprecated metrics and dashboard will be removed.

Learn more

Hiding sensitive ingress data of Ceph public endpoints¶

Introduced the ability to securely store ingress Transport Layer Security (TLS) certificates for Ceph Object Gateway public endpoints in a secret object. This feature leverages the tlsSecretRefName field in the Ceph cluster spec, enhancing security by preventing the exposure of sensitive data associated with Ceph public endpoints.

On existing clusters, Mirantis recommends updating the Ceph cluster spec by replacing fields containing TLS certificates with tlsSecretRefName as described in Hide sensitive ingress data for Ceph public endpoints.

Note

Since MOSK 25.1, the ingress field of the Ceph cluster spec is automatically replaced with the ingressConfig field.

Learn more

Rook 1.14¶

Added support for Rook 1.14.10 along with support for Ceph CSI v3.11.0. The updated Rook version contains the following brand new features included into the Ceph Controller API:

Introduced the ability to define custom monitor endpoint using the monitorIP field located in the nodes section of the KaasCephCluster CR. This field allows defining the monitor IP address from the Ceph public network range. For example:
```
roles: ["mon", "mgr"]
monitorIP: "196.168.13.1"
```
Added support for balancer mode for the Ceph Manager balancer module using the settings.balancerMode field in the KaasCephCluster CR. For example:
```
mgr:
  mgrModules:
  - name: balancer
    enabled: true
    settings:
      balancerMode: upmap
```

Upgrading to a new version of Rook and Ceph CSI occurs automatically during cluster upgrade.

Learn more

Operations Guide: Ceph advanced configuration - General and Node parameters

BareMetalHostInventory instead of BareMetalHost¶

To allow the operator use the gitops approach, implemented the BareMetalHostInventory resource that must be used instead of BareMetalHost for adding and modifying configuration of bare metal servers.

The BareMetalHostInventory resource monitors and manages the state of a bare metal server and is created for each Machine with all information about machine hardware configuration.

Each BareMetalHostInventory object is synchronized with an automatically created BareMetalHost object, which is now used for internal purposes of the Container Cloud private API.

Caution

Any change in the BareMetalHost object will be overwitten by BareMetalHostInventory.

For any existing BareMetalHost object, a BareMetalHostInventory object is created automatically during cluster update.

m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.

Learn more

Automatic pausing of a MOSK cluster update¶

TechPreview

Introduced automatic pausing of a MOSK cluster update using the UpdateAutoPause object. The operator can now define specific StackLight alerts that trigger auto-pause of an update phase execution. The feature enhances update management of MOSK clusters by preventing harmful changes to be propagated across the entire cloud.

Learn more

Operations Guide: Configure update auto-pause

Granular cluster update through the Container Cloud web UI¶

Implemented the ability to granularly update a MOSK cluster in the Container Cloud web UI using the ClusterUpdatePlan object. The feature introduces a convenient way to perform and control every step of a MOSK cluster update.

Learn more

Operations Guide: Granularly update a managed cluster using the Container Cloud web UI

Containerd as default container runtime¶

MOSK 25.1 introduces switching of the default container runtime for the underlying Kubernetes cluster from Docker to containerd on greenfield deployments. The use of containerd allows for better Kubernetes performance and component update without pod restart when applying fixes for CVEs.

On existing deployments, perform the mandatory migration from Docker to containerd in the scope of MOSK 25.1.x. Otherwise, the management cluster update to Container Cloud 2.30.0 will be blocked.

Important

Container runtime migration involves machine cordoning and draining.

Learn more

Operations Guide: Migrate container runtime from Docker to containerd

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

The issue occurs when two services share the same external IP address and have the same externalTrafficPolicy value. Initially, the services have the external IP address assigned and are accessible. After modifying the externalTrafficPolicy value for both services from Cluster to Local, the first service that has been changed remains with no external IP address assigned. Though, the second service, which was changed later, has the external IP assigned as expected.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[47695] Cinder database sync job fails during upgrade from Antelope to Caracal¶

Fixed in MOSK 24.3.6

Due to the issue in the OpenStack Cinder online data migration code, the migration does not process soft-deleted rows. As a result, in the presence of non-processed soft-deleted rows, the cinder-db-sync job may fail during upgrade from Antelope to Caracal with the error in the pod logs similar to the following:

2024-10-24 18:55:06.678 1 ERROR cinder pymysql.err.DataError: (1265, "Data truncated for column 'use_quota' at row 24")

The issue can occur if your MOSK cluster was initially deployed with OpenStack Wallaby or earlier release, such as Victoria, Ussuri, and so on, and you do not perform the standard periodic OpenStack database cleanup procedure. As a result, unprocessed soft-deleted database rows for volumes and snapshots created before the Xena release may still be present in the database, causing the failure of the database migration.

To verify if your cluster is affected:

Use the following SQL query against the OpenStack database:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL;
```
If both queries return a zero count, your cluster is not affected.

If either query returns a non-zero count, your cluster is affected.
Verify that all the affected rows are soft-deleted:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL AND deleted=0;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL AND deleted=0;
```
If either query returns a non-zero count, stop and request Mirantis support.

If both queries return zero count, proceed with the workaround.

Workaround:

Manually change the value of the use_quota field to 1, where its value is NULL using the following SQL query:

UPDATE cinder.volumes SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;
UPDATE cinder.snapshots SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;

This action is generally harmless as it only modifies rows that are already soft-deleted, and would eventually be removed by the database cleanup.

If you have already encountered the issue and your OpenStack upgrade is stuck, perform the database modification as described above, then re-run the cinder-db-sync job. Once completed, the upgrade should continue as expected.

[51887] The OpenStack Controller (Rockoon) is continuously restarting¶

During major update of a MOSK cluster with Tungsten Fabric enabled, the rockoon pod is continuously restarting and remaining in the APPLYING state with the OpenStackDeployment did not transition to APPLYING state within 60 seconds error in logs.

To work around the issue, before launching the update, add the following snippet in the spec:features:octavia section of the OpenStackDeployment object:

...
octavia:
  lb_network:
    name: lb-mgmt-net
...

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

Rebooting all Cassandra cluster TFConfig or TFAnalytics nodes, maintenance, or other circumstances that cause the Cassandra pods to start simultaneously may cause a broken Cassandra TFConfig and/or TFAnalytics cluster. In this case, Cassandra nodes do not join the ring and do not update the IPs of the neighbor nodes. As a result, the TF services cannot operate Cassandra cluster(s).

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

After replacing a failed Tungsten Fabric controller node as described in Replace a failed TF controller node, the first restart of the Cassandra pod on this node may cause an issue if the Cassandra node with the outdated IP address has not been removed from the cluster. Subsequent Cassandra pod restarts should not trigger this problem.

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

To work around the issue, after replacing the Tungsten Fabric controller node, delete the Cassandra pod on the replaced node and remove the outdated node from the Cassandra cluster using nodetool:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

[47396] Exceeding the number of Cassandra tombstone records leads to unresponsive TF API (tf-config pods)¶

The issue occurs during HA operations such as hard node reboot and force node shutdown, when tf-api is under load.

To work around the issue, manually trigger garbage collection and compaction as follows:

Reduce the garbage collection grace period:

ALTER TABLE config_db_uuid.obj_uuid_table WITH gc_grace_seconds = 10;

After some time (10+ seconds), run the following commands to force deletion of tombstones and compact the database:
```
nodetool garbagecollect -g CELL
nodetool compact -s
```
Restore default gc_grace_seconds = 864000 to avoid potential performance issues.

StakLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

Grafana relies on PostgreSQL for persistent data. In non-HA StackLight setup, PostgreSQL becomes temporarily unavailable during updates. If Grafana loses its database connection or fails to establish one during startup, Grafana fails with an error. This may cause the Grafana pod to enter the CrashLoopBackOff state. Such behavior is expected in non-HA StackLight setups. The Grafana pod will resume normal operation after PostgreSQL is restored.

To prevent the issue, deploy StackLight in HA mode.

[43474] Custom Grafana dashboards are corrupted¶

Custom Grafana panels and dashboards may be corrupted after automatic migration of deprecated Angular-based plugins to the React-based ones. For details, see Deprecation Notes: Angular plugins in Grafana dashboards and the post-update step Back up custom Grafana dashboards in MOSK 24.3 update notes.

To work around the issue, manually adjust the affected dashboards to restore their custom appearance.

[51524] sf-notifier creates big amount of relogins to Salesforce¶

Fixed in MOSK 24.3.5

The incompatibility between the newly implemented session refresh in the upstream simple-salesforce with the MOSK implementation of session refresh in sf-notifier results in the uncontrolled growth of new logins and lack of session reuse. The issue applies to both MOSK and management clusters.

Workaround:

The workaround to the issue is to change the sf-notifier image tag directly in the Deployment object. This change is not persistent as this direct change in the Deployment object will be reverted or overridden by:

Container Cloud version update (for management clusters)
Cluster release version update (for MOSK cluster)
Any sf-notifier-related operation (for all clusters):
- Disable and enable
- Credentials change
- IDs change
- Any configuration change for resources, node selector, tolerations, and log level

Once applied, this workaround must be re-applied whenever one of the above operations is performed in the cluster.

Print the currently used image:

kubectl get deployment sf-notifier -n stacklight -o jsonpath="{.spec.template.spec.containers[0].image}"

Possible results:

mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20250113023013

127.0.0.1:44301/stacklight/sf-notifier:v0.4-20250113023013

Compare the sf-notifier image tag with the list of affected tags. If the image is affected, it has to be replaced. Otherwise, your cluster is not affected.

Affected tags:

v0.4-20241021023015
v0.4-20241028023015
v0.4-20241118023015
v0.4-20241216023012
v0.4-20250113023013
v0.4-20250217023014
v0.4-20250317092322
v0.4-20250414023016

In the resulting string, replace only the tag of the affected image with the desired v0.4-20240828023015 tag. Keep the registry the same as in the original Deployment object.

Resulting images from examples:
```
mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015
```
or
```
127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015
```

Update the sf-notifier Deployment with the new image:

kubectl set image deployment/sf-notifier sf-notifier=<new image> -n stacklight

For example:

kubectl set image deployment/sf-notifier sf-notifier=mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

kubectl set image deployment/sf-notifier sf-notifier=127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

Wait until the pod with the updated image is created, and check the logs. Verify that there are no errors in the logs:
```
kubectl logs pod/<sf-notifier pod> -n stacklight
```

As this change is not persistent and can be reverted by the cluster update operation or any operation related to sf-notifier, periodically check all clusters and if the change has been reverted, re-apply the workaround.

Optionally, you can add a custom alert that will monitor the current tag of the sf-notifier image and will fire the alert if the tag is present in the list of affected tags. For the custom alert configuration details, refer to Alert configuration.

Example of a custom alert to monitor the current tag of the sf-notifier image:

- name: stacklight
  values:
    ...
    prometheusServer:
      ...
      customAlerts:
        ...
      - alert: SFnotifierImageVersion
        annotations:
          description: "sf-notifier image has a buggy tag, please revert deployment image back to sf-notifier:v0.4-20240828023015. This should be fixed in MCC 2.29.3 / MOSK 24.3.5, remove the alert after this release upgrade."
          summary: "This image might be causing too many API logins and exceeding our monthly API budget, please act immediately"
        expr: >-
          avg(kube_pod_container_info{container="sf-notifier",image_spec=~".*v0.4-(20241021023015|20241028023015|20241118023015|20241216023012|20250113023013|20250217023014|20250317092322|20250414023016)"})
        for: 5m
        labels:
          service: alertmanager
          severity: critical

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

A compact MOSK cluster fails to be deployed through the Container Cloud web UI due to inability to add any label to the control plane machines along with inability to change dedicatedControlPlane: false using the web UI.

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

To troubleshoot the issue, check the logs inside the tf-config API container and the tf-cassandra pods. The following example logs indicate that Cassandra services failed to peer with each other and are operating independently:

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[49705] Cluster update is stuck due to unhealthy tf-vrouter-agent-dpdk pods¶

During a MOSK cluster update, the tf-vrouter-agent-dpdk pods may become unhealthy due to a failed LivenessProbe, causing the update process to get stuck. The issue may only affect major updates when the cluster dataplane components are restarted.

To work around the issue, manually remove the tf-vrouter-agent-dpdk pods.

Release artifacts¶

This section lists the components artifacts of the MOSK 25.1 release that includes binaries, Docker images, and Helm charts.

MOSK 25.1 OpenStack Helm charts

Component	Path	License information for main executable programs
rockoon	https://binary.mirantis.com/openstack/helm/rockoon/rockoon-1.0.7.tgz	Mirantis Proprietary License

MOSK 25.1 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20250217113725.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20250208065210.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20250208065210.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20250217105151	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20250217105151	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20250217105151	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20250217105151	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-a7dd070	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20250217105151	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20250217105151	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20250217105151	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:caracal-jammy-20250217105151	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20250217105151	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20250217105151	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20250217105151	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20250217105151	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20241115095845	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250210060030	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-d06c869-20250204085201	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20250213123431	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.18-alpine-20250213142511	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241120110423	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20250213135405	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20241104184039	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20250217105151	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20250217105151	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20250217105151	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241111080549	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20250217105151	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20250213105718	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20250217105151	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20250217105151	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20250217105151	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20250217105151	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20250217105151	Apache License 2.0

MOSK 25.1 OpenStack Antelope binaries and Docker images Deprecated

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20250217114015.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20241015060810.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20241015060810.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20250217105151	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20250217105151	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20250217105151	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20250217105151	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-a7dd070	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20250217105151	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20250217105151	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20250217105151	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250217105151	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20250217105151	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20250217105151	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20250217105151	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20250217105151	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20241115095845	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250217105151	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-d06c869-20250204085201	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20250213123431	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.18-alpine-20250213142511	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241120110423	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20250213135405	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20241104184039	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20250217105151	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20250217105151	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20250217105151	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241111080549	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20250217105151	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20250213143220	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20250217105151	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20250217105151	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20250217105151	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20250217105151	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20250217105151	Apache License 2.0

MOSK 25.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.17.2.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.17.2	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20250120150530	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.8-mira	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.21	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20241122155019	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.8	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.12.14-alpine	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.25-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.8	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.1-alpine	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.66.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20250110161142	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-202a68c-20250203183923	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20250203200404	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20250203000000	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20250203000000	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20250203000000	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20250203000000	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20250203000000	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20250203000000	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20250203000000	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20250203000000	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20250203000000	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20250203000000	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20250203000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20250203000000	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20250203000000	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20250203000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20250203000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20250203000000	Apache License 2.0

MOSK 25.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20250217023014	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20250205153914	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 25.1 release:

OpenStack¶

[43058] Resolved the issue with the Cronjob for MariaDB that prevented it from transitioning to the APPLYING state after changing the OpenStackDeployment custom resource.
[47269] Resolved the issue that prevented instances from live-migrating.
[48890] Resolved the issue that caused an extremely high load on the gateway nodes.

Bare metal¶

[49678] Resolved the issue that caused the flapping status (Configure → Ready → Configure → Ready) of machines where any HostOSConfiguration object was targeted and migration to containerd was applied.

Update¶

[49078] Resolved the issue that caused migration to containerd to get stuck due to orphaned Docker containers.

StackLight¶

[49340] Resolved the issue that caused failure of tag-based log filtering using the tag_include parameter for logging.externalOutputs when output_kind: audit is selected.
[45215] Resolved the performance issue in the OpenStack PortProber Grafana dashboard when handling large amounts of metrics that caused time ranges to exceed one hour.

Implemented recording rules and updated the dashboard to leverage them, resulting in significant performance improvements. But be aware that the updated dashboard will only display data collected after cluster update.

To access older data, use the OpenStack PortProber [Deprecated] dashboard that will be removed in one of the following releases due to being unreliable when querying extended time ranges in high-load clusters.
[42660] Resolved the issue that caused the Nova - Hypervisor Overview Grafana dashboard to display the load average (per vCPU), allocated memory, and allocated disk (allocated by VMs) instead of real CPU, memory, and disk utilization with data collected from node-exporter and OpenStack Nova.
[39368] Resolved the issue that caused the DockerSwarmNodeFlapping to be firing during cluster update.

It is expected to see the DockerSwarmNodeFlapping and DockerSwarmServiceReplicasFlapping alerts firing during cluster update to Container Cloud 2.29.0 but only before the StackLight component is updated.
[39077] Resolved the issue that caused the TelegrafGatherErrors for telegraf-docker-swarm to be firing during cluster update.

Reworked the TelegrafGatherErrors alert and replaced it with TelegrafSMARTGatherErrors and TelegrafDockerSwarmGatherErrors alerts.

It is expected to see the TelegrafGatherErrors alert firing during cluster update to Container Cloud 2.29.0 but only before the StackLight component is updated.

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 25.1. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Cluster update schema¶

There is a possibility to update to the 25.1 version from the following cluster versions:

24.3 (released on October 16, 2024)
24.3.2 (released on February 03, 2025)

Important

Be advised that updating to version 25.1 will not be possible from at least the upcoming 24.3.3 and 24.3.4 patches. For the detailed cluster update schema, refer to Managed cluster update schema.

Update impact and maintenance windows planning¶

The following table provides details on the update impact on a MOSK cluster.

Impact during update to MOSK 25.1¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 0

0: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Pre-update actions¶

Upgrade Ubuntu to 22.04¶

Since Ubuntu 20.04 reaches end-of-life in April 2025, MOSK 25.1 does not support the Cluster release update of the Ubuntu 20.04-based clusters, and Ubuntu 22.04 becomes the only supported version of the host operating system.

Therefore, ensure that all your MOSK clusters are running Ubuntu 22.04 to unblock update of the management cluster to the Cluster release 16.4.1. For the Ubuntu upgrade procedure, refer to Upgrade an operating system distribution.

Caution

Edit Octavia configuration in OpenStackDeployment¶

Note

This step only applies to the Tungsten Fabric based MOSK deployments.

To mitigate the known issue 51887 during update of a MOSK cluster with Tungsten Fabric enabled, add the following snippet in the spec:features:octavia section of the OpenStackDeployment object before launching cluster update:

...
octavia:
  lb_network:
    name: lb-mgmt-net
...

Back up custom Grafana dashboards¶

In MOSK 25.1 and Container Cloud 2.29.0, Grafana is updated to version 11 where the following deprecated Angular-based plugins are automatically migrated to the React-based ones:

Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap

This migration may corrupt custom Grafana dashboards that have Angular-based panels. Therefore, if you have such dashboards, back them up and manually upgrade Angular-based panels before updating to MOSK 25.1 to prevent custom appearance issues after plugin migration.

Note

All Grafana dashboards provided by StackLight are also migrated to React automatically. For the list of default dashboards, see View Grafana dashboards.

Warning

For management clusters that are updated automatically, it is important to prepare the backup before Container Cloud 2.29.0 is released. Otherwise, custom dashboards using Angular-based plugins may be corrupted.

For managed clusters, you can perform the backup after the Container Cloud 2.29.0 release date but before updating them to MOSK 25.1.

See also

Post-update actions¶

Hide sensitive ingress data for Ceph public endpoints¶

Since MOSK 25.1, you can hide ingress TLS certificates for Ceph Object Gateway public endpoint in a secret object and use tlsSecretRefName in the Ceph cluster spec. This configuration prevents exposing sensitive data of Ceph public endpoints.

On existing clusters, Mirantis recommends updating the Ceph cluster spec by replacing fields containing TLS certificates with tlsSecretRefName:

Obtain ingress details. For example:

kubectl get ingress -n rook-ceph

Example of system response:

NAME                                    CLASS    HOSTS                                 ADDRESS                                                                                                    PORTS     AGE
rook-ceph-rgw-<rgw-store-name>-ingress   <none>   <rgw-store-name>.<rgw-public-domain> 10.10.0.134,10.10.0.148,10.10.0.203,10.10.0.210,10.10.0.222,10.10.0.232,10.10.0.27,10.10.0.67,10.10.0.82   80, 443   70d

Obtain the TLS secret name:

kubectl get ingress -n rook-ceph rook-ceph-rgw-<rgw-store-name>-ingress -o jsonpath='{.spec.tls[0].secretName}'

On the management cluster, update the kaascephcluster spec:

Note

Since MOSK 25.1, the ingress field of the Ceph cluster spec is automatically replaced with the ingressConfig field.

Remove the cacert, tlsCert, and tlsKey fields:

spec:
  cephClusterSpec:
    ingressConfig:
      tlsConfig:
        publicDomain: public.domain.name
        cacert: |
          -----BEGIN CERTIFICATE-----
          ...
          -----END CERTIFICATE-----
        tlsCert: |
          -----BEGIN CERTIFICATE-----
          ...
          -----END CERTIFICATE-----
        tlsKey: |
          -----BEGIN RSA PRIVATE KEY-----
          ...
          -----END RSA PRIVATE KEY-----
      controllerClassName: <ingress-controller-class-name>
      ...

Add tlsSecretRefName with the previously obtained TLS secret name where TLS certificates are stored:

spec:
  cephClusterSpec:
    ingressConfig:
      tlsConfig:
        publicDomain: public.domain.name
        tlsSecretRefName: <secret-name>
      controllerClassName: <ingress-controller-class-name>
      ...

Caution

If you update the ingress certificate, the new secret must be base64-encoded and have the same format as in the previous secret.

Update the Alertmanager API v1 integrations to v2¶

Note

This step applies if you use the Alertmanager API v1 in your integrations and configurations. Otherwise, skip this step.

In MOSK 25.1 and Container Cloud 2.29.0, the Alertmanager API v1 is deprecated and will be removed in one of the upcomping MOSK and Container Cloud releases. For details, see Deprecation Notes.

Therefore, if you use API v1, update your integrations and configurations to use the API v2 ensuring compatibility with new versions of Alertmanager.

Update parameters for externalOutputs¶

Note

This step applies if log forwarding to external destinations is enabled. Otherwise, skip this step.

In the following major MOSK and Container Cloud releases, the Fluentd plugin out_elasticsearch will be updated to the version that no longer supports external output to opensearch.

Therefore, if you use opensearch as an external destination for logging and used the elasticsearch value for the logging.externalOutputs[].type parameter, change it to opensearch in the scope of Container Cloud 2.29.x and MOSK 25.1.x release series. For the configuration procedure, see Enable log forwarding to external destinations.

Update RabbitMQ monitoring utilities¶

MOSK 25.1 introduces several enhancements for monitoring of RabbitMQ by StackLight, which include deprecation of some RabbitMQ metrics, alerts, and dashboard. For details, see RabbitMQ monitoring rework.

Start using BareMetalHostInventory instead of BareMetalHost¶

MOSK 25.1 introduces the BareMetalHostInventory resource that must be used instead of BareMetalHost for adding and modifying configuration of bare metal servers. Therefore, if you need to modify an existing or create a new configuration of a bare metal host, use BareMetalHostInventory.

Each BareMetalHostInventory object is synchronized with an automatically created BareMetalHost object, which is now used for internal purposes of the Container Cloud private API.

Caution

Any change in the BareMetalHost object will be overwitten by BareMetalHostInventory.

For any existing BareMetalHost object, a BareMetalHostInventory object is created automatically during cluster update.

m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.

See also

Migrate container runtime from Docker to containerd¶

MOSK 25.1 introduces switching of the default container runtime for the underlying Kubernetes cluster from Docker to containerd on greenfield deployments.

On existing deployments, perform the mandatory migration from Docker to containerd in the scope of MOSK 25.1.x. Otherwise, the management cluster update to Container Cloud 2.30.0 will be blocked.

Important

Container runtime migration involves machine cordoning and draining.

Required action

Operations Guide: Migrate container runtime from Docker to containerd

Security notes¶

In total, since MOSK 24.3 major release, in 25.1, 285 Common Vulnerabilities and Exposures (CVE) have been fixed: 10 of critical and 275 of high severity.

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 24.3.2 patch. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	1	9	10
OpenStack	Common	2	59	61
Tungsten Fabric	Unique	4	67	71
Tungsten Fabric	Common	5	106	111

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.29.0: Security notes.

24.3 series¶

24.3¶

Release date	October 16, 2024
Name	MOSK 24.3
Cluster release	17.3.0
Highlights	OpenStack Caracal LTS support Ubuntu 22.04 LTS support Instance migration for non-administrative OpenStack users Technical preview for external OIDC identity providers for OpenStack Technical preview for custom volume backend for Cinder Graceful instance shutdown Shared Filesystems as a Service Technical preview for cluster self-diagnostics Etcd as a backend for TaskFlow vRouter Provisioner moved to separate DaemonSet Automatic conversion to Tungsten Fabric Operator API v2 Monitoring of Nova orphaned allocations PowerDNS health monitoring

New features¶

MOSK 24.3 features¶
Component	Support scope	Feature
Major component updates	Full	OpenStack Caracal LTS support
	Full	Ubuntu 22.04 LTS support
OpenStack	Full	Instance migration for non-administrative OpenStack users
	TechPreview	External OIDC identity providers for OpenStack
	TechPreview	Custom volume backend for Cinder
	Full	Graceful instance shutdown
	Full	Shared Filesystems as a Service
	TechPreview	Cluster self-diagnostics
	Full	Etcd as a backend for TaskFlow
Tungsten Fabric	Full	vRouter Provisioner moved to separate DaemonSet
	Full	Automatic conversion to Tungsten Fabric Operator API v2
StackLight	Full	Monitoring of Nova orphaned allocations
	Full	PowerDNS health monitoring

OpenStack Caracal LTS support¶

Implemented full support for OpenStack Caracal with Open vSwitch and Tungsten Fabric networking backends for greenfield deployments and for an upgrade from OpenStack Antelope. To upgrade an existing cloud from OpenStack Antelope to Caracal, follow the Upgrade OpenStack procedure.

For the OpenStack support cycle in MOSK, refer to OpenStack support cycle.

Highlights from upstream OpenStack supported by MOSK deployed on Caracal

Horizon:

Horizon added TOTP authentication support, allowing users to enhance their security by authenticating with Time-based One-Time Passwords.

Manila:

Manila shares and access rules can now be locked against deletion. A generic resource locks framework has been introduced to facilitate this. Users can also hide sensitive fields of access rules with this feature.

Neutron:

Limit the rate at which instances can query the metadata service in order to protect the OpenStack deployment from DoS or misbehaved instances.
New API which allows to define a set of security group rules to be used automatically in every new default and/or custom security group created for any project.

Nova:

It is now possible to define different authorization policies for migration with and without a target host.

To view the full list of OpenStack Antelope features, including those not supported by MOSK, refer to OpenStack Caracal upstream documentation: Release notes and source code.

Tungsten Fabric Horizon plugin removal

Since the OpenStack Caracal release, the Tungsten Fabric Horizon plugin has been deprecated and removed. This change impacts the Networking panel in OpenStack Horizon, which previously allowed for managing Network IPAMs and Network policies. With the removal of this plugin, Horizon no longer supports these features.

As a result, cloud operators may encounter Tungsten Fabric service networks with snat-si in their names. These networks will be visible in the network tabs and during the creation of ports or instances. Mirantis advises cloud operators not to interact with these networks, as doing so may cause system malfunctions.

Ubuntu 22.04 LTS support¶

Implemented full support for Ubuntu 22.04 LTS (Jammy Jellyfish) as the default host operating system in MOSK clusters, including greenfield deployments and update from Ubuntu 20.04 to 22.04 on existing clusters.

Ubuntu 20.04 is unsupported for greenfield deployments and is considered deprecated during the MOSK 24.3 release cycle for existing clusters.

Note

Since Container Cloud 2.27.0 (Cluster release 16.2.0), existing MOSK management clusters were automatically updated to Ubuntu 22.04 during cluster upgrade. Greenfield deployments of management clusters are also based on Ubuntu 22.04.

Required action

Update notes: Upgrade Ubuntu to 22.04

Instance migration for non-administrative OpenStack users¶

Implemented the capability for non-administrative OpenStack users to migrate instances, including both live and cold migrations. This functionality is useful when performing different maintenance tasks including cloud updates, handling noisy neighbors, and other operational needs.

Learn more

External OIDC identity providers for OpenStack¶

TechPreview

Implemented the capability to connect external OpenID Connect (OIDC) identity providers to MOSK Identity service (OpenStack Keystone) directly through the OpenStackDeployment custom resource.

Learn more

Reference Architecture: Federation

Custom volume backend for Cinder¶

TechPreview

Introduced the ability to configure custom volume backends for MOSK Block Storage service (OpenStack Cinder), enhancing flexibility in storage management. Users can now define and deploy their own backend configurations through the OpenStackDeployment custom resource.

Learn more

Reference Architecture: Volume configuration

Graceful instance shutdown¶

Implemented the capability to automatically power off the guest instances during the compute node shutdown or reboot through the ACPI power event. This ensures the integrity of disk filesystems and prevents damage to running applications during cluster updates.

Learn more

Reference Architecture: Graceful instance shutdown

Shared Filesystems as a Service¶

Introduced general availability support for the MOSK Shared Filesystems service (OpenStack Manila), allowing cloud users to create and manage virtual file shares. This enables applications to store data using common network file-sharing protocols such as CIFS, NFS, and more.

Learn more

Cluster self-diagnostics¶

TechPreview

Implemented the Diagnostic Controller that performs cluster self-diagnostics to help the operator to easily understand, troubleshoot, and resolve potential issues against the major cluster components, including OpenStack, Tungsten Fabric, Ceph, and StackLight.

Running self-diagnostics is essential to ensure the overall health and optimal performance of a cluster. Mirantis recommends running self-diagnostics before cluster update, node replacement, or any other significant changes in the cluster to prevent potential issues and optimize the maintenance window.

Learn more

Operations Guide: Run cluster self-diagnostics

Etcd as a backend for TaskFlow¶

Implemented etcd as a backend for TaskFlow within Octavia, offering a scalable, consistent, and fault-tolerant solution for persisting and managing task states. This ensures that Octavia reliably handles distributed load balancing tasks in a Kubernetes cluster.

vRouter Provisioner moved to separate DaemonSet¶

Separated the vRouter provisioner from other Tungsten Fabric components. Now, the vRouter provisioner is deployed as a separate DaemonSet tf-vrouter-provisioner to allow for better control over the vRouter components.

Learn more

Reference Architecture: Tungsten Fabric Operator

Automatic conversion to Tungsten Fabric Operator API v2¶

Implemented the automatic conversion of the Tungsten Fabric cluster configuration API (TFOperator) v1alpha1 to the v2 version during update to MOSK 24.3.

Since MOSK 24.3, the v2 TFOperator custom resource should be used for any updates. The v1alpha1 TFOperator custom resource will remain in the cluster but will no longer be reconciled and will be automatically removed with the next cluster update.

Learn more

Monitoring of Nova orphaned allocations¶

Implemented monitoring of orphaned allocations in the MOSK Compute service (OpenStack Nova). This feature simplifies the detection and troubleshooting of orphaned resource allocations, ensuring that resources are correctly assigned and utilized within the cloud infrastructure.

Learn more

PowerDNS health monitoring¶

Implemented health monitoring of the PowerDNS backend for MOSK DNS service (OpenStack Designate) using StackLight that allows detecting and preventing PowerDNS issues. Started scraping a set of metrics to monitor PowerDNS networking and detect server errors, failures, and outages. Based on these metrics, added the dedicated OpenStack PowerDNS Grafana dashboard and several alerts to notify the operator of any detected issues.

Learn more

Major components versions¶

Mirantis has tested MOSK against a very specific configuration and can guarantee a predictable behavior of the product only in the exact same environments. The table below includes the major MOSK components with the exact versions against which testing has been performed.

MOSK 24.3 components versions¶
Component	Version
Cluster release	17.3.0 (Cluster release notes)
OpenStack	Caracal, Antelope
OpenStack Operator	0.17.8
Tungsten Fabric	21.4
Tungsten Fabric Operator	0.16.6

See also

For the supported versions of operating system, Ceph, and other components, refer to Release Compatibility Matrix.

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[47603] Masakari fails during the OpenStack upgrade to Caracal¶

Fixed in MOSK 24.3.1

During the upgrade to OpenStack Caracal, the masakari-db-sync Kubernetes Job is failing preventing the Masakari API pods from initializing. The failure is caused during migration of the Masakari database from legacy SQLAlchemy Migrate to Alembic due to misconfigured alembic_table.

Workaround:

Before upgrading to Caracal, pin the masakari_db_sync image to the updated Caracal image by adding the following content to the OpenStackDeployment custom resource:
```
spec:
  services:
    instance-ha:
      masakari:
        values:
          images:
            tags:
              masakari_db_sync: docker-dev-kaas-local.docker.mirantis.net/openstack/masakari:caracal-jammy-20241028141054
```
The database migration will be performed, but the actual database schema and data remain unchanged, only the migration metadata is updated.

Verify that Masakari in Antelope can now operate with the database schema from Caracal without issues:

exec masakari-manage db sync

Example of a positive system response:

2024-10-31 10:22:02.438 1 INFO masakari.db.sqlalchemy.migration [-] Applying migration(s)
2024-10-31 10:22:02.466 1 INFO masakari.db.sqlalchemy.migration [-] The database is still under sqlalchemy-migrate control; fake applying the initial alembic migration
2024-10-31 10:22:02.493 1 INFO masakari.engine.driver [-] Loading masakari notification driver 'taskflow_driver'
2024-10-31 10:22:03.819 1 INFO masakari.db.sqlalchemy.migration [-] Migration(s) applied

Perform upgrade to Caracal as instructed in Upgrade OpenStack.
Remove the image pin from the OpenStackDeployment custom resource.

[47695] Cinder database sync job fails during upgrade from Antelope to Caracal¶

Fixed in MOSK 24.3.6

2024-10-24 18:55:06.678 1 ERROR cinder pymysql.err.DataError: (1265, "Data truncated for column 'use_quota' at row 24")

To verify if your cluster is affected:

Use the following SQL query against the OpenStack database:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL;
```
If both queries return a zero count, your cluster is not affected.

If either query returns a non-zero count, your cluster is affected.
Verify that all the affected rows are soft-deleted:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL AND deleted=0;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL AND deleted=0;
```
If either query returns a non-zero count, stop and request Mirantis support.

If both queries return zero count, proceed with the workaround.

Workaround:

Manually change the value of the use_quota field to 1, where its value is NULL using the following SQL query:

UPDATE cinder.volumes SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;
UPDATE cinder.snapshots SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;

This action is generally harmless as it only modifies rows that are already soft-deleted, and would eventually be removed by the database cleanup.

Tungsten Fabric¶

Note

For the Tungsten Fabric limitations, refer to Tungsten Fabric known limitations.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[49340] Tag-based filtering does not work for output_kind: audit¶

Fixed in MOSK 25.1

Tag-based filtering of logs using the tag_include parameter does not work for the logging.externalOutputs feature when output_kind: audit is selected.

For example, if the user wants to send only logs from the sudo program and sets tag_include: sudo, none of the logs will be sent to an external destination.

To work around the issue, allow forwarding of all audit logs in addition to sudo, which include logs from sshd, systemd-logind, and su. Instead of tag_include: sudo, specify tag_include: '{sudo,systemd-audit}'.

When the fix applies in MOSK 25.1, filtering starts working automatically.

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[47602] Failed designate-zone-setup job blocks cluster update¶

Fixed in MOSK 24.3.1

The designate-zone-setup Kubernetes job in the openstack namespace fails during update to MOSK 24.3 with the following error present in the logs of the job pod:

openstack.exceptions.BadRequestException: BadRequestException: 400:
Client Error for url: http://designate-api.openstack.svc.cluster.local:9001/v2/zones,
Invalid TLD

The issue occurs when the DNS service (OpenStack Designate) has any TLDs created, but test is not among them. Since DNS service monitoring was added to MOSK 24.3, it attempts to create a test zone test-zone.test in the Designate service, which fails if the test TLD is missing.

To work around the issue, verify that there are created TLDs present in the DNS service:

openstack tld list -f value -c name

If there are TLDs present and test is not one of them, create it:

Warning

Do not create the test TLD if no TLDs were present in the DNS service initially. In this case, the issue is caused by a different factor, and creating the test TLD when none existed before may disrupt users of both the DNS and Networking services.

openstack tld create --name test

Example output:

+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| created_at  | 2024-10-22T19:22:15.000000           |
| description | None                                 |
| id          | 930fed8b-1e91-4c8c-a00f-7abf68b944d0 |
| name        | test                                 |
| updated_at  | None                                 |
+-------------+--------------------------------------+

[49705] Cluster update is stuck due to unhealthy tf-vrouter-agent-dpdk pods¶

To work around the issue, manually remove the tf-vrouter-agent-dpdk pods.

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Release artifacts¶

This section lists the components artifacts of the MOSK 24.3 release that includes binaries, Docker images, and Helm charts.

MOSK 24.3 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20240917163046.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20240917152114	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20240917152114	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20240917152114	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20240917152114	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-eae04fb	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20240917152114	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20240917152114	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20240917152114	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240917152117	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20240917152114	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20240917152114	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20240917152114	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20240917152114	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240822074257	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240819060021	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240905100031	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20240911082430	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240905182339	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.2-alpine-20240913100326	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240909113408	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20240917152114	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20240917152114	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20240917152114	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20240905134302	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240909123821	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20240918073840	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240905114128	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240905140249	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20240917152114	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20240919161246	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20240917152114	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20240917152114	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20240917152114	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20240917152114	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20240917152114	Apache License 2.0

MOSK 24.3 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240917163055.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240917152117	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240917152117	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240917152117	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240917152117	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-eae04fb	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240917152117	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240917152117	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240917152117	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240917152117	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240917152117	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240917152117	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240917152117	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240917152117	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240822074257	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240819060021	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240905100031	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20240911082430	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240905182339	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.2-alpine-20240913100326	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240909113408	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240917152117	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240917152117	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240917152117	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20240905134302	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240909123821	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20240918073840	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240905114128	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240905140249	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240917152117	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240916150923	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240917152117	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240917152117	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240917152117	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240917152117	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240917152117	Apache License 2.0

MOSK 24.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.17.8.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4545.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
drb-controller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/drb-controller-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3121.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.3 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.16.6.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.16.6	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240920135701	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.6	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.18	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240812125407	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.5	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.22-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.5	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.0-alpine3.20	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.62.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240820113128	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240919130057	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20240906000000	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20240906000000	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20240906000000	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20240906000000	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20240906000000	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20240906000000	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20240906000000	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20240906000000	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20240906000000	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20240906000000	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240906000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20240906000000	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20240906000000	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240906000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240906000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240906000000	Apache License 2.0

MOSK 24.3 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240828023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 24.3 release:

[43966] [Antelope] Improved parallel image downloading when Glance is configured with Cinder backend.
[44813] [Antelope] Resolved the issue that caused disruption on trunk ports.
[40900] [Tungsten Fabric] Resolved the issue that caused Cassandra database to enter an infinite table creation or changing state.
[46220] [Tungsten Fabric] Resolved the issue that caused subsequent cluster maintenance requests to get stuck on clusters running Tungsten Fabric with API v2, after updating from MOSK 24.2 to 24.2.1.

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 24.3. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Cluster update schema¶

According to the new cluster update schema introduced in the product in the MOSK 24.2 series, you can update to the 24.3 version from the following cluster versions:

24.2 (released on July 02, 2024)
24.2.2 (released on September 16, 2024)

For the detailed cluster update schema, refer to Managed cluster update schema.

Features¶

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

See also

Release Compatibility Matrix

Update impact and maintenance windows planning¶

The following table provides details on the update impact on a MOSK cluster.

Impact during update to MOSK 24.3¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 0

0: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

To properly plan the update maintenance window, use the following documentation:

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Pay special attention to [47602] Failed designate-zone-setup job blocks cluster update. Before performing the cluster update, verify the DNS service (OpenStack Designate) for any created Top-Level Domains (TLDs). If TLDs are present but the test TLD is missing, create test according to the known issue description.

Pre-update actions¶

Ensure ENABLE_4BYTE_AS is set for tf-config¶

Before updating to MOSK 24.3, if the BGP_ASN parameter is defined for tf-config in the TFOperator custom resource, ensure that the ENABLE_4BYTE_AS parameter is also set to prevent issues when starting the Tungsten Fabric operator with the TFOperator v2 custom resource:

controllers:
  tf-config:
    provisioner:
      containers:
      - env:
        - name: BGP_ASN
          value: "64515"
        - name: ENABLE_4BYTE_AS
          value: "false"
        name: provisioner

Post-update actions¶

Upgrade Ubuntu to 22.04¶

MOSK 24.3 release series is the last one to support Ubuntu 20.04 as the host operating system. Ubuntu 20.04 reaches end-of-life in April 2025. Therefore, Mirantis encourages all MOSK users to upgrade their clusters to Ubuntu 22.04 as soon as possible after getting to MOSK 24.3.

A host operating system upgrade requires reboot of the servers and can be performed in small batches. For the detailed procedure of the Ubuntu upgrade, refer to Upgrade an operating system distribution.

Warning

Update of management or MOSK clusters running Ubuntu 20.04 will not be possible in the following major product version.

Caution

Start using new API for Tungsten Fabric configuration (TFOperator v2)¶

Since MOSK 24.3, the v2 TFOperator custom resource becomes the default and the only way to manage the configuration of Tungsten Fabric cluster. During update to MOSK 24.3, the old v1alpha1 TFOperator custom resource will get automatically converted to version v2.

Note

The v1alpha1 TFOperator custom resource remains in the cluster but is no longer reconciled and will be automatically removed with the next major cluster update.

Back up custom Grafana dashboards¶

In MOSK 25.1 and Container Cloud 2.29.0, Grafana will be updated to version 11 where the following deprecated Angular-based plugins will be automatically migrated to the React-based ones:

Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap

This migration may corrupt custom Grafana dashboards that have Angular-based panels. Therefore, if you have such dashboards, back them up and manually upgrade Angular-based panels during the course of MOSK 24.3 and Container Cloud 2.28.x (Cluster releases 17.3.x and 16.3.x) to prevent custom appearance issues after plugin migration in Container Cloud 2.29.0 and MOSK 25.1.

Note

All Grafana dashboards provided by StackLight are also migrated to React automatically. For the list of default dashboards, see View Grafana dashboards.

Warning

For managed clusters, you can perform the backup after the Container Cloud 2.29.0 release date but before updating them to MOSK 25.1.

See also

Security notes¶

In total, since MOSK 24.2 major release, in 24.3, 1071 Common Vulnerabilities and Exposures (CVE) have been fixed: 79 of critical and 992 of high severity.

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 24.2.2 patch. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	8	62	70
OpenStack	Common	30	360	390
Tungsten Fabric	Unique	8	167	175
Tungsten Fabric	Common	13	313	326

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.28.0: Security notes.

24.3.1 patch¶

The MOSK 24.3.1 patch includes the following updates:

Update of Mirantis Kubernetes Engine (MKE) to 3.7.17.
Update of Mirantis Container Runtime (MCR) to 23.0.15 (with containerd 1.6.36).

Important

As a result of the MCR version update, downtimes during cluster updates are expected to be similar to those experienced during a major version update. To accurately plan the cluster update, refer to Update notes.
Optional transition of container runtime from Docker to containerd. Refer to Optional migration of container runtime from Docker to containerd for details.
Update of minor kernel version to 5.15.0-126-generic.
Security fixes for CVEs in images.
Resolved product issues.

Learn more about the release content and update specifics:

Release artifacts¶

This section lists the components artifacts of the MOSK 24.3.1 release that includes binaries, Docker images, and Helm charts.

MOSK 24.3.1 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.17.10.tgz	Mirantis Proprietary License

MOSK 24.3.1 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20241212120546.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20241212111317	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20241212111317	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20241212111317	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20241212111317	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-d0f3d7b	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20241212111317	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20241212111317	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20241212111317	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20241212111317	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20241212111317	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20241212111317	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20241212111317	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20241212111317	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20241115095845	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20241120070553	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20241024070356	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20241024070202	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241120110423	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20241024075118	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20241104184039	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20241212111317	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20241212111317	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20241212111317	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20241024111938	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241111080549	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20241024065919	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20241212111317	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20241120224419	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20241212111317	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20241212111317	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20241212111317	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20241212111317	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20241212111317	Apache License 2.0

MOSK 24.3.1 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20241212115724.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20241212111317	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20241212111317	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20241212111317	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20241212111317	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-d0f3d7b	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20241212111317	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20241212111317	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20241212111317	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20241212111317	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20241212111317	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20241212111317	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20241212111317	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20241212111317	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20241115095845	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20241120070553	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20241024070356	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20241024070202	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241120110423	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20241024075118	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20241104184039	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20241212111317	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20241212111317	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20241212111317	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20241024111938	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241111080549	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20241024065919	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20241212111317	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20241205185155	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20241212111317	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20241212111317	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20241212111317	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20241212111317	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20241212111317	Apache License 2.0

MOSK 24.3.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.16.7.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.16.7	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20241209122102	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.6	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.21	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240812125407	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.5	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.22-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.5	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.1-alpine	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.62.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20241119115912	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240919130057	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.3-r21.4.20241206144908	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240906000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.3-r21.4.20241206144908	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.3-r21.4.20241206144908	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240906000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240906000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240906000000	Apache License 2.0

MOSK 24.3.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20241216023012	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20241021123242	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

The table below contains the total number of addressed unique and common CVEs by MOSK-specific component compared to the previous release version. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	1	20	21
OpenStack	Common	1	63	64
Tungsten Fabric	Unique	2	14	16
Tungsten Fabric	Common	2	24	26

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.28.4: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.3.1 release:

[47602] [Update] Resolved the issue with the designate-zone-setup job that blocked cluster update.
[47603] [OpenStack] Resolved the issue that caused Masakari failure during the OpenStack upgrade to Caracal.
[48160] [OpenStack] Resolved the issue that caused instances to fail booting when using a VFAT-formatted config drive.
[47174] [Tungsten Fabric] Adjusted generation of affinity rules for Redis for the clusters where analytics services are disabled.
[47717] [Tungsten Fabric] Resolved the issue with the invalid BgpAsn setting in tungstenfabric-operator.
[48153] [Tungsten Fabric] Resolved the issue with OpenStack generating duplicate floating IP addresses for Tungsten Fabric within the same floating IP network, assigning them different IDs.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.3.1. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[47695] Cinder database sync job fails during upgrade from Antelope to Caracal¶

Fixed in MOSK 24.3.6

2024-10-24 18:55:06.678 1 ERROR cinder pymysql.err.DataError: (1265, "Data truncated for column 'use_quota' at row 24")

To verify if your cluster is affected:

Use the following SQL query against the OpenStack database:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL;
```
If both queries return a zero count, your cluster is not affected.

If either query returns a non-zero count, your cluster is affected.
Verify that all the affected rows are soft-deleted:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL AND deleted=0;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL AND deleted=0;
```
If either query returns a non-zero count, stop and request Mirantis support.

If both queries return zero count, proceed with the workaround.

Workaround:

Manually change the value of the use_quota field to 1, where its value is NULL using the following SQL query:

UPDATE cinder.volumes SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;
UPDATE cinder.snapshots SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;

This action is generally harmless as it only modifies rows that are already soft-deleted, and would eventually be removed by the database cleanup.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[49340] Tag-based filtering does not work for output_kind: audit¶

Fixed in MOSK 25.1

Tag-based filtering of logs using the tag_include parameter does not work for the logging.externalOutputs feature when output_kind: audit is selected.

For example, if the user wants to send only logs from the sudo program and sets tag_include: sudo, none of the logs will be sent to an external destination.

When the fix applies in MOSK 25.1, filtering starts working automatically.

[51524] sf-notifier creates big amount of relogins to Salesforce¶

Fixed in MOSK 24.3.5

Workaround:

Container Cloud version update (for management clusters)
Cluster release version update (for MOSK cluster)
Any sf-notifier-related operation (for all clusters):
- Disable and enable
- Credentials change
- IDs change
- Any configuration change for resources, node selector, tolerations, and log level

Once applied, this workaround must be re-applied whenever one of the above operations is performed in the cluster.

Print the currently used image:

kubectl get deployment sf-notifier -n stacklight -o jsonpath="{.spec.template.spec.containers[0].image}"

Possible results:

mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20250113023013

127.0.0.1:44301/stacklight/sf-notifier:v0.4-20250113023013

Compare the sf-notifier image tag with the list of affected tags. If the image is affected, it has to be replaced. Otherwise, your cluster is not affected.

Affected tags:

v0.4-20241021023015
v0.4-20241028023015
v0.4-20241118023015
v0.4-20241216023012
v0.4-20250113023013
v0.4-20250217023014
v0.4-20250317092322
v0.4-20250414023016

In the resulting string, replace only the tag of the affected image with the desired v0.4-20240828023015 tag. Keep the registry the same as in the original Deployment object.

Resulting images from examples:
```
mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015
```
or
```
127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015
```

Update the sf-notifier Deployment with the new image:

kubectl set image deployment/sf-notifier sf-notifier=<new image> -n stacklight

For example:

kubectl set image deployment/sf-notifier sf-notifier=mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

kubectl set image deployment/sf-notifier sf-notifier=127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

Wait until the pod with the updated image is created, and check the logs. Verify that there are no errors in the logs:
```
kubectl logs pod/<sf-notifier pod> -n stacklight
```

Example of a custom alert to monitor the current tag of the sf-notifier image:

- name: stacklight
  values:
    ...
    prometheusServer:
      ...
      customAlerts:
        ...
      - alert: SFnotifierImageVersion
        annotations:
          description: "sf-notifier image has a buggy tag, please revert deployment image back to sf-notifier:v0.4-20240828023015. This should be fixed in MCC 2.29.3 / MOSK 24.3.5, remove the alert after this release upgrade."
          summary: "This image might be causing too many API logins and exceeding our monthly API budget, please act immediately"
        expr: >-
          avg(kube_pod_container_info{container="sf-notifier",image_spec=~".*v0.4-(20241021023015|20241028023015|20241118023015|20241216023012|20250113023013|20250217023014|20250317092322|20250414023016)"})
        for: 5m
        labels:
          service: alertmanager
          severity: critical

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[49078] Migration to containerd is stuck due to orphaned Docker containers¶

Fixed in MOSK 25.1

During migration of container runtime from Docker to containerd, some nodes may get stuck with the following error in LCM logs:

Orphaned Docker containers found after migration. Unable to proceed, please
check the node manually: exit status 2

The cluster is affected if orphaned containers with the k8s_ prefix are present on the affected nodes:

docker ps -a --format '{{ .Names }}' | grep '^k8s_'

Workaround:

Inspect recent Ansible logs at /var/log/lcm/* and make sure that the only failed task during migration is Delete running pods. If so, proceed to the next step. Otherwise, contact Mirantis support for further information.
Stop and remove orphaned containers with the k8s_ prefix.

Note

This action has no impact on the cluster because the nodes are already cordoned and drained as part of the maintenance window.

[49678] The Machine status is flapping after migration to containerd¶

Fixed in MOSK 25.1

On cluster machines where any HostOSConfiguration object is targeted and migration to containerd is applied, the machine status may be flapping (Configure → Ready → Configure → Ready) with the HostOSConfiguration-related Ansible tasks constantly restarting. This occurs due to the HostOSConfiguration object state items being constantly added and then removed from related LCMMachine objects.

To work around the issue, temporarily disable all HostOSConfiguration objects until the issue is resolved. The expected Container Cloud release with the issue resolution is targeted to Container Cloud 2.29.0, after the management cluster update to the Cluster release 16.4.0.

To disable HostOSConfiguration objects:

In the machineSelector:matchLabels section of every HostOSConfiguration object, remove the corresponding label selectors for cluster machines.
Wait for each HostOSConfiguration object status to be updated and the machinesStates field to be absent:
```
kubectl -n <namespace> get hoc <hoc-name> -o jsonpath='{.status.machinesStates}'
```
The system response must be empty.

Once the issue is resolved in the target release, re-enable all objects using the same procedure.

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Update notes¶

Although MOSK 24.3.1 is classified as a patch release, as a cloud operator, you will be performing a major update regardless of the upgrade path: whether you are upgrading from patch 24.2.5 or major version 24.3. This is because of the Mirantis Container Runtime update to 23.0.15.

This section describes the specific actions you need to complete to accurately plan and successfully perform the update. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Cluster update schema¶

You can update to the 24.3.1 version from the following cluster versions:

24.3 (released on October 16, 2024)
24.2.5 (released on December 09, 2024)

For the detailed cluster update schema, refer to Managed cluster update schema.

Optional migration of container runtime from Docker to containerd¶

MOSK 24.3.1 enables optional migration of container runtime from Docker to containerd. Usage of containerd introduces a major enhancement for cloud workloads running on top of MOSK, as it helps minimize the network connectivity downtime caused by underlying infrastructure updates.

To minimize the number of required maintenance windows, Mirantis recommends that cloud operators switch to the containerd runtime on the nodes simultaneously with their upgrade to Ubuntu 22.04 if the Ubuntu upgrade has not been completed before.

Note

Migration from Docker to containerd will become mandatory in MOSK 25.1.

Required action

Upgrade Ubuntu to 22.04

Update impact and maintenance windows planning¶

The following table provides details on the update impact on a MOSK cluster.

Impact during update to MOSK 24.3.1¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 0

0: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

To properly plan the update maintenance window, use the following documentation:

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Post-update actions¶

Upgrade Ubuntu to 22.04 and migrate to containerd runtime¶

MOSK 24.3 release series is the last release series to support Ubuntu 20.04 as the host operating system. Ubuntu 20.04 reaches end-of-life in April 2025. Therefore, Mirantis encourages all MOSK users to upgrade their clusters to Ubuntu 22.04 as soon as possible after getting to the 24.3 series. A host operating system upgrade requires reboot of the nodes and can be performed in small batches.

Also, MOSK 24.3.1 introduces the new container runtime for the underlying Kubernetes cluster - containerd. The migration from Docker to containerd in 24.3.1 is optional and requires node cordoning and draining. Therefore, if you decide to start using containerd and have not yet upgraded to Ubuntu 22.04, Mirantis highly recommends that these two changes be applied simultaneously to every node to minimize downtime for cloud workloads and users:

Note

In MOSK 24.3.x, the default container runtime remains Docker for greenfield deployments. Support for greenfield deployments based on containerd will be announced in one of the following releases.

Warning

Update of management or MOSK clusters running Ubuntu 20.04 or Docker runtime will not be possible in the following product series.

Caution

Back up custom Grafana dashboards¶

In MOSK 25.1 and Container Cloud 2.29.0, Grafana will be updated to version 11 where the following deprecated Angular-based plugins will be automatically migrated to the React-based ones:

Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap

This migration may corrupt custom Grafana dashboards that have Angular-based panels. Therefore, if you have such dashboards, back them up and manually upgrade Angular-based panels during the course of MOSK 24.3 and Container Cloud 2.28.x (Cluster releases 17.3.x and 16.3.x) to prevent custom appearance issues after plugin migration in Container Cloud 2.29.0 and MOSK 25.1.

Note

All Grafana dashboards provided by StackLight are also migrated to React automatically. For the list of default dashboards, see View Grafana dashboards.

Warning

For managed clusters, you can perform the backup after the Container Cloud 2.29.0 release date but before updating them to MOSK 25.1.

See also

Mirantis Container Cloud: Release notes

24.3.2 patch¶

The MOSK 24.3.2 patch includes the following updates:

Update of Mirantis Kubernetes Engine (MKE) to 3.7.18
Mirantis Container Runtime (MCR) to 23.0.15 (with containerd 1.6.36)
Optional transition of container runtime from Docker to containerd
Update of minor kernel version to 5.15.0-130-generic
Security fixes for CVEs in images
Resolved product issues

Learn more about the release content and update specifics:

Release artifacts¶

This section lists the components artifacts of the MOSK 24.3.2 release that includes binaries, Docker images, and Helm charts.

MOSK 24.3.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.17.15.tgz	Mirantis Proprietary License

MOSK 24.3.2 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20250124133227.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20250124130157	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20250124130157	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20250124130157	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20250124130157	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-d0f3d7b	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20250124130157	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20250124130157	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20250124130157	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250124130157	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20250124130157	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20250124130157	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20250124130157	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20250124130157	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20241115095845	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20241120070553	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20241024070356	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20241024070202	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241120110423	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20241024075118	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20241104184039	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20250124130157	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20250124130157	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20250124130157	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20241024111938	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241111080549	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20241024065919	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20250124130157	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20241120224419	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20250124130157	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20250124130157	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20250124130157	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20250124130157	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20250124130157	Apache License 2.0

MOSK 24.3.2 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20250124133409.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20250124130157	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20250124130157	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20250124130157	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20250124130157	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-d0f3d7b	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20250124130157	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20250124130157	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20250124130157	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250124130157	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20250124130157	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20250124130157	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20250124130157	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20250124130157	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20241115095845	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20241120070553	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20241024070356	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20241024070202	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241120110423	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20241024075118	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20241104184039	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20250124130157	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20250124130157	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20250124130157	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20241024111938	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241111080549	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20241024065919	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20250124130157	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20250106125618	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20250124130157	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20250124130157	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20250124130157	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20250124130157	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20250124130157	Apache License 2.0

MOSK 24.3.2 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.16.8.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.16.8	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20241209122102	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.7	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.21	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240812125407	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.5	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.22-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.7	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.1-alpine	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.66.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20250110161142	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240919130057	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.3-r21.4.20241206144908	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240906000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.3-r21.4.20241206144908	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.3-r21.4.20241206144908	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240906000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240906000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240906000000	Apache License 2.0

MOSK 24.3.2 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20250113023013	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20241021123242	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	8	8
OpenStack	Common	0	48	48
Tungsten Fabric	Unique	1	9	10
Tungsten Fabric	Common	1	15	16

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.28.5: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.3.2 release:

[47115] [OpenStack] Resolved the issue that prevented the virtual machine with a floating IP address assigned to it from reaching the gateway node located on the same hypervisor.
[48274] [OpenStack] Resolved the issue that caused the failure of live migrations if multiple networks existed on an instance.
[48571] [OpenStack] Resolved the issue that caused Keystone and DNS downtimes during the OpenStack controller replacement.
[48614] [OpenStack] Resolved the issue that prevented the PXE boot from downloading the iPXE file due to restrictive permissions on the tftpboot directory.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.3.2. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[47695] Cinder database sync job fails during upgrade from Antelope to Caracal¶

Fixed in MOSK 24.3.6

2024-10-24 18:55:06.678 1 ERROR cinder pymysql.err.DataError: (1265, "Data truncated for column 'use_quota' at row 24")

To verify if your cluster is affected:

Use the following SQL query against the OpenStack database:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL;
```
If both queries return a zero count, your cluster is not affected.

If either query returns a non-zero count, your cluster is affected.
Verify that all the affected rows are soft-deleted:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL AND deleted=0;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL AND deleted=0;
```
If either query returns a non-zero count, stop and request Mirantis support.

If both queries return zero count, proceed with the workaround.

Workaround:

Manually change the value of the use_quota field to 1, where its value is NULL using the following SQL query:

UPDATE cinder.volumes SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;
UPDATE cinder.snapshots SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;

This action is generally harmless as it only modifies rows that are already soft-deleted, and would eventually be removed by the database cleanup.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[49340] Tag-based filtering does not work for output_kind: audit¶

Fixed in MOSK 25.1

Tag-based filtering of logs using the tag_include parameter does not work for the logging.externalOutputs feature when output_kind: audit is selected.

For example, if the user wants to send only logs from the sudo program and sets tag_include: sudo, none of the logs will be sent to an external destination.

When the fix applies in MOSK 25.1, filtering starts working automatically.

[51524] sf-notifier creates big amount of relogins to Salesforce¶

Fixed in MOSK 24.3.5

Workaround:

Container Cloud version update (for management clusters)
Cluster release version update (for MOSK cluster)
Any sf-notifier-related operation (for all clusters):
- Disable and enable
- Credentials change
- IDs change
- Any configuration change for resources, node selector, tolerations, and log level

Once applied, this workaround must be re-applied whenever one of the above operations is performed in the cluster.

Print the currently used image:

kubectl get deployment sf-notifier -n stacklight -o jsonpath="{.spec.template.spec.containers[0].image}"

Possible results:

mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20250113023013

127.0.0.1:44301/stacklight/sf-notifier:v0.4-20250113023013

Compare the sf-notifier image tag with the list of affected tags. If the image is affected, it has to be replaced. Otherwise, your cluster is not affected.

Affected tags:

v0.4-20241021023015
v0.4-20241028023015
v0.4-20241118023015
v0.4-20241216023012
v0.4-20250113023013
v0.4-20250217023014
v0.4-20250317092322
v0.4-20250414023016

In the resulting string, replace only the tag of the affected image with the desired v0.4-20240828023015 tag. Keep the registry the same as in the original Deployment object.

Resulting images from examples:
```
mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015
```
or
```
127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015
```

Update the sf-notifier Deployment with the new image:

kubectl set image deployment/sf-notifier sf-notifier=<new image> -n stacklight

For example:

kubectl set image deployment/sf-notifier sf-notifier=mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

kubectl set image deployment/sf-notifier sf-notifier=127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

Wait until the pod with the updated image is created, and check the logs. Verify that there are no errors in the logs:
```
kubectl logs pod/<sf-notifier pod> -n stacklight
```

Example of a custom alert to monitor the current tag of the sf-notifier image:

- name: stacklight
  values:
    ...
    prometheusServer:
      ...
      customAlerts:
        ...
      - alert: SFnotifierImageVersion
        annotations:
          description: "sf-notifier image has a buggy tag, please revert deployment image back to sf-notifier:v0.4-20240828023015. This should be fixed in MCC 2.29.3 / MOSK 24.3.5, remove the alert after this release upgrade."
          summary: "This image might be causing too many API logins and exceeding our monthly API budget, please act immediately"
        expr: >-
          avg(kube_pod_container_info{container="sf-notifier",image_spec=~".*v0.4-(20241021023015|20241028023015|20241118023015|20241216023012|20250113023013|20250217023014|20250317092322|20250414023016)"})
        for: 5m
        labels:
          service: alertmanager
          severity: critical

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[49078] Migration to containerd is stuck due to orphaned Docker containers¶

Fixed in MOSK 25.1

During migration of container runtime from Docker to containerd, some nodes may get stuck with the following error in LCM logs:

Orphaned Docker containers found after migration. Unable to proceed, please
check the node manually: exit status 2

The cluster is affected if orphaned containers with the k8s_ prefix are present on the affected nodes:

docker ps -a --format '{{ .Names }}' | grep '^k8s_'

Workaround:

Inspect recent Ansible logs at /var/log/lcm/* and make sure that the only failed task during migration is Delete running pods. If so, proceed to the next step. Otherwise, contact Mirantis support for further information.
Stop and remove orphaned containers with the k8s_ prefix.

Note

This action has no impact on the cluster because the nodes are already cordoned and drained as part of the maintenance window.

[49678] The Machine status is flapping after migration to containerd¶

Fixed in MOSK 25.1

To disable HostOSConfiguration objects:

In the machineSelector:matchLabels section of every HostOSConfiguration object, remove the corresponding label selectors for cluster machines.
Wait for each HostOSConfiguration object status to be updated and the machinesStates field to be absent:
```
kubectl -n <namespace> get hoc <hoc-name> -o jsonpath='{.status.machinesStates}'
```
The system response must be empty.

Once the issue is resolved in the target release, re-enable all objects using the same procedure.

[49705] Cluster update is stuck due to unhealthy tf-vrouter-agent-dpdk pods¶

To work around the issue, manually remove the tf-vrouter-agent-dpdk pods.

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Update notes¶

Cluster update schema¶

You can update to the 24.3.2 version from the following cluster versions:

Patch update: 24.3.1 (January 06, 2025)
Major update: 24.3 (October 16, 2024)

For the detailed cluster update schema, refer to Managed cluster update schema.

Patch update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.3 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Major update impact and maintenance windows planning¶

The following table provides details on the update impact on a MOSK cluster.

Major update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 1

1: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

To properly plan the update maintenance window, use the following documentation:

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Post-update actions¶

Upgrade Ubuntu to 22.04 and migrate to containerd runtime¶

Also, MOSK 24.3.1 introduces the new container runtime for the underlying Kubernetes cluster - containerd. The migration from Docker to containerd in 24.3.2 is still optional and requires node cordoning and draining. Therefore, if you decide to start using containerd and have not yet upgraded to Ubuntu 22.04, Mirantis highly recommends that these two changes be applied simultaneously to every node to minimize downtime for cloud workloads and users:

Note

In MOSK 24.3.x, the default container runtime remains Docker for greenfield deployments. Support for greenfield deployments based on containerd will be announced in one of the following releases.

Warning

Update of management or MOSK clusters running Ubuntu 20.04 or Docker runtime will not be possible in the following product series.

Usage of third-party software, which is not part of Mirantis-supported configurations, for example, the use of custom DPDK modules, may block upgrade of an operating system distribution. Users are ully responsible for ensuring the compatibility of such custom components with the latest supported Ubuntu version.

If you have not upgraded the operating system distribution on your machines to 22.04 yet, Mirantis recommends migrating machines from Docker to containerd on managed clusters together with distribution upgrade to minimize the maintenance window. In this case, ensure that all cluster machines are updated at once during the same maintenance window to prevent machines from running different container runtimes.

Back up custom Grafana dashboards¶

In MOSK 25.1 and Container Cloud 2.29.0, Grafana will be updated to version 11 where the following deprecated Angular-based plugins will be automatically migrated to the React-based ones:

Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap

This migration may corrupt custom Grafana dashboards that have Angular-based panels. Therefore, if you have such dashboards, back them up and manually upgrade Angular-based panels during the course of MOSK 24.3 and Container Cloud 2.28.x (Cluster releases 17.3.x and 16.3.x) to prevent custom appearance issues after plugin migration in Container Cloud 2.29.0 and MOSK 25.1.

Note

All Grafana dashboards provided by StackLight are also migrated to React automatically. For the list of default dashboards, see View Grafana dashboards.

Warning

For managed clusters, you can perform the backup after the Container Cloud 2.29.0 release date but before updating them to MOSK 25.1.

See also

Mirantis Container Cloud: Release notes

24.3.3 patch¶

The MOSK 24.3.3 patch includes the following updates:

Update of Mirantis Kubernetes Engine (MKE) to 3.7.20
Mirantis Container Runtime 23.0.15 with docker-ee-cli updated to 23.0.17
Update of minor kernel version to 5.15.0-134-generic
Security fixes for CVEs in images

Learn more about the release content and update specifics:

Release artifacts¶

This section lists the components artifacts of the MOSK 24.3.3 release that includes binaries, Docker images, and Helm charts.

MOSK 24.3.3 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.17.22.tgz	Mirantis Proprietary License

MOSK 24.3.3 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20250314202650.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20250314195424	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20250314195424	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20250314195424	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20250314195424	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-8748745	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20250314195424	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20250314195424	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20250314195424	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250314195424	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20250314195424	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20250314195424	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20250314195424	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20250314195424	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250306085321	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250218074818	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250218074818	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250310060023	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20250213123431	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.18-alpine-20250213142511	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250225092111	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20250312074328	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20250314195424	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20250314195424	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20250314195424	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250318060007	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250226152200	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20250314195424	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250216082520	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20250304152010	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20250314195424	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20250314195424	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20250314195424	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20250314195424	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20250314195424	Apache License 2.0

MOSK 24.3.3 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20250314202953.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20250314195424	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20250314195424	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20250314195424	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20250314195424	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-8748745	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20250314195424	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20250314195424	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20250314195424	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250314195424	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20250314195424	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20250314195424	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20250314195424	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20250314195424	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250306085321	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250218074818	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250218074818	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250310060023	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20250213123431	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.18-alpine-20250213142511	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250225092111	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20250312074328	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20250314195424	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20250314195424	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20250314195424	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250318060007	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250226152200	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20250314195424	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250216082520	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20250303131743	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20250314195424	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20250314195424	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20250314195424	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20250314195424	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20250314195424	Apache License 2.0

MOSK 24.3.3 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.16.10.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.16.10	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20250227124904	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.8-mira	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.21	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20250317171052	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.9	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.26-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.9	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.2-alpine3.21	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.66.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20250318125939	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20250317121025	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.3-r21.4.20241206144908	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.3-r21.4.20241206144908	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240906000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.3-r21.4.20241206144908	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.3-r21.4.20241206144908	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240906000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240906000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240906000000	Apache License 2.0

MOSK 24.3.3 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20250317023016	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20250205153914	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	3	23	26
OpenStack	Common	12	88	100
Tungsten Fabric	Unique	2	18	20
Tungsten Fabric	Common	6	79	85

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.29.1: Security notes.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.3.3. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[49078] Migration to containerd is stuck due to orphaned Docker containers¶

Fixed in MOSK 25.1

During migration of container runtime from Docker to containerd, some nodes may get stuck with the following error in LCM logs:

Orphaned Docker containers found after migration. Unable to proceed, please
check the node manually: exit status 2

The cluster is affected if orphaned containers with the k8s_ prefix are present on the affected nodes:

docker ps -a --format '{{ .Names }}' | grep '^k8s_'

Workaround:

Inspect recent Ansible logs at /var/log/lcm/* and make sure that the only failed task during migration is Delete running pods. If so, proceed to the next step. Otherwise, contact Mirantis support for further information.
Stop and remove orphaned containers with the k8s_ prefix.

Note

This action has no impact on the cluster because the nodes are already cordoned and drained as part of the maintenance window.

[49678] The Machine status is flapping after migration to containerd¶

Fixed in MOSK 25.1

To disable HostOSConfiguration objects:

In the machineSelector:matchLabels section of every HostOSConfiguration object, remove the corresponding label selectors for cluster machines.
Wait for each HostOSConfiguration object status to be updated and the machinesStates field to be absent:
```
kubectl -n <namespace> get hoc <hoc-name> -o jsonpath='{.status.machinesStates}'
```
The system response must be empty.

Once the issue is resolved in the target release, re-enable all objects using the same procedure.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[47695] Cinder database sync job fails during upgrade from Antelope to Caracal¶

Fixed in MOSK 24.3.6

2024-10-24 18:55:06.678 1 ERROR cinder pymysql.err.DataError: (1265, "Data truncated for column 'use_quota' at row 24")

To verify if your cluster is affected:

Use the following SQL query against the OpenStack database:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL;
```
If both queries return a zero count, your cluster is not affected.

If either query returns a non-zero count, your cluster is affected.
Verify that all the affected rows are soft-deleted:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL AND deleted=0;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL AND deleted=0;
```
If either query returns a non-zero count, stop and request Mirantis support.

If both queries return zero count, proceed with the workaround.

Workaround:

Manually change the value of the use_quota field to 1, where its value is NULL using the following SQL query:

UPDATE cinder.volumes SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;
UPDATE cinder.snapshots SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;

This action is generally harmless as it only modifies rows that are already soft-deleted, and would eventually be removed by the database cleanup.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[49340] Tag-based filtering does not work for output_kind: audit¶

Fixed in MOSK 25.1

Tag-based filtering of logs using the tag_include parameter does not work for the logging.externalOutputs feature when output_kind: audit is selected.

For example, if the user wants to send only logs from the sudo program and sets tag_include: sudo, none of the logs will be sent to an external destination.

When the fix applies in MOSK 25.1, filtering starts working automatically.

[51524] sf-notifier creates big amount of relogins to Salesforce¶

Fixed in MOSK 24.3.5

Workaround:

Container Cloud version update (for management clusters)
Cluster release version update (for MOSK cluster)
Any sf-notifier-related operation (for all clusters):
- Disable and enable
- Credentials change
- IDs change
- Any configuration change for resources, node selector, tolerations, and log level

Once applied, this workaround must be re-applied whenever one of the above operations is performed in the cluster.

Print the currently used image:

kubectl get deployment sf-notifier -n stacklight -o jsonpath="{.spec.template.spec.containers[0].image}"

Possible results:

mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20250113023013

127.0.0.1:44301/stacklight/sf-notifier:v0.4-20250113023013

Compare the sf-notifier image tag with the list of affected tags. If the image is affected, it has to be replaced. Otherwise, your cluster is not affected.

Affected tags:

v0.4-20241021023015
v0.4-20241028023015
v0.4-20241118023015
v0.4-20241216023012
v0.4-20250113023013
v0.4-20250217023014
v0.4-20250317092322
v0.4-20250414023016

In the resulting string, replace only the tag of the affected image with the desired v0.4-20240828023015 tag. Keep the registry the same as in the original Deployment object.

Resulting images from examples:
```
mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015
```
or
```
127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015
```

Update the sf-notifier Deployment with the new image:

kubectl set image deployment/sf-notifier sf-notifier=<new image> -n stacklight

For example:

kubectl set image deployment/sf-notifier sf-notifier=mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

kubectl set image deployment/sf-notifier sf-notifier=127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

Wait until the pod with the updated image is created, and check the logs. Verify that there are no errors in the logs:
```
kubectl logs pod/<sf-notifier pod> -n stacklight
```

Example of a custom alert to monitor the current tag of the sf-notifier image:

- name: stacklight
  values:
    ...
    prometheusServer:
      ...
      customAlerts:
        ...
      - alert: SFnotifierImageVersion
        annotations:
          description: "sf-notifier image has a buggy tag, please revert deployment image back to sf-notifier:v0.4-20240828023015. This should be fixed in MCC 2.29.3 / MOSK 24.3.5, remove the alert after this release upgrade."
          summary: "This image might be causing too many API logins and exceeding our monthly API budget, please act immediately"
        expr: >-
          avg(kube_pod_container_info{container="sf-notifier",image_spec=~".*v0.4-(20241021023015|20241028023015|20241118023015|20241216023012|20250113023013|20250217023014|20250317092322|20250414023016)"})
        for: 5m
        labels:
          service: alertmanager
          severity: critical

Container Cloud web UI¶

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

Update notes¶

Cluster update schema¶

You can update to the 24.3.3 version from the following cluster versions:

Patch update: 24.3.1 (January 06, 2025) or 24.3.2 (February 03, 2025)
Major update: 24.3 (October 16, 2024)

For the detailed cluster update schema, refer to Managed cluster update schema.

Patch update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.3 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Major update impact and maintenance windows planning¶

The following table provides details on the update impact on a MOSK cluster.

Major update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 1

1: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Pre-update actions¶

Update MOSK clusters to Ubuntu 22.04¶

Management cluster update to Container Cloud 2.29.1 will be blocked if at least one node of any related MOSK cluster is running Ubuntu 20.04.

Therefore, ensure that every node of your MOSK clusters are running Ubuntu 22.04 to unblock management cluster update to Container Cloud 2.29.1 and MOSK cluster update to 24.3.3.

For the update procedure, refer to Upgrade an operating system distribution.

Caution

Migrate container runtime from Docker to containerd¶

Since 24.3.1, MOSK introduces the new container runtime for the underlying Kubernetes cluster - containerd. The migration from Docker to containerd in 24.3.3 is still optional and requires node cordoning and draining.

If you decide to start using containerd and have not yet upgraded to Ubuntu 22.04, Mirantis highly recommends that these two changes be applied simultaneously to every node to minimize downtime for cloud workloads and users. In this case, ensure that all cluster machines are updated at once during the same maintenance window to prevent machines from running different container runtimes.

For the migration procedure, refer to Migrate container runtime from Docker to containerd.

Note

In MOSK 24.3.x, the default container runtime remains Docker for greenfield deployments. Support for greenfield deployments based on containerd is added in Container Cloud 2.29.0 (Cluster release 16.4.0) for management clusters and in MOSK 25.1 for MOSK clusters.

Back up custom Grafana dashboards¶

In MOSK 25.1 and Container Cloud 2.29.0, Grafana will be updated to version 11 where the following deprecated Angular-based plugins will be automatically migrated to the React-based ones:

Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap

This migration may corrupt custom Grafana dashboards that have Angular-based panels. Therefore, if you have such dashboards, back them up and manually upgrade Angular-based panels during the course of MOSK 24.3 and Container Cloud 2.28.x (Cluster releases 17.3.x and 16.3.x) to prevent custom appearance issues after plugin migration in Container Cloud 2.29.0 and MOSK 25.1.

Note

All Grafana dashboards provided by StackLight are also migrated to React automatically. For the list of default dashboards, see View Grafana dashboards.

Warning

For MOSK clusters, you can perform the backup after the Container Cloud 2.29.0 release date but before updating them to MOSK 25.1.

See also

Mirantis Container Cloud: Release notes

24.3.4 patch¶

The MOSK 24.3.4 patch includes the following updates:

Update of minor kernel version to 5.15.0-135-generic
Security fixes for CVEs in images
Resolved product issues

Learn more about the release content and update specifics:

Release artifacts¶

This section lists the components artifacts of the MOSK 24.3.4 release that includes binaries, Docker images, and Helm charts.

MOSK 24.3.4 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.17.24.tgz	Mirantis Proprietary License

MOSK 24.3.4 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20250404071524.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20250404064445	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20250404064445	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20250404064445	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20250404064445	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-8748745	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20250404064445	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20250404064445	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20250404064445	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250404064445	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20250404064445	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20250404064445	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20250404064445	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20250404064445	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250306085321	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250408072712	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250408072712	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250407060024	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.38-alpine-20250328111424	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.21-alpine-20250409093259	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250225092111	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.5-alpine-20250411051037	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20250404064445	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20250404064445	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20250404064445	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250410070732	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250411050643	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20250404064445	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250216082520	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20250304152010	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20250404064445	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20250404064445	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20250404064445	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20250404064445	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20250404064445	Apache License 2.0

MOSK 24.3.4 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20250404071609.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20250404064445	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20250404064445	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20250404064445	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20250404064445	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-8748745	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20250404064445	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20250404064445	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20250404064445	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250404064445	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20250404064445	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20250404064445	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20250404064445	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20250404064445	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250306085321	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250408072712	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250408072712	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250407060024	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.38-alpine-20250328111424	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.21-alpine-20250409093259	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250225092111	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.5-alpine-20250411051037	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20250404064445	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20250404064445	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20250404064445	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250410070732	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250411050643	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20250404064445	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250216082520	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20250410080758	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20250404064445	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20250404064445	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20250404064445	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20250404064445	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20250404064445	Apache License 2.0

MOSK 24.3.4 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.16.12.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.16.12	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20250324105307	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.8-mira	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.21	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20250317171052	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.9	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.26-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.9	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.2-alpine3.21	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.66.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20250318125939	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20250317121025	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.3-r21.4.20250403150542	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240906000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.3-r21.4.20250403150542	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.3-r21.4.20250403150542	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240906000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240906000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240906000000	Apache License 2.0

MOSK 24.3.4 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20250414023016	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20250205153914	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	1	19	20
OpenStack	Common	4	31	35
Tungsten Fabric	Unique	0	4	4
Tungsten Fabric	Common	0	5	5

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.29.2: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.3.4 release:

[50138] [OpenStack] Resolved the issue where virtual machines could restart unexpectedly during live migration operations

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.3.4. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[49078] Migration to containerd is stuck due to orphaned Docker containers¶

Fixed in MOSK 25.1

During migration of container runtime from Docker to containerd, some nodes may get stuck with the following error in LCM logs:

Orphaned Docker containers found after migration. Unable to proceed, please
check the node manually: exit status 2

The cluster is affected if orphaned containers with the k8s_ prefix are present on the affected nodes:

docker ps -a --format '{{ .Names }}' | grep '^k8s_'

Workaround:

Inspect recent Ansible logs at /var/log/lcm/* and make sure that the only failed task during migration is Delete running pods. If so, proceed to the next step. Otherwise, contact Mirantis support for further information.
Stop and remove orphaned containers with the k8s_ prefix.

Note

This action has no impact on the cluster because the nodes are already cordoned and drained as part of the maintenance window.

[49678] The Machine status is flapping after migration to containerd¶

Fixed in MOSK 25.1

To disable HostOSConfiguration objects:

In the machineSelector:matchLabels section of every HostOSConfiguration object, remove the corresponding label selectors for cluster machines.
Wait for each HostOSConfiguration object status to be updated and the machinesStates field to be absent:
```
kubectl -n <namespace> get hoc <hoc-name> -o jsonpath='{.status.machinesStates}'
```
The system response must be empty.

Once the issue is resolved in the target release, re-enable all objects using the same procedure.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[47695] Cinder database sync job fails during upgrade from Antelope to Caracal¶

Fixed in MOSK 24.3.6

2024-10-24 18:55:06.678 1 ERROR cinder pymysql.err.DataError: (1265, "Data truncated for column 'use_quota' at row 24")

To verify if your cluster is affected:

Use the following SQL query against the OpenStack database:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL;
```
If both queries return a zero count, your cluster is not affected.

If either query returns a non-zero count, your cluster is affected.
Verify that all the affected rows are soft-deleted:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL AND deleted=0;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL AND deleted=0;
```
If either query returns a non-zero count, stop and request Mirantis support.

If both queries return zero count, proceed with the workaround.

Workaround:

Manually change the value of the use_quota field to 1, where its value is NULL using the following SQL query:

UPDATE cinder.volumes SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;
UPDATE cinder.snapshots SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;

This action is generally harmless as it only modifies rows that are already soft-deleted, and would eventually be removed by the database cleanup.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[49340] Tag-based filtering does not work for output_kind: audit¶

Fixed in MOSK 25.1

Tag-based filtering of logs using the tag_include parameter does not work for the logging.externalOutputs feature when output_kind: audit is selected.

For example, if the user wants to send only logs from the sudo program and sets tag_include: sudo, none of the logs will be sent to an external destination.

When the fix applies in MOSK 25.1, filtering starts working automatically.

[51524] sf-notifier creates big amount of relogins to Salesforce¶

Fixed in MOSK 24.3.5

Workaround:

Container Cloud version update (for management clusters)
Cluster release version update (for MOSK cluster)
Any sf-notifier-related operation (for all clusters):
- Disable and enable
- Credentials change
- IDs change
- Any configuration change for resources, node selector, tolerations, and log level

Once applied, this workaround must be re-applied whenever one of the above operations is performed in the cluster.

Print the currently used image:

kubectl get deployment sf-notifier -n stacklight -o jsonpath="{.spec.template.spec.containers[0].image}"

Possible results:

mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20250113023013

127.0.0.1:44301/stacklight/sf-notifier:v0.4-20250113023013

Compare the sf-notifier image tag with the list of affected tags. If the image is affected, it has to be replaced. Otherwise, your cluster is not affected.

Affected tags:

v0.4-20241021023015
v0.4-20241028023015
v0.4-20241118023015
v0.4-20241216023012
v0.4-20250113023013
v0.4-20250217023014
v0.4-20250317092322
v0.4-20250414023016

In the resulting string, replace only the tag of the affected image with the desired v0.4-20240828023015 tag. Keep the registry the same as in the original Deployment object.

Resulting images from examples:
```
mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015
```
or
```
127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015
```

Update the sf-notifier Deployment with the new image:

kubectl set image deployment/sf-notifier sf-notifier=<new image> -n stacklight

For example:

kubectl set image deployment/sf-notifier sf-notifier=mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

kubectl set image deployment/sf-notifier sf-notifier=127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

Wait until the pod with the updated image is created, and check the logs. Verify that there are no errors in the logs:
```
kubectl logs pod/<sf-notifier pod> -n stacklight
```

Example of a custom alert to monitor the current tag of the sf-notifier image:

- name: stacklight
  values:
    ...
    prometheusServer:
      ...
      customAlerts:
        ...
      - alert: SFnotifierImageVersion
        annotations:
          description: "sf-notifier image has a buggy tag, please revert deployment image back to sf-notifier:v0.4-20240828023015. This should be fixed in MCC 2.29.3 / MOSK 24.3.5, remove the alert after this release upgrade."
          summary: "This image might be causing too many API logins and exceeding our monthly API budget, please act immediately"
        expr: >-
          avg(kube_pod_container_info{container="sf-notifier",image_spec=~".*v0.4-(20241021023015|20241028023015|20241118023015|20241216023012|20250113023013|20250217023014|20250317092322|20250414023016)"})
        for: 5m
        labels:
          service: alertmanager
          severity: critical

Container Cloud web UI¶

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

Update notes¶

Cluster update schema¶

You can update to the 24.3.4 version from the following cluster versions:

Patch update: 24.3.3 (March 26, 2025), 24.3.2 (February 03, 2025), or 24.3.1 (January 06, 2025)
Major update: 24.3 (October 16, 2024)

For the detailed cluster update schema, refer to Managed cluster update schema.

Patch update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.3 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Major update impact and maintenance windows planning¶

The following table provides details on the update impact on a MOSK cluster.

Major update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 1

1: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Post-update actions¶

Mandatory migration of container runtime from Docker to containerd¶

Migration of container runtime from Docker to containerd, which is implemented for existing management and managed clusters, becomes mandatory in the scope of Container Cloud 2.29.x. Otherwise, the management cluster update to 2.30.0 will be blocked.

The use of containerd allows for better Kubernetes performance and component update without pod restart when applying fixes for CVEs. For the migration procedure, refer to Migrate container runtime from Docker to containerd.

Important

Container runtime migration involves machine cordoning and draining.

See also

Mirantis Container Cloud: Release notes

24.3.5 patch¶

The MOSK 24.3.5 patch includes the following updates:

MKE 3.7.22
MCR 23.0.15 with docker-ee-cli updated to 23.0.18
Security fixes for CVEs in images
Resolved product issues

Learn more about the release content and update specifics:

Release artifacts¶

This section lists the components artifacts of the MOSK 24.3.5 release that includes binaries, Docker images, and Helm charts.

MOSK 24.3.5 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.17.27.tgz	Mirantis Proprietary License

MOSK 24.3.5 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20250510093810.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20250510090605	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20250510090605	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20250510090605	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20250510090605	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-8748745	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20250510090605	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20250510090605	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20250510090605	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250510090605	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20250510090605	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20250510090605	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20250510090605	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20250510090605	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250419095835	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250408072712	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250408072712	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250510090605	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.38-alpine-20250328111424	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.21-alpine-20250421162920	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250422152737	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.5-alpine-20250411051037	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20250510090605	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20250510090605	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20250510090605	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250410070732	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250421162917	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20250422051959:v0.32.2-amd64-20250418101136	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20250510090605	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250216082520	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20250415171202	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20250510090605	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20250510090605	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20250510090605	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20250510090605	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20250510090605	Apache License 2.0

MOSK 24.3.5 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20250510093025.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20250510090605	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20250510090605	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20250510090605	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20250510090605	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-8748745	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20250510090605	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20250510090605	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20250510090605	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250510090605	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20250510090605	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20250510090605	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20250510090605	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20250510090605	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250419095835	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250408072712	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250408072712	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250415072626	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.38-alpine-20250328111424	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.21-alpine-20250421162920	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250422152737	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.5-alpine-20250411051037	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20250510090605	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20250510090605	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20250510090605	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250410070732	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250421162917	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20250422051959:v0.32.2-amd64-20250418101136	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20250510090605	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250216082520	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20250418201240	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20250510090605	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20250510090605	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20250510090605	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20250510090605	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20250510090605	Apache License 2.0

MOSK 24.3.5 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.16.12.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.16.12	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20250324105307	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.8-mira	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.21	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20250317171052	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.9	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.26-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.9	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.2-alpine3.21	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.66.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20250318125939	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20250317121025	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.3-r21.4.20250403150542	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240906000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.3-r21.4.20250403150542	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.3-r21.4.20250403150542	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240906000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240906000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240906000000	Apache License 2.0

MOSK 24.3.5 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20250512023015	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20250205153914	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	1	9	10
OpenStack	Common	1	10	11

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.29.3: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.3.5 release:

[51849] [OpenStack] Resolved the issue that caused the restart of the Open vSwitch, even with the pinned images.
[51932] [OpenStack] Resolved the issue that caused the failure of the readiness probe for Nova, Neutron, and some other OpenStack services.
[51524] [StackLight] Resolved the issue that caused sf-notifier to create big amount of relogins to Salesforce.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.3.5. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[49078] Migration to containerd is stuck due to orphaned Docker containers¶

Fixed in MOSK 25.1

During migration of container runtime from Docker to containerd, some nodes may get stuck with the following error in LCM logs:

Orphaned Docker containers found after migration. Unable to proceed, please
check the node manually: exit status 2

The cluster is affected if orphaned containers with the k8s_ prefix are present on the affected nodes:

docker ps -a --format '{{ .Names }}' | grep '^k8s_'

Workaround:

Inspect recent Ansible logs at /var/log/lcm/* and make sure that the only failed task during migration is Delete running pods. If so, proceed to the next step. Otherwise, contact Mirantis support for further information.
Stop and remove orphaned containers with the k8s_ prefix.

Note

This action has no impact on the cluster because the nodes are already cordoned and drained as part of the maintenance window.

[49678] The Machine status is flapping after migration to containerd¶

Fixed in MOSK 25.1

To disable HostOSConfiguration objects:

In the machineSelector:matchLabels section of every HostOSConfiguration object, remove the corresponding label selectors for cluster machines.
Wait for each HostOSConfiguration object status to be updated and the machinesStates field to be absent:
```
kubectl -n <namespace> get hoc <hoc-name> -o jsonpath='{.status.machinesStates}'
```
The system response must be empty.

Once the issue is resolved in the target release, re-enable all objects using the same procedure.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[47695] Cinder database sync job fails during upgrade from Antelope to Caracal¶

Fixed in MOSK 24.3.6

2024-10-24 18:55:06.678 1 ERROR cinder pymysql.err.DataError: (1265, "Data truncated for column 'use_quota' at row 24")

To verify if your cluster is affected:

Use the following SQL query against the OpenStack database:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL;
```
If both queries return a zero count, your cluster is not affected.

If either query returns a non-zero count, your cluster is affected.
Verify that all the affected rows are soft-deleted:
```
SELECT COUNT(*) FROM cinder.volumes WHERE use_quota IS NULL AND deleted=0;
SELECT COUNT(*) FROM cinder.snapshots WHERE use_quota IS NULL AND deleted=0;
```
If either query returns a non-zero count, stop and request Mirantis support.

If both queries return zero count, proceed with the workaround.

Workaround:

Manually change the value of the use_quota field to 1, where its value is NULL using the following SQL query:

UPDATE cinder.volumes SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;
UPDATE cinder.snapshots SET use_quota=1 WHERE deleted=1 AND use_quota IS NULL;

This action is generally harmless as it only modifies rows that are already soft-deleted, and would eventually be removed by the database cleanup.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[49340] Tag-based filtering does not work for output_kind: audit¶

Fixed in MOSK 25.1

Tag-based filtering of logs using the tag_include parameter does not work for the logging.externalOutputs feature when output_kind: audit is selected.

For example, if the user wants to send only logs from the sudo program and sets tag_include: sudo, none of the logs will be sent to an external destination.

When the fix applies in MOSK 25.1, filtering starts working automatically.

Container Cloud web UI¶

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

Update notes¶

Cluster update schema¶

You can update to the 24.3.5 version from the following cluster versions:

Patch update: 24.3.4 (April 22, 2025), 24.3.3 (March 26, 2025), 24.3.2 (February 03, 2025), or 24.3.1 (January 06, 2025)
Major update: 24.3 (October 16, 2024)

For the detailed cluster update schema, refer to Managed cluster update schema.

Patch update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.3 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Major update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a major release.

Major update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 1

1: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Post-update actions¶

Mandatory migration of container runtime from Docker to containerd¶

Important

Container runtime migration involves machine cordoning and draining.

See also

Mirantis Container Cloud: Release notes

24.3.6 patch¶

The MOSK 24.3.6 patch includes the following updates:

MKE 3.7.23
Kernel 5.15.0-140-generic
Ceph 18.2.7-2.cve
Security fixes for CVEs in images
Resolved product issues

Learn more about the release content and update specifics:

Release artifacts¶

This section lists the components artifacts of the MOSK 24.3.6 release that includes binaries, Docker images, and Helm charts.

MOSK 24.3.6 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.17.30.tgz	Mirantis Proprietary License

MOSK 24.3.6 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20250609101026.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20250609092426	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20250609092426	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20250609092426	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20250609092426	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-8748745	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20250609092426	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20250609092426	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20250609092426	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250609092426	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20250609092426	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20250609092426	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20250609092426	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20250609092426	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250514041009	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250408072712	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250408072712	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250609092426	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.38-alpine-20250328111424	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.21-alpine-20250421162920	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250422152737	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.5-alpine-20250521065429	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20250609092426	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20250609092426	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20250609092426	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250410070732	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250421162917	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20250422051959:v0.32.2-amd64-20250418101136	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20250609092426	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250216082520	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20250415171202	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20250609092426	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20250609092426	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20250609092426	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20250609092426	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20250609092426	Apache License 2.0

MOSK 24.3.6 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20250609121711.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20250609115514	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20250609115514	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20250609115514	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20250609115514	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-8748745	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20250609115514	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20250609115514	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20250609115514	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250609092426	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20250609115514	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20250609115514	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20250609115514	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20250609115514	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250514041009	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250408072712	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250408072712	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250415072626	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.38-alpine-20250328111424	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.21-alpine-20250421162920	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250422152737	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.5-alpine-20250521065429	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20250609115514	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20250609115514	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20250609115514	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250213163602	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250410070732	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250421162917	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20250422051959:v0.32.2-amd64-20250418101136	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20250213141444	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20250609115514	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250216082520	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20250418201240	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20250609115514	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20250609115514	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20250609115514	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20250609115514	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20250609115514	Apache License 2.0

MOSK 24.3.6 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.16.14.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.16.14	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20250429122611	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.8-mira	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.22	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20250505091631	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.9	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.26-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.9	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.2-alpine3.21	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.66.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20250522085705	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20250317121025	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.3-r21.4.20250403150542	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.3-r21.4.20250403150542	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240906000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.3-r21.4.20250403150542	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.3-r21.4.20250403150542	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240906000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240906000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240906000000	Apache License 2.0

MOSK 24.3.6 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20250526023013	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20250205153914	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	8	8
OpenStack	Common	0	8	8
Tungsten Fabric	Unique	2	5	7
Tungsten Fabric	Common	2	6	8

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.29.4: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.3.6 release:

[47695] [OpenStack] Resolved the issue that caused the failure of the Cinder database sync job during upgrade of OpenStack Antelope to Caracal.
[51949] [Tungsten Fabric] Resolved the issue that caused the TF Operator to fail during conversion.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.3.6. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[49078] Migration to containerd is stuck due to orphaned Docker containers¶

Fixed in MOSK 25.1

During migration of container runtime from Docker to containerd, some nodes may get stuck with the following error in LCM logs:

Orphaned Docker containers found after migration. Unable to proceed, please
check the node manually: exit status 2

The cluster is affected if orphaned containers with the k8s_ prefix are present on the affected nodes:

docker ps -a --format '{{ .Names }}' | grep '^k8s_'

Workaround:

Inspect recent Ansible logs at /var/log/lcm/* and make sure that the only failed task during migration is Delete running pods. If so, proceed to the next step. Otherwise, contact Mirantis support for further information.
Stop and remove orphaned containers with the k8s_ prefix.

Note

This action has no impact on the cluster because the nodes are already cordoned and drained as part of the maintenance window.

[49678] The Machine status is flapping after migration to containerd¶

Fixed in MOSK 25.1

To disable HostOSConfiguration objects:

In the machineSelector:matchLabels section of every HostOSConfiguration object, remove the corresponding label selectors for cluster machines.
Wait for each HostOSConfiguration object status to be updated and the machinesStates field to be absent:
```
kubectl -n <namespace> get hoc <hoc-name> -o jsonpath='{.status.machinesStates}'
```
The system response must be empty.

Once the issue is resolved in the target release, re-enable all objects using the same procedure.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[49340] Tag-based filtering does not work for output_kind: audit¶

Fixed in MOSK 25.1

Tag-based filtering of logs using the tag_include parameter does not work for the logging.externalOutputs feature when output_kind: audit is selected.

For example, if the user wants to send only logs from the sudo program and sets tag_include: sudo, none of the logs will be sent to an external destination.

When the fix applies in MOSK 25.1, filtering starts working automatically.

Container Cloud web UI¶

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

Update notes¶

Cluster update schema¶

You can update to the 24.3.6 version from the following cluster versions:

Patch update: 24.3.5 (May 20, 2025), 24.3.4 (April 22, 2025), 24.3.3 (March 26, 2025), 24.3.2 (February 03, 2025), or 24.3.1 (January 06, 2025)
Major update: 24.3 (October 16, 2024)

Note

The next patch 24.3.7 will be the last patch release in the 24.3. series. From 24.3.7, there will be a possibility to update to 25.1.1.

For the detailed cluster update schema, refer to Managed cluster update schema.

Patch update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.3 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Major update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a major release.

Major update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 1

1: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Post-update actions¶

Mandatory migration of container runtime from Docker to containerd¶

Important

Container runtime migration involves machine cordoning and draining.

See also

Mirantis Container Cloud: Release notes

24.3.7 patch¶

The MOSK 24.3.7 patch includes the following updates:

Ubuntu Kernel 5.15.0-142-generic
Security fixes for CVEs in images
Resolved product issues

Learn more about the release content and update specifics:

Release artifacts¶

This section lists the components artifacts of the MOSK 24.3.7 release that includes binaries, Docker images, and Helm charts.

MOSK 24.3.7 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.17.35.tgz	Mirantis Proprietary License

MOSK 24.3.7 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20250707015824.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20250707013559	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20250707013559	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20250707013559	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20250707013559	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-55e4c83	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20250707013559	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20250707013559	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20250707013559	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250707013559	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20250707013559	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20250707013559	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20250707013559	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20250707013559	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250514041009	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250408072712	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250408072712	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250707013559	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.38-alpine-20250328111424	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.21-alpine-20250421162920	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250422152737	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.5-alpine-20250521065429	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20250707013559	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20250707013559	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20250707013559	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250624100054	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20250624100054	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250410070732	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250421162917	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20250422051959:v0.32.2-amd64-20250418101136	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.4-20250624093033	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20250707013559	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250624110441	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20250415171202	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20250707013559	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20250707013559	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20250707013559	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20250707013559	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20250707013559	Apache License 2.0

MOSK 24.3.7 OpenStack Caracal binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-caracal-20250704143551.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-caracal-1f6ade2-20240408162450.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-caracal-20240523065213.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:caracal-jammy-20250704140944	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:caracal-jammy-20250704140944	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:caracal-jammy-20250704140944	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:caracal-jammy-20250704140944	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-55e4c83	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:caracal-jammy-20250704140944	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:caracal-jammy-20250704140944	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:caracal-jammy-20250704140944	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20250707013559	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:caracal-jammy-20250704140944	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:caracal-jammy-20250704140944	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:caracal-jammy-20250704140944	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:caracal-jammy-20250704140944	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20250514041009	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20250408072712	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20250408072712	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:antelope-jammy-20250707013559	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.38-alpine-20250328111424	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.21-alpine-20250421162920	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20250422152737	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.5-alpine-20250521065429	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20250218081722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.15.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.16.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:caracal-jammy-20250704140944	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:caracal-jammy-20250704140944	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:caracal-jammy-20250704140944	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20250624100054	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20250624100054	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20250410070732	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.14-20250421162917	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20250422051959:v0.32.2-amd64-20250418101136	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.4-20250624093033	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:caracal-jammy-20250704140944	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20250624110441	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:caracal-jammy-20250624193837	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:caracal-jammy-20250704140944	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:caracal-jammy-20250704140944	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:caracal-jammy-20250704140944	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:caracal-jammy-20250704140944	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:caracal-jammy-20250704140944	Apache License 2.0

MOSK 24.3.7 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.16.15.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.16.15	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20250429122611	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.8-mira	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.23	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20250624132518	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.9	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.26-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.9.3-20250320	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.9	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.4.2-alpine3.21	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.66.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20250624125704	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-94cbbf5-20250227101657	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20250616180810	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.3-r21.4.20250620184755	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.3-r21.4.20250620184755	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.3-r21.4.20250620184755	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.3-r21.4.20250620184755	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.3-r21.4.20250620184755	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.3-r21.4.20250620184755	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.3-r21.4.20250620184755	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.3-r21.4.20250620184755	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.3-r21.4.20250620184755	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:24.3-r21.4.20250620184755	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.3-r21.4.20250620184755	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.3-r21.4.20250620184755	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:24.3-r21.4.20250620184755	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:24.3-r21.4.20250620184755	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:24.3-r21.4.20250620184755	Apache License 2.0

MOSK 24.3.7 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20250707023018	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20250205153914	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	2	4	6
OpenStack	Common	7	25	32
Tungsten Fabric	Unique	4	85	89
Tungsten Fabric	Common	4	96	100

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.29.5: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.3.7 release:

[52414] Fixed the issue that resulted in the missing status in the nodeworkloadlock objects for OpenStack.
[52410] Fixed the issue that caused the unexpected restart of Open vSwitch pods during external proxy removal.
[52259] Fixed the issue that prevented adjusting the Kubernetes resources causing OOM on the Tungsten Fabric control nodes.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.3.7. For the known issues in the related Container Cloud release, refer to Mirantis Container Cloud: Release Notes.

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[49078] Migration to containerd is stuck due to orphaned Docker containers¶

Fixed in MOSK 25.1

During migration of container runtime from Docker to containerd, some nodes may get stuck with the following error in LCM logs:

Orphaned Docker containers found after migration. Unable to proceed, please
check the node manually: exit status 2

The cluster is affected if orphaned containers with the k8s_ prefix are present on the affected nodes:

docker ps -a --format '{{ .Names }}' | grep '^k8s_'

Workaround:

Inspect recent Ansible logs at /var/log/lcm/* and make sure that the only failed task during migration is Delete running pods. If so, proceed to the next step. Otherwise, contact Mirantis support for further information.
Stop and remove orphaned containers with the k8s_ prefix.

Note

This action has no impact on the cluster because the nodes are already cordoned and drained as part of the maintenance window.

[49678] The Machine status is flapping after migration to containerd¶

Fixed in MOSK 25.1

To disable HostOSConfiguration objects:

In the machineSelector:matchLabels section of every HostOSConfiguration object, remove the corresponding label selectors for cluster machines.
Wait for each HostOSConfiguration object status to be updated and the machinesStates field to be absent:
```
kubectl -n <namespace> get hoc <hoc-name> -o jsonpath='{.status.machinesStates}'
```
The system response must be empty.

Once the issue is resolved in the target release, re-enable all objects using the same procedure.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[53401] Credential rotation reports success without performing action¶

Occasionally, the password rotation procedure for admin or service credentials may incorrectly report success without actually initiating the rotation process. This can result in unchanged credentials despite the procedure indicating completion.

To work around the issue, restart the rotation procedure and verify that the credentials have been successfully updated.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[49340] Tag-based filtering does not work for output_kind: audit¶

Fixed in MOSK 25.1

Tag-based filtering of logs using the tag_include parameter does not work for the logging.externalOutputs feature when output_kind: audit is selected.

For example, if the user wants to send only logs from the sudo program and sets tag_include: sudo, none of the logs will be sent to an external destination.

When the fix applies in MOSK 25.1, filtering starts working automatically.

Container Cloud web UI¶

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

Update notes¶

Cluster update schema¶

The 24.3.7 patch is the last patch release in the 24.3 series. From 24.3.7, there will be a possibility to update to 25.1.1, when it is released.

You can update to the 24.3.7 version from the following cluster versions:

Patch update:
- 24.3.6 (June 16, 2025)
- 24.3.5 (May 20, 2025)
- 24.3.4 (April 22, 2025)
- 24.3.3 (March 26, 2025)
- 24.3.2 (February 03, 2025)
- 24.3.1 (January 06, 2025)
Major update: 24.3 (October 16, 2024). As long as 24.3.7 is the last patch of the 24.3 series, Mirantis recommends avoiding this update path. Instead, consider updating to 25.1 that will soon open the update path to 25.2.

For the detailed cluster update schema, refer to Managed cluster update schema.

Patch update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.3 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Major update impact and maintenance windows planning¶

The following table provides details on the impact of a MOSK cluster update to a major release.

Major update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 1

1: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Post-update actions¶

Mandatory migration of container runtime from Docker to containerd¶

Important

Container runtime migration involves machine cordoning and draining.

See also

Mirantis Container Cloud: Release notes

24.2 series¶

24.2¶

Release date	July 02, 2024
Name	MOSK 24.2
Cluster release	17.2.0
Highlights	Technical preview for OpenStack Caracal Technical preview for Open Virtual Networking (OVN) Ceph Reef Other component version updates `OpenStackDeploymentSecret custom` resource removal RAW local images for root/ephemeral storage Technical preview for the tool to measure availability of network ports (Portprober) Technical preview for Mirantis Dynamic Resource Balancer (DRB) service Tungsten Fabric Operator API v2 Technical preview for the SR-IOV Spoof Check control for Tungsten Fabric CQL to connect with Cassandra clusters Availability zones for Tungsten Fabric

New features¶

MOSK 24.2 features¶
Component	Support scope	Feature
OpenStack	TechPreview	OpenStack Caracal
	TechPreview	Open Virtual Network
	n/a	OpenStackDeploymentSecret custom resource removal
	Full	Raw images for storage backends
	TechPreview	Network port availability monitoring (Portprober)
	TechPreview	Dynamic Resource Balancer (DRB) service
Tungsten Fabric	Full	Tungsten Fabric Operator API v2
	TechPreview	SR-IOV Spoof Check control for Tungsten Fabric
	Full	CQL to connect with Cassandra clusters
	Full	Availability zones for Tungsten Fabric
Major version changes	Full	Ceph Reef
	Full	Other component version updates

OpenStack Caracal¶

TechPreview

Implemented the technical preview support for OpenStack Caracal for greenfield deployments.

To start experimenting with the new functionality, set openstack_version to caracal in the OpenStackDeployment custom resource during the cloud deployment.

Learn more

OpenStack documentation: 2024.1 Caracal release highlights

Open Virtual Network¶

TechPreview

Implemented the technical preview support for Open Virtual Network as a networking backend for OpenStack.

Learn more

OpenStackDeploymentSecret custom resource removal¶

Completely removed the OpenStackDeploymentSecret custom resource, which was previously used to aggregate cloud confidential settings.

Sensitive information within the OpenStackDeployment object can be hidden using the value_from directive. This enhancement was introduced in MOSK 23.1 and allows for better management of confidential data without the need for a separate custom resource.

Learn more

Reference Architecture: Hiding sensitive information

Raw images for storage backends¶

Implemented the possibility to specify the raw option for the image storage backend through the spec.features section of the OpenStackDeployment custom resource.

Learn more

Reference Architecture: Image storage backend

Network port availability monitoring (Portprober)¶

TechPreview

Added support for the network port availability monitoring service (Portprober). The service is implemented as an extension to the OpenStack Neutron service and gets enabled automatically together with the floating IP address availability monitoring service (Cloudprober). Portprober is available for the MOSK clusters running OpenStack Antelope or newer version and using the Neutron OVS backend for networking.

Learn more

Dynamic Resource Balancer (DRB) service¶

TechPreview

Implemented the technical preview support for Dynamic Resource Balancer (DRB) service that enables cloud operators to continuously ensure optimal placement of their workloads.

Learn more

Tungsten Fabric Operator API v2¶

Introduced full support for Tungsten Fabric Operator API v2. All greenfield deployments now deploy v2 by default. After updating, all existing deployments include the ability to convert existing v1alpha1 TFOperator to v2.

This new API version aligns with the OpenStack Controller API and provides better interface for advanced configurations. The configuration documentation for Tungsten Fabric provides configuration examples for both API v1alpha1 and API v2.

Learn more

SR-IOV Spoof Check control for Tungsten Fabric¶

TechPreview

Implemented the capability to disable spoof checking on the SR-IOV enabled ports of some virtual network functions. This enhancement streamlines the control over SR-IOV spoof check within Tungsten Fabric, offering cloud operators a more seamless experience.

Learn more

SR-IOV Spoof Check control for Tungsten Fabric

CQL to connect with Cassandra clusters¶

Enhanced the connectivity between the Tungsten Fabric services and Cassandra database clusters through the Cassandra Query Language (CQL) protocol.

Learn more

Reference Architecture: Configuring the protocol for connecting to Cassandra clusters

Availability zones for Tungsten Fabric¶

Implemented the capability to specify the target nodes for hosting Tungsten Fabric SNAT and load balancer namespaces. This feature helps preserve resources on compute nodes running highly sensitive workloads.

Learn more

Reference Architecture: Availability zones for Tungsten Fabric

Ceph Reef¶

Upgraded Ceph major version from Quincy 17.2.7 to Reef 18.2.3 with an automatic upgrade of Ceph components during the Cluster version update.

Ceph Reef delivers new version of RocksDB which provides better IO performance. Also, this version supports RGW multisite re-sharding and contains overall security improvements.

Learn more

Ceph documentation: Ceph Reef Release Notes

Other component version updates¶

OpenStack¶

Major update: FRR 9.0.2
Minor update:
- RabbitMQ 3.12.12
- Ingress 1.10.1
- Descheduler 0.29.0
- PowerDNS 4.8.4
Patch update:
- Galera 26.4.16
- MariaDB 10.6.17

Tungsten Fabric¶

Patch update: Cassandra 3.11.17

Major components versions¶

MOSK 24.2 components versions¶
Component	Version
Cluster release	17.2.0 (Cluster release notes)
OpenStack	Caracal TechPreview, Antelope, Yoga
OpenStack Operator	0.16.5
Tungsten Fabric	21.4
Tungsten Fabric Operator	0.15.4

See also

For the supported versions of operating system, Ceph, and other components, refer to Release Compatibility Matrix.

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.2.

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[42725] OpenStack Controller Exporter fails to scrub metrics after credential rotation¶

Fixed in MOSK 24.1.6

Ocassionally, after the credential rotation, OpenStack Controller Exporter fails to scrub the metrics. To work around the issue, restart the pod with openstack-controller-exporter.

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.2.

Note

For the Tungsten Fabric limitations, refer to Tungsten Fabric known limitations.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[40900] Cassandra DB infinite table creation/changing state in Tungsten Fabric¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

During initial deployment of a Tungsten Fabric cluster, there is a known issue where the Cassandra database may enter an infinite table creation or changing state. This results in the Tungsten Fabric configuration pods failing to reach the Ready state.

The root cause of this issue is a schema mismatch within Cassandra.

To verify whether the cluster is affected:

The symptoms of this issue can be observed by verifying the Tungsten Fabric configuration pods:

kubectl describe pod <TF_CONFIG_POD_NAME> -n tf

The following events might be observed:

Events:
  Type     Reason     Age                  From     Message
  ----     ------     ----                 ----     -------
  Warning  Unhealthy  35m (x64 over 78m)   kubelet  Readiness probe failed: contrail-svc-monitor: initializing (Database:Cassandra[] connection down)
  Warning  BackOff    30m (x128 over 63m)  kubelet  Back-off restarting failed container svc-monitor in pod tf-config-mfkfc_tf(fc77e6b6-d7b9-4680-bffc-618796a754af)
  Warning  BackOff    25m (x42 over 44m)   kubelet  Back-off restarting failed container api in pod tf-config-mfkfc_tf(fc77e6b6-d7b9-4680-bffc-618796a754af)
  Normal   Started    20m (x18 over 80m)   kubelet  Started container svc-monitor
  Warning  Unhealthy  14m (x90 over 70m)   kubelet  (combined from similar events): Readiness probe errored: rpc error: code = Unknown desc = container not running (38fc8f52b45a0918363e5617c0d0181d72a01435a6c1ec4021301e2a0e75805e)

The events above indicate that the configuration services remain in the initializing state after deployment due to inability to connect to the database. As a result, liveness and readiness probes fail, and the pods continuously restart.

Additionally, each node of Cassandra configuration database logs similar errors:

INFO  [MigrationStage:1] 2024-05-29 23:13:23,089 MigrationCoordinator.java:531 - Sending schema pull request to /192.168.159.144 at 1717024403089 with timeout 10000
ERROR [InternalResponseStage:123] 2024-05-29 23:13:24,792 MigrationCoordinator.java:491 - Unable to merge schema from /192.168.159.144
org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found a1de4100-1e02-11ef-8797-b7bb8876c3bc; expected a22ef910-1e02-11ef-a135-a747d50b5bce)
 at org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:1000)
 at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:953)
 at org.apache.cassandra.config.Schema.updateTable(Schema.java:687)
 at org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1495)
 at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1451)
 at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
 at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
 at org.apache.cassandra.service.MigrationCoordinator.mergeSchemaFrom(MigrationCoordinator.java:449)
 at org.apache.cassandra.service.MigrationCoordinator$Callback.response(MigrationCoordinator.java:487)
 at org.apache.cassandra.service.MigrationCoordinator$Callback.response(MigrationCoordinator.java:475)
 at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
 at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:69)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
 at java.lang.Thread.run(Thread.java:750)

Workaround:

To resolve this issue temporarily, restart the affected Cassandra pod:

kubectl delete pod <TF_CASSANDRA_CONFIG_POD_NAME> -n tf

After the pod is restarted, monitor the status of other Tungsten Fabric pods. If they become Ready within two minutes, the issue is resolved. Otherwise, inspect the latest Cassandra logs in other pods and restart any other pods exhibiting the same pattern of errors:

kubectl delete pod <ANOTHER_TF_CASSANDRA_CONFIG_POD_NAME> -n tf

Repeat the process until all affected pods become Ready.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

Update known issues¶

This section lists the update known issues with workarounds for the MOSK release 24.2.

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

Container Cloud web UI known issues¶

This section lists the Container Cloud web UI known issues with workarounds for the MOSK release 24.2.

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Release artifacts¶

This section lists the components artifacts of the MOSK 24.2 release that includes binaries, Docker images, and Helm charts.

MOSK 24.2 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240524061520.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240522134129	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240522134129	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240522134129	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240522134129	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-dae2abb	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240522134129	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240522134129	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240522134129	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240524080520	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240522134129	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240522134129	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240522134129	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240522134129	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240507155140	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240522134129	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.1-alpine-20240424084259	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240522134129	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240522134129	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240522134129	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240522134129	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240522142810	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240523101039	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240522134129	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240522134129	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240522134129	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240522134129	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240522134129	Apache License 2.0

MOSK 24.2 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240522141440.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240522134129	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240522134129	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240522134129	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240522134129	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-dae2abb	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240522134129	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240522134129	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240522134129	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240524080520	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240522134129	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240522134129	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240522134129	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240522134129	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240507155140	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240522134129	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.1-alpine-20240424084259	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240522134129	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240522134129	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240522134129	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240522134129	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240522142810	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240419160325	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240522134129	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240522134129	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240522134129	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240522134129	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240522134129	Apache License 2.0

MOSK 24.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.16.5.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
drb-controller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/drb-controller-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.2 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.15.4.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.15.4	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240524142031	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.5	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.15	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240527095401	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.3	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.21-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240315	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.3	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240524142131	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240415104417	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20240530000000	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20240530000000	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20240530000000	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20240530000000	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20240530000000	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20240530000000	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20240530000000	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20240530000000	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20240530000000	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20240530000000	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240530000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20240530000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20240530000000	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20240530000000	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240530000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240530000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240530000000	Apache License 2.0

MOSK 24.2 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240515023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 24.2 release:

[OpenStack] [36524] Resolved the issue causing etcd to enter a panic state after replacement of the controller node.
[OpenStack] [39768] Resolved the issue that caused the OpenStack controller exporter to fail to initialize within the default timeout on large (500+ compute nodes) clusters.
[OpenStack] [42390] Resolved the issue that caused the absence of caching for PowerDNS.
[Ceph] [42903] Resolved the issue that prevented ceph-controller from correct handling of missing pools.
[Update] [41810] Resolved the cluster update issue caused by the OpenStack controller flooding.

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 24.2. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Cluster update scheme¶

Starting from MOSK 24.1.5, Mirantis introduces a new update scheme allowing for the update path flexibility.

If you started receiving the patches of the MOSK 24.1 series, there are two ways to update to the MOSK 24.2 series:

24.1.5 (released on June 18, 2024) to 24.2
24.1.7 (released on August 05, 2024) to 24.2.1 (released on August 27, 2024)

For details, refer to Cluster update scheme.

For those clusters that update between only major versions, the update scheme remains unchaged.

Features¶

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

See also

Release Compatibility Matrix

Update impact and maintenance windows planning¶

The following table provides details on the update impact on a MOSK cluster.

Impact during update to MOSK 24.2¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 0

0: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

To properly plan the update maintenance window, use the following documentation:

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Known issues.

Pre-update actions¶

Verify that OpenStackDeploymentSecret is not in use¶

The OpenStackDeploymentSecret custom resource has been removed in MOSK 24.2. Therefore, verify that sensitive information has been removed from cluster configuration as described in MOSK 23.1 update notes: Remove sensitive information from cluster configuration.

Back up the Cassandra database¶

MOSK 24.2 includes the updated patch version of the Cassandra database. With the cluster update, Cassandra is updated from 3.11.10 to 3.11.17.

Additionally, the connectivity method between the Tungsten Fabric services and Cassandra database clusters changes from Thrift to Cassandra Query Language (CQL) protocol.

Therefore, Mirantis highly recommends backing up your Cassandra database before updating a MOSK cluster with Tungsten Fabric to 24.2. For the procedure, refer to Back up TF databases.

Post-update actions¶

Convert v1alpha1 TFOperator custom resource to v2¶

In MOSK 24.2, the Tungsten Fabric API v2 becomes default for new deployments and includes the ability to convert existing v1alpha1 TFOperator to v2.

Conversion of TFOperator causes recreation of the Tungsten Fabric service pods. Therefore, Mirantis recommends performing the conversion within a maintenance window during or after the update. The conversion is optional in MOSK 24.2.

For the detailed procedure, refer to Convert v1alpha1 TFOperator custom resource to v2.

Disable Tungsten Fabric analytics services¶

If your cluster runs Tungsten Fabric analytics services and you want to obtain a more lightweight setup, you can disable these services through the custom resource of the Tungsten Fabric Operator. For the details, refer to the Tungsten Fabric analytics services deprecation notice.

Upgrade OpenStack to Antelope¶

With the MOSK 24.2 series, the OpenStack Yoga version is being deprecated. Therefore, Mirantis encourages you to upgrade to Antelope to start benefitting from the enhanced functionality and new features of the newer OpenStack release.

MOSK allows for direct upgrade from Yoga to Antelope, without the need to upgrade to the intermediate Zed release. To upgrade the cloud, complete the Upgrade OpenStack procedure.

Important

There are several known issue affecting MOSK clusters running OpenStack Antelope that can disrupt the network connectivity of the cloud workloads.

If you have updated your cluster to OpenStack Antelope, apply the workarounds described in Release notes: OpenStack known issues for the following issues:

[45879] [Antelope] Incorrect packet handling between instance and its gateway
[44813] Traffic disruption observed on trunk ports

Security notes¶

In total, since MOSK 24.1 major release, in 24.2, 771 Common Vulnerabilities and Exposures (CVE) have been fixed: 39 of critical and 732 of high severity.

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 24.1.5 patch. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	4	4
OpenStack	Common	0	5	5
Tungsten Fabric	Unique	5	54	59
Tungsten Fabric	Common	6	207	213

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.27.0: Security notes.

24.2.1 patch¶

The patch release notes contain the description of product enhancements, the list of updated artifacts and Common Vulnerabilities and Exposures (CVE) fixes as well as description of the addressed product issues for the MOSK 24.2.1 patch:

Highlights¶

MOSK 24.2.1 details¶
Release date	August 27, 2024
Scope	Patch
Cluster release	17.2.3
OpenStack Operator	0.16.11
Tungsten Fabric Operator	0.15.5

The MOSK 24.2.1 patch provides the following updates:

Support for MKE 3.7.12
Update of minor kernel version to 5.15.0-117-generic
Security fixes for CVEs in images
Resolved product issues

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

Release artifacts¶

This section lists the components artifacts of the MOSK 24.2.1 release that includes binaries, Docker images, and Helm charts.

MOSK 24.2.1 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240816121729.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240816113600	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240816113600	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240816113600	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240816113600	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-dae2abb	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240816113600	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240816113600	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240816113600	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240816113600	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240816113600	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240816113600	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240816113600	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240705072041	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240722060024	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-7a2867f-20240717092051	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.3-alpine-20240729113916	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240816113600	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240816113600	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240816113600	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240816113600	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240808091352	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240816113600	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240816113600	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240816113600	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240816113600	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240816113600	Apache License 2.0

MOSK 24.2.1 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240812093232.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240812085256	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240812085256	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240812085256	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240812085256	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-dae2abb	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240812085256	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240812085256	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240812085256	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240812085256	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240812085256	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240812085256	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240812085256	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240812085256	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240705072041	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240812085256	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-7a2867f-20240717092051	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.3-alpine-20240729113916	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240812085256	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240812085256	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240812085256	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240812085256	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240610161716	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240812085256	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240812085256	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240812085256	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240812085256	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240812085256	Apache License 2.0

MOSK 24.2.1 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.16.11.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
drb-controller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/drb-controller-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.2.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.15.5.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.15.5	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240806152951	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.6	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.18	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240715193519	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.3	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.21-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240315	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.3	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240715193039	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-7a2867f-20240717092051	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240415104417	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.2-r21.4.20240806114743	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.2-r21.4.20240806114743	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240530000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.2-r21.4.20240806114743	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.2-r21.4.20240806114743	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.2-r21.4.20240806114743	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240530000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240530000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240530000000	Apache License 2.0

MOSK 24.2.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240807023011	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

The table below contains the total number of addressed unique and common CVEs by MOSK-specific component compared to the previous patch release version. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	1	17	18
OpenStack	Common	4	38	42
Tungsten Fabric	Unique	4	39	43
Tungsten Fabric	Common	6	51	57

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.27.3: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.2.1 release:

[40900] [Tungsten Fabric] Resolved the issue that caused Cassandra database to enter an infinite table creation or changing state.
[44813] [Antelope] Resolved the issue that caused disruption on trunk ports.
[45879] [Antelope] Resolved the issue that caused the incorrect packet handling between instance and its gateway.
[38847] [Antelope] Resolved the issue that prevented ISO images from booting.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.2.1.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

[46220] ClusterMaintenanceRequest stuck with Tungsten Fabric API v2¶

Fixed in MOSK 24.3

On clusters running Tungsten Fabric with API v2, after updating from MOSK 24.2 to 24.2.1, subsequent cluster maintenance requests may stuck. The root cause of the issue is a version mismatch within the internal structures of the Tungsten Fabric Operator.

To identify if your cluster is affected, run:

kubectl get clusterworkloadlock tf-openstack-tf -o yaml

The output similar to the one below, indicates that the Tungsten Fabric ClusterWorkloadLock remains in the active state indefinitely preventing further LCM operations with other components:

apiVersion: lcm.mirantis.com/v1alpha1
kind: ClusterWorkloadLock
metadata:
  creationTimestamp: "2024-08-30T13:50:33Z"
  generation: 1
  name: tf-openstack-tf
  resourceVersion: "4414649"
  uid: 582fc558-c343-4e96-a445-a2d1818dcdb2
spec:
  controllerName: tungstenfabric
status:
  errorMessage: cluster is not in ready state
  release: 17.2.4+24.2.2
  state: active

Additionally, the LCM controller logs may contain errors similar to:

{"level":"info","ts":"2024-09-02T16:22:16Z","logger":"entrypoint.lcmcluster-controller.req:5520","caller":"lcmcluster/maintenance.go:178","msg":"ClusterWorkloadLock is inactive cwl {{ClusterWorkloadLock lcm.mirantis.com/v1alpha1} {ceph-clusterworkloadlock    a45eca91-cd7b-4d68-9a8e-4d656b4308af 3383288 1 2024-08-30 13:15:14 +0000 UTC <nil> <nil> map[] map[miraceph-ready:true] [{v1 Namespace ceph-lcm-mirantis 43853f67-9058-44ed-8287-f650dbeac5d7 <nil> <nil>}]
[] [{ceph-controller Update lcm.mirantis.com/v1alpha1 2024-08-30 13:25:53 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:miraceph-ready\":{}},\"f:ownerReferences\":{\".\":{},\"k:{\\\"uid\\\":\\\"43853f67-9058-44ed-8287-f650dbeac5d7\\\"}\":{}}},\"f:spec\":{\".\":{},\"f:controllerName\":{}}} } {ceph-controller Update lcm.mirantis.com/v1alpha1 2024-09-02 10:48:27 +0000 UTC FieldsV1 {\"f:status\":{\".\":{},\"f:release\":{},\"f:state\":{}}} status}]} {ceph} {inactive  17.2.4+24.2.2}}","ns":"child-ns-tf","name":"child-cl"}
{"level":"info","ts":"2024-09-02T16:22:16Z","logger":"entrypoint.lcmcluster-controller.req:5520","caller":"lcmcluster/maintenance.go:178","msg":"ClusterWorkloadLock is inactive cwl {{ClusterWorkloadLock lcm.mirantis.com/v1alpha1} {openstack-osh-dev    7de2b86f-d247-4cee-be8d-dcbcf5e1e11b 3382535 1 2024-08-30 13:50:54 +0000 UTC <nil> <nil> map[] map[] [] [] [{pykube-ng Update lcm.mirantis.com/v1alpha1 2024-08-30 13:50:54 +0000 UTC FieldsV1 {\"f:spec\":{\".\":{},\"f:controllerName\":{}}} } {pykube-ng Update lcm.mirantis.com/v1alpha1 2024-09-02 10:47:29 +0000 UTC FieldsV1 {\"f:status\":{\".\":{},\"f:release\":{},\"f:state\":{}}} status}]} {openstack} {inactive  17.2.4+24.2.2}}","ns":"child-ns-tf","name":"child-cl"}
{"level":"info","ts":"2024-09-02T16:22:16Z","logger":"entrypoint.lcmcluster-controller.req:5520","caller":"lcmcluster/maintenance.go:173","msg":"ClusterWorkloadLock is still active cwl {{ClusterWorkloadLock lcm.mirantis.com/v1alpha1} {tf-openstack-tf    582fc558-c343-4e96-a445-a2d1818dcdb2 3382495 1 2024-08-30 13:50:33 +0000 UTC <nil> <nil> map[] map[] [] [] [{maintenance-ctl Update lcm.mirantis.com/v1alpha1 2024-08-30 13:50:33 +0000 UTC FieldsV1 {\"f:spec\":{\".\":{},\"f:controllerName\":{}}} } {maintenance-ctl Update lcm.mirantis.com/v1alpha1 2024-09-02 10:47:25 +0000 UTC FieldsV1 {\"f:status\":{\".\":{},\"f:errorMessage\":{},\"f:release\":{},\"f:state\":{}}} status}]} {tungstenfabric} {active cluster is not in ready state 17.2.4+24.2.2}}","ns":"child-ns-tf","name":"child-cl"}
{"level":"error","ts":"2024-09-02T16:22:16Z","logger":"entrypoint.lcmcluster-controller.req:5520","caller":"lcmcluster/lcmcluster_controller.go:388","msg":"","ns":"child-ns-tf","name":"child-cl","error":"following ClusterWorkloadLocks in cluster child-ns-tf/child-cl are still active -  tf-openstack-tf: InProgress not all ClusterWorkloadLocks are inactive yet","stacktrace":"sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/controller/lcmcluster.(*ReconcileLCMCluster).updateCluster\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/controller/lcmcluster/lcmcluster_controller.go:388\nsigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/controller/lcmcluster.(*ReconcileLCMCluster).Reconcile\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/controller/lcmcluster/lcmcluster_controller.go:223\nsigs.k8s.io/cluster-api-provider-openstack/pkg/service.(*reconcilePanicCatcher).Reconcile\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/service/reconcile.go:98\nsigs.k8s.io/cluster-api-provider-openstack/pkg/service.(*reconcileContextEnricher).Reconcile\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/service/reconcile.go:78\nsigs.k8s.io/cluster-api-provider-openstack/pkg/service.(*reconcileMetrics).Reconcile\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/service/reconcile.go:136\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.3/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.3/pkg/internal/controller/controller.go:31:

To work around the issue, set the actual version of Tungsten Fabric Operator in the TFOperator custom resource:

For MOSK 24.2.1:

kubectl -n tf patch tfoperators.tf.mirantis.com openstack-tf --type=merge --subresource status --patch 'status: {operatorVersion: <0.15.5>}'

For MOSK 24.2.2:

kubectl -n tf patch tfoperators.tf.mirantis.com openstack-tf --type=merge --subresource status --patch 'status: {operatorVersion: <0.15.6>}'

Update known issues¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Update notes¶

Cluster update scheme¶

To improve user update experience and make the update path more flexible, MOSK is introducing a new scheme of updating between cluster releases. For the details and possible update paths, refer to 24.1.5 update notes: Cluster update scheme.

You can update to 24.2.1 either from 24.1.7 (major update) or 24.2 (patch update).

Expected impact when updating from 24.1.7¶

Update from 24.1.7 version is a major update between the series. Therefore, the expected impact and maintenance window for the major update to 24.2 series applies. For details, refer to 24.2 update notes

Expected impact when updating within the 24.2 series¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.2 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

24.2.2 patch¶

Highlights¶

MOSK 24.2.2 details¶
Release date	September 16, 2024
Scope	Patch
Cluster release	17.2.4
OpenStack Operator	0.16.12
Tungsten Fabric Operator	0.15.6

The MOSK 24.2.2 patch provides the following updates:

Update of minor kernel version to 5.15.0-118-generic
Security fixes for CVEs in images

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

Release artifacts¶

This section lists the components artifacts of the MOSK 24.2.2 release that includes binaries, Docker images, and Helm charts.

MOSK 24.2.2 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240816121729.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240816113600	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240816113600	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240816113600	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240816113600	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-b8aa7cb	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240816113600	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240816113600	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240816113600	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240816113600	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240816113600	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240816113600	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240816113600	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240801131834	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240811190753	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-4e381cb-20240813170642	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.3-alpine-20240729113916	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240816113600	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240816113600	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240816113600	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240816113600	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240820091923	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240816113600	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240816113600	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240816113600	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240816113600	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240816113600	Apache License 2.0

MOSK 24.2.2 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240816121612.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240816113602	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240816113602	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240816113602	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240816113602	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-b8aa7cb	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240816113602	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240816113602	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240816113602	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240816113602	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240816113602	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240816113602	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240816113602	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240816113602	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240801131834	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240816113602	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240209104455	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-4e381cb-20240813170642	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.3-alpine-20240729113916	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240816113602	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240816113602	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240816113602	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240816113602	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240610161716	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240816113602	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240816113602	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240816113602	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240816113602	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240816113602	Apache License 2.0

MOSK 24.2.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.16.12.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4472.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
drb-controller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/drb-controller-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3100.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.2.2 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.15.6.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.15.6	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240806152951	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.6	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.18	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240812125407	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.5	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.21-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.5	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.62.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240820113128	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-4e381cb-20240813170642	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240812125441	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.2-r21.4.20240819135158	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240530000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.2-r21.4.20240819135158	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.2-r21.4.20240819135158	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.2-r21.4.20240819135158	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240530000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240530000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240530000000	Apache License 2.0

MOSK 24.2.2 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240821023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	3	29	32
OpenStack	Common	7	59	66
Tungsten Fabric	Unique	4	55	59
Tungsten Fabric	Common	44	483	527

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.27.4: Security notes.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.2.2:

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

[46220] ClusterMaintenanceRequest stuck with Tungsten Fabric API v2¶

Fixed in MOSK 24.3

To identify if your cluster is affected, run:

kubectl get clusterworkloadlock tf-openstack-tf -o yaml

The output similar to the one below, indicates that the Tungsten Fabric ClusterWorkloadLock remains in the active state indefinitely preventing further LCM operations with other components:

apiVersion: lcm.mirantis.com/v1alpha1
kind: ClusterWorkloadLock
metadata:
  creationTimestamp: "2024-08-30T13:50:33Z"
  generation: 1
  name: tf-openstack-tf
  resourceVersion: "4414649"
  uid: 582fc558-c343-4e96-a445-a2d1818dcdb2
spec:
  controllerName: tungstenfabric
status:
  errorMessage: cluster is not in ready state
  release: 17.2.4+24.2.2
  state: active

Additionally, the LCM controller logs may contain errors similar to:

{"level":"info","ts":"2024-09-02T16:22:16Z","logger":"entrypoint.lcmcluster-controller.req:5520","caller":"lcmcluster/maintenance.go:178","msg":"ClusterWorkloadLock is inactive cwl {{ClusterWorkloadLock lcm.mirantis.com/v1alpha1} {ceph-clusterworkloadlock    a45eca91-cd7b-4d68-9a8e-4d656b4308af 3383288 1 2024-08-30 13:15:14 +0000 UTC <nil> <nil> map[] map[miraceph-ready:true] [{v1 Namespace ceph-lcm-mirantis 43853f67-9058-44ed-8287-f650dbeac5d7 <nil> <nil>}]
[] [{ceph-controller Update lcm.mirantis.com/v1alpha1 2024-08-30 13:25:53 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:miraceph-ready\":{}},\"f:ownerReferences\":{\".\":{},\"k:{\\\"uid\\\":\\\"43853f67-9058-44ed-8287-f650dbeac5d7\\\"}\":{}}},\"f:spec\":{\".\":{},\"f:controllerName\":{}}} } {ceph-controller Update lcm.mirantis.com/v1alpha1 2024-09-02 10:48:27 +0000 UTC FieldsV1 {\"f:status\":{\".\":{},\"f:release\":{},\"f:state\":{}}} status}]} {ceph} {inactive  17.2.4+24.2.2}}","ns":"child-ns-tf","name":"child-cl"}
{"level":"info","ts":"2024-09-02T16:22:16Z","logger":"entrypoint.lcmcluster-controller.req:5520","caller":"lcmcluster/maintenance.go:178","msg":"ClusterWorkloadLock is inactive cwl {{ClusterWorkloadLock lcm.mirantis.com/v1alpha1} {openstack-osh-dev    7de2b86f-d247-4cee-be8d-dcbcf5e1e11b 3382535 1 2024-08-30 13:50:54 +0000 UTC <nil> <nil> map[] map[] [] [] [{pykube-ng Update lcm.mirantis.com/v1alpha1 2024-08-30 13:50:54 +0000 UTC FieldsV1 {\"f:spec\":{\".\":{},\"f:controllerName\":{}}} } {pykube-ng Update lcm.mirantis.com/v1alpha1 2024-09-02 10:47:29 +0000 UTC FieldsV1 {\"f:status\":{\".\":{},\"f:release\":{},\"f:state\":{}}} status}]} {openstack} {inactive  17.2.4+24.2.2}}","ns":"child-ns-tf","name":"child-cl"}
{"level":"info","ts":"2024-09-02T16:22:16Z","logger":"entrypoint.lcmcluster-controller.req:5520","caller":"lcmcluster/maintenance.go:173","msg":"ClusterWorkloadLock is still active cwl {{ClusterWorkloadLock lcm.mirantis.com/v1alpha1} {tf-openstack-tf    582fc558-c343-4e96-a445-a2d1818dcdb2 3382495 1 2024-08-30 13:50:33 +0000 UTC <nil> <nil> map[] map[] [] [] [{maintenance-ctl Update lcm.mirantis.com/v1alpha1 2024-08-30 13:50:33 +0000 UTC FieldsV1 {\"f:spec\":{\".\":{},\"f:controllerName\":{}}} } {maintenance-ctl Update lcm.mirantis.com/v1alpha1 2024-09-02 10:47:25 +0000 UTC FieldsV1 {\"f:status\":{\".\":{},\"f:errorMessage\":{},\"f:release\":{},\"f:state\":{}}} status}]} {tungstenfabric} {active cluster is not in ready state 17.2.4+24.2.2}}","ns":"child-ns-tf","name":"child-cl"}
{"level":"error","ts":"2024-09-02T16:22:16Z","logger":"entrypoint.lcmcluster-controller.req:5520","caller":"lcmcluster/lcmcluster_controller.go:388","msg":"","ns":"child-ns-tf","name":"child-cl","error":"following ClusterWorkloadLocks in cluster child-ns-tf/child-cl are still active -  tf-openstack-tf: InProgress not all ClusterWorkloadLocks are inactive yet","stacktrace":"sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/controller/lcmcluster.(*ReconcileLCMCluster).updateCluster\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/controller/lcmcluster/lcmcluster_controller.go:388\nsigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/controller/lcmcluster.(*ReconcileLCMCluster).Reconcile\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/controller/lcmcluster/lcmcluster_controller.go:223\nsigs.k8s.io/cluster-api-provider-openstack/pkg/service.(*reconcilePanicCatcher).Reconcile\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/service/reconcile.go:98\nsigs.k8s.io/cluster-api-provider-openstack/pkg/service.(*reconcileContextEnricher).Reconcile\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/service/reconcile.go:78\nsigs.k8s.io/cluster-api-provider-openstack/pkg/service.(*reconcileMetrics).Reconcile\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/service/reconcile.go:136\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.3/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.3/pkg/internal/controller/controller.go:31:

To work around the issue, set the actual version of Tungsten Fabric Operator in the TFOperator custom resource:

For MOSK 24.2.1:

kubectl -n tf patch tfoperators.tf.mirantis.com openstack-tf --type=merge --subresource status --patch 'status: {operatorVersion: <0.15.5>}'

For MOSK 24.2.2:

kubectl -n tf patch tfoperators.tf.mirantis.com openstack-tf --type=merge --subresource status --patch 'status: {operatorVersion: <0.15.6>}'

Update known issues¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[43474] Custom Grafana dashboards are corrupted¶

To work around the issue, manually adjust the affected dashboards to restore their custom appearance.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Update notes¶

Cluster update scheme¶

Expected impact when updating within the 24.2 series¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.2 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

24.2.3 patch¶

Highlights¶

MOSK 24.2.3 details¶
Release date	October 30, 2024
Scope	Patch
Cluster release	17.2.5
OpenStack Operator	0.16.15
Tungsten Fabric Operator	0.15.7

The MOSK 24.2.3 patch provides the following updates:

Support for MKE 3.7.15
Update of minor kernel version to 5.15.0-122-generic
Security fixes for CVEs in images
Resolved product issues

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

Release artifacts¶

This section lists the components artifacts of the MOSK 24.2.3 release that includes binaries, Docker images, and Helm charts.

MOSK 24.2.3 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.16.15.tgz	Mirantis Proprietary License

MOSK 24.2.3 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240927174140.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240927160001	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240927160001	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240927160001	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240927160001	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-eae04fb	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240927160001	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240927160001	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240927160001	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240927160001	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240927160001	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240927160001	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240927160001	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240927160001	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240822074257	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240819060021	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240905100031	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20240911082430	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240923091141	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.2-alpine-20240913100326	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-jammy-20240927170336	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240927160001	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240927160001	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240927160001	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20240905134302	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240909123821	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20240918073840	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240905114128	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240905140249	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240927160001	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240926184743	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240927160001	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240927160001	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240927160001	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240927160001	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240927160001	Apache License 2.0

MOSK 24.2.3 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240927170837.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20240318125453.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240927160001	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240927160001	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240927160001	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240927160001	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-eae04fb	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240927160001	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240927160001	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240927160001	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240927160001	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240927160001	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240927160001	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240927160001	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240927160001	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240822074257	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240927160001	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240905100031	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20240911082430	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240923091141	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.2-alpine-20240913100326	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-jammy-20240927170336	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240927160001	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240927160001	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240927160001	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20240905134302	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240909123821	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20240918073840	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240905114128	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240905140249	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240927160001	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240610161716	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240927160001	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240927160001	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240927160001	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240927160001	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240927160001	Apache License 2.0

MOSK 24.2.3 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.15.7.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.15.7	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240920135651	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.6	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.18	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240812125407	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.6	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.24-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.6	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.62.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240930105454	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240919130057	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.2-r21.4.20240819135158	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240530000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.2-r21.4.20240819135158	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.2-r21.4.20240819135158	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.2-r21.4.20240819135158	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240530000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240530000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240530000000	Apache License 2.0

MOSK 24.2.3 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
	prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20241021023014
	tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20241021123242
Helm charts
	fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz
	prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz
	prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	7	64	71
OpenStack	Common	29	365	394
Tungsten Fabric	Unique	7	199	206
Tungsten Fabric	Common	7	230	237

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.28.1: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.2.3 release:

[46220] [Tungsten Fabric] Resolved the issue that caused subsequent cluster maintenance requests to get stuck on clusters running Tungsten Fabric with API v2, after updating from MOSK 24.2 to 24.2.1.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.2.3:

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[47602] Failed designate-zone-setup job blocks cluster update¶

Fixed in MOSK 24.3.1

The designate-zone-setup Kubernetes job in the openstack namespace fails during update to MOSK 24.3 with the following error present in the logs of the job pod:

openstack.exceptions.BadRequestException: BadRequestException: 400:
Client Error for url: http://designate-api.openstack.svc.cluster.local:9001/v2/zones,
Invalid TLD

To work around the issue, verify that there are created TLDs present in the DNS service:

openstack tld list -f value -c name

If there are TLDs present and test is not one of them, create it:

Warning

openstack tld create --name test

Example output:

+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| created_at  | 2024-10-22T19:22:15.000000           |
| description | None                                 |
| id          | 930fed8b-1e91-4c8c-a00f-7abf68b944d0 |
| name        | test                                 |
| updated_at  | None                                 |
+-------------+--------------------------------------+

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[51524] sf-notifier creates big amount of relogins to Salesforce¶

Fixed in MOSK 24.3.5

Workaround:

Container Cloud version update (for management clusters)
Cluster release version update (for MOSK cluster)
Any sf-notifier-related operation (for all clusters):
- Disable and enable
- Credentials change
- IDs change
- Any configuration change for resources, node selector, tolerations, and log level

Once applied, this workaround must be re-applied whenever one of the above operations is performed in the cluster.

Print the currently used image:

kubectl get deployment sf-notifier -n stacklight -o jsonpath="{.spec.template.spec.containers[0].image}"

Possible results:

mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20250113023013

127.0.0.1:44301/stacklight/sf-notifier:v0.4-20250113023013

Compare the sf-notifier image tag with the list of affected tags. If the image is affected, it has to be replaced. Otherwise, your cluster is not affected.

Affected tags:

v0.4-20241021023015
v0.4-20241028023015
v0.4-20241118023015
v0.4-20241216023012
v0.4-20250113023013
v0.4-20250217023014
v0.4-20250317092322
v0.4-20250414023016

In the resulting string, replace only the tag of the affected image with the desired v0.4-20240828023015 tag. Keep the registry the same as in the original Deployment object.

Resulting images from examples:
```
mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015
```
or
```
127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015
```

Update the sf-notifier Deployment with the new image:

kubectl set image deployment/sf-notifier sf-notifier=<new image> -n stacklight

For example:

kubectl set image deployment/sf-notifier sf-notifier=mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

kubectl set image deployment/sf-notifier sf-notifier=127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

Wait until the pod with the updated image is created, and check the logs. Verify that there are no errors in the logs:
```
kubectl logs pod/<sf-notifier pod> -n stacklight
```

Example of a custom alert to monitor the current tag of the sf-notifier image:

- name: stacklight
  values:
    ...
    prometheusServer:
      ...
      customAlerts:
        ...
      - alert: SFnotifierImageVersion
        annotations:
          description: "sf-notifier image has a buggy tag, please revert deployment image back to sf-notifier:v0.4-20240828023015. This should be fixed in MCC 2.29.3 / MOSK 24.3.5, remove the alert after this release upgrade."
          summary: "This image might be causing too many API logins and exceeding our monthly API budget, please act immediately"
        expr: >-
          avg(kube_pod_container_info{container="sf-notifier",image_spec=~".*v0.4-(20241021023015|20241028023015|20241118023015|20241216023012|20250113023013|20250217023014|20250317092322|20250414023016)"})
        for: 5m
        labels:
          service: alertmanager
          severity: critical

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Update notes¶

Expected impact when updating within the 24.2 series¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.2 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

24.2.4 patch¶

Highlights¶

MOSK 24.2.4 details¶
Release date	November 18, 2024
Scope	Patch
Cluster release	17.2.6
OpenStack Operator	0.16.19
Tungsten Fabric Operator	0.15.9

The MOSK 24.2.4 patch provides the following updates:

Support for MKE 3.7.16
Update of minor kernel version to 5.15.0-124-generic
Security fixes for CVEs in images
Resolved product issues

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

Release artifacts¶

This section lists the components artifacts of the MOSK 24.2.4 release that includes binaries, Docker images, and Helm charts.

MOSK 24.2.4 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.16.19.tgz	Mirantis Proprietary License

MOSK 24.2.4 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20241028145520.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20241028141054	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20241028141054	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20241028141054	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20241028141054	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-d0f3d7b	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20241028141054	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20241028141054	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20241028141054	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20241028141054	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20241028141054	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20241028141054	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20241028141054	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20241028141054	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240822074257	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240819060021	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20241024070356	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20241024070202	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241024073159	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20241024075118	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-jammy-20240927170336	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20241028141054	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20241028141054	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20241028141054	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20241024111938	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241024070624	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20241024065919	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20241028141054	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20241018143901	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20241028141054	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20241028141054	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20241028141054	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20241028141054	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20241028141054	Apache License 2.0

MOSK 24.2.4 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20241028145440.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20241028141054	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20241028141054	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20241028141054	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20241028141054	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-d0f3d7b	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20241028141054	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20241028141054	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20241028141054	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20241028141054	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20241028141054	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20241028141054	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20241028141054	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20241028141054	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20240822074257	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20241028141054	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20241024070356	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20241024070202	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241024073159	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20241024075118	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-jammy-20240927170336	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20241028141054	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20241028141054	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20241028141054	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20241024111938	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241024070624	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20241024065919	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20241028141054	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20241027073834	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20241028141054	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20241028141054	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20241028141054	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20241028141054	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20241028141054	Apache License 2.0

MOSK 24.2.4 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.15.9.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.15.9	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240920135651	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.6	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.20	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240812125407	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.6	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.24-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.6	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.62.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240930105454	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240919130057	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.2-r21.4.20240819135158	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240530000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.2-r21.4.20240819135158	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.2-r21.4.20240819135158	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.2-r21.4.20240819135158	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240530000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240530000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240530000000	Apache License 2.0

MOSK 24.2.4 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20241028023014	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20241021123242	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	6	6
OpenStack	Common	0	33	33
Tungsten Fabric	Unique	1	5	6
Tungsten Fabric	Common	1	7	8

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.28.2: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.2.4 release:

[47196] [OpenStack] Resolved the issue that caused the Network QoS policy to be absent after creation.
[47743] [OpenStack] Resolved the issue that prevented openstack-controller-exporter from collecting PCPUs.
[47717] [Tungsten Fabric] Resolved the issue with the invalid BgpAsn setting in tungstenfabric-operator.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.2.4:

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[47602] Failed designate-zone-setup job blocks cluster update¶

Fixed in MOSK 24.3.1

The designate-zone-setup Kubernetes job in the openstack namespace fails during update to MOSK 24.3 with the following error present in the logs of the job pod:

openstack.exceptions.BadRequestException: BadRequestException: 400:
Client Error for url: http://designate-api.openstack.svc.cluster.local:9001/v2/zones,
Invalid TLD

To work around the issue, verify that there are created TLDs present in the DNS service:

openstack tld list -f value -c name

If there are TLDs present and test is not one of them, create it:

Warning

openstack tld create --name test

Example output:

+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| created_at  | 2024-10-22T19:22:15.000000           |
| description | None                                 |
| id          | 930fed8b-1e91-4c8c-a00f-7abf68b944d0 |
| name        | test                                 |
| updated_at  | None                                 |
+-------------+--------------------------------------+

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[51524] sf-notifier creates big amount of relogins to Salesforce¶

Fixed in MOSK 24.3.5

Workaround:

Container Cloud version update (for management clusters)
Cluster release version update (for MOSK cluster)
Any sf-notifier-related operation (for all clusters):
- Disable and enable
- Credentials change
- IDs change
- Any configuration change for resources, node selector, tolerations, and log level

Once applied, this workaround must be re-applied whenever one of the above operations is performed in the cluster.

Print the currently used image:

kubectl get deployment sf-notifier -n stacklight -o jsonpath="{.spec.template.spec.containers[0].image}"

Possible results:

mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20250113023013

127.0.0.1:44301/stacklight/sf-notifier:v0.4-20250113023013

Compare the sf-notifier image tag with the list of affected tags. If the image is affected, it has to be replaced. Otherwise, your cluster is not affected.

Affected tags:

v0.4-20241021023015
v0.4-20241028023015
v0.4-20241118023015
v0.4-20241216023012
v0.4-20250113023013
v0.4-20250217023014
v0.4-20250317092322
v0.4-20250414023016

In the resulting string, replace only the tag of the affected image with the desired v0.4-20240828023015 tag. Keep the registry the same as in the original Deployment object.

Resulting images from examples:
```
mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015
```
or
```
127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015
```

Update the sf-notifier Deployment with the new image:

kubectl set image deployment/sf-notifier sf-notifier=<new image> -n stacklight

For example:

kubectl set image deployment/sf-notifier sf-notifier=mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

kubectl set image deployment/sf-notifier sf-notifier=127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

Wait until the pod with the updated image is created, and check the logs. Verify that there are no errors in the logs:
```
kubectl logs pod/<sf-notifier pod> -n stacklight
```

Example of a custom alert to monitor the current tag of the sf-notifier image:

- name: stacklight
  values:
    ...
    prometheusServer:
      ...
      customAlerts:
        ...
      - alert: SFnotifierImageVersion
        annotations:
          description: "sf-notifier image has a buggy tag, please revert deployment image back to sf-notifier:v0.4-20240828023015. This should be fixed in MCC 2.29.3 / MOSK 24.3.5, remove the alert after this release upgrade."
          summary: "This image might be causing too many API logins and exceeding our monthly API budget, please act immediately"
        expr: >-
          avg(kube_pod_container_info{container="sf-notifier",image_spec=~".*v0.4-(20241021023015|20241028023015|20241118023015|20241216023012|20250113023013|20250217023014|20250317092322|20250414023016)"})
        for: 5m
        labels:
          service: alertmanager
          severity: critical

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Update notes¶

Expected impact when updating within the 24.2 series¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.2 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

24.2.5 patch¶

Highlights¶

MOSK 24.2.5 details¶
Release date	December 09, 2024
Scope	Patch
Cluster release	17.2.7
OpenStack Operator	0.16.22
Tungsten Fabric Operator	0.15.10

The MOSK 24.2.5 patch provides the following updates:

Update of minor kernel version to 5.15.0-125-generic
Security fixes for CVEs in images
Resolved product issues

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

Release artifacts¶

This section lists the components artifacts of the MOSK 24.2.5 release that includes binaries, Docker images, and Helm charts.

MOSK 24.2.5 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.16.22.tgz	Mirantis Proprietary License

MOSK 24.2.5 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20241128103344.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20240213233421.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20241128095555	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20241128095555	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20241128095555	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20241128095555	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-d0f3d7b	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20241128095555	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20241128095555	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20241128095555	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20241128095555	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20241128095555	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20241128095555	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20241128095555	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20241128095555	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20241115095845	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20241120070553	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20241024070356	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20241024070202	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241120110423	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20241024075118	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20241104184039	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20241128095555	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20241128095555	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20241128095555	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20241024111938	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241111080549	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20241024065919	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20241128095555	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20241120224419	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20241128095555	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20241128095555	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20241128095555	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20241128095555	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20241128095555	Apache License 2.0

MOSK 24.2.5 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20241128103502.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20241024092252.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20241128095555	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20241128095555	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20241128095555	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20241128095555	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/osprober:0.0.1-d0f3d7b	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20241128095555	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20241128095555	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20241128095555	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20241128095555	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20241128095555	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20241128095555	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20241128095555	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20241128095555	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-reef-20241115095845	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20240827065304	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20240827065304	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20241128095555	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.12.12-jammy-20240829072728	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.32-alpine-20241024070356	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.16-alpine-20241024070202	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20241120110423	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.11.3-alpine-20241024075118	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:jammy-fipster-1.0.0.dev2	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20240910090142	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.20-jammy-20241104184039	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.14.4	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20241128095555	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20241128095555	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20241128095555	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.x-alpine-20241024111938	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20241024111938	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.1.2-20241111080549	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20241024065424	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20240829110216	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler-20240910121701:v0.30.1-amd64-20240910114244	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.27.2-20241024065919	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20241128095555	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240701093549	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20241107122432	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20241128095555	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20241128095555	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20241128095555	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20241128095555	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20241128095555	Apache License 2.0

MOSK 24.2.5 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.15.10.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.15.10	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20241118202346	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.6	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.20	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.17	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240812125407	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.6	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.24-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240813	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.6	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.62.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20241119115912	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-34a4f54-20240910081335	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240919130057	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.2-r21.4.20240819135158	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.2-r21.4.20240819135158	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240530000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.2-r21.4.20240819135158	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.2-r21.4.20240819135158	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.2-r21.4.20240819135158	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240530000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240530000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240530000000	Apache License 2.0

MOSK 24.2.5 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20241118023015	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20241021123242	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	5	5
OpenStack	Common	0	8	8
Tungsten Fabric	Unique	1	8	9
Tungsten Fabric	Common	2	11	13

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.28.3: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.2.5 release:

[48160] [OpenStack] Resolved the issue that caused instances to fail booting when using a VFAT-formatted config drive.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.2.5:

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

Update known issues¶

[42449] Rolling reboot failure on a Tungsten Fabric cluster¶

During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.

[46671] Cluster update fails with the tf-config pods crashed¶

When updating to the MOSK 24.3 series, tf-config pods from the Tungsten Fabric namespace may enter the CrashLoopBackOff state. For example:

tf-config-cs8zr                            2/5     CrashLoopBackOff   676 (19s ago)   15h
tf-config-db-6zxgg                         1/1     Running            44 (25m ago)    15h
tf-config-db-7k5sz                         1/1     Running            43 (23m ago)    15h
tf-config-db-dlwdv                         1/1     Running            43 (25m ago)    15h
tf-config-nw4tr                            3/5     CrashLoopBackOff   665 (43s ago)   15h
tf-config-wzf6c                            1/5     CrashLoopBackOff   680 (10s ago)   15h
tf-control-c6bnn                           3/4     Running            41 (23m ago)    13h
tf-control-gsnnp                           3/4     Running            42 (23m ago)    13h
tf-control-sj6fd                           3/4     Running            41 (23m ago)    13h

Logs from the tf-config API container:

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})

Logs from the tf-cassandra pods:

INFO  [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready

To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:

kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2

Now, all other services in the Tungsten Fabric namespace should be in the Active state.

[47602] Failed designate-zone-setup job blocks cluster update¶

Fixed in MOSK 24.3.1

The designate-zone-setup Kubernetes job in the openstack namespace fails during update to MOSK 24.3 with the following error present in the logs of the job pod:

openstack.exceptions.BadRequestException: BadRequestException: 400:
Client Error for url: http://designate-api.openstack.svc.cluster.local:9001/v2/zones,
Invalid TLD

To work around the issue, verify that there are created TLDs present in the DNS service:

openstack tld list -f value -c name

If there are TLDs present and test is not one of them, create it:

Warning

openstack tld create --name test

Example output:

+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| created_at  | 2024-10-22T19:22:15.000000           |
| description | None                                 |
| id          | 930fed8b-1e91-4c8c-a00f-7abf68b944d0 |
| name        | test                                 |
| updated_at  | None                                 |
+-------------+--------------------------------------+

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[51524] sf-notifier creates big amount of relogins to Salesforce¶

Fixed in MOSK 24.3.5

Workaround:

Container Cloud version update (for management clusters)
Cluster release version update (for MOSK cluster)
Any sf-notifier-related operation (for all clusters):
- Disable and enable
- Credentials change
- IDs change
- Any configuration change for resources, node selector, tolerations, and log level

Once applied, this workaround must be re-applied whenever one of the above operations is performed in the cluster.

Print the currently used image:

kubectl get deployment sf-notifier -n stacklight -o jsonpath="{.spec.template.spec.containers[0].image}"

Possible results:

mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20250113023013

127.0.0.1:44301/stacklight/sf-notifier:v0.4-20250113023013

Compare the sf-notifier image tag with the list of affected tags. If the image is affected, it has to be replaced. Otherwise, your cluster is not affected.

Affected tags:

v0.4-20241021023015
v0.4-20241028023015
v0.4-20241118023015
v0.4-20241216023012
v0.4-20250113023013
v0.4-20250217023014
v0.4-20250317092322
v0.4-20250414023016

In the resulting string, replace only the tag of the affected image with the desired v0.4-20240828023015 tag. Keep the registry the same as in the original Deployment object.

Resulting images from examples:
```
mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015
```
or
```
127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015
```

Update the sf-notifier Deployment with the new image:

kubectl set image deployment/sf-notifier sf-notifier=<new image> -n stacklight

For example:

kubectl set image deployment/sf-notifier sf-notifier=mirantis.azurecr.io/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

kubectl set image deployment/sf-notifier sf-notifier=127.0.0.1:44301/stacklight/sf-notifier:v0.4-20240828023015 -n stacklight

Wait until the pod with the updated image is created, and check the logs. Verify that there are no errors in the logs:
```
kubectl logs pod/<sf-notifier pod> -n stacklight
```

Example of a custom alert to monitor the current tag of the sf-notifier image:

- name: stacklight
  values:
    ...
    prometheusServer:
      ...
      customAlerts:
        ...
      - alert: SFnotifierImageVersion
        annotations:
          description: "sf-notifier image has a buggy tag, please revert deployment image back to sf-notifier:v0.4-20240828023015. This should be fixed in MCC 2.29.3 / MOSK 24.3.5, remove the alert after this release upgrade."
          summary: "This image might be causing too many API logins and exceeding our monthly API budget, please act immediately"
        expr: >-
          avg(kube_pod_container_info{container="sf-notifier",image_spec=~".*v0.4-(20241021023015|20241028023015|20241118023015|20241216023012|20250113023013|20250217023014|20250317092322|20250414023016)"})
        for: 5m
        labels:
          service: alertmanager
          severity: critical

Container Cloud web UI¶

[50181] Failure to deploy a compact cluster using the Container Cloud web UI¶

To work around the issue, manually add the required labels using CLI. Once done, the cluster deployment resumes.

[50168] Inability to use a new project through the Container Cloud web UI¶

A newly created project does not display all available tabs and contains different access denied errors during first five minutes after creation.

To work around the issue, refresh the browser in five minutes after the project creation.

Update notes¶

Expected impact when updating within the 24.2 series¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.2 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices. Pay special attention to the known issue 50566 that may affect the maintenance window.
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

24.1 series¶

Major and patch versions update path

The primary distinction between major and patch product versions lies in the fact that major release versions introduce new functionalities, whereas patch release versions predominantly offer minor product enhancements, mostly CVE resolutions for your clusters.

Depending on your deployment needs, you can either update only between major releases or apply patch updates between major releases. Choosing the latter option, which includes patch updates, ensures you receive security fixes as soon as they become available. Though, be prepared to update your cluster frequently, approximately once every three weeks.

Refer to the following documentation for the details about the release content, update schemes, and so on:

Starting from MOSK 24.1.5, Mirantis introduces a new update scheme allowing for the update path flexibility. For details, Cluster update scheme.

24.1¶

Release date	March 04, 2024
Name	MOSK 24.1
Cluster release	17.1.0
Highlights	Pre-update inspection of pinned product artifacts in a `Cluster` object OpenStack Antelope Technical preview for SPICE remote console Technical preview for Windows guests Technical preview for GPU virtualization Deterministic Open vSwitch restarts Orchestration of stateful applications rescheduling Technical preview for CQL to connect with Cassandra clusters Technical preview for Tungsten Fabric Operator API v2 Tungsten Fabric analytics services unsupported Monitoring of OpenStack credential rotation dates Removal of the StackLight `telegraf-openstack` plugin Restrictive network policies

New features¶

MOSK 24.1 features¶
Component	Support scope	Feature
Update	Full	Pre-update inspection of pinned product artifacts in a Cluster object
OpenStack	Full	OpenStack Antelope
	TechPreview	SPICE remote console
	TechPreview	Windows guests
	TechPreview	GPU virtualization
	Full	Orchestration of stateful applications rescheduling
	Full	Deterministic Open vSwitch restarts
Tungsten Fabric	TechPreview	CQL to connect with Cassandra clusters
	Unsupported	Tungsten Fabric analytics services unsupported
	TechPreview	Tungsten Fabric Operator API v2
StackLight	Full	Monitoring of OpenStack credential rotation dates
	Removed	Removal of the StackLight telegraf-openstack plugin
Security	Full	Restrictive network policies for Kubernetes pods

Pre-update inspection of pinned product artifacts in a Cluster object¶

To ensure that Container Cloud clusters remain consistently updated with the latest security fixes and product improvements, the Admission Controller has been enhanced. Now, it actively prevents the utilization of pinned custom artifacts for Container Cloud components. Specifically, it blocks a management or managed cluster release update, or any cluster configuration update, for example, adding public keys or proxy, if a Cluster object contains any custom Container Cloud artifacts with global or image-related values overwritten in the helm-releases section, until these values are removed.

Normally, the Container Cloud clusters do not contain pinned artifacts, which eliminates the need for any pre-update actions in most deployments. However, if the update of your cluster is blocked with the invalid HelmReleases configuration error, refer to Update notes: Pre-update actions for details.

Note

In rare cases, if the image-related or global values should be changed, you can use the ClusterRelease or KaaSRelease objects instead. But make sure to update these values manually after every major and patch update.

Note

The pre-update inspection applies only to images delivered by Container Cloud that are overwritten. Any custom images unrelated to the product components are not verified and do not block cluster update.

Learn more

Update notes: Pre-update actions

OpenStack Antelope¶

Added full support for OpenStack Antelope with Open vSwitch and Tungsten Fabric 21.4 networking backends.

Starting from 24.1, MOSK deploys all new clouds with OpenStack Antelope by default. To upgrade an existing cloud from OpenStack Yoga to Antelope, follow the Upgrade OpenStack procedure.

For the OpenStack support cycle in MOSK, refer to OpenStack support cycle.

Highlights from upstream OpenStack supported by MOSK deployed on Antelope

Designate:

Ability to share Designate zones across multiple projects. This not only allows two or more projects to manage recordsets in the zone but also enables “Classless IN-ADDR.ARPA delegation” (RFC 2317) in Designate. “Classless IN-ADDR.ARPA delegation” permits IP address DNS PTR record assignment in smaller blocks without creating a DNS zone for each address.

Manila:

Feature parity between the native client and OSC.
Capability for users to specify metadata when creating their share snapshots. The behavior should be similar to Manila shares, allowing users to query snapshots filtering them by metadata, and update or delete the metadata of the given resources.

Neutron:

Capability for managing network traffic based on packet rate by implementing the QoS (Quality of Service) rule type “packet per second” (pps).

Nova:

Improved behavior for Windows guests by adding new Hyper-V enlightments on all libvirt guests by default.
Ability to unshelve an instance to a specific host (admin only).
With microversion 2.92, the capability to only import a public key and not generate a keypair. Also, the capability to use an extended name pattern.

Octavia:

Support for notifications about major events of the life cycle of a load balancer. Only loadbalancer.[create|update|delete].end events are emitted.

To view the full list of OpenStack Antelope features, including those not supported by MOSK, refer to OpenStack Antelope upstream documentation: Release notes and source code.

SPICE remote console¶

TechPreview

Implemented the capability to enable SPICE remote console through the OpenStackDeployment custom resource as a method to interact with OpenStack virtual machines through the CLI and desktop client as well as MOSK Dashboard (OpenStack Horizon).

The usage of the SPICE remote console is an alternative to using the noVNC-based VNC remote console.

Learn more

Enable SPICE remote console

Windows guests¶

TechPreview

Implemented the capability to configure and run Windows guests on OpenStack, which allows for optimization of cloud infrastructure for diverse workloads.

Learn more

User Guide: Run Windows guests

GPU virtualization¶

TechPreview

Introduced support for the Virtual Graphics Processing Unit (vGPU) feature that allows for leveraging the power of virtualized GPU resources to enhance performance and scalability of cloud deployments.

Learn more

Deterministic Open vSwitch restarts¶

Implemented a new logic for Open vSwitch restart process during a MOSK cluster update that allows for minimized workload downtime.

Orchestration of stateful applications rescheduling¶

Implemented automated management and coordination of relocating stateful applications.

CQL to connect with Cassandra clusters¶

TechPreview

Enhanced the connectivity between the Tungsten Fabric services and Cassandra database clusters through the Cassandra Query Language (CQL) protocol.

Learn more

Reference Architecture: Enabling CQL to connect with Cassandra clusters

Tungsten Fabric Operator API v2¶

TechPreview

Introduced the technical preview support for the API v2 for the Tungsten Fabric Operator. This API version aligns with the OpenStack Controller API and provides better interface for advanced configurations.

In MOSK 24.1, the API v2 is available only for the greenfield product deployments with Tungsten Fabric. The Tungsten Fabric configuration documentation provides configuration examples for both API v1alpha1 and API v2.

Learn more

Tungsten Fabric analytics services unsupported¶

Removed from support Tungsten Fabric analytics services, primarily designed for collecting various metrics from the Tungsten Fabric services.

Despite its initial implementation, user demand for this feature has been minimal. As a result, Tungsten Fabric analytics services become unsupported in the product.

All greenfield deployments starting from MOSK 24.1 do not include Tungsten Fabric analytics services using StackLight capabilities instead by default. The existing deployments updated to 24.1 and newer versions will include Tungsten Fabric analytics services as well as the ability to disable them.

Learn more

Monitoring of OpenStack credential rotation dates¶

Implemented alerts to notify the cloud users when their OpenStack administrator and OpenStack service user credentials are overdue for rotation.

Learn more

Operations Guide: Credential rotation alerts

Removal of the StackLight telegraf-openstack plugin¶

Removed StackLight telegraf-openstack plugin and replaced it with osdpl-exporter.

All valuable Telegraf metrics used by StackLight components have been reimplemented in osdpl-exporter and all dependent StackLight alerts and dashboards started using new metrics.

Learn more

Deprecation Notes: StackLight telegraf-openstack plugin

Restrictive network policies for Kubernetes pods¶

Implemented more restrictive network policies for Kubernetes pods running OpenStack services.

As part of the enhancement, added NetworkPolicy objects for all types of Ceph daemons. These policies allow only specified ports to be used by the corresponding Ceph daemon pods.

Learn more

Security Guide: Ceph network policies

Major components versions¶

MOSK 24.1 components versions¶
Component	Version
Cluster release	17.1.0 (Cluster release notes)
OpenStack	Antelope, Yoga
OpenStack Operator	0.15.9
Tungsten Fabric	21.4
Tungsten Fabric Operator	0.14.3

See also

For the supported versions of operating system, Ceph, and other components, refer to Release Compatibility Matrix.

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.1.

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[36524] etcd enters a panic state after replacement of the controller node¶

Fixed in MOSK 24.2

After provisioning the controller node, the etcd pod initiates before the Kubernetes networking is fully operational. As a result, the pod encounters difficulties resolving DNS and establishing connections with other members, ultimately leading to a panic state for the etcd service.

Workaround:

Delete the PVC related to the replaced controller node:
```
kubectl -n openstack delete pvc <PVC-NAME>
```
Delete pods related to the crashing etcd service on the replaced controller node:
```
kubectl -n openstack delete pods <ETCD-POD-NAME>
```

[39768] OpenStack Controller exporter fails to start¶

Fixed in MOSK 24.1.2 Fixed in MOSK 24.2

On large (500+ compute nodes) clusters, openstack-controller-exporter may fail to initialize within the default timeout.

As a workaround, define OSCTL_EXPORTER_MAX_POLL_TIMEOUT in the cluster object:

spec:
  providerSpec:
    value:
      helmReleases:
        - name: openstack-operator
          values:
            exporter:
              settings:
                raw:
                  OSCTL_EXPORTER_MAX_POLL_TIMEOUT: 900

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric (TF) known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.1. For TF limitations, see Tungsten Fabric known limitations.

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

Update known issues¶

This section lists the update known issues with workarounds for the MOSK release 24.1.

[40036] Node is not removed from a cluster when its ‘Machine’ is ‘disabled’¶

Fixed in MOSK 24.1.1 Fixed in MOSK 24.2

During the ClusterRelease update of a MOSK cluster, a node cannot be removed from the Kubernetes cluster if the related Machine object is disabled.

As a workaround, remove the finalizer from the affected Node object.

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

[41810] Cluster update is stuck due to the OpenStack Controller flooding¶

Fixed in MOSK 24.1.3

The cluster update may stuck if the maximum number of the worker nodes to update simultaneously is ten or higher.

To work around the problem, set the spec.providerSpec.maxWorkerUpgradeCount to a value lower than 10. For configuration details, see Configure the parallel update of worker nodes.

Release artifacts¶

This section lists the components artifacts of the MOSK 24.1 release that includes binaries, Docker images, and Helm charts.

MOSK 24.1 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240117112744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240202140750	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240202140750	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240202140750	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240202140750	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240202140750	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240202140750	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240202140750	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240202140750	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240202140750	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240202140750	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240202140750	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240202140750	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231211175451	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240202140749	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20240131140954	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.6-alpine-20240129151228	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231127070342	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240202140750	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240202140750	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240202140750	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20240131110410	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.28.1	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240202140750	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240124144254	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240202140750	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240202140750	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240202140750	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240202140750	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240202140750	Apache License 2.0

MOSK 24.1 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240115150429.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240202140749	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240202140749	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240202140749	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240202140749	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240202140749	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240202140749	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240202140749	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240202140749	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240202140749	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240202140749	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240202140749	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240202140749	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231211175451	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240202140749	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20240131140954	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.6-alpine-20240129151228	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231127070342	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240202140749	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240202140749	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240202140749	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20240131110410	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.28.1	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240202140749	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240104174902	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240202140749	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240202140749	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240202140749	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240202140749	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240202140749	Apache License 2.0

MOSK 24.1 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.15.9.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.14.3.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.14.3	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240123162016	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:v2.2.1	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.13	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240122114202	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.8	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.3	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-11-r1	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.2	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.20-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.3-20231019	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.2	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240116162056	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20231017142953	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20240118000000	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20240118000000	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20240118000000	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20240118000000	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20240118000000	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20240118000000	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20240118000000	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20240118000000	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20240118000000	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20240118000000	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20240118000000	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20240118000000	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20240118000000	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20240118000000	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20240118000000	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20240118000000	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240118000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20240118000000	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20240118000000	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240118000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240118000000	Apache License 2.0

MOSK 24.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240201074016	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 24.1 release:

[OpenStack] [Antelope] [37678] Resolved the issue that prevented instance live-migration due to CPU incompatibility.
[OpenStack] [38629] Optimized resource allocation to enable designate-api to scale up its operation.
[OpenStack] [38792] Resolved the issue that prevented MOSK from creating instances and volumes from images stored in Pure Storage.
[OpenStack] [39069] Resolved the issue that caused the logging of false alerts about 401 responses from OpenStack endpoints.
[StackLight] [36211] Resolved the issue that caused the deprecated dashboards NGINX Ingress controller and Ceph Nodes to be displayed in Grafana. These dashboards are now removed. Therefore, Mirantis recommends switching to the following dashboards:
- OpenStack Ingress controller instead of NGINX Ingress controller
- For Ceph:
  - Ceph Cluster dashboard for Ceph stats
  - System dashboard for resource utilization, which includes filtering by Ceph node labels, such as ceph_role_osd, ceph_role_mon, and ceph_role_mgr

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 24.1. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Features¶

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

See also

Release Compatibility Matrix

Update impact and maintenance windows planning¶

The following table provides details on the update impact on a MOSK cluster.

Impact during update to MOSK 24.1¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of the North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds Non-distributed routers, non-HA mode - interruption up to 10 minutes Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	Instance network connectivity interruption up to 5 minutes
Host OS kernel	No impact	Restart of instances due to the hypervisor reboot 0

0: Host operating system needs to be rebooted for the kernel update to be applied. Configure live-migration of workloads to avoid the impact on the instances running on a host.

To properly plan the update maintenance window, use the following documentation:

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

If your are updating to MOSK 24.1 from the 23.2 series, make sure that you apply the workaround for [37545] Cloud public API becomes inaccessible during update.

Pre-update actions¶

Unblock cluster update by removing any pinned product artifacts¶

If any pinned product artifacts are present in the Cluster object of a management or managed cluster, the update will be blocked by the Admission Controller with the invalid HelmReleases configuration error until such artifacts are removed. The update process does not start and any changes in the Cluster object are blocked by the Admission Controller except the removal of fields with pinned product artifacts.

Therefore, verify that the following sections of the Cluster objects do not contain any image-related (tag, name, pullPolicy, repository) and global values inside Helm releases:

.spec.providerSpec.value.helmReleases
.spec.providerSpec.value.kaas.management.helmReleases
.spec.providerSpec.value.regionalHelmReleases
.spec.providerSpec.value.regional

For example, a cluster configuration that contains the following highlighted lines will be blocked until you remove them:

- name: kaas-ipam
          values:
            kaas_ipam:
              image:
                tag: base-focal-20230127092754
              exampleKey: exampleValue

- name: kaas-ipam
          values:
            global:
              anyKey: anyValue
            kaas_ipam:
              image:
                tag: base-focal-20230127092754
              exampleKey: exampleValue

The custom pinned product artifacts are inspected and blocked by the Admission Controller to ensure that Container Cloud clusters remain consistently updated with the latest security fixes and product improvements

Note

Post-update actions¶

Upgrade OpenStack to Antelope¶

With 24.1, MOSK is rolling out OpenStack Antelope support for both Open vSwitch and Tungsten Fabric-based deployments.

Mirantis encourages you to upgrade to Antelope to start benefitting from the enhanced functionality and new features of this OpenStack release. MOSK allows for direct upgrade from Yoga to Antelope, without the need to upgrade to the intermediate Zed release. To upgrade the cloud, complete the Upgrade OpenStack procedure.

Important

There are several known issue affecting MOSK clusters running OpenStack Antelope that can disrupt the network connectivity of the cloud workloads.

If you have updated your cluster to OpenStack Antelope, apply the workarounds described in Release notes: OpenStack known issues for the following issues:

[45879] [Antelope] Incorrect packet handling between instance and its gateway
[44813] Traffic disruption observed on trunk ports

Disable Tungsten Fabric analytics services¶

Security notes¶

In total, since MOSK 23.3 major release, in 24.1, 327 Common Vulnerabilities and Exposures (CVE) have been fixed: 15 of critical and 312 of high severity.

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 23.3.4. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	2	8	10
OpenStack	Common	2	10	12
Tungsten Fabric	Unique	4	29	33
Tungsten Fabric	Common	8	44	52

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.26.0: Security notes.

24.1.1 patch¶

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

MOSK 24.1.1 details¶
Release date	March 20, 2024
Scope	Patch
Cluster release	17.1.1
OpenStack Operator	0.15.10
Tungsten Fabric Operator	0.14.5

Enhancements¶

This section outlines enhancements introduced in the MOSK 24.1.1 patch release.

Delivery mechanism for CVE fixes for Ubuntu¶

Introduced the ability to update Ubuntu packages including kernel minor version update, when available in a product release, to address CVE issues on a host operating system.

On management clusters, the update of Ubuntu mirror along with the update of minor kernel version occurs automatically with cordon-drain and reboot of machines.

On MOSK clusters, the update of Ubuntu mirror along with the update of minor kernel version applies during a manual cluster update without automatic cordon-drain and reboot of machines. After a managed cluster update, all cluster machines have the reboot is required notification.

The kernel update is not obligatory on MOSK clusters. Though, if you prefer obtaining the latest CVE fixes for Ubuntu, update the kernel by manually rebooting machines during a convenient maintenance window using GracefulRebootRequest resource.

In MOSK 24.1.1, the kernel version has been updated to 5.15.0-97-generic.

Learn more

Release artifacts¶

This section lists the components artifacts of the MOSK 24.1.1 release that includes binaries, Docker images, and Helm charts.

MOSK 24.1.1 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240117112744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240223093139	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240223093139	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240223093139	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240223093139	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240223093139	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240223093139	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240223093139	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240223093139	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240223093139	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240223093139	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240223093139	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240223093139	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240209162006	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240223093139	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20240212154001	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.6-alpine-20240129151228	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231127070342	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240223093139	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240223093139	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240223093139	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20240212084423	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240223093139	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240220093950	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240223093139	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240223093139	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240223093139	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240223093139	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240223093139	Apache License 2.0

MOSK 24.1.1 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240115150429.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240223093139	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240223093139	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240223093139	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240223093139	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240223093139	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240223093139	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240223093139	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240223093139	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240223093139	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240223093139	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240223093139	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240223093139	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240209162006	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240223093139	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20240212154001	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.6-alpine-20240129151228	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231127070342	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240223093139	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240223093139	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240223093139	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20240212084423	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240223093139	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240220093914	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240223093139	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240223093139	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240223093139	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240223093139	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240223093139	Apache License 2.0

MOSK 24.1.1 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.15.10.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.1.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.14.5.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.14.5	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240213163655	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:v2.2.1	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.13	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240122114202	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.8	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.3	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-11-r24	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.2	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.20-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.3-20231019	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.2	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240116162056	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20231017142953	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:24.1-r21.4.20240227154027	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:24.1-r21.4.20240227154027	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:24.1-r21.4.20240227154027	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:24.1-r21.4.20240227154027	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:24.1-r21.4.20240227154027	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.1-r21.4.20240227154027	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.1-r21.4.20240227154027	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240118000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.1-r21.4.20240227154027	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.1-r21.4.20240227154027	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.1-r21.4.20240227154027	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240118000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240118000000	Apache License 2.0

MOSK 24.1.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240228023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230912105027	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

In total, since MOSK 24.1 release, in 24.1.1, 223 Common Vulnerabilities and Exposures (CVE) have been fixed: 8 of critical and 215 of high severity.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	8	8
OpenStack	Common	0	50	50
Tungsten Fabric	Unique	8	50	58
Tungsten Fabric	Common	8	165	173

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.26.1: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.1.1 release:

[40036] Resolved the issue causing nodes to remain in the Kubernetes cluster when the corresponding machine is marked as disabled during cluster update.

Known issues¶

This section lists MOSK known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.1.1.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[36524] etcd enters a panic state after replacement of the controller node¶

Fixed in MOSK 24.2

Workaround:

Delete the PVC related to the replaced controller node:
```
kubectl -n openstack delete pvc <PVC-NAME>
```
Delete pods related to the crashing etcd service on the replaced controller node:
```
kubectl -n openstack delete pods <ETCD-POD-NAME>
```

[39768] OpenStack Controller exporter fails to start¶

Fixed in MOSK 24.1.2 Fixed in MOSK 24.2

On large (500+ compute nodes) clusters, openstack-controller-exporter may fail to initialize within the default timeout.

As a workaround, define OSCTL_EXPORTER_MAX_POLL_TIMEOUT in the cluster object:

spec:
  providerSpec:
    value:
      helmReleases:
        - name: openstack-operator
          values:
            exporter:
              settings:
                raw:
                  OSCTL_EXPORTER_MAX_POLL_TIMEOUT: 900

[41810] Cluster update is stuck due to the OpenStack Controller flooding¶

Fixed in MOSK 24.1.3

The cluster update may stuck if the maximum number of the worker nodes to update simultaneously is ten or higher.

To work around the problem, set the spec.providerSpec.maxWorkerUpgradeCount to a value lower than 10. For configuration details, see Configure the parallel update of worker nodes.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric¶

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

Update notes¶

Expected impact¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.1 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Learn more about the release cadence

24.1.2 patch¶

MOSK 24.1.2 details¶
Release date	April 08, 2024
Scope	Patch
Cluster release	17.1.2
OpenStack Operator	0.15.13
Tungsten Fabric Operator	0.14.7

The MOSK 24.1.2 patch provides the following updates:

Support for MKE 3.7.6
Update of minor kernel version from 5.15.0-97-generic to 5.15.0-101-generic
Security fixes for CVEs in images
Resolved product issues

Release artifacts¶

This section lists the components artifacts of the MOSK 24.1.2 release that includes binaries, Docker images, and Helm charts.

MOSK 24.1.2 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240117112744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240318153252	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240318153252	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240318153252	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240318153252	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240318153252	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240318153252	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240318153252	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240318153252	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240318153252	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240318153252	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240318153252	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240318153252	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240316092942	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240318153251	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.0-alpine-20240305082401	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20240311120505	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240318153252	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240318153252	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240318153252	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240318153252	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240315112519	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240318153252	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240318153252	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240318153252	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240318153252	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240318153252	Apache License 2.0

MOSK 24.1.2 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240115150429.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240318153251	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240318153251	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240318153251	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240318153251	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240318153251	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240318153251	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240318153251	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240318153251	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240318153251	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240318153251	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240318153251	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240318153251	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240316092942	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240318153251	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.0-alpine-20240305082401	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20240311120505	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240318153251	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240318153251	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240318153251	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240318153251	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240313091248	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240318153251	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240318153251	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240318153251	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240318153251	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240318153251	Apache License 2.0

MOSK 24.1.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.15.13.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.1.2 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.14.7.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.14.7	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240307145514	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:v2.2.1	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.14	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240122114202	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.8	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.3	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-11-r24	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.2	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.20-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.3-20231019	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.2	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240116162056	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20231017142953	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.1-r21.4.20240322210912	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240118000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.1-r21.4.20240322210912	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.1-r21.4.20240322210912	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.1-r21.4.20240322210912	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240118000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240118000000	Apache License 2.0

MOSK 24.1.2 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240318062244	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 24.1.1. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	High	Total
OpenStack	Unique	7	7
OpenStack	Common	7	7
Tungsten Fabric	Unique	3	3
Tungsten Fabric	Common	4	4

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.26.2: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.1.2 release:

[39663] Resolved the issue in OpenStack notifications for the Keystone component where the "event_type": "identity.user.deleted" field lacked the ID of the request initiator (req_initiator) in the CADF payload.
[39768] Resolved the issue that caused the OpenStack controller exporter to fail to initialize within the default timeout on large (500+ compute nodes) clusters.
[40712] Resolved the issue that caused nova-manage image_property to show traces.
[40740] Resolved the issue that caused the OpenStack sos report logs collection failure.

Known issues¶

This section lists MOSK known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.1.2.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[36524] etcd enters a panic state after replacement of the controller node¶

Fixed in MOSK 24.2

Workaround:

Delete the PVC related to the replaced controller node:
```
kubectl -n openstack delete pvc <PVC-NAME>
```
Delete pods related to the crashing etcd service on the replaced controller node:
```
kubectl -n openstack delete pods <ETCD-POD-NAME>
```

[41810] Cluster update is stuck due to the OpenStack Controller flooding¶

Fixed in MOSK 24.1.3

The cluster update may stuck if the maximum number of the worker nodes to update simultaneously is ten or higher.

To work around the problem, set the spec.providerSpec.maxWorkerUpgradeCount to a value lower than 10. For configuration details, see Configure the parallel update of worker nodes.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric¶

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

Update notes¶

Expected impact¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.1 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

24.1.3 patch¶

MOSK 24.1.3 details¶
Release date	April 29, 2024
Scope	Patch
Cluster release	17.1.3
OpenStack Operator	0.15.14
Tungsten Fabric Operator	0.14.8

The MOSK 24.1.3 patch provides the following updates:

Monitoring of the cluster upgrade progress. See OpenStackDeploymentStatus custom resource for details.
Support for MKE 3.7.7.
Update of minor kernel version from 5.15.0-101-generic to 5.15.0-102-generic.
Security fixes for CVEs in images.
Resolved product issues.

Release artifacts¶

This section lists the components artifacts of the MOSK 24.1.3 release that includes binaries, Docker images, and Helm charts.

MOSK 24.1.3 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240117112744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240412114318	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240412114318	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240412114318	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240412114318	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240412114318	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240412114318	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240412114318	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240412114318	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240412114318	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240412114318	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240412114318	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240412114318	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240316092942	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240412114317	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.0-alpine-20240305082401	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240327104027	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240412114318	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240412114318	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240412114318	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240412114318	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240411173711	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240412114318	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240412114318	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240412114318	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240412114318	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240412114318	Apache License 2.0

MOSK 24.1.3 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240115150429.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240412114317	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240412114317	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240412114317	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240412114317	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240412114317	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240412114317	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240412114317	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240412114317	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240412114317	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240412114317	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240412114317	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240412114317	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240316092942	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240412114317	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.0-alpine-20240305082401	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240327104027	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240412114317	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240412114317	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240412114317	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240412114317	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240313091248	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240412114317	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240412114317	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240412114317	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240412114317	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240412114317	Apache License 2.0

MOSK 24.1.3 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.15.14.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.1.3 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.14.8.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.14.8	Mirantis Proprietary License
tungsten-pytes	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240307145514	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:v2.2.1	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.14	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240122114202	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.8	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.3	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-11-r24	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.2	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.20-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240315	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.2	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240116162056	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20231017142953	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:24.1-r21.4.20240322210912	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.1-r21.4.20240322210912	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.1-r21.4.20240322210912	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240118000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.1-r21.4.20240322210912	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.1-r21.4.20240322210912	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.1-r21.4.20240322210912	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240118000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240118000000	Apache License 2.0

MOSK 24.1.3 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240403023015	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 24.1.2. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	High	Total
OpenStack	Unique	4	4
OpenStack	Common	4	4
Tungsten Fabric	Unique	1	1
Tungsten Fabric	Common	4	4

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.26.3: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.1.3 release:

[41810] Resolved the cluster update issue caused by the OpenStack controller flooding.

Known issues¶

This section lists MOSK known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.1.3.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[36524] etcd enters a panic state after replacement of the controller node¶

Fixed in MOSK 24.2

Workaround:

Delete the PVC related to the replaced controller node:
```
kubectl -n openstack delete pvc <PVC-NAME>
```
Delete pods related to the crashing etcd service on the replaced controller node:
```
kubectl -n openstack delete pods <ETCD-POD-NAME>
```

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric¶

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

Update notes¶

Expected impact¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.1 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

24.1.4 patch¶

MOSK 24.1.4 details¶
Release date	May 20, 2024
Scope	Patch
Cluster release	17.1.4
OpenStack Operator	0.15.16
Tungsten Fabric Operator	0.14.9

Release artifacts¶

This section lists the components artifacts of the MOSK 24.1.4 release that includes binaries, Docker images, and Helm charts.

MOSK 24.1.4 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240117112744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240502124148	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240502124148	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240502124148	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240502124148	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240502124148	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240502124148	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240502124148	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240502124148	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240502124148	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240502124148	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240502124148	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240502124148	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240316092942	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240502124148	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.0-alpine-20240305082401	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240327104027	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240502124148	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240502124148	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240502124148	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240422155834	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240502124148	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240418091859	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240502124148	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240502124148	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240502124148	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240502124148	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240502124148	Apache License 2.0

MOSK 24.1.4 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240115150429.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240502124148	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240502124148	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240502124148	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240502124148	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240502124148	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240502124148	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240502124148	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240502124148	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240502124148	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240502124148	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240502124148	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240502124148	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240316092942	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240502124148	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.0-alpine-20240305082401	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240327104027	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240502124148	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240502124148	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240502124148	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240422155834	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240502124148	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240419160325	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240502124148	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240502124148	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240502124148	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240502124148	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240502124148	Apache License 2.0

MOSK 24.1.4 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.15.16.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
drb-controller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/drb-controller-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.1.4 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.14.9.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.14.9	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240425092454	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.4	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.15	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240122114202	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.9	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.3	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-12-r14	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.3	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.21-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240315	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.3	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240415181047	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240415104417	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.1-r21.4.20240423151748	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240118000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.1-r21.4.20240423151748	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.1-r21.4.20240423151748	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.1-r21.4.20240423151748	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240118000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240118000000	Apache License 2.0

MOSK 24.1.4 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240424023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component compared to the previous patch release version. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	4	4
OpenStack	Common	0	20	20
Tungsten Fabric	Unique	9	98	107
Tungsten Fabric	Common	17	131	148

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.26.4: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.1.4 release:

[40897] [Tungsten Fabric] Resolved the issue caused by absence of check for error 500 while checking the Tungsten Fabric services.
[41613] [Tungsten Fabric] Resolved the issue caused by incosistencies in the Database Management Tool (the db_manage.py script) that checks, heals, and cleans up inconsistent database entries.
[41784] [OpenStack] Resolved the issue caused by the missing dependency between FIP association and router interface addition to the subnet with a server.

Known issues¶

This section lists MOSK known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.1.4.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[36524] etcd enters a panic state after replacement of the controller node¶

Fixed in MOSK 24.2

Workaround:

Delete the PVC related to the replaced controller node:
```
kubectl -n openstack delete pvc <PVC-NAME>
```
Delete pods related to the crashing etcd service on the replaced controller node:
```
kubectl -n openstack delete pods <ETCD-POD-NAME>
```

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric¶

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

Update notes¶

Expected impact¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.1 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

The MOSK 24.1.4 patch provides the following updates:

Support for MKE 3.7.8
Update of minor kernel version from 5.15.0-102-generic to 5.15.0-105-generic
Security fixes for CVEs in images
Resolved product issues

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

24.1.5 patch¶

Highlights¶

MOSK 24.1.5 details¶
Release date	June 18, 2024
Scope	Patch
Cluster release	17.1.5
OpenStack Operator	0.15.20
Tungsten Fabric Operator	0.14.10

The MOSK 24.1.5 patch provides the following updates:

Update of minor kernel version from 5.15.0-105-generic to 5.15.0-107-generic
Security fixes for CVEs in images
Resolved product issues

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

Release artifacts¶

This section lists the components artifacts of the MOSK 24.1.5 release that includes binaries, Docker images, and Helm charts.

MOSK 24.1.5 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240117112744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240522120643	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240522120643	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240522120643	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240522120643	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240522120643	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240522120643	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240522120643	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240522120643	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240522120643	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240522120643	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240522120643	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240522120643	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240507155140	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240522120640	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.1-alpine-20240424084259	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240522120643	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240522120643	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240522120643	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240522120643	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240522142810	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240522142159	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240522120643	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240522120643	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240522120643	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240522120643	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240522120643	Apache License 2.0

MOSK 24.1.5 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240115150429.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240522120640	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240522120640	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240522120640	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240522120640	Apache License 2.0
osprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Unknown
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240522120640	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240522120640	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240522120640	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240522120640	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240522120640	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240522120640	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240522120640	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240522120640	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240507155140	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240522120640	Apache License 2.0
rabbitmq	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.1-alpine-20240424084259	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240522120640	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240522120640	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240522120640	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Unknown
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240522120640	Apache License 2.0
drb-controller	mirantis.azurecr.io/openstack/extra/drb-controller:0.1.0-20240522142810	Mirantis Proprietary License
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240419160325	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240522120640	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240522120640	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240522120640	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240522120640	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240522120640	Apache License 2.0

MOSK 24.1.5 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.15.20.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
drb-controller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/drb-controller-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.1.5 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.14.10.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.14.10	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240514103153	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.4	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.15	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240508173904	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.9	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.3	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-12-r14	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.3	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.21-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240315	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.3	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240514103649	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240415104417	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:24.1-r21.4.20240423151748	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.1-r21.4.20240423151748	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240118000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.1-r21.4.20240423151748	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.1-r21.4.20240423151748	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.1-r21.4.20240423151748	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240118000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240118000000	Apache License 2.0

MOSK 24.1.5 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240515023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	5	5
OpenStack	Common	0	26	26
Tungsten Fabric	Unique	3	19	22
Tungsten Fabric	Common	3	27	30

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.26.5: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.1.5 release:

[42375] [OpenStack] Resolved the issue with the OpenStack Controller releasing more NodeWorkloadLock objects than allowed during creation of concurrent NodeMaintenanceRequest objects.

Known issues¶

This section lists MOSK known issues with workarounds for the Mirantis OpenStack for Kubernetes release 24.1.5.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[36524] etcd enters a panic state after replacement of the controller node¶

Fixed in MOSK 24.2

Workaround:

Delete the PVC related to the replaced controller node:
```
kubectl -n openstack delete pvc <PVC-NAME>
```
Delete pods related to the crashing etcd service on the replaced controller node:
```
kubectl -n openstack delete pods <ETCD-POD-NAME>
```

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

Ceph¶

[42903] Inconsistent handling of missing pools by ceph-controller¶

Fixed in MOSK 24.2

In rare cases, when ceph-controller cannot confirm the existence of MOSK pools, instead of denying action and raising errors, it proceeds to recreate the Cinder Ceph client. Such behavior may potentially cause issues with OpenStack workloads.

Workaround:

In spec.cephClusterSpec of the KaaSCephCluster custom resource, remove the external section.
Wait for the Not all mgrs are running: 1/2 message to disappear from the KaaSCephCluster status.

Verify that the nova Ceph client that is integrated to MOSK has the same keyring as in the Ceph cluster.

Verify that the cinder Ceph client integrated to MOSK has the same keyring as in the Ceph cluster:

Verify that the glance Ceph client integrated to MOSK has the same keyring as in the Ceph cluster.

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

Update notes¶

Expected impact¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.1 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Cluster update scheme¶

To improve user update experience and make the update path more flexible, MOSK is introducing a new scheme of updating between cluster releases. More specifically, MOSK intends to ultimately provide a possibility to update to any newer patch version within single series at any point of time. The patch version downgrade is not supported.

Though, in some cases, Mirantis may request to update to some specific patch version in the series to be able to update to the next major series. This may be necessary due to the specifics of technical content already released or planned for the release.

Note

The management cluster update scheme remains the same. A management cluster obtains the new product version automatically after release.

For the possible update paths for the current and upcoming product versions refer to Update path for 24.1, 24.2, 24.3, and 25.1 series.

Learn more

24.1.6 patch¶

Highlights¶

MOSK 24.1.6 details¶
Release date	July 16, 2024
Scope	Patch
Cluster release	17.1.6
OpenStack Operator	0.15.21
Tungsten Fabric Operator	0.14.12

The MOSK 24.1.6 patch provides the following updates:

Support for MKE 3.7.10
Update of minor kernel version from 5.15.0-107-generic to 5.15.0-113-generic
Security fixes for CVEs in images
Resolved product issues

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

Release artifacts¶

This section lists the components artifacts of the MOSK 24.1.6 release that includes binaries, Docker images, and Helm charts.

MOSK 24.1.6 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240117112744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240625145417	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240625145417	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240625145417	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240625145417	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240625145417	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240625145417	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240625145417	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240625145417	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240625145417	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240625145417	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240625145417	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240625145417	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240507155140	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240625145417	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.1-alpine-20240424084259	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240625145417	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240625145417	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240625145417	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240625145417	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240611060850	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240625145417	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240625145417	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240625145417	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240625145417	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240625145417	Apache License 2.0

MOSK 24.1.6 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240115150429.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240625145417	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240625145417	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240625145417	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240625145417	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240625145417	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240625145417	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240625145417	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240625145417	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240625145417	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240625145417	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240625145417	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240625145417	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240507155140	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240625145417	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.1-alpine-20240424084259	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240625145417	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240625145417	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240625145417	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240625145417	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240610161716	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240625145417	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240625145417	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240625145417	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240625145417	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240625145417	Apache License 2.0

MOSK 24.1.6 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.15.21.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
drb-controller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/drb-controller-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.1.6 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.14.12.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.14.12	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240701111153	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.5	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.17	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240626115402	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.3	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.21-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240315	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.3	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240613145714	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240415104417	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.1-r21.4.20240423151748	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240118000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.1-r21.4.20240423151748	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.1-r21.4.20240423151748	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.1-r21.4.20240423151748	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240118000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240118000000	Apache License 2.0

MOSK 24.1.6 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240701140356	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	10	10
OpenStack	Common	0	10	10
Tungsten Fabric	Unique	5	53	58
Tungsten Fabric	Common	22	345	367

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.27.1: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.1.6 release:

[42725] Resolved the issue with the OpenStack Controller Exporter failing to scrub the metrics after credential rotation.
[43065] Resolved the DNS query issue.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.1.6.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[36524] etcd enters a panic state after replacement of the controller node¶

Fixed in MOSK 24.2

Workaround:

Delete the PVC related to the replaced controller node:
```
kubectl -n openstack delete pvc <PVC-NAME>
```
Delete pods related to the crashing etcd service on the replaced controller node:
```
kubectl -n openstack delete pods <ETCD-POD-NAME>
```

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

Ceph¶

[42903] Inconsistent handling of missing pools by ceph-controller¶

Fixed in MOSK 24.2

Workaround:

In spec.cephClusterSpec of the KaaSCephCluster custom resource, remove the external section.
Wait for the Not all mgrs are running: 1/2 message to disappear from the KaaSCephCluster status.

Verify that the nova Ceph client that is integrated to MOSK has the same keyring as in the Ceph cluster.

Verify that the cinder Ceph client integrated to MOSK has the same keyring as in the Ceph cluster:

Verify that the glance Ceph client integrated to MOSK has the same keyring as in the Ceph cluster.

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

Update notes¶

Cluster update scheme¶

Note

The management cluster update scheme remains the same. A management cluster obtains the new product version automatically after release.

For the possible update paths for the current and upcoming product versions refer to Update path for 24.1, 24.2, 24.3, and 25.1 series.

Expected impact¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.1 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

Learn more

24.1.7 patch¶

Highlights¶

MOSK 24.1.7 details¶
Release date	August 05, 2024
Scope	Patch
Cluster release	17.1.7
OpenStack Operator	0.15.22
Tungsten Fabric Operator	0.14.14

The MOSK 24.1.7 patch provides the following updates:

Support for MKE 3.7.11
Security fixes for CVEs in images
Resolved product issues

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

Learn more about the release cadence

Release artifacts¶

This section lists the components artifacts of the MOSK 24.1.7 release that includes binaries, Docker images, and Helm charts.

MOSK 24.1.7 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20240117112744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20231123060809.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20240717071031	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20240717071031	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20240717071031	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20240717071031	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240717071031	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20240717071031	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20240717071031	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20240717071031	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20240717071031	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20240717071031	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20240717071031	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20240717071031	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240705072041	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240716141249	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.1-alpine-20240424084259	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20240717071031	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20240717071031	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20240717071031	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20240717071031	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20240611060850	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20240717071031	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20240717071031	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20240717071031	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20240717071031	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20240717071031	Apache License 2.0

MOSK 24.1.7 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20240115150429.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20231004061110.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20240716141249	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20240716141249	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20240716141249	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20240716141249	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.2-20240131075124	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20240716141249	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20240716141249	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20240716141249	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20240716141249	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20240716141249	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20240716141249	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20240716141249	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20240716141249	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20240705072041	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20240716141249	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.23-alpine-20240131134844	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20240131112547	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.8-alpine-20240308161357	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.10.1-alpine-20240424084259	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.17-focal-20240523075821	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20240716141249	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20240716141249	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20240716141249	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.4-alpine3.19-1	BSD 3-Clause “New” or “Revised” License
redis-operator	mirantis.azurecr.io/openstack/extra/redis-operator:v1.2.4-20240427152750	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v9.0.2-20240318063427	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.12-20240129155309	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.29.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20240131112557	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20240716141249	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20240610161716	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20240716141249	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20240716141249	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20240716141249	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20240716141249	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20240716141249	Apache License 2.0

MOSK 24.1.7 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.15.22.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4433.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3060.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 24.1.7 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
tungstenfabric-operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.14.14.tgz	Mirantis Proprietary License

Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.14.14	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20240718144406	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:2.2.5	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.18	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20240715193519	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.3	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.21-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.4-20240315	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.3	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.4-alpine3.19	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20240715193039	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-ba8ada4-20240405150338	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20240415104417	MIT License
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:24.1-r21.4.20240423151748	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:24.1-r21.4.20240423151748	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20240118000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:24.1-r21.4.20240423151748	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:24.1-r21.4.20240423151748	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:24.1-r21.4.20240423151748	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20240118000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20240118000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20240118000000	Apache License 2.0

MOSK 24.1.7 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20240710023012	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20240119124301	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	9	9
OpenStack	Common	0	13	13
Tungsten Fabric	Unique	1	11	12
Tungsten Fabric	Common	1	20	21

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.27.2: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 24.1.7 release:

[43902] [Antelope] Resolved the issue in Neutron that prevented the user with the member role from creating instances.

Known issues¶

This section lists MOSK known issues with workarounds for the MOSK release 24.1.7.

OpenStack¶

[31186,34132] Pods get stuck during MariaDB operations¶

During MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[36524] etcd enters a panic state after replacement of the controller node¶

Fixed in MOSK 24.2

Workaround:

Delete the PVC related to the replaced controller node:
```
kubectl -n openstack delete pvc <PVC-NAME>
```
Delete pods related to the crashing etcd service on the replaced controller node:
```
kubectl -n openstack delete pods <ETCD-POD-NAME>
```

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

[43058] [Antelope] Cronjob for MariaDB is not created¶

Fixed in MOSK 25.1

Sometimes, after changing the OpenStackDeployment custom resource, it does not transition to the APPLYING state as expected.

To work around the issue, restart the rockoon` pod in the osh-system namespace.

[44813] [Antelope] Traffic disruption observed on trunk ports¶

Fixed in MOSK 24.2.1 Fixed in MOSK 24.3

After upgrading to OpenStack Antelope, clusters with configured trunk ports experience traffic flow disruptions that block the cluster updates.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

[45879] [Antelope] Incorrect packet handling between instance and its gateway¶

Fixed in MOSK 24.2.1

After upgrade to OpenStack Antelope, the virtual machines experience connectivity disruptions when sending data over the virtual networks. Network packets with full MTU are dropped.

The issue affects the MOSK clusters with Open vSwitch as the networking backend and with the following specific MTU settings:

The MTU configured on the tunnel interface of compute nodes is equal to the value of the spec:services:networking:neutron:values:conf:neutron:DEFAULT:global_physnet_mtu parameter of the OpenStackDeployment custom resource (if not specified, default is 1500 bytes).

If the MTU of the tunnel interface is higher by at least 4 bytes, the cluster is not affected by the issue.
The cluster contains virtual machines that have the MTU of the network interfaces of the guest operating system larger than the MTU of the value of the global_physnet_mtu parameter above minus 50 bytes.

To work around the issue, pin the MOSK Networking service (OpenStack Neutron) container image by adding the following content to the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          images:
            tags:
              neutron_openvswitch_agent: mirantis.azurecr.io/openstack/neutron:antelope-jammy-20240816113600

Caution

Remove the pinning after updating to MOSK 24.2.1 or later patch or major release.

Tungsten Fabric¶

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[40032] tf-rabbitmq fails to start after rolling reboot¶

Occasionally, RabbitMQ instances in tf-rabbitmq pods fail to enable the tracking_records_in_ets during the initialization process.

To work around the problem, restart the affected pods manually.

[42896] Cassandra cluster contains extra node with outdated IP after replacement of TF control node¶

To verify if your Cassandra cluster is affected, run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool status

Example of the system response with outdated IP addresses:

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns    Host ID                               Rack
UN  192.168.201.144  509.43 KiB  256          ?       7e760a99-fae5-4921-b0c5-d9e6e1eca1c5  rack1
UN  192.168.50.146   534.18 KiB  256          ?       2248ea35-85d4-4887-820b-1fac4733021f  rack1
UN  192.168.145.147  484.19 KiB  256          ?       d988aaaa-44ae-4fec-a617-0b0a253e736d  rack1
DN  192.168.145.144  481.53 KiB  256          ?       c23703a1-6854-47a7-a4a2-af649d63af0c  rack1

An extra node will appear in the cluster with an outdated IP address (the IP of the terminated Cassandra pod) in the Down state.

kubectl -n tf exec -it tf-cassandra-<CONFIG-OR-ANALYTICS>-dc1-rack1-<REPLICA-NUM> -c cassandra -- nodetool removenode <HOST-ID>

Ceph¶

[42903] Inconsistent handling of missing pools by ceph-controller¶

Fixed in MOSK 24.2

Workaround:

In spec.cephClusterSpec of the KaaSCephCluster custom resource, remove the external section.
Wait for the Not all mgrs are running: 1/2 message to disappear from the KaaSCephCluster status.

Verify that the nova Ceph client that is integrated to MOSK has the same keyring as in the Ceph cluster.

Verify that the cinder Ceph client integrated to MOSK has the same keyring as in the Ceph cluster:

Verify that the glance Ceph client integrated to MOSK has the same keyring as in the Ceph cluster.

StackLight¶

[42463] KubePodsCrashLooping is firing during cluster update¶

During major or patch update of a MOSK cluster with StackLight enabled in non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana ReplicaSet.

To prevent the issue, deploy StackLight in HA mode.

Update notes¶

Cluster update scheme¶

Expected impact¶

The following table provides details on the impact of a MOSK cluster update to a patch release within the 24.1 series.

Note

For the detailed workflow of update to a patch release, refer to Update to a patch version.

Expected update impact¶
Updated component	Impact on cloud users	Impact on cloud workloads
OpenStack and Tungsten Fabric	~1% of read operations on cloud API resources may fail ~8% of create and update operations on cloud API resources may fail	Open vSwitch networking - interruption of North-South connectivity, depending on the type of virtual routers used by a workload: Distributed (DVR) routers - no interruption Non-distributed routers, High Availability (HA) mode - interruption up to 1 minute, usually less than 5 seconds 0 Non-distributed routers, non-HA mode - interruption up to 10 minutes 0 Tungsten Fabric networking - no impact
Ceph	~1% of read operations on object storage API may fail	IO performance degradation for Ceph-backed virtual storage devices
Host OS components	No impact	No impact

0(1,2): You can bypass updating components of the cloud data plane to avoid the network downtime during Update to a patch version. By using this technique, you accept the risk that some security fixes may not be applied.

23.3 series¶

23.3¶

Release date	November 06, 2023
Name	MOSK 23.3
Cluster release	17.0.0
Highlights	Technical preview for OpenStack Antelope Technical preview for generation of OpenStack support dump FIPS-compatible OpenStack API MKE 3.7 with Kubernetes 1.27 Open vSwitch 2.17 Technical preview for Tungsten Fabric analytics disablement OpenStack Usage Efficiency Grafana dashboard Ceph monitoring improvements Documentation enhancements

New features¶

MOSK 23.3 features¶
Component	Support scope	Feature
OpenStack	TechPreview	OpenStack Antelope
	TechPreview	Support dump for OpenStack
Security	Full	FIPS-compatible OpenStack API
Major version changes	Full	MKE 3.7 with Kubernetes 1.27 Open vSwitch 2.17
Tungsten Fabric	TechPreview	Tungsten Fabric analytics disablement
StackLight	Full	‘OpenStack Usage Efficiency’ Grafana dashboard
	Full	Ceph monitoring improvements
Documentation	n/a	Operations Guide: Update maintenance calculator Operations Guide: Orchestrate Tungsten Fabric objects through Heat templates Security Guide: Data protection capabilities Protection of control plane communication (WireGuard)

OpenStack Antelope¶

TechPreview

Provided the technical preview support for OpenStack Antelope with Neutron OVS and Tungsten Fabric 21.4 for greenfield deployments.

To start experimenting with the new functionality, set openstack_version to antelope in the OpenStackDeployment custom resource during the cloud deployment.

Learn more

OpenStack documentation: OpenStack Antelope highlights

Support dump for OpenStack¶

TechPreview

Implemented the capability to automatically collect logs and generate support dumps that provide valuable insights for troubleshooting OpenStack-related problems through the osctl sos report tool present within the openstack-controller image.

Learn more

Troubleshoot an OpenStack deployment: Support dump

FIPS-compatible OpenStack API¶

Introduced FIPS-compatible encryption into the API of all MOSK cloud services, ensuring data security and regulatory compliance with the FIPS 140-2 standard.

Learn more

Security Guide: FIPS compliance

Major version changes¶

Introduced support for Mirantis Kubernetes Engine (MKE) 3.7 with Kubernetes 1.27. MOSK clusters are updated to the latest supported MKE version during the cluster update.

Learn more

MKE 3.7 Release Notes
Upgraded Open vSwitch to 2.17 for better performance.

Tungsten Fabric analytics disablement¶

TechPreview

Implemented the capability to disable the Tungsten Fabric analytics services to obtain a more lightweight setup.

Learn more

‘OpenStack Usage Efficiency’ Grafana dashboard¶

Implemented the OpenStack Usage Efficiency dashboard for Grafana that provides information about requested (allocated) CPU and memory usage efficiency on a per-project and per-flavor basis.

This dashboard aims to identify flavors that specific projects are not effectively using, with allocations significantly exceeding actual usage. Also, it evaluates per-instance underuse for specific projects.

Learn more

Ceph monitoring improvements¶

Implemented the following monitoring improvements for Ceph:

Optimized the following Ceph dashboards in Grafana: Ceph Cluster, Ceph Pools, Ceph OSDs.
Removed the redundant Ceph Nodes Grafana dashboard. You can view its content using the following dashboards:
- Ceph stats through the Ceph Cluster dashboard.
- Resource utilization through the System dashboard, which now includes filtering by Ceph node labels, such as ceph_role_osd, ceph_role_mon, and ceph_role_mgr.
Removed the rook_cluster alert label.
Removed the redundant CephOSDDown alert.
Renamed the CephNodeDown alert to CephOSDNodeDown.

Learn more

Documentation enhancements¶

Implemented an online calculator for quick calculation of the approximate time required to update your MOSK cluster that uses Open vSwitch as a networking backend.

Learn more

Operations Guide: Calculate a maintenance window duration for update
Published a procedure that instructs on how to orchestrate Tungsten Fabric objects through Heat templates to ensure repeatability and consistency across deployments.

Learn more

User Guide: Use Heat to create and manage Tungsten Fabric objects
Published an overview of the data protection capabilities available in MOSK, focusing primarily on data encryption.

Learn more

Security Guide: Data encryption capabilities
Published a section that explains the principles of encrypting control plane communications in MOSK.

Learn more

Security Guide: Control plane communications

Major components versions¶

MOSK 23.3 components versions¶
Component	Version
Cluster release	17.0.0 (Cluster release notes)
OpenStack	Yoga, Antelope TechPreview
OpenStack Operator	0.14.7
Tungsten Fabric	21.4
Tungsten Fabric Operator	0.13.2

See also

For the supported versions of operating system, Ceph, and other components, refer to Release Compatibility Matrix.

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.3.

[31186,34132] Pods get stuck during MariaDB operations¶

Due to the upstream MariaDB issue, during MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric (TF) known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.3. For TF limitations, see Tungsten Fabric known limitations.

[37684] Cassandra containers are experiencing high resource utilization
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot

[37684] Cassandra containers are experiencing high resource utilization¶

Fixed in MOSK 23.3.4

The Cassandra containers of the tf-cassandra-analytics service are experiencing high CPU and memory utilization. This is happening because Cassandra Analytics is running out of memory, causing restarts of both Cassandra and the Tungsten Fabric control plane services.

To work around the issue, use the custom images from the Mirantis public repository:

Specify the image for config-api in the TFOperator custom resource:

controllers:
  tf-config:
    api:
      containers:
        - image: mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.2-r21.4.20231208123354
          name: api

Wait for the tf-config pods to restart.

Monitor the Cassandra Analytics resources continuously. If the Out Of Memory (OOM) error is not present, the applied workaround is sufficient.

Otherwise, modify the TF vRouters configuration as well:

controllers:
  tf-vrouter:
    agent:
      containers:
        - env:
          - name: VROUTER_GATEWAY
            value: 10.32.6.1
          - name: DISABLE_TX_OFFLOAD
            value: "YES"
          name: agent
          image: mirantis.azurecr.io/tungsten/contrail-vrouter-agent:23.2-r21.4.20231208123354

To apply the changes, restart the vRouters manually.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

StackLight known issues¶

This section lists the StackLight known issues with workarounds for the Mirantis OpenStack for Kubernetes MOSK release 23.3.

[36211] Deprecated dashboards are displayed in Grafana¶

Fixed in MOSK 24.1

The deprecated dashboards NGINX Ingress controller and Ceph Nodes, which may expose inaccurate information, are displayed in Grafana.

These dashboards will be removed in the following MOSK release. Therefore, Mirantis recommends switching to the following dashboards in this release:

OpenStack Ingress controller instead of NGINX Ingress controller
For Ceph:
- Ceph Cluster dashboard for Ceph stats
- System dashboard for resource utilization, which includes filtering by Ceph node labels, such as ceph_role_osd, ceph_role_mon, and ceph_role_mgr

Update known issues¶

This section lists the update known issues with workarounds for the MOSK release 23.3.

[37012] Masakari failure during update¶

Fixed in MOSK 23.3.1

While updating your cluster, the Instance High Availability service (OpenStack Masakari) may not work as expected.

As a workaround, temporarily disable the service by removing instance-ha from the service list in the OpenStackDeployment custom resource.

[37545] Cloud public API becomes inaccessible during update¶

Fixed in MOSK 23.3.3

During update, the ingress pods that have not been updated yet adopt the configuration meant for the updated pods, causing disruptions. This occurs as ingress pods are sequentially updated, leading to potential inaccessibility to the cloud public API for unpredictable durations until all ingress pods are updated.

To mitigate this issue, Mirantis recommends updating ingress pods in larger batches, preferably half of all pods at a time. This approach minimizes downtime for the public API.

Workaround:

Before you start updating to MOSK 23.3:
1. Increase maxUnavailable for the ingress DaemonSet to 50% of replicas by patching directly the DaemonSet:
```
kubectl -n openstack patch ds ingress -p '{"spec":{"updateStrategy":{"rollingUpdate":{"maxUnavailable":"50%"}}}}'
```
  In certain scenarios, the change may trigger an immediate restart of half of the ingress pods. Therefore, after patching the ingress, wait until all ingress pods become ready, taking into account that there might be occasional failures in public API calls.
  
  To verify that the patch has been applied successfully:
```
kubectl -n openstack get ds ingress -o jsonpath={.spec.updateStrategy.rollingUpdate.maxUnavailable}
```
2. Disable FIPS tls_proxy explicitly in MOSK 23.2 by adding the following configuration into the OpenStackDeployment custom resource:
```
spec:
  features:
    ssl:
      tls_proxy:
        enabled: false
```
Update to MOSK 23.3.
Update to MOSK 24.1.
Re-enable FIPS tls_proxy by removing the configuration added to the OpenStackDeployment custom resource above.

See also

Release artifacts¶

This section lists the components artifacts of the MOSK 23.3 release that includes binaries, Docker images, and Helm charts.

MOSK 23.3 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230928140935.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20231013125630	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20231013125630	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20231013125630	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20231013125630	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.9	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20231013125630	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20231013125630	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20231013125630	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20231013125630	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20231013125630	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20231013125630	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20231013125630	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20231013125630	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231006073052	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231006073052	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231006073052	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20231013125630	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.21-alpine-20230913050002	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230912131525	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.2-alpine-20230928053836	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230920121405	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20231013125630	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20231013125630	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20231013125630	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230928073518	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230920121951	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20231013125630	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20231007133456	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20231013125630	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20231013125630	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20231013125630	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20231013125630	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20231013125630	Apache License 2.0

MOSK 23.3 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20230927122744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20231013125630	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20231013125630	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20231013125630	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20231013125630	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.9	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20231013125630	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20231013125630	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20231013125630	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20231013125630	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20231013125630	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20231013125630	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20231013125630	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20231013125630	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231006073052	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231006073052	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231006073052	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20231013125630	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.21-alpine-20230913050002	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230912131525	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.2-alpine-20230928053836	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230920121405	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20231013125630	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20231013125630	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20231013125630	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230928073518	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230920121951	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20231013125630	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20231006073052	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20231013125630	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20231013125630	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20231013125630	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20231013125630	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20231013125630	Apache License 2.0

MOSK 23.3 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.14.7.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4285.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2972.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.3 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.13.2.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.13.2	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20230929000000	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20230929000000	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20230929000000	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20230929000000	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20230929000000	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20230929000000	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20230929000000	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20230929000000	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230929000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230929000000	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.19	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.12	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230830113546	Apache License 2.0
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.0	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r32	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.7	Mirantis Proprietary License
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.1	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.19-mcp	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.1-20230425	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.1	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20230929000000	Apache License 2.0
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230921141620	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230927135644	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20230929000000	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0

MOSK 23.3 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230929023009	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230912105027	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-49.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 23.3 release:

[OpenStack] [34897] Resolved the issue that caused the unavailability of machines from the nodes with DPDK after update of OpenStack from Victoria to Wallaby.
[OpenStack] [34411] Resolved the issue with an incorrect port value for RabbitMQ after update.
[OpenStack] [25124] Improved performance while sending data between instances affected by the Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) throughput limitation.
[TF] [30738] Fixed the issue that caused the tf-vrouter-agent readiness probe failure (No Configuration for self).
[Update] [35111] Resolved the issue that caused the openstack-operator-ensure-resources job getting stuck in CrashLoopBackOff.
[WireGuard] [35147] Resolved the issue that prevented the WireGuard interface from having the IPv4 address assigned.
[Bare metal] [34342] Resolved the issue that caused a failure of the etcd pods due to the simultaneous deployment of several pods on a single node. To ensure that etcd pods are always placed on different nodes, MOSK now deploys etcd with the requiredDuringSchedulingIgnoredDuringExecution policy.
[StackLight] [35738] Resolved the issue with ucp-node-exporter. It was unable to bind port 9100, causing the ucp-node-exporter start failure. This issue was due to a conflict with the StackLight node-exporter, which was also binding the same port.

The resolution of the issue involves an automatic change of the port for the StackLight node-exporter from 9100 to 19100. No manual port update is required.

If your cluster uses a firewall, add an additional firewall rule that grants the same permissions to port 19100 as those currently assigned to port 9100 on all cluster nodes.

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 23.3. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Features¶

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

Major component versions update¶

As part of the update to MOSK 23.3, the following automatic updates of major component versions will take place:

MKE 3.6 with Kubernetes 1.24 to MKE 3.7 with Kubernetes 1.27
Open vSwitch 2.13 to 2.17

See also

Release Compatibility Matrix

Update impact and maintenance windows planning¶

The update to MOSK 23.3 does not include any version-specific impact on the cluster.

To properly plan the update maintenance window, use the following documentation:

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Update known issues.

Specifically, apply the workaround for the [37545] Cloud public API becomes inaccessible during update known issue.

Pre-update actions¶

Upgrade Ubuntu to 20.04¶

In the 23.3 release series, MOSK stops supporting Ubuntu 18.04. Therefore, upgrade the operating system on your cluster machines to Ubuntu 20.04 before you update to MOSK 23.3. Otherwise, the Cluster release update for the cluster running on Ubuntu 18.04 becomes impossible.

It is not mandatory to upgrade all machines at once. You can upgrade them one by one or in small batches, for example, if the maintenance window is limited in time.

For details on distribution upgrade, see Upgrade an operating system distribution.

Warning

Make sure to manually reboot machines after the distribution upgrade before updating MOSK to 23.3.

Upgrade OpenStack to Yoga¶

MOSK supports the OpenStack Victoria version until September, 2023. MOSK 23.2 was the last release version where OpenStack Victoria packages were updated.

If you have not already upgraded your OpenStack version to Yoga, perform the upgrade before cluster update.

See also

Disable the Instance High Availability service¶

While updating your cluster, the Instance High Availability service (OpenStack Masakari) may not work as expected. Therefore, temporarily disable the service by removing instance-ha from the service list in the OpenStackDeployment custom resource.

Ensure running one etcd pod per OpenStack controller node¶

During the update, you may encounter the issue that causes a failure of the etcd pods due to the simultaneous deployment of several pods on a single node.

Therefore, before starting the update, ensure that each OpenStack controller node runs only one etcd pod.

Post-update actions¶

No specific actions are needed to finalize the cluster update.

Security notes¶

In total, since MOSK 23.2 major release, in 23.3, 466 Common Vulnerabilities and Exposures (CVE) have been fixed: 24 of critical and 442 of high severity.

The table below includes the total numbers of addressed unique and common CVEs by MOSK-specific component since MOSK 23.2.3. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
MOSK component	CVE type	Critical	High	Total
OpenStack	Unique	0	19	19
OpenStack	Common	0	45	45
Tungsten Fabric	Unique	2	19	21
Tungsten Fabric	Common	2	57	59

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.25.0: Security notes.

23.3.1 patch¶

The patch release notes contain the list of updated artifacts and Common Vulnerabilities and Exposures (CVE) fixes in images as well as description of the addressed product issues for the MOSK 23.3.1 patch.

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

MOSK 23.3.1 details¶
Release date	November 27, 2023
Scope	Patch
Cluster release	17.0.1
OpenStack Operator	0.14.12
Tungsten Fabric Operator	0.13.4

Release artifacts¶

This section lists the components artifacts of the MOSK 23.3.1 release that includes binaries, Docker images, and Helm charts.

MOSK 23.3.1 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/octavia/amphora-x64-haproxy-antelope-20230927122744.qcow2	Mirantis Proprietary License
mirantis	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.vmlinuz	GPL-2.0
initramfs	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.gz	GPL-2.0
service-image	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	docker-dev-kaas-virtual.docker.mirantis.net/openstack/keystone:antelope-jammy-20231110182826	Apache License 2.0
heat	docker-dev-kaas-virtual.docker.mirantis.net/openstack/heat:antelope-jammy-20231110182826	Apache License 2.0
glance	docker-dev-kaas-virtual.docker.mirantis.net/openstack/glance:antelope-jammy-20231110182826	Apache License 2.0
cinder	docker-dev-kaas-virtual.docker.mirantis.net/openstack/cinder:antelope-jammy-20231110182826	Apache License 2.0
cloudprober	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/cloudprober:main-rc1	Apache License 2.0
neutron	docker-dev-kaas-virtual.docker.mirantis.net/openstack/neutron:antelope-jammy-20231110182826	Apache License 2.0
nova	docker-dev-kaas-virtual.docker.mirantis.net/openstack/nova:antelope-jammy-20231110182826	Apache License 2.0
horizon	docker-dev-kaas-virtual.docker.mirantis.net/openstack/horizon:antelope-jammy-20231110182826	Apache License 2.0
tempest	docker-dev-kaas-virtual.docker.mirantis.net/openstack/tempest:antelope-jammy-20231110182826	Apache License 2.0
octavia	docker-dev-kaas-virtual.docker.mirantis.net/openstack/octavia:antelope-jammy-20231110182826	Apache License 2.0
designate	docker-dev-kaas-virtual.docker.mirantis.net/openstack/designate:antelope-jammy-20231110182826	Apache License 2.0
ironic	docker-dev-kaas-virtual.docker.mirantis.net/openstack/ironic:antelope-jammy-20231110182826	Apache License 2.0
barbican	docker-dev-kaas-virtual.docker.mirantis.net/openstack/barbican:antelope-jammy-20231110182826	Apache License 2.0
libvirt	docker-dev-kaas-virtual.docker.mirantis.net/general/libvirt:8.0.x-jammy-20231018050930	LGPL-2.1 License
pause	docker-dev-kaas-virtual.docker.mirantis.net/general/external/pause:3.1	Apache License 2.0
openvswitch	docker-dev-kaas-virtual.docker.mirantis.net/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	docker-dev-kaas-virtual.docker.mirantis.net/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	docker-dev-kaas-virtual.docker.mirantis.net/openstack/openstack-tools:yoga-jammy-20231110182826	Apache License 2.0
rabbitmq-3.10.x	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	docker-dev-kaas-virtual.docker.mirantis.net/general/memcached:v1.6.21-alpine-20230913050002	BSD 3-Clause “New” or “Revised” License
etcd	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/etcd:v3.5.10-alpine-20231031103038	Apache License 2.0
powerdns	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/nginx-ingress-controller:1.9.3	Apache License 2.0
tls-proxy	docker-dev-kaas-virtual.docker.mirantis.net/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	docker-dev-kaas-virtual.docker.mirantis.net/general/mariadb:10.6.14-focal-20231024091216	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	docker-dev-kaas-virtual.docker.mirantis.net/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	docker-dev-kaas-virtual.docker.mirantis.net/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	docker-dev-kaas-virtual.docker.mirantis.net/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	docker-dev-kaas-virtual.docker.mirantis.net/openstack/aodh:antelope-jammy-20231110182826	Apache License 2.0
ceilometer	docker-dev-kaas-virtual.docker.mirantis.net/openstack/ceilometer:antelope-jammy-20231110182826	Apache License 2.0
gnocchi	docker-dev-kaas-virtual.docker.mirantis.net/openstack/gnocchi:antelope-jammy-20231110182826	Apache License 2.0
redis	docker-dev-kaas-virtual.docker.mirantis.net/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	docker-dev-kaas-virtual.docker.mirantis.net/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/strongswan:alpine-5.9.8-20231021164312	GPL-2.0
rsyslog	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	docker-dev-kaas-virtual.docker.mirantis.net/openstack/requirements:antelope-jammy-20231110182826	Apache License 2.0
stepler	docker-dev-kaas-virtual.docker.mirantis.net/openstack/stepler:antelope-jammy-20231109062439	Apache License 2.0
placement	docker-dev-kaas-virtual.docker.mirantis.net/openstack/placement:antelope-jammy-20231110182826	Apache License 2.0
masakari	docker-dev-kaas-virtual.docker.mirantis.net/openstack/masakari:antelope-jammy-20231110182826	Apache License 2.0
masakari-monitors	docker-dev-kaas-virtual.docker.mirantis.net/openstack/masakari-monitors:antelope-jammy-20231110182826	Apache License 2.0
ironic-inspector	docker-dev-kaas-virtual.docker.mirantis.net/openstack/ironic-inspector:antelope-jammy-20231110182826	Apache License 2.0
manila	docker-dev-kaas-virtual.docker.mirantis.net/openstack/manila:antelope-jammy-20231110182826	Apache License 2.0

MOSK 23.3.1 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230928140935.qcow2	Mirantis Proprietary License
mirantis	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	docker-dev-kaas-virtual.docker.mirantis.net/openstack/keystone:yoga-jammy-20231110182826	Apache License 2.0
heat	docker-dev-kaas-virtual.docker.mirantis.net/openstack/heat:yoga-jammy-20231110182826	Apache License 2.0
glance	docker-dev-kaas-virtual.docker.mirantis.net/openstack/glance:yoga-jammy-20231110182826	Apache License 2.0
cinder	docker-dev-kaas-virtual.docker.mirantis.net/openstack/cinder:yoga-jammy-20231110182826	Apache License 2.0
cloudprober	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/cloudprober:main-rc1	Apache License 2.0
neutron	docker-dev-kaas-virtual.docker.mirantis.net/openstack/neutron:yoga-jammy-20231110182826	Apache License 2.0
nova	docker-dev-kaas-virtual.docker.mirantis.net/openstack/nova:yoga-jammy-20231110182826	Apache License 2.0
horizon	docker-dev-kaas-virtual.docker.mirantis.net/openstack/horizon:yoga-jammy-20231110182826	Apache License 2.0
tempest	docker-dev-kaas-virtual.docker.mirantis.net/openstack/tempest:yoga-jammy-20231110182826	Apache License 2.0
octavia	docker-dev-kaas-virtual.docker.mirantis.net/openstack/octavia:yoga-jammy-20231110182826	Apache License 2.0
designate	docker-dev-kaas-virtual.docker.mirantis.net/openstack/designate:yoga-jammy-20231110182826	Apache License 2.0
ironic	docker-dev-kaas-virtual.docker.mirantis.net/openstack/ironic:yoga-jammy-20231110182826	Apache License 2.0
barbican	docker-dev-kaas-virtual.docker.mirantis.net/openstack/barbican:yoga-jammy-20231110182826	Apache License 2.0
libvirt	docker-dev-kaas-virtual.docker.mirantis.net/general/libvirt:8.0.x-jammy-20231018050930	LGPL-2.1 License
pause	docker-dev-kaas-virtual.docker.mirantis.net/general/external/pause:3.1	Apache License 2.0
openvswitch	docker-dev-kaas-virtual.docker.mirantis.net/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	docker-dev-kaas-virtual.docker.mirantis.net/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	docker-dev-kaas-virtual.docker.mirantis.net/openstack/openstack-tools:yoga-jammy-20231110182826	Apache License 2.0
rabbitmq-3.10.x	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	docker-dev-kaas-virtual.docker.mirantis.net/general/memcached:v1.6.21-alpine-20230913050002	BSD 3-Clause “New” or “Revised” License
etcd	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/etcd:v3.5.10-alpine-20231031103038	Apache License 2.0
powerdns	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/nginx-ingress-controller:1.9.3	Apache License 2.0
tls-proxy	docker-dev-kaas-virtual.docker.mirantis.net/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	docker-dev-kaas-virtual.docker.mirantis.net/general/mariadb:10.6.14-focal-20231024091216	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	docker-dev-kaas-virtual.docker.mirantis.net/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	docker-dev-kaas-virtual.docker.mirantis.net/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	docker-dev-kaas-virtual.docker.mirantis.net/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	docker-dev-kaas-virtual.docker.mirantis.net/openstack/aodh:yoga-jammy-20231110182826	Apache License 2.0
ceilometer	docker-dev-kaas-virtual.docker.mirantis.net/openstack/ceilometer:yoga-jammy-20231110182826	Apache License 2.0
gnocchi	docker-dev-kaas-virtual.docker.mirantis.net/openstack/gnocchi:yoga-jammy-20231110182826	Apache License 2.0
redis	docker-dev-kaas-virtual.docker.mirantis.net/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	docker-dev-kaas-virtual.docker.mirantis.net/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/strongswan:alpine-5.9.8-20231021164312	GPL-2.0
rsyslog	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	docker-dev-kaas-virtual.docker.mirantis.net/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	docker-dev-kaas-virtual.docker.mirantis.net/openstack/requirements:yoga-jammy-20231110182826	Apache License 2.0
stepler	docker-dev-kaas-virtual.docker.mirantis.net/openstack/stepler:yoga-focal-20231108073953	Apache License 2.0
placement	docker-dev-kaas-virtual.docker.mirantis.net/openstack/placement:yoga-jammy-20231110182826	Apache License 2.0
masakari	docker-dev-kaas-virtual.docker.mirantis.net/openstack/masakari:yoga-jammy-20231110182826	Apache License 2.0
masakari-monitors	docker-dev-kaas-virtual.docker.mirantis.net/openstack/masakari-monitors:yoga-jammy-20231110182826	Apache License 2.0
ironic-inspector	docker-dev-kaas-virtual.docker.mirantis.net/openstack/ironic-inspector:yoga-jammy-20231110182826	Apache License 2.0
manila	docker-dev-kaas-virtual.docker.mirantis.net/openstack/manila:yoga-jammy-20231110182826	Apache License 2.0

MOSK 23.3.1 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.14.12-ea5e4d57.tgz	Mirantis Proprietary License
aodh	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/designate-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/glance-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/heat-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/nova-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/panko-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/placement-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm/manila-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://artifactory.mcp.mirantis.net/artifactory/binary-dev-kaas-local/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.3.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.13.4.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.13.4	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20231018094104	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:v2.2.0	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.13	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20231026075239	Apache License 2.0
cassandra-backrest-sidecar	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.8	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.0	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r32	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.2	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.20-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.3-20231019	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.2	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20231025094005	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20231017142953	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:23.3-r21.4.20231030213711	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230929000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:23.3-r21.4.20231030213711	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:23.3-r21.4.20231030213711	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:23.3-r21.4.20231030213711	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230929000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230929000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230929000000	Apache License 2.0

MOSK 23.3.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20231027023009	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230912105027	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

In total, since MOSK 23.3 release, in 23.3.1, 157 Common Vulnerabilities and Exposures (CVE) have been fixed: 5 of critical and 152 of high severity.

The table below includes the total numbers of addressed unique and common CVEs by MOSK-specific component since MOSK 23.2.3. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
MOSK component	CVE type	Critical	High	Total
OpenStack	Unique	1	16	17
OpenStack	Common	3	69	72
Tungsten Fabric	Unique	1	15	16
Tungsten Fabric	Common	2	83	85

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.25.1: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 23.3.1 release:

[37012] Resolved the issue that caused the cluster update failure due to instances evacuation when they were not supposed to be evacuated.
[37083] Resolved the issue that caused Cloudprober to produce warnings about large amount of targets.
[37185] Resolved the issue that caused the OpenStack Controller to fail while applying the Manila Helm charts during the attempt to enable Manila through the OpenStackDeployment custom resource.

Learn more about the release cadence

23.3.2 patch¶

The patch release notes contain the list of updated artifacts and Common Vulnerabilities and Exposures (CVE) fixes in images for the MOSK 23.3.2 patch.

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

MOSK 23.3.2 details¶
Release date	December 05, 2023
Scope	Patch
Cluster release	17.0.2
OpenStack Operator	0.14.14
Tungsten Fabric Operator	0.13.5

Release artifacts¶

This section lists the components artifacts of the MOSK 23.3.2 release that includes binaries, Docker images, and Helm charts.

MOSK 23.3.2 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20230927122744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20231110182826	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20231110182826	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20231110182826	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20231110182826	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.1-6-geb9d5960-20231120094223	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20231110182826	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20231110182826	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20231110182826	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20231110182826	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20231110182826	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20231110182826	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20231110182826	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20231110182826	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231114112207	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20231110182826	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.22-alpine-20231117094504	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.10-alpine-20231117141230	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20231116135030	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.3-alpine-20231120101958	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231024091216	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20231110182826	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20231110182826	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20231110182826	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.3-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20231117093402	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20231116165931	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20231120120521	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20231110182826	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20231117084119	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20231110182826	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20231110182826	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20231110182826	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20231110182826	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20231110182826	Apache License 2.0

MOSK 23.3.2 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230928140935.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20231110182826	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20231110182826	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20231110182826	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20231110182826	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.1-6-geb9d5960-20231120094223	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20231110182826	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20231110182826	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20231110182826	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20231110182826	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20231110182826	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20231110182826	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20231110182826	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20231110182826	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231114112207	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20231110182826	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.22-alpine-20231117094504	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.10-alpine-20231117141230	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20231116135030	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.3-alpine-20231120101958	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231024091216	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20231110182826	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20231110182826	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20231110182826	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.3-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20231117093402	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20231116165931	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20231120120521	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20231110182826	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20231117072125	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20231110182826	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20231110182826	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20231110182826	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20231110182826	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20231110182826	Apache License 2.0

MOSK 23.3.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.14.14.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3014.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.3.2 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.13.5.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.13.5	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20231018094104	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:v2.2.0	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.13	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20231115112406	Apache License 2.0
cassandra-backrest-sidecar	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.8	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.2	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-11-r1	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.2	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.20-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.3-20231019	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.2	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.3-alpine	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20231120173127	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20231017142953	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:23.3-r21.4.20231030213711	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230929000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:23.3-r21.4.20231030213711	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:23.3-r21.4.20231030213711	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:23.3-r21.4.20231030213711	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230929000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230929000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230929000000	Apache License 2.0

MOSK 23.3.2 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20231117023009	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230912105027	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 23.3.1. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	High	Total
OpenStack	Unique	2	2
OpenStack	Common	19	19
Tungsten Fabric	Unique	18	18
Tungsten Fabric	Common	39	39

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.25.2: Security notes.

Learn more about the release cadence

23.3.3 patch¶

The patch release notes contain the lists of updated artifacts and addressed product issues, as well as the details on Common Vulnerabilities and Exposures (CVE) fixes in images for the MOSK 23.3.3 patch.

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

MOSK 23.3.3 details¶
Release date	December 18, 2023
Scope	Patch
Cluster release	17.0.3
OpenStack Operator	0.14.17
Tungsten Fabric Operator	0.13.5

Release artifacts¶

This section lists the components artifacts of the MOSK 23.3.3 release that includes binaries, Docker images, and Helm charts.

MOSK 23.3.3 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20230927122744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20231204144213	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20231204144213	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20231204144213	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20231204144213	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.1-6-geb9d5960-20231120094223	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20231204144213	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20231204144213	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20231204144213	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20231204144213	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20231204144213	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20231204144213	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20231204144213	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20231204144213	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231114112207	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20231204144213	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.22-alpine-20231117094504	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.10-alpine-20231117141230	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20231201092632	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.3-alpine-20231120101958	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231127070342	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20231204144213	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20231204144213	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20231204144213	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.3-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20231117093402	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20231116165931	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.28.1	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20231120120521	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20231204144213	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20231205121024	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20231204144213	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20231204144213	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20231204144213	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20231204144213	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20231204144213	Apache License 2.0

MOSK 23.3.3 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230928140935.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20231204144213	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20231204144213	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20231204144213	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20231204144213	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.1-6-geb9d5960-20231120094223	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20231204144213	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20231204144213	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20231204144213	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20231204144213	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20231204144213	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20231204144213	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20231204144213	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20231204144213	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231114112207	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20231204144213	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.22-alpine-20231117094504	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.10-alpine-20231117141230	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20231201092632	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.3-alpine-20231120101958	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231127070342	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20231204144213	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20231204144213	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20231204144213	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.3-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20231117093402	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20231116165931	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.28.1	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20231120120521	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20231204144213	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20231205122540	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20231204144213	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20231204144213	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20231204144213	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20231204144213	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20231204144213	Apache License 2.0

MOSK 23.3.3 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.14.17.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.3.3 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.13.5.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.13.5	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20231018094104	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:v2.2.0	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.13	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20231115112406	Apache License 2.0
cassandra-backrest-sidecar	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.8	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.2	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-11-r1	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.2	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.20-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.3-20231019	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.2	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.3-alpine	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20231120173127	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20231017142953	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:23.3-r21.4.20231030213711	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:23.3-r21.4.20231030213711	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:23.3-r21.4.20231030213711	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230929000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:23.3-r21.4.20231030213711	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:23.3-r21.4.20231030213711	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:23.3-r21.4.20231030213711	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230929000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230929000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230929000000	Apache License 2.0

MOSK 23.3.3 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20231201023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20231204150325	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 23.3.2. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	1	1
OpenStack	Common	0	1	1

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.25.3: Security notes.

Addressed issues¶

The following issues have been addressed in the MOSK 23.3.3 release:

[37545] Resolved the issue that led to potential inaccessibility to the cloud public API for unpredictable durations during a cluster update.

Learn more about the release cadence

23.3.4 patch¶

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

MOSK 23.3.4 details¶
Release date	January 10, 2023
Scope	Patch
Cluster release	17.0.4
OpenStack Operator	0.14.18
Tungsten Fabric Operator	0.13.6

Release artifacts¶

This section lists the components artifacts of the MOSK 23.3.4 release that includes binaries, Docker images, and Helm charts.

MOSK 23.3.4 OpenStack Antelope binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-antelope-20230927122744.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-antelope-18a1377-20230817112356.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-antelope-20230831060811.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:antelope-jammy-20231214160210	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:antelope-jammy-20231214160210	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:antelope-jammy-20231214160210	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:antelope-jammy-20231214160210	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.1-6-geb9d5960-20231120094223	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:antelope-jammy-20231214160210	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:antelope-jammy-20231214160210	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:antelope-jammy-20231214160210	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:antelope-jammy-20231214160210	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:antelope-jammy-20231214160210	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:antelope-jammy-20231214160210	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:antelope-jammy-20231214160210	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:antelope-jammy-20231214160210	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231211175451	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20231214151928	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.22-alpine-20231117094504	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20231211203412	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20231201092632	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.3-alpine-20231120101958	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231127070342	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:antelope-jammy-20231214160210	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:antelope-jammy-20231214160210	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:antelope-jammy-20231214160210	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.3-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20231117093402	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20231116165931	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.28.1	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20231120120521	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:antelope-jammy-20231214160210	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:antelope-jammy-20231205121024	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:antelope-jammy-20231214160210	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:antelope-jammy-20231214160210	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:antelope-jammy-20231214160210	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:antelope-jammy-20231214160210	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:antelope-jammy-20231214160210	Apache License 2.0

MOSK 23.3.4 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230928140935.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-yoga-186584b-20230817112411.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-jammy-20231214151928	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-jammy-20231214151928	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-jammy-20231214151928	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-jammy-20231214151928	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.13.1-6-geb9d5960-20231120094223	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-jammy-20231214151928	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-jammy-20231214151928	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-jammy-20231214151928	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-jammy-20231214151928	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-jammy-20231214151928	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-jammy-20231214151928	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-jammy-20231214151928	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-jammy-20231214151928	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:8.0.x-jammy-20231211175451	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.17-jammy-20231018050930	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.17-jammy-20231018050930	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-jammy-20231214151928	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-55b02f7-20231019172556	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.22-alpine-20231117094504	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.11-alpine-20231211203412	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20231201092632	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.9.3-alpine-20231120101958	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20231127070342	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0-20231208095208	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-jammy-20231214151928	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-jammy-20231214151928	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-jammy-20231214151928	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.2.3-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20231117093402	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20231116165931	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20231018050930	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.28.1	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.3-20231120120521	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-jammy-20231214151928	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20231205122540	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-jammy-20231214151928	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-jammy-20231214151928	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-jammy-20231214151928	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-jammy-20231214151928	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-jammy-20231214151928	Apache License 2.0

MOSK 23.3.4 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.14.18.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4367.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-3038.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.3.4 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.13.6.tgz	Mirantis Proprietary License
Docker images
tungstenfabric-operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.13.6	Mirantis Proprietary License
tungsten-pytest	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20231213155208	MIT License
casskop	mirantis.azurecr.io/tungsten-operator/casskop:v2.2.1	Apache License 2.0
cassandra-bootstrap	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.13	Apache License 2.0
cassandra	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
cassandra-config-builder	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20231115112406	Apache License 2.0
cassandra-backrest-sidecar	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
kafka-k8s-operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.8	Mirantis Proprietary License
cp-kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.5.2	Apache License 2.0
kafka-jmx-exporter	mirantis.azurecr.io/stacklight/jmx-exporter:0.20.0-debian-11-r1	Apache License 2.0
rabbitmq-operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.4.2	Mirantis Proprietary License
rabbitmq	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
zookeeper-operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.20-mcp	Apache License 2.0
zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.3-20231019	Apache License 2.0
redis-operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.4.2	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.3-alpine	BSD 3-Clause “New” or “Revised” License
redis-exporter	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
tf-cli	mirantis.azurecr.io/tungsten/tf-cli:0.1-20231120173127	MIT License
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
tf-nodeinfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20231017142953	MIT License
contrail-analytics-alarm-gen	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:23.3-r21.4.20231214142113	Apache License 2.0
contrail-analytics-api	mirantis.azurecr.io/tungsten/contrail-analytics-api:23.3-r21.4.20231214142113	Apache License 2.0
contrail-analytics-collector	mirantis.azurecr.io/tungsten/contrail-analytics-collector:23.3-r21.4.20231214142113	Apache License 2.0
contrail-analytics-query-engine	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:23.3-r21.4.20231214142113	Apache License 2.0
contrail-analytics-snmp-collector	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:23.3-r21.4.20231214142113	Apache License 2.0
contrail-analytics-snmp-topology	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-config-api	mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-config-devicemgr	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-config-dnsmasq	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-config-schema	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-config-svcmonitor	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-control-control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-control-dns	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-control-named	mirantis.azurecr.io/tungsten/contrail-controller-control-named:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-webui-job	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:23.3-r21.4.20231214142113	Apache License 2.0
contrail-controller-webui-web	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:23.3-r21.4.20231214142113	Apache License 2.0
contrail-node-init	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230929000000	Apache License 2.0
contrail-nodemgr	mirantis.azurecr.io/tungsten/contrail-nodemgr:23.3-r21.4.20231214142113	Apache License 2.0
contrail-provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:23.3-r21.4.20231214142113	Apache License 2.0
contrail-tools	mirantis.azurecr.io/tungsten/contrail-tools:23.3-r21.4.20231214142113	Apache License 2.0
contrail-vrouter-agent	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230929000000	Apache License 2.0
contrail-vrouter-agent-dpdk	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230929000000	Apache License 2.0
contrail-vrouter-kernel-build-init	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230929000000	Apache License 2.0

MOSK 23.3.4 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20231215023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20231204150325	Mirantis Proprietary License
Helm charts
fluentd	https://binary.mirantis.com/stacklight/helm/fluentd-2.0.3-mcp-52.tgz	Mirantis Proprietary License
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-7.tgz	Mirantis Proprietary License

Security notes¶

The table below includes the total number of addressed unique and common CVEs by MOSK-specific component since MOSK 23.3.3. The common CVEs are issues addressed across several images.

Addressed CVEs - summary¶
Product component	CVE type	Critical	High	Total
OpenStack	Unique	0	1	1
OpenStack	Common	0	3	3
Tungsten Fabric	Unique	2	22	24
Tungsten Fabric	Common	2	26	28

Mirantis Security Portal

For the detailed list of fixed and present CVEs across the Mirantis Container Cloud and MOSK products, refer to Mirantis Security Portal.

Mirantis Container Cloud CVEs

For the number of fixed CVEs in the Mirantis Container Cloud-related components including kaas core, bare metal, Ceph, and StackLight, refer to Container Cloud 2.25.4: Security notes.

Addressed issues¶

The following issue has been addressed in the MOSK 23.3.4 release:

[37684] Resolved the issue that led to high resource utilization by Cassandra containers of the tf-cassandra-analytics service.

Learn more about the release cadence

23.2 series¶

23.2¶

Release date	August 21, 2023
Name	MOSK 23.2
Cluster release	15.0.1
Highlights	Technical preview for parallel node update Automatic cleanup of OpenStack metadata during node removal Technical preview for workload monitoring Technical preview for BGP dynamic routing Encryption of exposable OpenStack notification endpoint Secure live migration of OpenStack instances Technical preview for Tungsten Fabric graceful restart and long-lived graceful restart Technical preview for external storage for Tungsten Fabric Major version updates: MKE 3.6, Keycloak Quarkus, and Ceph Quincy Technical preview for Cephless cloud architecture Technical preview for WireGuard support Technical preview for custom host names for cluster machines configuration Technical preview for auditd support Workload onboarding tutorial

New features¶

MOSK 23.2 features¶
Component	Support scope	Feature
OpenStack	TechPreview	Parallel node update
	Full	Automatic cleanup of OpenStack metadata during node removal
	TechPreview	Workload monitoring
	TechPreview	BGP dynamic routing
Security	Full	Encryption of exposable OpenStack notification endpoint
	Full	Secure live migration of OpenStack instances
Tungsten Fabric	TechPreview	Tungsten Fabric graceful restart and long-lived graceful restart
	TechPreview	External storage for Tungsten Fabric
Major version changes	Full	MKE 3.6 support
	Full	Keycloak Quarkus
	Full	Ceph Quincy
Other components	TechPreview	Cephless cloud architecture
	TechPreview	Support for WireGuard
	TechPreview	Custom host names for cluster machines
	TechPreview	Support for auditd
Documentation	n/a	Workload onboarding tutorial

Parallel node update¶

TechPreview

Implemented the capability to parallelize OpenStack, Ceph, and Tungsten Fabric node update operations, significantly improving the efficiency of MOSK deployments. The parallel node update feature applies to any operation that utilizes the Node Maintenance API, such as cluster updates or graceful node reboots.

Learn more

Automatic cleanup of OpenStack metadata during node removal¶

Implemented the automatic removal of OpenStack-related metadata during the graceful machine deletion.

Learn more

Operations Guide: Delete a compute node

Workload monitoring¶

TechPreview

Implemented the OpenStack workload monitoring feature through the Cloudprober exporter.

After enablement and proper configuration, the exporter allows for monitoring the availability of instance floating IP addresses per OpenStack compute node and project, as well as viewing the probe statistics for individual instance floating IP addresses through the Openstack Instances Availability dashboard in Grafana.

Learn more

BGP dynamic routing¶

TechPreview

Introduced the Technology Preview support for the BGP dynamic routing extension to the Networking service (OpenStack Neutron) that will be particularly useful for the MOSK clouds where private networks managed by cloud users need to be transparently integrated into the networking of the data center.

Learn more

Reference Architecture: BGP dynamic routing

Encryption of exposable OpenStack notification endpoint¶

Implemented the encryption of the exposed message bus (RabbitMQ) endpoint for secure connection.

Learn more

Reference Architecture: Exposable OpenStack notifications

Secure live migration of OpenStack instances¶

Implemented the TLS encryption feature for QEMU and libvirt to secure all data transports during live migration, including disks not on shared storage.

Learn more

Tungsten Fabric graceful restart and long-lived graceful restart¶

Available since MOSK 23.2 for Tungsten Fabric 21.4 only TechPreview

Added support for graceful restart and long-lived graceful restart allowing for a more efficient and robust routing experience for Tungsten Fabric. These features enhance the speed at which routing tables converge, specifically when dealing with BGP router restarts or failures.

Learn more

Reference Architecture: Graceful restart and long-lived graceful restart

External storage for Tungsten Fabric¶

TechPreview

Implemented Technology Preview support for configuring a remote NFS storage for Tungsten Fabric data backup and restoration.

Learn more

Reference Architecture: Remote storage for Tungsten Fabric database backups

MKE 3.6 support¶

Introduced support for Mirantis Kubernetes Engine (MKE) 3.6 with Kubernetes 1.24. MOSK clusters are updated to the latest supported MKE version during the cluster update.

Learn more

MKE 3.6 Release Notes

Keycloak Quarkus¶

Upgraded Keycloak major version from 18.0.0 to 21.1.1 during the Cluster version update.

Learn more

Keycloak Release Notes

Ceph Quincy¶

Upgraded Ceph major version from Pacific to Quincy with an automatic upgrade of Ceph components during the Cluster version update.

Learn more

Ceph Quincy Release Notes

Cephless cloud architecture¶

TechPreview

Implemented the capability to configure a MOSK cluster without Ceph and, for example, rely on external storage appliances to host their data instead.

Learn more

Reference Architecture: Cephless cloud

Support for WireGuard¶

TechPreview

Added initial Technology Preview support for WireGuard that enables traffic encryption on the Kubernetes workloads network.

Learn more

Custom host names for cluster machines¶

TechPreview

Added initial Technology Preview support for custom host names of cluster machines. When enabled, any machine host name in a particular region matches the related Machine object name.

Learn more

Operations Guide: Configure host names for cluster machines

Support for auditd¶

TechPreview

Added initial Technology Preview support for the Linux Audit daemon auditd to monitor activity of cluster processes that allow for detection of potential malicious activity.

Learn more

Mirantis Container Cloud documentation: Create a cluster using web UI

Workload onboarding tutorial¶

Added a tutorial to help you build your first cloud application and onboard it to a MOSK cloud. It will guide you through the process of deploying a simple application using the cloud web UI (OpenStack Horizon).

Learn more

User Guide: Deploy your first cloud application using cloud web UI

Major components versions¶

MOSK 23.2 components versions¶
Component	Version
Cluster release	15.0.1 (Cluster release notes)
OpenStack	Yoga
OpenStack Operator	0.13.8
Tungsten Fabric	21.4
Tungsten Fabric Operator	0.12.4

See also

For the supported versions of operating system, Ceph, and other components, refer to Release Compatibility Matrix.

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.2.

[25124] MPLSoGRE encapsulation has limited throughput¶

Fixed in MOSK 23.3

Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) provides limited throughput while sending data between VMs up to 38 Mbps, as per Mirantis tests.

As a workaround, switch the encapsulation type to VXLAN in the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          conf:
            bagpipe_bgp:
              dataplane_driver_ipvpn:
                mpls_over_gre: "False"
                vxlan_encap: "True"

[31186,34132] Pods get stuck during MariaDB operations¶

Due to the upstream MariaDB issue, during MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.
Verify that other replicas are up and ready.
Remove the galera.cache file for the affected mariadb-server Pod.
Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[34897] Machines are not available after Victoria to Wallaby update¶

Fixed in MOSK 23.3

After update of OpenStack from Victoria to Wallaby, the machines from nodes with DPDK become unavailable.

Workaround:

Search for the nodes with the OVS ports:

for i in $(kubectl -n openstack get pods |grep openvswitch-vswitchd | awk '{print $1}'); do kubectl -n openstack exec -it -c openvswitch-vswitchd $i -- ovs-vsctl show |grep -q "tag: 4095" && echo $i; done

Restart the neutron-ovs-agent agent on the affected nodes.

[42386] A load balancer service does not obtain the external IP address¶

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

Identify the service that is stuck:

kubectl get svc -A | grep pending

Example of system response:

stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP

Add an arbitrary label to the service that is stuck. For example:

kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1

Example of system response:

service/iam-proxy-prometheus labeled

Verify that the external IP was allocated to the service:

kubectl get svc -n stacklight iam-proxy-prometheus

Example of system response:

NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric (TF) known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.2. For TF limitations, see Tungsten Fabric known limitations.

[37684] Cassandra containers are experiencing high resource utilization
[30738] ‘tf-vrouter-agent’ readiness probe failed (No Configuration for self)
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot

[37684] Cassandra containers are experiencing high resource utilization¶

Fixed in MOSK 23.3.4

To work around the issue, use the custom images from the Mirantis public repository:

Specify the image for config-api in the TFOperator custom resource:

controllers:
  tf-config:
    api:
      containers:
        - image: mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.2-r21.4.20231208123354
          name: api

Wait for the tf-config pods to restart.

Monitor the Cassandra Analytics resources continuously. If the Out Of Memory (OOM) error is not present, the applied workaround is sufficient.

Otherwise, modify the TF vRouters configuration as well:

controllers:
  tf-vrouter:
    agent:
      containers:
        - env:
          - name: VROUTER_GATEWAY
            value: 10.32.6.1
          - name: DISABLE_TX_OFFLOAD
            value: "YES"
          name: agent
          image: mirantis.azurecr.io/tungsten/contrail-vrouter-agent:23.2-r21.4.20231208123354

To apply the changes, restart the vRouters manually.

[30738] ‘tf-vrouter-agent’ readiness probe failed (No Configuration for self)¶

Fixed in MOSK 23.3 Fixed in MOSK 23.2.1

Execution of the TF Heat Tempest test test_template_global_vrouter_config can result in lost vRouter configuration. This causes the tf-vrouter pod readiness probe to fail with the following error message:

"Readiness probe failed: vRouter is PRESENT contrail-vrouter-agent: initializing (No Configuration for self)"

As a result, vRouters may have an incomplete routing table making some services, such as metadata, become unavailable.

Workaround:

Add the tf_heat_tempest_plugin tests with global configuration to the exclude list in the OpenStackDeployment custom resource:

spec:
  tempest:
    tempest:
      values:
        conf:
          blacklist:
            - (?:tf_heat_tempest_plugin.tests.functional.test_global.*)

If you ran test_template_global_vrouter_config and tf-vrouter-agent pods moved to the error state with the above error, re-create these pods through deletion:
```
kubectl -n tf delete pod tf-vrouter-agent-*
```

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

Wireguard known issues¶

This section lists the Wireguard known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.2.

[35147] The Wireguard interface does not have the IPv4 address assigned¶

Fixed in MOSK 23.3

Due to the upstream Calico issue, on clusters with Wireguard enabled, the Wireguard interface on a node may not have the IPv4 address assigned. This leads to broken inter-Pod communication between the affected node and other cluster nodes.

The node is affected if the IP address is missing on the Wireguard interface:

ip a show wireguard.cali

Example of system response:

40: wireguard.cali: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1000 link/none

The workaround is to manually restart the calico-node Pod to allocate the IPv4 address on the Wireguard interface:

docker restart $(docker ps -f "label=name=Calico node" -q)

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.2.

[35111] openstack-operator-ensure-resources job stuck in CrashLoopBackOff¶

Fixed in MOSK 23.3

During MOSK update to either 23.2 major release or any patch release of the 23.2 release series, the openstack-operator-ensure-resources job may get stuck in the CrashLoopBackOff state with the following error:

Traceback (most recent call last):
File "/usr/local/bin/osctl-ensure-shared-resources", line 8, in <module>
  sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/openstack_controller/cli/ensure_shared_resources.py", line 61, in main
  obj.update()
File "/usr/local/lib/python3.8/dist-packages/pykube/objects.py", line 165, in update
  self.patch(self.obj, subresource=subresource)
File "/usr/local/lib/python3.8/dist-packages/pykube/objects.py", line 157, in patch
  self.api.raise_for_status(r)
File "/usr/local/lib/python3.8/dist-packages/pykube/http.py", line 444, in raise_for_status
  raise HTTPError(resp.status_code, payload["message"])
pykube.exceptions.HTTPError: CustomResourceDefinition.apiextensions.k8s.io "redisfailovers.databases.spotahome.com" is invalid: spec.preserveUnknownFields: Invalid value: true: must be false in order to use defaults in the schema

As a workaround, delete the redisfailovers.databases.spotahome.com CRD from your cluster:

kubectl delete crd redisfailovers.databases.spotahome.com

[37012] Masakari failure during update¶

While updating your cluster, the Instance High Availability service (OpenStack Masakari) may not work as expected.

As a workaround, temporarily disable the service by removing instance-ha from the service list in the OpenStackDeployment custom resource.

Release artifacts¶

This section lists the components artifacts of the MOSK 23.2 release that includes binaries, Docker images, and Helm charts.

MOSK 23.2 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230718165730.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230730141349	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230730141349	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230730141349	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230730141349	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.7-20230623070627	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230730141349	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230730141349	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230730141349	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230730141349	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230730141349	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230730141349	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230730141349	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230730141349	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230730124813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230730124813	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230730124813	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230718163224	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.20-alpine-20230614113432	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230610071256	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230610074056	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.0-alpine-20230617191825	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230730124341	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230730141349	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230730141349	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230730141349	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230720054838	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230730124813	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230730124813	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-focal-20230730141349	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230730124813	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230730141349	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230730141349	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230730141349	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230730141349	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230730141349	Apache License 2.0

MOSK 23.2 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230706155916.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230730130947	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230730130947	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230730130947	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230730130947	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.7-20230623070627	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230730130947	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230730130947	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230730130947	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230730130947	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230730130947	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230730130947	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230730130947	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230730130947	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230730124813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230730124813	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230730124813	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230718163224	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.20-alpine-20230614113432	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230610071256	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230610074056	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.0-alpine-20230617191825	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230730124341	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230730130947	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230730130947	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230730130947	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230720054838	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230730124813	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230730124813	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:victoria-focal-20230730130947	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230730130947	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230730124813	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230730130947	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230730130947	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230730130947	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230730130947	Apache License 2.0

MOSK 23.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.13.8.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.2 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.12.4.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.12.4	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:23.2_R21.4.20230810083758	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:23.2_R21.4.20230810083758	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:23.2_R21.4.20230810083758	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:23.2_R21.4.20230810083758	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:23.2_R21.4.20230810083758	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:23.2_R21.4.20230810083758	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:23.2_R21.4.20230810083758	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:23.2_R21.4.20230810083758	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:23.2_R21.4.20230810083758	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.17	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.12	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230622161721	Apache License 2.0
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.4.1	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r32	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.5	Mirantis Proprietary License
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.7	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.17-mcp	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.1-20230425	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.8	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:23.2_R21.4.20230810083758	Apache License 2.0
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230713172410	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230802163214	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:23.2_R21.4.20230810083758	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0

MOSK 23.2 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230714023011	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230531104437	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 23.2 release:

[OpenStack] [33006] Fixed the issue that prevented communication between virtual machines on the same network.
[OpenStack] [34208] Prevented the Masakari API pods from constant restart.
[TF] [32723] Fixed the issue that prevented a compiled vRouter kmod from automatic refreshing with the new kernel.
[TF] [32326] Fixed the issue that allowed for unathorized access to the Tungsten Fabric API.
[Ceph] [30635] Fixed the issue with irrelevant error message displaying in the osd-prepare Pod during the deployment of Ceph OSDs on removable devices on AMD nodes. Now, the error message clearly states that removable devices with hotplug enabled are not supported for deploying Ceph OSDs.
[Ceph] [31630] Fixed the issue that caused the Ceph cluster upgrade to Pacific to be stuck with Rook connection failure.
[Ceph] [31555] Fixed the issue with Ceph finding only 1 out of 2 mgr after update.
[Ceph] [23292] Fixed the issue that caused the failure of the Ceph rook-operator with FIPS kernel.
[Update] [27797] Fixed the issue that stopped cluster kubeconfig from working during the MKE minor version update.
[Update] [32311] Fixed the issue with the tf-rabbit-exporter ReplicaSet blocking the cluster update.
[StackLight] [30867] Fixed the Instance Info panel for RabbitMQ in Grafana.

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 23.2. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Features¶

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

Major component versions update¶

As part of the update to MOSK 23.2, the following automatic updates of major component versions will take place:

MKE 3.5 with Kubernetes 1.21 to 3.6 with Kubernetes 1.24
Ceph Pacific to Quincy

See also

Release Compatibility Matrix

Update impact and maintenance windows planning¶

The update to MOSK 23.2 does not include any version-specific impact on the cluster. To start planning a maintenance window, use the Operations Guide: Update a MOSK cluster standard procedure.

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Cluster update known issues.

Pre-update actions¶

Disable the Instance High Availability service¶

Post-update actions¶

Upgrade Ubuntu to 20.04¶

In the next release series, MOSK will stop supporting Ubuntu 18.04. Therefore, Mirantis highly recommends upgrading an operating system on your cluster machines to Ubuntu 20.04 during the course of the MOSK 23.2 series by rebooting cluster nodes.

It is not mandatory to reboot all machines at once. You can reboot them one by one or in small batches, for example, if the maintenance window is limited in time.

Otherwise, the Cluster release update for the cluster running on Ubuntu 18.04 will become impossible.

For details on distribution upgrade, see Upgrade an operating system distribution.

Upgrade OpenStack to Yoga¶

MOSK supports the OpenStack Victoria version until September, 2023. MOSK 23.2 is the last release version where OpenStack Victoria packages are updated.

If you have not already upgraded your OpenStack version to Yoga, Mirantis highly recommends doing this during the course of the MOSK 23.2 series.

See also

Make the OpenStack notifications available in StackLight¶

After the update, the notifications from OpenStack become unavailable in StackLight. On an attempt to establish a TCP connection to the RabbitMQ server, the connection is refused with the following error:

Could not establish TCP connection to any of the configured hosts

As a workaround, add the following annotation to the openstack-rabbitmq-users-credentials secret:

kubectl -n openstack patch secrets openstack-rabbitmq-users-credentials --type='json' -p='[{"op": "add", "path": "/metadata/annotations/foo", "value":"bar"}]'

Security notes¶

In total, since MOSK 23.1 major release, in 23.2, 1611 Common Vulnerabilities and Exposures (CVE) have been fixed: 65 of critical and 1546 of high severity.

Among them, 689 CVEs that are listed in Addressed CVEs - detailed have been fixed since 23.1.4 patch release: 29 of critical and 660 of high severity. The fixes for the rest of CVEs were released with the patch releases of the MOSK 23.1 series.

The full list of the CVEs present in the current Mirantis OpenStack for Kubernetes (MOSK) release is available at the Mirantis Security Portal.

The Addressed CVEs - summary table includes the total number of unique CVEs along with the total number of issues fixed across images.

Addressed CVEs - summary¶
Severity	Critical	High	Total
Unique CVEs	5	64	69
Total issues across images	29	660	689

Note

Duplicate CVEs for packages in the Addressed CVEs - detailed table can mean that they were discovered in container images with the same names but different tags, for example, openstack/barbican for Openstack Victoria and Yoga versions.

Addressed CVEs - detailed¶
Image	Component name	CVE
ceph/mcp/ceph-controller	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
ceph/rook	openssl	CVE-2022-3786 (High)
		CVE-2022-3602 (High)
		CVE-2023-0286 (High)
	openssl-libs	CVE-2022-3602 (High)
		CVE-2022-3786 (High)
		CVE-2023-0286 (High)
	cryptography	CVE-2023-2650 (High)
general/amqproxy	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
general/external/docker.io/frrouting/frr	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
	libcap2	CVE-2023-2603 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
general/memcached	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
general/openvswitch	linux-libc-dev	CVE-2023-3090 (High)
		CVE-2023-32629 (High)
		CVE-2023-3390 (High)
		CVE-2023-35001 (High)
		CVE-2023-1380 (High)
		CVE-2023-30456 (High)
		CVE-2023-31436 (High)
		CVE-2023-32233 (High)
		CVE-2023-35788 (High)
general/openvswitch-dpdk	linux-libc-dev	CVE-2023-35788 (High)
		CVE-2023-1380 (High)
		CVE-2023-30456 (High)
		CVE-2023-31436 (High)
		CVE-2023-32233 (High)
		CVE-2023-3090 (High)
		CVE-2023-32629 (High)
		CVE-2023-3390 (High)
		CVE-2023-35001 (High)
mirantis/ceph	openssl	CVE-2022-3786 (High)
		CVE-2022-3602 (High)
		CVE-2023-0286 (High)
	openssl-libs	CVE-2022-3602 (High)
		CVE-2022-3786 (High)
		CVE-2023-0286 (High)
	python3	CVE-2023-24329 (High)
	python3-devel	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
mirantis/cephcsi	openssl	CVE-2022-3786 (High)
		CVE-2022-3602 (High)
		CVE-2023-0286 (High)
	openssl-libs	CVE-2022-3602 (High)
		CVE-2022-3786 (High)
		CVE-2023-0286 (High)
	cryptography	CVE-2023-2650 (High)
mirantis/fio	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
openstack/aodh	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/barbican	linux-libc-dev	CVE-2023-1380 (High)
		CVE-2023-30456 (High)
		CVE-2023-31436 (High)
		CVE-2023-32233 (High)
		CVE-2023-35788 (High)
		CVE-2023-1829 (High)
		CVE-2023-3567 (High)
		CVE-2023-3090 (High)
		CVE-2023-32629 (High)
		CVE-2023-3390 (High)
		CVE-2023-35001 (High)
		CVE-2023-1380 (High)
		CVE-2023-30456 (High)
		CVE-2023-31436 (High)
		CVE-2023-32233 (High)
		CVE-2023-3090 (High)
		CVE-2023-32629 (High)
		CVE-2023-3390 (High)
		CVE-2023-35001 (High)
		CVE-2023-35788 (High)
	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/ceilometer	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/cinder	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/designate	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/extra/etcd	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
openstack/extra/kubernetes-entrypoint	github.com/emicklei/go-restful	CVE-2022-1996 (Critical)
	golang.org/x/net	CVE-2022-27664 (High)
		CVE-2022-41721 (High)
	golang.org/x/text	CVE-2022-32149 (High)
openstack/extra/nginx	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
openstack/extra/nginx-ingress-controller	golang.org/x/net	CVE-2022-41721 (High)
		CVE-2022-27664 (High)
	curl	CVE-2023-28319 (High)
	libcurl	CVE-2023-28319 (High)
	libcrypto1.1	CVE-2023-2650 (High)
	libssl1.1	CVE-2023-2650 (High)
	openssl	CVE-2023-2650 (High)
	github.com/opencontainers/runc	CVE-2023-28642 (High)
	golang.org/x/text	CVE-2022-32149 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
	nghttp2-libs	CVE-2023-35945 (High)
openstack/extra/powerdns	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
openstack/extra/redis	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
openstack/extra/strongswan	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
	openssl	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
	nghttp2-libs	CVE-2023-35945 (High)
openstack/glance	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/gnocchi	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/heat	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/horizon	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
	Django	CVE-2023-36053 (High)
openstack/ironic	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/ironic-inspector	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/keystone	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/manila	cryptography	CVE-2023-2650 (High)
openstack/masakari	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/masakari-monitors	cryptography	CVE-2023-2650 (High)
openstack/neutron	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/nova	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/octavia	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/openstack-controller	cryptography	CVE-2023-2650 (High)
	aiohttp	CVE-2023-37276 (High)
openstack/openstack-tools	cryptography	CVE-2023-2650 (High)
openstack/panko	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/placement	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
openstack/stepler	linux-libc-dev	CVE-2023-1380 (High)
		CVE-2023-30456 (High)
		CVE-2023-31436 (High)
		CVE-2023-32233 (High)
		CVE-2023-1829 (High)
		CVE-2023-3567 (High)
		CVE-2023-3090 (High)
		CVE-2023-32629 (High)
		CVE-2023-3390 (High)
		CVE-2023-35001 (High)
		CVE-2023-35788 (High)
		CVE-2023-1829 (High)
		CVE-2023-35788 (High)
		CVE-2023-3090 (High)
		CVE-2023-32629 (High)
		CVE-2023-3390 (High)
		CVE-2023-35001 (High)
		CVE-2023-1380 (High)
		CVE-2023-30456 (High)
		CVE-2023-31436 (High)
		CVE-2023-32233 (High)
		CVE-2023-3567 (High)
	cryptography	CVE-2023-2650 (High)
openstack/tempest	cryptography	CVE-2023-2650 (High)
	sqlparse	CVE-2023-30608 (High)
stacklight/alerta-web	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
stacklight/alertmanager	golang.org/x/net	CVE-2022-41723 (High)
stacklight/alertmanager-webhook-servicenow	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
	openssl-dev	CVE-2023-2650 (High)
	Flask	CVE-2023-30861 (High)
stacklight/alpine-utils	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
stacklight/blackbox-exporter	golang.org/x/net	CVE-2022-41723 (High)
stacklight/cadvisor	libcrypto1.1	CVE-2023-2650 (High)
	libssl1.1	CVE-2023-2650 (High)
stacklight/cerebro	org.xerial:sqlite-jdbc	CVE-2023-32697 (Critical)
	com.fasterxml.jackson.core:jackson-databind	CVE-2021-46877 (High)
		CVE-2022-42003 (High)
		CVE-2022-42004 (High)
		CVE-2020-36518 (High)
	libssl1.1	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	openssl	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
stacklight/fluentd	libssl-dev	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	libssl1.1	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	openssl	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
stacklight/grafana	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
stacklight/grafana-image-renderer	tough-cookie	CVE-2023-26136 (Critical)
stacklight/jmx-exporter	libssl1.1	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	libncurses6	CVE-2022-29458 (High)
	libncursesw6	CVE-2022-29458 (High)
	libtinfo6	CVE-2022-29458 (High)
	ncurses-base	CVE-2022-29458 (High)
stacklight/k8s-sidecar	libcrypto1.1	CVE-2023-2650 (High)
	libssl1.1	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
stacklight/kubectl	libssl1.1	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	openssl	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
stacklight/metric-collector	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
stacklight/metricbeat	python	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
stacklight/node-exporter	golang.org/x/net	CVE-2022-41723 (High)
stacklight/opensearch	org.codelibs.elasticsearch.module:ingest-common	CVE-2015-5377 (Critical)
		CVE-2019-7611 (High)
	org.xerial:sqlite-jdbc	CVE-2023-32697 (Critical)
	org.springframework:spring-core	CVE-2023-20860 (High)
	ncurses	CVE-2023-29491 (High)
	ncurses-base	CVE-2023-29491 (High)
	ncurses-libs	CVE-2023-29491 (High)
stacklight/opensearch-dashboards	tough-cookie	CVE-2023-26136 (Critical)
	debug	CVE-2015-8315 (High)
	decode-uri-component	CVE-2022-38900 (High)
	glob-parent	CVE-2021-35065 (High)
	ncurses	CVE-2023-29491 (High)
	ncurses-base	CVE-2023-29491 (High)
	ncurses-libs	CVE-2023-29491 (High)
stacklight/prometheus	github.com/docker/docker	CVE-2023-28840 (High)
	golang.org/x/net	CVE-2022-41723 (High)
stacklight/prometheus-es-exporter	libcrypto1.1	CVE-2023-2650 (High)
	libssl1.1	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
stacklight/prometheus-libvirt-exporter	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
stacklight/prometheus-patroni-exporter	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
stacklight/prometheus-relay	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
stacklight/sf-notifier	libcrypto1.1	CVE-2023-2650 (High)
	libssl1.1	CVE-2023-2650 (High)
	openssl-dev	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
stacklight/sf-reporter	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
stacklight/spilo	PyJWT	CVE-2022-29217 (High)
	golang.org/x/net	CVE-2022-27664 (High)
	golang.org/x/text	CVE-2022-32149 (High)
	gopkg.in/yaml.v3	CVE-2022-28948 (High)
stacklight/stacklight-toolkit	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
	ncurses-libs	CVE-2023-29491 (High)
	ncurses-terminfo-base	CVE-2023-29491 (High)
stacklight/telegraf	libssl1.1	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	openssl	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
stacklight/telemeter	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
stacklight/tungstenfabric-prometheus-exporter	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
stacklight/yq	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
tungsten-operator/casskop	libssl1.1	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	openssl	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
tungsten/cass-config-builder	python-unversioned-command	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	java-1.8.0-openjdk-headless	CVE-2023-21930 (High)
tungsten/cassandra	libssl1.1	CVE-2023-0286 (High)
	openssl	CVE-2023-0286 (High)
tungsten/cassandra-bootstrap	libssl1.1	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	openssl	CVE-2023-0464 (High)
		CVE-2023-2650 (High)
	libtinfo6	CVE-2022-29458 (High)
	ncurses-base	CVE-2022-29458 (High)
tungsten/contrail-analytics-alarm-gen	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-analytics-api	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-analytics-collector	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-analytics-query-engine	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-analytics-snmp-collector	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-analytics-snmp-topology	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-config-api	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	openssh	CVE-2023-38408 (High)
	openssh-clients	CVE-2023-38408 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-config-devicemgr	bottle	CVE-2022-31799 (Critical)
	git	CVE-2023-25652 (High)
		CVE-2023-29007 (High)
		CVE-2022-41903 (High)
		CVE-2022-23521 (High)
	perl-Git	CVE-2022-23521 (High)
		CVE-2022-41903 (High)
		CVE-2023-29007 (High)
		CVE-2023-25652 (High)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	openssh	CVE-2023-38408 (High)
	openssh-clients	CVE-2023-38408 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-config-dnsmasq	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-config-schema	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	openssh	CVE-2023-38408 (High)
	openssh-clients	CVE-2023-38408 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-config-svcmonitor	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	openssh	CVE-2023-38408 (High)
	openssh-clients	CVE-2023-38408 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-control-control	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-control-dns	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-control-named	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-webui-job	tough-cookie	CVE-2023-26136 (Critical)
	python	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	ansi-regex	CVE-2021-3807 (High)
	decode-uri-component	CVE-2022-38900 (High)
	minimatch	CVE-2022-3517 (High)
	qs	CVE-2022-24999 (High)
		CVE-2017-1000048 (High)
	redis	CVE-2021-29469 (High)
	trim-newlines	CVE-2021-33623 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-controller-webui-web	tough-cookie	CVE-2023-26136 (Critical)
	python	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	ansi-regex	CVE-2021-3807 (High)
	decode-uri-component	CVE-2022-38900 (High)
	minimatch	CVE-2022-3517 (High)
	qs	CVE-2022-24999 (High)
		CVE-2017-1000048 (High)
	redis	CVE-2021-29469 (High)
	trim-newlines	CVE-2021-33623 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-node-init	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-nodemgr	bottle	CVE-2022-31799 (Critical)
	github.com/emicklei/go-restful	CVE-2022-1996 (Critical)
	golang.org/x/net	CVE-2022-27664 (High)
		CVE-2021-33194 (High)
		CVE-2022-27664 (High)
		CVE-2022-41721 (High)
		CVE-2022-27664 (High)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	github.com/opencontainers/runc	CVE-2023-28642 (High)
	pip	CVE-2018-20225 (High)
	golang.org/x/text	CVE-2022-32149 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-provisioner	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-tools	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	sudo	CVE-2023-22809 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	openssh	CVE-2023-38408 (High)
	openssh-clients	CVE-2023-38408 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-vrouter-agent	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	sudo	CVE-2023-22809 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-export-libs	CVE-2023-2828 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-vrouter-agent-dpdk	bottle	CVE-2022-31799 (Critical)
	python	CVE-2023-24329 (High)
	python-devel	CVE-2023-24329 (High)
	python-libs	CVE-2023-24329 (High)
	python3	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	openssl	CVE-2023-0286 (High)
	openssl-libs	CVE-2023-0286 (High)
	sudo	CVE-2023-22809 (High)
	c-ares	CVE-2023-32067 (High)
	nss	CVE-2023-0767 (High)
	nss-sysinit	CVE-2023-0767 (High)
	nss-tools	CVE-2023-0767 (High)
	bind-export-libs	CVE-2023-2828 (High)
	bind-license	CVE-2023-2828 (High)
	pip	CVE-2018-20225 (High)
	wheel	CVE-2022-40898 (High)
tungsten/contrail-vrouter-kernel-build-init	kernel-headers	CVE-2023-0461 (High)
		CVE-2022-3564 (High)
tungsten/cp-kafka	python39	CVE-2023-24329 (High)
	python39-libs	CVE-2023-24329 (High)
	platform-python	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
tungsten/redis	libcrypto3	CVE-2023-2650 (High)
	libssl3	CVE-2023-2650 (High)
tungsten/tf-cli	kernel-headers	CVE-2023-0461 (High)
		CVE-2022-3564 (High)
	python39	CVE-2023-24329 (High)
	python39-devel	CVE-2023-24329 (High)
	python39-libs	CVE-2023-24329 (High)
	platform-python	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	bind-libs	CVE-2023-2828 (High)
	bind-libs-lite	CVE-2023-2828 (High)
	bind-license	CVE-2023-2828 (High)
	bind-utils	CVE-2023-2828 (High)
	python3-bind	CVE-2023-2828 (High)
tungsten/tungsten-pytest	python39	CVE-2023-24329 (High)
	python39-libs	CVE-2023-24329 (High)
	platform-python	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	cryptography	CVE-2023-2650 (High)

23.2.1 patch¶

The patch release notes contain the list of artifacts and Common Vulnerabilities and Exposures (CVE) fixes for the MOSK 23.2.1 patch released on August 29, 2023.

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

MOSK 23.2.1 details¶
Release date	August 29, 2023
Scope	Patch
Cluster release	15.0.2
OpenStack Operator	0.13.10
Tungsten Fabric Operator	0.12.4

Release artifacts¶

This section lists the components artifacts of the MOSK 23.2.1 release that includes binaries, Docker images, and Helm charts.

MOSK 23.2.1 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230706155916.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230821170130	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230821170130	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230821170130	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230821170130	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.7-20230623070627	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230821170130	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230821170130	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230821170130	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230821170130	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230821170130	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230821170130	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230821170130	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230821170130	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230730124813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230730124813	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230730124813	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230821170130	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.20-alpine-20230614113432	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230610071256	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.0-alpine-20230617191825	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230730124341	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230821170130	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230821170130	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230821170130	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230720054838	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230730124813	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:victoria-focal-20230821170130	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230821170130	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230730124813	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230821170130	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230821170130	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230821170130	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230821170130	Apache License 2.0

MOSK 23.2.1 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230718165730.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230821170130	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230821170130	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230821170130	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230821170130	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.7-20230623070627	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230821170130	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230821170130	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230821170130	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230821170130	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230821170130	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230821170130	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230821170130	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230821170130	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230730124813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230730124813	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230730124813	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230821170130	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.20-alpine-20230614113432	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230610071256	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.0-alpine-20230617191825	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230730124341	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230821170130	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230821170130	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230821170130	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230720054838	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230730124813	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-focal-20230821170130	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230730124813	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230821170130	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230821170130	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230821170130	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230821170130	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230821170130	Apache License 2.0

MOSK 23.2.1 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.13.10.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.2.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.12.4.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.12.4	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:23.2_R21.4.20230810083758	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:23.2_R21.4.20230810083758	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:23.2_R21.4.20230810083758	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:23.2_R21.4.20230810083758	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:23.2_R21.4.20230810083758	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:23.2_R21.4.20230810083758	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:23.2_R21.4.20230810083758	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:23.2_R21.4.20230810083758	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:23.2_R21.4.20230810083758	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.17	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.12	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230622161721	Apache License 2.0
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.4.1	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r32	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.5	Mirantis Proprietary License
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.7	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.17-mcp	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.1-20230425	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.8	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:23.2_R21.4.20230810083758	Apache License 2.0
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230713172410	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230802163214	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:23.2_R21.4.20230810083758	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0

MOSK 23.2.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230811023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230531104437	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Security notes¶

In total, in the MOSK 23.2.1 release, 43 Common Vulnerabilities and Exposures (CVE) with high severity have been fixed.

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

Addressed CVEs - summary¶
Severity	Critical	High	Total
Unique CVEs	0	10	10
Total issues across images	0	43	43

Addressed CVEs - detailed¶
Image	Component name	CVE
ceph/rook	python3	CVE-2023-24329 (High)
	python3-devel	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	cryptography	CVE-2023-38325 (High)
mirantis/ceph	cryptography	CVE-2023-2650 (High)
mirantis/cephcsi	python3	CVE-2023-24329 (High)
	python3-devel	CVE-2023-24329 (High)
	python3-libs	CVE-2023-24329 (High)
	cryptography	CVE-2023-38325 (High)
openstack/aodh	cryptography	CVE-2023-38325 (High)
openstack/barbican	cryptography	CVE-2023-38325 (High)
openstack/ceilometer	cryptography	CVE-2023-38325 (High)
openstack/cinder	cryptography	CVE-2023-38325 (High)
openstack/designate	cryptography	CVE-2023-38325 (High)
openstack/extra/powerdns	libpq	CVE-2023-39417 (High)
openstack/glance	cryptography	CVE-2023-38325 (High)
openstack/gnocchi	cryptography	CVE-2023-38325 (High)
openstack/heat	cryptography	CVE-2023-38325 (High)
openstack/horizon	cryptography	CVE-2023-38325 (High)
openstack/ironic	cryptography	CVE-2023-38325 (High)
openstack/ironic-inspector	cryptography	CVE-2023-38325 (High)
openstack/keystone	cryptography	CVE-2023-38325 (High)
openstack/manila	cryptography	CVE-2023-38325 (High)
openstack/masakari	cryptography	CVE-2023-38325 (High)
openstack/masakari-monitors	cryptography	CVE-2023-38325 (High)
openstack/neutron	cryptography	CVE-2023-38325 (High)
openstack/nova	cryptography	CVE-2023-38325 (High)
openstack/octavia	cryptography	CVE-2023-38325 (High)
openstack/openstack-tools	cryptography	CVE-2023-38325 (High)
openstack/panko	cryptography	CVE-2023-38325 (High)
openstack/placement	cryptography	CVE-2023-38325 (High)
openstack/tempest	cryptography	CVE-2023-38325 (High)
stacklight/alpine-utils	nghttp2-libs	CVE-2023-35945 (High)
stacklight/cadvisor	github.com/docker/docker	CVE-2023-28840 (High)
	github.com/opencontainers/runc	CVE-2023-28642 (High)
	golang.org/x/net	CVE-2022-41723 (High)
stacklight/grafana	nghttp2-libs	CVE-2023-35945 (High)
stacklight/metricbeat	bind-license	CVE-2023-2828 (High)
stacklight/opensearch	libnghttp2	CVE-2023-35945 (High)
stacklight/opensearch-dashboards	libnghttp2	CVE-2023-35945 (High)
stacklight/prometheus-libvirt-exporter	nghttp2-libs	CVE-2023-35945 (High)
stacklight/stacklight-toolkit	nghttp2-libs	CVE-2023-35945 (High)
stacklight/telegraf	github.com/snowflakedb/gosnowflake	CVE-2023-34231 (High)

Addressed issues¶

The following issues have been addressed in the MOSK 23.2.1 release:

[TF] [30738] Fixed the issue that caused the tf-vrouter-agent readiness probe failure (No Configuration for self).

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.2.1.

[35111] openstack-operator-ensure-resources job stuck in CrashLoopBackOff¶

Traceback (most recent call last):
File "/usr/local/bin/osctl-ensure-shared-resources", line 8, in <module>
  sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/openstack_controller/cli/ensure_shared_resources.py", line 61, in main
  obj.update()
File "/usr/local/lib/python3.8/dist-packages/pykube/objects.py", line 165, in update
  self.patch(self.obj, subresource=subresource)
File "/usr/local/lib/python3.8/dist-packages/pykube/objects.py", line 157, in patch
  self.api.raise_for_status(r)
File "/usr/local/lib/python3.8/dist-packages/pykube/http.py", line 444, in raise_for_status
  raise HTTPError(resp.status_code, payload["message"])
pykube.exceptions.HTTPError: CustomResourceDefinition.apiextensions.k8s.io "redisfailovers.databases.spotahome.com" is invalid: spec.preserveUnknownFields: Invalid value: true: must be false in order to use defaults in the schema

As a workaround, delete the redisfailovers.databases.spotahome.com CRD from your cluster:

kubectl delete crd redisfailovers.databases.spotahome.com

[37012] Masakari failure during update¶

While updating your cluster, the Instance High Availability service (OpenStack Masakari) may not work as expected.

As a workaround, temporarily disable the service by removing instance-ha from the service list in the OpenStackDeployment custom resource.

Learn more about the release cadence

23.2.2 patch¶

The patch release notes contain the list of artifacts and Common Vulnerabilities and Exposures (CVE) fixes for the MOSK 23.2.2 patch released on September 14, 2023.

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

MOSK 23.2.2 details¶
Release date	September 14, 2023
Scope	Patch
Cluster release	15.0.3
OpenStack Operator	0.13.11
Tungsten Fabric Operator	0.12.5

Addressed issues¶

The following issues have been addressed in the MOSK 23.2.2 release:

[34342] Resolved the issue that caused a failure of the etcd pods due to the simultaneous deployment of several pods on a single node. To ensure that etcd pods are always placed on different nodes, MOSK now deploys etcd with the requiredDuringSchedulingIgnoredDuringExecution policy.
[34276] Resolved the issue that caused the presence of stale namespaces if the agent responsible for hosting the network was modified while the agent was offline.

Release artifacts¶

This section lists the components artifacts of the MOSK 23.2.2 release that includes binaries, Docker images, and Helm charts.

MOSK 23.2.2 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230706155916.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230830092445	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230830092445	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230830092445	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230830092445	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.7-20230623070627	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230830092445	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230830092445	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230830092445	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230830092445	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230830092445	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230830092445	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230830092445	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230830092445	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230730124813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230830072226	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230830072225	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230830092445	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.20-alpine-20230614113432	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230610071256	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.0-alpine-20230617191825	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230730124341	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230830092445	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230830092445	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230830092445	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230720054838	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230730124813	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:victoria-focal-20230830092445	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230830092445	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230830072225	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230830092445	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230830092445	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230830092445	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230830092445	Apache License 2.0

MOSK 23.2.2 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230718165730.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230830092445	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230830092445	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230830092445	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230830092445	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.7-20230623070627	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230830092445	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230830092445	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230830092445	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230830092445	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230830092445	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230830092445	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230830092445	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230830092445	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230730124813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230830072226	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230830072225	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230830092445	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.20-alpine-20230614113432	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230610071256	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.0-alpine-20230617191825	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230730124341	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230830092445	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230830092445	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230830092445	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230720054838	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230730124813	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-focal-20230830092445	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230830072225	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230830092445	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230830092445	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230830092445	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230830092445	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230830092445	Apache License 2.0

MOSK 23.2.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.13.11.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.2.2 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.12.5.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.12.5	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:23.2_R21.4.20230810083758	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:23.2_R21.4.20230810083758	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:23.2_R21.4.20230810083758	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:23.2_R21.4.20230810083758	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:23.2_R21.4.20230810083758	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:23.2_R21.4.20230810083758	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:23.2_R21.4.20230810083758	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:23.2_R21.4.20230810083758	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:23.2_R21.4.20230810083758	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.17	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.12	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230830113546	Apache License 2.0
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.4.1	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r32	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.5	Mirantis Proprietary License
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.7	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.17-mcp	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.1-20230425	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.8	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.11-alpine3.18	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:23.2_R21.4.20230810083758	Apache License 2.0
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230713172410	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230828085831	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:23.2_R21.4.20230810083758	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0

MOSK 23.2.2 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230825023009	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230531104437	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Security notes¶

In total, in the MOSK 23.2.2 release, 72 Common Vulnerabilities and Exposures (CVE) have been fixed: 8 of critical and 64 of high severity.

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

Addressed CVEs - summary¶
Severity	Critical	High	Total
Unique CVEs	2	19	21
Total issues across images	8	64	72

Addressed CVEs - detailed¶
Image	Component name	CVE
general/openvswitch	linux-libc-dev	CVE-2023-20593 (High)
		CVE-2023-3609 (High)
		CVE-2023-3611 (High)
		CVE-2023-3776 (High)
general/openvswitch-dpdk	linux-libc-dev	CVE-2023-20593 (High)
		CVE-2023-3609 (High)
		CVE-2023-3611 (High)
		CVE-2023-3776 (High)
iam/keycloak-gatekeeper	golang.org/x/crypto	CVE-2021-43565 (High)
		CVE-2020-29652 (High)
		CVE-2022-27191 (High)
	golang.org/x/net	CVE-2021-33194 (High)
		CVE-2022-27664 (High)
	golang.org/x/text	CVE-2021-38561 (High)
		CVE-2022-32149 (High)
	github.com/prometheus/client_golang	CVE-2022-21698 (High)
openstack/aodh	grpcio	CVE-2023-33953 (High)
		CVE-2023-33953 (High)
openstack/barbican	linux-libc-dev	CVE-2023-20593 (High)
		CVE-2023-3609 (High)
		CVE-2023-3611 (High)
		CVE-2023-3776 (High)
		CVE-2023-20593 (High)
		CVE-2023-3609 (High)
		CVE-2023-3611 (High)
		CVE-2023-3776 (High)
openstack/ceilometer	grpcio	CVE-2023-33953 (High)
		CVE-2023-33953 (High)
openstack/designate	Werkzeug	CVE-2022-29361 (Critical)
		CVE-2023-25577 (High)
	Flask	CVE-2023-30861 (High)
openstack/gnocchi	Werkzeug	CVE-2022-29361 (Critical)
		CVE-2023-25577 (High)
	grpcio	CVE-2023-33953 (High)
		CVE-2023-33953 (High)
openstack/ironic-inspector	Werkzeug	CVE-2022-29361 (Critical)
		CVE-2023-25577 (High)
	Flask	CVE-2023-30861 (High)
openstack/keystone	Werkzeug	CVE-2022-29361 (Critical)
		CVE-2023-25577 (High)
	Flask	CVE-2023-30861 (High)
openstack/octavia	Werkzeug	CVE-2022-29361 (Critical)
		CVE-2023-25577 (High)
	Flask	CVE-2023-30861 (High)
openstack/panko	grpcio	CVE-2023-33953 (High)
openstack/stepler	linux-libc-dev	CVE-2023-20593 (High)
		CVE-2023-3609 (High)
		CVE-2023-3611 (High)
		CVE-2023-3776 (High)
		CVE-2023-20593 (High)
		CVE-2023-3609 (High)
		CVE-2023-3611 (High)
		CVE-2023-3776 (High)
	cryptography	CVE-2023-38325 (High)
		CVE-2023-38325 (High)
scale/psql-client	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
	libpq	CVE-2023-39417 (High)
	postgresql13-client	CVE-2023-39417 (High)
stacklight/alerta-web	grpcio	CVE-2023-33953 (High)
	libpq	CVE-2023-39417 (High)
	postgresql15-client	CVE-2023-39417 (High)
stacklight/pgbouncer	libpq	CVE-2023-39417 (High)
	postgresql-client	CVE-2023-39417 (High)
tungsten/cass-config-builder	cups-libs	CVE-2023-32360 (High)
tungsten/tf-cli	dnf-plugin-subscription-manager	CVE-2023-3899 (High)
	python3-cloud-what	CVE-2023-3899 (High)
	python3-subscription-manager-rhsm	CVE-2023-3899 (High)
	python3-syspurpose	CVE-2023-3899 (High)
	subscription-manager	CVE-2023-3899 (High)
	subscription-manager-rhsm-certificates	CVE-2023-3899 (High)

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.2.2.

[34342] etcd pods failure¶

During the update, you may encounter the issue that causes a failure of the etcd pods due to the simultaneous deployment of several pods on a single node.

The workaround is to remove the PVC for one etcd pod.

[35111] openstack-operator-ensure-resources job stuck in CrashLoopBackOff¶

Traceback (most recent call last):
File "/usr/local/bin/osctl-ensure-shared-resources", line 8, in <module>
  sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/openstack_controller/cli/ensure_shared_resources.py", line 61, in main
  obj.update()
File "/usr/local/lib/python3.8/dist-packages/pykube/objects.py", line 165, in update
  self.patch(self.obj, subresource=subresource)
File "/usr/local/lib/python3.8/dist-packages/pykube/objects.py", line 157, in patch
  self.api.raise_for_status(r)
File "/usr/local/lib/python3.8/dist-packages/pykube/http.py", line 444, in raise_for_status
  raise HTTPError(resp.status_code, payload["message"])
pykube.exceptions.HTTPError: CustomResourceDefinition.apiextensions.k8s.io "redisfailovers.databases.spotahome.com" is invalid: spec.preserveUnknownFields: Invalid value: true: must be false in order to use defaults in the schema

As a workaround, delete the redisfailovers.databases.spotahome.com CRD from your cluster:

kubectl delete crd redisfailovers.databases.spotahome.com

[37012] Masakari failure during update¶

While updating your cluster, the Instance High Availability service (OpenStack Masakari) may not work as expected.

As a workaround, temporarily disable the service by removing instance-ha from the service list in the OpenStackDeployment custom resource.

Learn more about the release cadence

23.2.3 patch¶

The patch release notes contain the list of artifacts and Common Vulnerabilities and Exposures (CVE) fixes as well as description of the fixed product issues for the MOSK 23.2.3 patch released on September 26, 2023.

For the list of enhancements and bug fixes that relate to Mirantis Container Cloud, refer to the Mirantis Container Cloud Release notes.

MOSK 23.2.3 details¶
Release date	September 26, 2023
Scope	Patch
Cluster release	15.0.4
OpenStack Operator	0.13.12
Tungsten Fabric Operator	0.12.6

Addressed issues¶

The following issues have been addressed in the MOSK 23.2.3 release:

[31155] Resolved the issue that caused the OpenStack Redis operator CPU throttling in StackLight.
[34978] Resolved the issue that caused the Tungsten Fabric Redis operator to crash due to the upcoming MKE version update.

Release artifacts¶

This section lists the components artifacts of the MOSK 23.2.3 release that includes binaries, Docker images, and Helm charts.

MOSK 23.2.3 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230706155916.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230912131036	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230912131036	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230912131036	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230912131036	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.9	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230912131036	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230912131036	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230912131036	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230912131036	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230912131036	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230912131036	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230912131036	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230912131036	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230730124813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230912122504	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230912122503	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230913143832	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.21-alpine-20230913050002	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230912131525	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.2-alpine-20230912142938	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230912121635	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230912131036	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230912131036	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230912131036	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230720054838	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230912122503	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:victoria-focal-20230912131036	n/a
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230912131036	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230912122503	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230912131036	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230912131036	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230912131036	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230912131036	Apache License 2.0

MOSK 23.2.3 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230718165730.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230913143832	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230913143832	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230913143832	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230913143832	Apache License 2.0
cloudprober	mirantis.azurecr.io/openstack/extra/cloudprober:v0.12.9	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230913143832	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230913143832	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230913143832	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230913143832	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230913143832	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230913143832	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230913143832	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230913143832	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230730124813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230912122504	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230912122503	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230913143832	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-5359171-20230810125608	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:v1.6.21-alpine-20230913050002	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.9-alpine-20230912131525	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230817061604	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.8.2-alpine-20230912142938	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-fipster-20230725114156	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.14-focal-20230912121635	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230913143832	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230913143832	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230913143832	Apache License 2.0
redis	mirantis.azurecr.io/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1-20230619084330	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230720054838	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230912122503	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.25.1-alpine-slim	Apache License 2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-focal-20230913143832	n/a
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230913183155	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230913143832	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230913143832	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230913143832	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230913143832	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230913143832	Apache License 2.0

MOSK 23.2.3 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.13.12.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4270.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2953.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.2.3 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.12.6.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.12.6	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:23.2-r21.4.20230913134453	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:23.2-r21.4.20230913134453	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:23.2-r21.4.20230913134453	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:23.2-r21.4.20230913134453	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:23.2-r21.4.20230913134453	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:23.2-r21.4.20230913134453	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:23.2-r21.4.20230913134453	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:23.2-r21.4.20230913134453	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:23.2-r21.4.20230913134453	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:23.2_R21.4.20230810083758	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:23.2_R21.4.20230810083758	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.17	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.12	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230622	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230830113546	Apache License 2.0
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.4.1	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r32	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.5	Mirantis Proprietary License
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.8	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.16	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.17-mcp	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.1-20230425	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.9	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.2.1-alpine3.18	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:23.2-r21.4.20230913134453	Apache License 2.0
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230915094303	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230828085831	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:23.2-r21.4.20230913134453	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0

MOSK 23.2.3 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230915023013	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230912105027	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Security notes¶

In total, in the MOSK 23.2.3 release, 331 Common Vulnerabilities and Exposures (CVE) have been fixed: 39 of critical and 292 of high severity.

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

Addressed CVEs - summary¶
Severity	Critical	High	Total
Unique CVEs	1	18	19
Total issues across images	39	292	331

Addressed CVEs - detailed¶
Image	Component name	CVE
core/external/nginx	libwebp	CVE-2023-4863 (High)
core/frontend	libwebp	CVE-2023-4863 (High)
general/memcached	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
lcm/kubernetes/openstack-cloud-controller-manager-amd64	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
lcm/registry	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
openstack/extra/cloudprober	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
openstack/extra/etcd	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
openstack/extra/nginx-ingress-controller	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
openstack/extra/redis	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
openstack/horizon	Django	CVE-2023-41164 (High)
scale/curl-jq	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
stacklight/alertmanager-webhook-servicenow	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
stacklight/grafana-image-renderer	libwebp	CVE-2023-4863 (High)
stacklight/ironic-prometheus-exporter	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
stacklight/sf-reporter	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
stacklight/tungstenfabric-prometheus-exporter	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)
tungsten/contrail-analytics-alarm-gen	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-analytics-api	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-analytics-collector	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/contrail-analytics-query-engine	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/contrail-analytics-snmp-collector	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-analytics-snmp-topology	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-controller-config-api	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-controller-config-devicemgr	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-controller-config-schema	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/contrail-controller-config-svcmonitor	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/contrail-controller-control-control	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-controller-control-dns	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/contrail-controller-control-named	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/contrail-controller-webui-job	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/contrail-controller-webui-web	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-nodemgr	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/contrail-provisioner	kernel-headers	CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
		CVE-2023-35788 (High)
tungsten/contrail-tools	kernel-headers	CVE-2023-35788 (High)
		CVE-2022-1012 (High)
		CVE-2023-2163 (High)
		CVE-2022-42896 (High)
		CVE-2023-3611 (High)
		CVE-2023-35001 (High)
		CVE-2023-3609 (High)
		CVE-2020-8834 (High)
		CVE-2021-3715 (High)
		CVE-2023-4128 (High)
		CVE-2023-32233 (High)
		CVE-2022-2639 (High)
		CVE-2023-1829 (High)
		CVE-2023-3776 (High)
		CVE-2018-20976 (High)
		CVE-2023-1281 (High)
tungsten/redis	busybox	CVE-2022-48174 (Critical)
	busybox-binsh	CVE-2022-48174 (Critical)
	ssl_client	CVE-2022-48174 (Critical)

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.2.3.

[34342] etcd pods failure¶

During the update, you may encounter the issue that causes a failure of the etcd pods due to the simultaneous deployment of several pods on a single node.

The workaround is to remove the PVC for one etcd pod.

[35111] openstack-operator-ensure-resources job stuck in CrashLoopBackOff¶

Traceback (most recent call last):
File "/usr/local/bin/osctl-ensure-shared-resources", line 8, in <module>
  sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/openstack_controller/cli/ensure_shared_resources.py", line 61, in main
  obj.update()
File "/usr/local/lib/python3.8/dist-packages/pykube/objects.py", line 165, in update
  self.patch(self.obj, subresource=subresource)
File "/usr/local/lib/python3.8/dist-packages/pykube/objects.py", line 157, in patch
  self.api.raise_for_status(r)
File "/usr/local/lib/python3.8/dist-packages/pykube/http.py", line 444, in raise_for_status
  raise HTTPError(resp.status_code, payload["message"])
pykube.exceptions.HTTPError: CustomResourceDefinition.apiextensions.k8s.io "redisfailovers.databases.spotahome.com" is invalid: spec.preserveUnknownFields: Invalid value: true: must be false in order to use defaults in the schema

As a workaround, delete the redisfailovers.databases.spotahome.com CRD from your cluster:

kubectl delete crd redisfailovers.databases.spotahome.com

[37012] Masakari failure during update¶

While updating your cluster, the Instance High Availability service (OpenStack Masakari) may not work as expected.

As a workaround, temporarily disable the service by removing instance-ha from the service list in the OpenStackDeployment custom resource.

Learn more about the release cadence

23.1 series¶

23.1¶

Release date	April 4, 2023
Name	MOSK 23.1
Cluster release	12.7.0
Highlights	Full support for Tungsten Fabric 21.4 with automatic upgrade from Tungsten Fabric 2011 and hardened web UI Technical Preview of Octavia Amphora support load balancers with Tungsten Fabric Dynamic control over resource oversubscription Sensitive information hidden from `OpenStackDeployment` Restricted privileges for project administrator Automated password rotation for MOSK superuser and service accounts Encrypted data transfer between VNC proxy and hypervisor VNC server Upgraded Ceph to Pacific 16.2.11 and PowerDNS to 4.7

New features¶

MOSK 23.1 features¶
Component	Support scope	Feature
OpenStack	Full	Dynamic configuration of resource oversubscription
Tungsten Fabric	Full	Tungsten Fabric 21.4 full support
	TechPreview	Advanced load balancing with Tungsten Fabric
Stacklight	Full	OpenStack Controller alerts Support for new panels for OpenSearch and Prometheus with the storage usage details Bond interfaces monitoring Logs forwarding to external destinations Ability to set custom TLS certificates for `iam-proxy` endpoints
Security	Full	Sensitive information hidden from `OpenStackDeployment` Automated password rotation for MOSK superuser and service accounts Encrypted data transfer between VNC proxy and hypervisor VNC server
	TechPreview	Restricted privileges for project administrator
Cluster update	Full	Graceful reboot of cluster nodes Mirantis Container Cloud web UI to identify cluster nodes that require reboot
Other components	Full	Ceph upgrade to Pacific 16.2.11 PowerDNS upgrade to 4.7
Documentation	n/a	User Guide: Deploy cloud applications using automation Operations Guide: Customize OpenStack container images

Dynamic configuration of resource oversubscription¶

Introduced a new default way to configure the resource oversubscription in the cloud that enables the cloud operator to dynamically control the oversubscription through the Compute service (OpenStack Nova) placement API.

The initial configuration is performed through the OpenStackDeployment custom resource. By default, the following values are applied:

cpu: 8.0
disk: 1.6
ram: 1.0

Learn more

Tungsten Fabric 21.4 full support¶

Starting from 23.1, MOSK deploys all new clouds using Tungsten Fabric 21.4 by default. The existing OpenStack deployments using Tungsten Fabric as a networking backend will obtain this new version automatically during the cluster update to MOSK 23.1.

One of the key highlights of the Tungsten Fabric 21.4 release is the support for configuring Maximum Transmission Unit for virtual networks. This capability enables you to set the maximum packet size for your virtual networks, ensuring that your network traffic is optimized for performance and efficiency.

Learn more

Tungsten Fabric upstream documentation: Tungsten Fabric 21.4 release notes

Advanced load balancing with Tungsten Fabric¶

TechPreview

Enhanced load balancing as a service for Tungsten Fabric -enabled MOSK clouds by adding support for Amphora instances on top of the Tungsten Fabric networks.

Compared to the old implementation, which relied on the Tungsten Fabric-controlled HAproxy, the new approach offers:

Full compatibility with the OpenStack Octavia API
Layer 7 load balancing policies and rules
Support for HTTPs/TLS terminating load balancers
Support for the UDP protocol

Learn more

Reference Architecture: Octavia Amphora load balancing

Stacklight¶

Implemented the following list of alerts for the OpenStack controller:
- OsDplExporterTargetDown
- OsDplSSLCertExpirationHigh
- OsDplSSLCertExpirationMedium
Learn more

Operations Guide: OpenStack Controller alerts
Implemented the new panels in the Grafana dashboards for OpenSearch and Prometheus that provide details on the storage usage and allow calculating the possible retention time based on provisioned storage and average usage.

Learn more

Operations Guide: Calculate the storage retention time
Implemented monitoring of bond interfaces.

Learn more

Operations Guide: Bond interface alerts
Added the capability to forward logs to external Elasticsearch and OpenSearch servers as the fluentd-logs output. This enhancement also expands existing configuration options for log forwarding to syslog.

Learn more

Operations Guide: Enable log forwarding to external destinations
Implemented the ability to set up custom TLS certificates for the following StackLight iam-proxy endpoints:
- iam-proxy-alerta
- iam-proxy-alertmanager
- iam-proxy-grafana
- iam-proxy-kibana
- iam-proxy-prometheus
Learn more

Operations Guide: Configure TLS certificates for cluster applications

Security¶

Implemented the capability to hide sensitive fields from the OpenStackDeployment object by adding reference to a secret to this object using the value_from structure.

Learn more

Reference Architecture: Hiding sensitive information
Implemented the functionality that enables cloud operators to periodically rotate credentials of OpenStack admin and service users with minimized impact on service availability and workload downtime.
Learn more
- Operations Guide: Rotate OpenStack credentials
- Security Guide: Rotation of credentials in OpenStack
Ensured better security for the noVNC client by allowing encryption of data transfer between the instances and the noVNC proxy server using VeNCrypt authentication scheme. You can enable this feature by defining features:nova:console:novnc:tls:enabled in the OpenStackDeployment custom resource.

Learn more

Reference Architecture: Encrypted data transfer for noVNC
Technology Preview. Reworked the default MOSK access policies to restrict the permissions of a project administrator role exclusively to the scope of their project.
Learn more
- Reference Architecture: features:policies:strict_admin
- Security guide: OpenStack API access policies

Cluster update¶

Implemented the capability to reboot several cluster nodes in one go by using the Graceful reboot mechanism provided by Mirantis Container Cloud. The mechanism restarts the selected nodes one by one, honoring the instance migration policies.
Implemented the capability to identify the nodes requiring reboot through both the Mirantis Container Cloud API and web UI:
- API: reboot.required.true in status:providerStatus of a Machine object
- Web UI: the One or more machines require a reboot notification on the Clusters and Machines pages

Learn more

Operations Guide: Update a MOSK cluster

Other major component version update¶

Upgraded Ceph to Pacific 16.2.11 from Octopus 15.2.17
Upgraded PowerDNS to 4.7 from 4.2

Documentation¶

Published the tutorial to help you build your first cloud application and onboard it to a MOSK cloud. The dedicated section in the User Guide will guide you through the process of deploying and managing a sample application using automation, and showcase the powerful capabilities of OpenStack.

Learn more

User Guide: Deploy cloud applications using automation
Published the instructions on how you can customize the functionality of MOSK OpenStack services by installing custom system or Python packages into their container images.

Learn more

Operations Guide: Customize OpenStack container images

Major components versions¶

MOSK 23.1 components versions¶
Component	Version
Cluster release	12.7.0 (Cluster release notes)
OpenStack	Yoga
OpenStack Operator	0.12.4
Tungsten Fabric	21.4
Tungsten Fabric Operator	0.11.7

See also

For the supported versions of operating system, Ceph, and other components, refer to Release Compatibility Matrix.

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.1.

[25124] MPLSoGRE encapsulation has limited throughput¶

Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) provides limited throughput while sending data between VMs up to 38 Mbps, as per Mirantis tests.

As a workaround, switch the encapsulation type to VXLAN in the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          conf:
            bagpipe_bgp:
              dataplane_driver_ipvpn:
                mpls_over_gre: "False"
                vxlan_encap: "True"

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric (TF) known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.1. For TF limitations, see Tungsten Fabric known limitations.

[30738] ‘tf-vrouter-agent’ readiness probe failed (No Configuration for self)
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[32723] Compiled vRouter kernel module does not refresh with the new kernel

[30738] ‘tf-vrouter-agent’ readiness probe failed (No Configuration for self)¶

Fixed in MOSK 23.2 Fixed in MOSK 23.2.1

"Readiness probe failed: vRouter is PRESENT contrail-vrouter-agent: initializing (No Configuration for self)"

As a result, vRouters may have an incomplete routing table making some services, such as metadata, become unavailable.

Workaround:

Add the tf_heat_tempest_plugin tests with global configuration to the exclude list in the OpenStackDeployment custom resource:

spec:
  tempest:
    tempest:
      values:
        conf:
          blacklist:
            - (?:tf_heat_tempest_plugin.tests.functional.test_global.*)

If you ran test_template_global_vrouter_config and tf-vrouter-agent pods moved to the error state with the above error, re-create these pods through deletion:
```
kubectl -n tf delete pod tf-vrouter-agent-*
```

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete the Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica_num>

[32723] Compiled vRouter kernel module does not refresh with the new kernel¶

Fixed in MOSK 23.2

The vRouter kernel module remains at /usr/src/vrouter-<TF-VROUTER-IMAGE-VERSION>, even if it was initially compiled for an older kernel version. This leads to the reuse of compiled artifacts without recompilation. Consequently, after upgrading to Mirantis OpenStack for Kubernetes 23.1, an outdated module gets loaded onto the new kernel. This mismatch results in a failure that triggers the CrashLoop state for the vRouter on the affected node.

Workaround:

On the affected node, move the old vRouter kernel module to another directory. For example:

mkdir /root/old_vrouter_kmods
mv /lib/modules/`uname -r`/updates/dkms/vrouter* /root/old_vrouter_kmods
mv /usr/src/vrouter-21.4.20230306000000 /root/old_vrouter_kmods

On the cluster, remove the tf-vrouter-agent pod:

kubectl delete -n tf pod tf-vrouter-agent-<POD_ID>

Ceph known issues¶

This section lists the Ceph known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.1.

[30857] Irrelevant error during Ceph OSD deployment on removable devices
[31630] Ceph cluster upgrade to Pacific is stuck with Rook connection failure
[31555] Ceph can find only 1 out of 2 ‘mgr’ after update to MOSK 23.1

[30857] Irrelevant error during Ceph OSD deployment on removable devices¶

Fixed in MOSK 23.2

The deployment of Ceph OSDs fails with the following messages in the status section of the KaaSCephCluster custom resource:

shortClusterInfo:
  messages:
  - Not all osds are deployed
  - Not all osds are in
  - Not all osds are up

To find out if your cluster is affected, verify if the devices on the AMD hosts you use for the Ceph OSDs deployment are removable. For example, if the sdb device name is specified in spec.cephClusterSpec.nodes.storageDevices of the KaaSCephCluster custom resource for the affected host, run:

# cat /sys/block/sdb/removable
1

The system output shows that the reason of the above messages in status is the enabled hotplug functionality on the AMD nodes, which marks all drives as removable. And the hotplug functionality is not supported by Ceph in MOSK.

As a workaround, disable the hotplug functionality in the BIOS settings for disks that are configured to be used as Ceph OSD data devices.

[31630] Ceph cluster upgrade to Pacific is stuck with Rook connection failure¶

Fixed in MOSK 23.2

During update to MOSK 23.1, the Ceph cluster gets stuck during upgrade to Ceph Pacific.

To verify whether your cluster is affected:

The cluster is affected if the following conditions are true:

The ceph-status-controller pod on the MOSK cluster contains the following log lines:

kubectl -n ceph-lcm-mirantis logs <ceph-status-controller-podname>
...
E0405 08:07:15.603247       1 cluster.go:222] Cluster health: "HEALTH_ERR"
W0405 08:07:15.603266       1 cluster.go:230] found issue error: {Urgent failed to get status. . timed out: exit status 1}

The KaaSCephCluster custom resource contains the following configuration option in the rookConfig section:

spec:
  cephClusterSpec:
    rookConfig:
      ms_crc_data: "false" # or 'ms crc data: "false"'

As a workaround, remove ms_crc_data (or ms crc data) configuration key from the KaaSCephCluster custom resource and wait for the rook-ceph-mon pods to restart on the MOSK cluster:

kubectl -n rook-ceph get pod -l app=rook-ceph-mon -w

[31555] Ceph can find only 1 out of 2 ‘mgr’ after update to MOSK 23.1¶

Fixed in MOSK 23.2

After update to MOSK 23.1, the status section of the KaaSCephCluster custom resource can contain the following message:

shortClusterInfo:
  messages:
  - Not all mgrs are running: 1/2

To verify whether the cluster is affected:

If the KaaSCephCluster spec contains the external section, the cluster is affected:

spec:
  cephClusterSpec:
    external:
      enable: false

Workaround::

In spec.cephClusterSpec of the KaaSCephCluster custom resource, remove the external section.
Wait for the Not all mgrs are running: 1/2 message to disappear from the KaaSCephCluster status.

Verify that the nova Ceph client that is integrated to MOSK has the same keyring as in the Ceph cluster.

Verify that the cinder Ceph client integrated to MOSK has the same keyring as in the Ceph cluster:

Verify that the glance Ceph client integrated to MOSK has the same keyring as in the Ceph cluster.

StackLight known issues¶

This section lists the StackLight known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.1.

[30867] Broken ‘Instance Info’ panel for RabbitMQ in Grafana¶

Fixed in MOSK 23.2

Due to the table widget failure, the Instance Info panel in the RabbitMQ Grafana dashboard is empty.

Workaround:

Download the packaged chart with the fixed RabbitMQ dashboard.
Unpack the archive, open grafana-3.3.10-mcp-199/grafana/dashboards/rabbitmq.json, and copy the file content to the clipboard.
In the Grafana web UI, navigate to Dashboards > Import.
In the Import dashboard wizard that opens, paste the imported RabbitMQ dashboard into the Import via panel json box. Click Load.
Specify the dashboard name, for example, RabbitMQ_fix to avoid colliding with the current dashboard.
Click Import to save the dashboard.

Now, the Instance Info panel in the imported RabbitMQ dashboard should be displayed correctly.

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.1.

[27797] Cluster ‘kubeconfig’ stops working during MKE minor version update¶

Fixed in MOSK 23.2

During update of a Container Cloud management cluster, if the MKE minor version is updated from 3.4.x to 3.5.x, access to the cluster using the existing kubeconfig fails with the You must be logged in to the server (Unauthorized) error due to OIDC settings being reconfigured.

As a workaround, during the Container Cloud cluster update, use the admin kubeconfig instead of the existing one. Once the update completes, you can use the existing cluster kubeconfig again.

To obtain the admin kubeconfig:

kubectl --kubeconfig <pathToMgmtKubeconfig> get secret -n <affectedClusterNamespace> \
-o yaml <affectedClusterName>-kubeconfig | awk '/admin.conf/ {print $2}' | \
head -1 | base64 -d > clusterKubeconfig.yaml

[32311] Update is stuck due to the ‘tf-rabbit-exporter’ ReplicaSet issue¶

Fixed in MOSK 23.2

On a cluster with Tungsten Fabric enabled, the cluster update is stuck with the tf-rabbit-exporter deployment having a number of pods in the Terminating state.

To verify whether your cluster is affected:

kubectl -n tf get pods | grep tf-rabbit-exporter

Example of system response on the affected cluster:

tf-rabbit-exporter-6cd5bcd677-dz4bw        1/1     Running       0          9m13s
tf-rabbit-exporter-8665b5886f-4n66m        0/1     Terminating   0          5s
tf-rabbit-exporter-8665b5886f-58q4z        0/1     Terminating   0          0s
tf-rabbit-exporter-8665b5886f-7t5bp        0/1     Terminating   0          7s
tf-rabbit-exporter-8665b5886f-b2vp9        0/1     Terminating   0          3s
tf-rabbit-exporter-8665b5886f-k4gn2        0/1     Terminating   0          6s
tf-rabbit-exporter-8665b5886f-lscb2        0/1     Terminating   0          5s
tf-rabbit-exporter-8665b5886f-pdp78        0/1     Terminating   0          1s
tf-rabbit-exporter-8665b5886f-qgpcl        0/1     Terminating   0          1s
tf-rabbit-exporter-8665b5886f-vpfrg        0/1     Terminating   0          8s
tf-rabbit-exporter-8665b5886f-vsqqk        0/1     Terminating   0          13s
tf-rabbit-exporter-8665b5886f-xfjgf        0/1     Terminating   0          2s

Workaround:

Drop an extra custom resource of RabbitMQ:

kubectl -n tf get rabbitmq

Example of system response on the affected cluster:

NAME                 AGE
tf-rabbit-exporter   545d
tf-rabbitmq          545d

Delete the tf-rabbit-exporter custom resource:

kubectl -n tf delete rabbitmq tf-rabbit-exporter

Release artifacts¶

This section lists the components artifacts of the MOSK 23.1 release that includes binaries, Docker images, and Helm charts.

MOSK 23.1 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230227101732.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230227093206	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230227093206	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230227093206	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230227093206	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230227093206	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230227093206	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230227093206	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230227093206	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230227093206	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230227093206	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230227093206	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230227093206	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230227123148	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230227123148	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230227123149	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230227093206	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:v0.8.3	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.17-alpine3.16	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:pacific-focal-20230227123148	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.4-alpine-20221116085516	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230113154410	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230227122722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.1	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20230228212732	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230227093206	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230227093206	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230227093206	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.5-alpine3.16	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.3.1	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20221028090933	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230227123149	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.25.0-amd64-20220922181123	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.3-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230227123149	GPL-2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230223144137	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230227093206	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230227093206	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230227093206	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230227093206	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230227093206	Apache License 2.0

MOSK 23.1 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230223191854.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230223184118	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230223184118	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230223184118	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230223184118	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230223184118	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230223184118	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230223184118	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230223184118	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230223184118	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230223184118	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230223184118	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230223184118	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230227123148	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230227123148	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230227123149	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230227093206	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:v0.8.3	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.17-alpine3.16	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:pacific-focal-20230227123148	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.4-alpine-20221116085516	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230113154410	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230227122722	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.1	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20230228212732	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230223184118	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230223184118	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230223184118	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.5-alpine3.16	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.3.1	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20221028090933	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230227123149	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.25.0-amd64-20220922181123	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.3-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230227123149	GPL-2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230223184118	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230227123149	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230223184118	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230223184118	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230223184118	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230223184118	Apache License 2.0

MOSK 23.1 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.12.4.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.11.7.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.11.7	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20230306000000	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20230306000000	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20230306000000	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20230306000000	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20230306000000	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20230306000000	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20230306000000	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20230306000000	Apache License 2.0
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20230306000000	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230306000000	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.14	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.11	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230126	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20220919122317	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20220919114133	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.1	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.3.1	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.17.2-debian-11-r47	Apache License 2.0
Pause	mirantis.azurecr.io/general/external/pause:3.1	Google Cloud Platform
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.1	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.11.2	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.14	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.7.1-20220914	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.6	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.5-alpine3.16	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230306110548	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230223110354	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20221109111149	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20230306000000	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0

MOSK 23.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230203125533	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20221226123816	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 23.1 release:

[OpenStack] [30450] Fixed the issue causing high CPU load of MariaDB.
[OpenStack] [29501] Fixed the issue when Cinder periodic database cleanup resets the state of volumes.
[OpenStack] [27168] Fixed the issue that made openvswitch-openvswitch-vswitchd-default and neutron-ovs-agent-default pods stuck in the NotReady status after restart.
[OpenStack] [29539] Fixed the issue with missing network traffic for a trunked port in OpenStack Yoga.
[OpenStack] [Yoga] [24067] Fixed the issue with inability to set up a secondary DNS zone in OpenStack Yoga.

Note

The issue still affects OpenStack Victoria.
[TF] [10096] Fixed the issue that prevented tf-control from refreshing IP addresses of Cassandra pods.
[TF] [28728] Fixed the issue when tungstenFabricMonitoring.enabled was not enabled by default during Tungsten Fabric deployment.
[TF] [30449] Fixed the issue that resulted in losing connectivity after the primary TF Controller node reboot.
[Ceph] [28142] Added the ability to specify node affinity for rook-discover pods through the ceph-operator Helm release.
[Ceph] [26820] Fixed the issue when the status section in the KaaSCephCluster.status custom resource did not reflect issues during the process of a Ceph cluster deletion.
[StackLight] [28372] Fixed the issue causing false-positive liveness probe failures for fluentd-notifications.
[StackLight] [29330] Fixed the issue that prevented tf-rabbitmq from being monitored.
[Updates] [29438] Fixed the issue that caused the cluster update being stuck during the Tungsten Fabric Operator update.

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 23.1. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Features¶

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

Major component versions update¶

As part of the update to MOSK 23.1, Tungsten Fabric will automatically get updated from version 2011 to version 21.4.

Note

For the compatibility matrix of the most recent MOSK releases and their major components in conjunction with Container Cloud and Cluster releases, refer to Release Compatibility Matrix.

Update impact and maintenance windows planning¶

The update to MOSK 23.1 does not include any version-specific impact on the cluster. To start planning a maintenance window, use the Operations Guide: Update a MOSK cluster standard procedure.

Known issues during the update¶

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Cluster update known issues.

Pre-update actions¶

Update the baremetal-provider image to 1.37.18¶

If your Container Cloud management cluster has updated to 2.24.1, to avoid the issue with waiting for the lcm-agent to update the currentDistribution field during the cluster update to MOSK 23.1, replace the baremetal-provider image 1.37.15 tag with 1.37.18:

Open the kaasrelease object for editing:
```
kubectl edit kaasrelease kaas-2-24-1
```

Replace the 1.37.15 tag with 1.37.18 for the baremetal-provider image:

- chartURL: core/helm/baremetal-provider-1.37.15.tgz
  helmV3: true
  name: baremetal-provider
  namespace: kaas
  values:
    cluster_api_provider_baremetal:
      image:
        tag: 1.37.18

Explicitly define the OIDCClaimDelimiter parameter¶

MOSK 23.1 introduces a new default value for the OIDCClaimDelimiter parameter, which defines the delimiter to use when setting multi-valued claims in the HTTP headers. See the MOSK 23.1 OpenStack API Reference for details.

Previously, the value of the OIDCClaimDelimiter parameter defaulted to ",". This value misaligned with the behavior expected by Keystone. As a result, when creating federation mappings for Keystone, the cloud operator was forced to write more complex rules. Therefore, in MOSK 22.4, Mirantis announced the change of the default value for the OIDCClaimDelimiter parameter.

If your deployment is affected and you have not explicitly defined the OIDCClaimDelimiter parameter, as Mirantis advised, after update to MOSK 22.4 or 22.5, now would be a good time to do it. Otherwise, you may encounter unforeseen consequences after the update to MOSK 23.1.

Affected deployments

Proceed with the instruction below only if the following conditions are true:

Keystone is set to use federation through the OpenID Connect protocol, with Mirantis Container Cloud Keycloak in particular. The following configuration is present in your OpenStackDeployment custom resource:
```
kind: OpenStackDeployment
spec:
  features:
    keystone:
      keycloak:
        enabled: true
```
No value has already been specified for the OIDCClaimDelimiter parameter in your OpenStackDeployment custom resource.

To facilitate smooth transition of the existing deployments to the new default value, explicitly define the OIDCClaimDelimiter parameter as follows:

kind: OpenStackDeployment
spec:
  features:
    keystone:
      keycloak:
        oidc:
          OIDCClaimDelimiter: ","

Note

The new default value for the OIDCClaimDelimiter parameter is ";". To find out whether your Keystone mappings will need adjustment after changing the default value, set the parameter to ";" on your staging environment and verify the rules.

Verify Ceph configuration¶

Verify that the KaaSCephCluster custom resource does not contain the following entries. If they exist, remove them.

In the spec.cephClusterSpec section, the external section.

Caution

If the external section exists in the KaaSCephCluster spec during upgrade to MOSK 23.1, it will cause Ceph outage that leads to corruption of the Cinder volumes file system and requires a lot of routine work to fix sectors with Cinder volumes one-by-one after fixing Ceph outage.

Therefore, make sure that the external section is removed from the KaaSCephCluster spec right before starting cluster upgrade.
In the spec.cephClusterSpec.rookConfig section, the ms_crc_data or ms crc data configuration key. After you remove the key, wait for rook-ceph-mon pods to restart on the MOSK cluster.

Caution

If the ms_crc_data key exists in the rookConfig section of KaaSCephCluster during upgrade to MOSK 23.1, it causes missing connection between Rook Operator and Ceph Monitors during Ceph version upgrade leading to a stuck upgrade and requires that you manually disable the ms_crc_data key for all Ceph Monitors.

Therefore, make sure that the ms_crc_data key is removed from the KaaSCephCluster spec right before starting cluster upgrade.

Disable Tempest¶

To prevent issues during graceful reboot of the OpenStack controller nodes, temporarily remove Tempest from the OpenStackDeployment object:

spec:
  features:
    services:
    - tempest

Post-update actions¶

Remove sensitive information from cluster configuration¶

The OpenStackDeploymentSecret custom resource has been deprecated in MOSK 23.1. The fields that store confidential settings in OpenStackDeploymentSecret and OpenStackDeployment custom resources need to be migrated to the Kubernetes secrets.

The full list of the affected fields include:

spec:
  features:
    ssl:
      public_endpoints:
        - ca_cert
        - api_cert
        - api_key
    barbican:
      backends:
        vault:
          - approle_role_id
          - approle_secret_id
          - ssl_ca_crt_file
    baremetal:
      ngs:
        hardware:
          *:
            - username
            - password
            - ssh_private_key
            - secret
    keystone:
      domain_specific_configuration:
        ...
        ks_domains:
          *:
            config:
              ...
              ldap:
                ...
                password: <password>
                user: <user-name>

After the update, migrate the fields mentioned above from OpenStackDeployment and OpenStackDeploymentSecret custom resources:

Create a Kubernetes secret <osdpl-name>-hidden in the openstack namespace either using the helper script or manually:
Using script
Use the osctl-move-sensitive-data helper script from the openstack-controller pod:
osctl-move-sensitive-data osh-dev --secret-name osh-dev-hidden
Manually
Create the Kubernetes secret and add content of required fields to the OpenStackDeployment custom resource. For example:
apiVersion: v1 kind: Secret metadata: name: osh-dev-hidden namespace: openstack labels: openstack.lcm.mirantis.com/osdpl_secret: 'true' type: Opaque data: ca_cert: ... api_cert: ... api_key: ...

Add a reference from appropriate fields in the OpenStackDeployment object. For example:

spec:
  features:
   ssl:
     public_endpoints:
       api_cert:
         value_from:
           secret_key_ref:
             key: api_cert
             name: osh-dev-hidden
       api_key:
         value_from:
           secret_key_ref:
             key: api_key
             name: osh-dev-hidden
       ca_cert:
         value_from:
           secret_key_ref:
             key: ca_cert
             name: osh-dev-hidden

If you used to store your sensitive information in the OpenStackDeploymentSecret object, remove it from your cluster configuration.

Change to default RAM oversubscription¶

To ensure stability for production workloads, MOSK 23.1 changes the default value of RAM oversubscription on compute nodes to 1.0, which is no oversubscription. In MOSK 22.5 and earlier, the effective default value of RAM allocation ratio is 1.1.

This change will be applied only to the compute nodes added to the cloud after update to MOSK 23.1. The effective RAM oversubscription value for existing compute nodes will not automatically change after updating to MOSK 23.1.

Therefore, Mirantis strongly recommends adjusting the oversubscription values to new defaults for existing compute nodes as well. For the procedure, refer to Change oversubscription settings for existing compute nodes.

Use dynamic configuration for resource oversubscription¶

Since MOSK 23.1, the Compute service (OpenStack Nova) enables you to control the resource oversubscription dynamically through the placement API.

However, if your cloud already makes use of custom allocation ratios, the new functionality will not become immediately available after update. Any compute node configured with explicit values for the cpu_allocation_ratio, disk_allocation_ratio, and ram_allocation_ratio configuration options will continue to enforce those values in the placement service. Therefore, any changes made through the placement API will be overridden by the values set in those configuration options in the Compute service. To modify oversubscription, you should adjust the values of these configuration options in the OpenStackDeployment custom resource. This procedure should be performed with caution as modifying these values may result in compute service restarts and potential disruptions in the instance builds.

To enable the use of the new functionality, Mirantis recommends removing explicit values for the cpu_allocation_ratio, disk_allocation_ratio, and ram_allocation_ratio options from the OpenStackDeployment custom resource. Instead, use the new configuration options as described in Configuring initial resource oversubscription. Also, keep in mind that the changes will only impact newly added compute nodes and will not be applied to the existing ones.

Learn more

Security notes¶

The table below contains the number of vendor-specific addressed CVEs with Critical or High severity.

In total, in the MOSK 23.1 release, 432 CVEs have been fixed and 85 artifacts (images) updated.

Addressed CVEs¶
Fixed CVE ID	# of updated artifacts
ALAS2-2022-1877	2
ALAS2-2022-1885	2
ALAS2-2022-1902	4
ALAS2-2023-1904	1
ALAS2-2023-1908	2
ALAS2-2023-1911	2
ALAS2-2023-1915	2
ALAS2-2023-1958	2
CVE-2014-10064	4
CVE-2015-8315	2
CVE-2015-8851	2
CVE-2016-10539	2
CVE-2016-10540	8
CVE-2016-1252	1
CVE-2016-2515	2
CVE-2016-2537	4
CVE-2016-6313	2
CVE-2017-15010	4
CVE-2017-16042	2
CVE-2017-16119	2
CVE-2017-16138	8
CVE-2017-18077	4
CVE-2017-20165	8
CVE-2017-9445	1
CVE-2018-1000001	1
CVE-2018-1000620	4
CVE-2018-16492	4
CVE-2018-16864	1
CVE-2018-16865	1
CVE-2018-20834	2
CVE-2018-3728	4
CVE-2018-3737	4
CVE-2018-7408	2
CVE-2019-10744	4
CVE-2019-13173	4
CVE-2019-16776	2
CVE-2019-19919	2
CVE-2019-20920	2
CVE-2019-20922	2
CVE-2019-25013	9
CVE-2019-3462	1
CVE-2020-10684	1
CVE-2020-1737	1
CVE-2020-1971	1
CVE-2020-26301	2
CVE-2020-7754	2
CVE-2020-7774	2
CVE-2020-7788	2
CVE-2020-8203	6
CVE-2021-23337	6
CVE-2021-23358	4
CVE-2021-23369	4
CVE-2021-23383	4
CVE-2021-23807	4
CVE-2021-33574	69
CVE-2021-3711	2
CVE-2021-3918	4
CVE-2021-44906	7
CVE-2022-0144	3
CVE-2022-0686	1
CVE-2022-0691	1
CVE-2022-0778	5
CVE-2022-1292	5
CVE-2022-1664	6
CVE-2022-1941	8
CVE-2022-2068	6
CVE-2022-23218	6
CVE-2022-23219	6
CVE-2022-24407	3
CVE-2022-24785	2
CVE-2022-27404	2
CVE-2022-29155	5
CVE-2022-29167	4
CVE-2022-29217	2
CVE-2022-31129	3
CVE-2022-3515	2
CVE-2022-37599	1
CVE-2022-37601	1
CVE-2022-37603	1
CVE-2022-39379	1
CVE-2022-40023	19
CVE-2022-40899	10
RHSA-2019:0997	2
RHSA-2019:1145	1
RHSA-2019:1619	1
RHSA-2019:1714	1
RHSA-2019:2692	1
RHSA-2019:4114	5
RHSA-2020:0271	1
RHSA-2020:0273	1
RHSA-2020:0575	4
RHSA-2020:0902	1
RHSA-2020:2338	1
RHSA-2020:2344	3
RHSA-2020:2637	1
RHSA-2020:2755	1
RHSA-2020:2894	2
RHSA-2020:3014	5
RHSA-2020:3658	1
RHSA-2020:5476	1
RHSA-2020:5566	1
RHSA-2021:0670	1
RHSA-2021:0671	3
RHSA-2021:1024	1
RHSA-2021:1206	2
RHSA-2021:1469	3
RHSA-2021:1989	1
RHSA-2021:2147	1
RHSA-2021:2170	1
RHSA-2021:2359	3
RHSA-2021:2717	8
RHSA-2021:4903	5
RHSA-2021:4904	3
RHSA-2021:5082	8
RHSA-2022:0332	8
RHSA-2022:0666	1
RHSA-2022:1066	1
RHSA-2022:1069	1
RHSA-2022:2191	1
RHSA-2022:2213	1
RHSA-2022:4799	2
RHSA-2022:5052	2
RHSA-2022:5056	2
RHSA-2022:6160	2
RHSA-2022:6170	1
RHSA-2022:6765	3
RHSA-2022:6778	3
RHSA-2022:6834	2
RHSA-2022:7186	24
RHSA-2022:8492	5
RHSA-2022:8640	32
RHSA-2023:0101	1
RHSA-2023:0284	2
RHSA-2023:0379	2
RHSA-2023:0610	4
RHSA-2023:0838	8
RHSA-2023:1252	5

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

23.1.1 patch¶

The patch release notes contain the list of artifacts and Common Vulnerabilities and Exposures (CVE) fixes for the MOSK 23.1.1 patch released on April 20, 2023.

Release artifacts¶

This section lists the components artifacts of the MOSK 23.1.1 patch release that includes binaries, Docker images, and Helm charts.

MOSK 23.1.1 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230223191854.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230403060017	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230403060017	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230403060017	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230403060017	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230403060017	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230403060017	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230403060017	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230403060017	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230403060017	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230403060017	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230403060017	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230403060017	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230323105551	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230227123148	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230227123149	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230403060017	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:v0.8.3	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-de7c287-20230321181457	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.19-alpine3.17.3	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:pacific-focal-20230323105551	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.7-alpine-20230331115754	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230324153219	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1-alpine-20230331115133	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230331112513	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.1	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20230328221837	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230403060017	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230403060017	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230403060017	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.10-alpine3.17	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.0	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20230331105620	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230323105551	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.26.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.4-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230323105551	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:victoria-focal-20230403060017	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230403060017	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230324131216	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230403060017	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230403060017	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230403060017	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230403060017	Apache License 2.0

MOSK 23.1.1 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230227101732.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230403060017	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230403060017	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230403060017	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230403060017	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230403060017	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230403060017	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230403060017	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230403060017	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230403060017	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230403060017	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230403060017	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230403060017	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230323105551	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230227123148	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230227123149	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230403060017	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:v0.8.3	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-de7c287-20230321181457	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.19-alpine3.17.3	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:pacific-focal-20230323105551	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.7-alpine-20230331115754	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230324153219	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1-alpine-20230331115133	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230331112513	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.1	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20230328221837	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230403060017	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230403060017	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230403060017	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.10-alpine3.17	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.0	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20230331105620	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230323105551	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.26.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.4-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230323105551	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-focal-20230403060017	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230329122808	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230403060017	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230403060017	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230403060017	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230403060017	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230403060017	Apache License 2.0

MOSK 23.1.1 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.12.5.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.1.1 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.11.8.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.11.8	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20230306000000	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20230306000000	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20230306000000	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20230306000000	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20230306000000	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20230306000000	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20230306000000	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20230306000000	Apache License 2.0
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20230306000000	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230306000000	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.15	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.11	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230126	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230328121021	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.2	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.3.1	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.17.2-debian-11-r54	Apache License 2.0
Pause	mirantis.azurecr.io/general/external/pause:3.1	Google Cloud Platform
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.2	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.11.2	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.15-17-de1dffc4	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.7.1-20230323	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.7	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.9-alpine3.17	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230321053606	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230223110354	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20230306000000	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0

MOSK 23.1.1 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230331023012	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230330133839	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Security notes¶

In total, in the MOSK 23.1.1 release, 418 Common Vulnerabilities and Exposures (CVE) have been fixed: 22 of critical and 396 of high severity.

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

Learn more about new release cadence

23.1.2 patch¶

MOSK 23.1.2 details¶
Release date	May 4, 2023
Scope	Patch
Cluster release	12.7.2
OpenStack Operator	0.12.6
Tungsten Fabric Operator	0.11.10

Release artifacts¶

This section lists the components artifacts of the MOSK 23.1.2 patch release that includes binaries, Docker images, and Helm charts.

MOSK 23.1.2 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230223191854.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230403060017	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230403060017	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230403060017	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230403060017	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230403060017	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230403060017	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230403060017	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230403060017	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230403060017	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230403060017	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230403060017	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230423104356	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230323105551	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230227123148	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230227123149	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:0.8.6-alpine3.17.3-20230422135216	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.19-alpine3.17.3	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.8-alpine-20230422141943	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230324153219	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1-alpine-20230422140933	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230331112513	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.1	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20230328221837	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230403060017	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230403060017	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230403060017	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.0	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20230422141212	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230323105551	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.26.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.4-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230323105551	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:victoria-focal-20230403060017	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230403060017	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230324131216	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230403060017	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230403060017	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230403060017	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230403060017	Apache License 2.0

MOSK 23.1.2 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230227101732.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230403060017	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230403060017	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230403060017	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230403060017	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230403060017	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230403060017	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230403060017	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230403060017	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230403060017	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230403060017	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230403060017	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230423104356	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230323105551	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230227123148	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230227123149	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:0.8.6-alpine3.17.3-20230422135216	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.19-alpine3.17.3	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.8-alpine-20230422141943	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230324153219	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1-alpine-20230422140933	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230331112513	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.1	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20230328221837	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230403060017	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230403060017	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230403060017	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.0	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20230422141212	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230323105551	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.26.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.4-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230323105551	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-focal-20230403060017	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230329122808	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230403060017	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230403060017	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230403060017	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230403060017	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230403060017	Apache License 2.0

MOSK 23.1.2 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.12.6.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.1.2 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.11.10.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.11.10	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20230306000000	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20230306000000	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20230306000000	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20230306000000	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20230306000000	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20230306000000	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20230306000000	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20230306000000	Apache License 2.0
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20230306000000	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230306000000	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.15	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.11	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230126	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230424135332	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.2	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.3.3	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r9	Apache License 2.0
Pause	mirantis.azurecr.io/general/external/pause:3.1	Google Cloud Platform
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.2	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.2	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.15-17-de1dffc4	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.7.1-20230323	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.7	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230420090701	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230424100414	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20230306000000	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0

MOSK 23.1.2 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230414023012	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230330133839	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Security notes¶

In total, in the MOSK 23.1.2 release, 40 Common Vulnerabilities and Exposures (CVE) have been fixed: 4 of critical and 36 of high severity.

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

Known issues¶

This section describes the patch-related known issues with available workarounds.

OpenStack upgrade failure¶

Fixed in MOSK 23.1.3

The OpenStack upgrade to Yoga fails due to the delay in the Cinder start.

Workaround:

Follow the openstack-controller logs from the OpenStackDeployment container. When the controller is stuck on checking health for any OpenStack component, verify the Helm releases statuses:

helm3 --namespace openstack list

Example of a system response:

NAME                     NAMESPACE       REVISION        UPDATED                                 STATUS          CHART
openstack-cinder         openstack       111             2023-04-30 20:49:08.969450386 +0000 UTC failed          cinder-0.1.0-mcp-4241

If there is a release in the failed state, roll it back:

helm3 --namespace openstack rollback openstack-cinder

See also

Operations Guide: Upgrade OpenStack

Learn more about new release cadence

23.1.3 patch¶

MOSK 23.1.3 details¶
Release date	May 22, 2023
Scope	Patch
Cluster release	12.7.3
OpenStack Operator	0.12.8
Tungsten Fabric Operator	0.11.11

Release artifacts¶

This section lists the components artifacts of the MOSK 23.1.3 release that includes binaries, Docker images, and Helm charts.

MOSK 23.1.3 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230223191854.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230423104356	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230423104356	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230423104356	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230423104356	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230423104356	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230423104356	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230423104356	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230423104356	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230423104356	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230423104356	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230423104356	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230423104356	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230427072424	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230427182440	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230427182440	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:0.8.6-alpine3.17.3-20230422135216	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.19-alpine3.17.3	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.8-alpine-20230422141943	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230324153219	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1-alpine-20230422140933	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230423170220	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230423104356	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230423104356	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230423104356	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20230422141212	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230423172355	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.26.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.4-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230423172355	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:victoria-focal-20230423104356	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230423104356	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230423172355	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230423104356	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230423104356	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230423104356	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230423104356	Apache License 2.0

MOSK 23.1.3 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230227101732.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230423104356	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230423104356	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230423104356	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230423104356	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230423104356	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230423104356	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230423104356	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230423104356	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230423104356	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230423104356	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230423104356	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230423104356	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230427072424	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230427182440	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230427182440	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:0.8.6-alpine3.17.3-20230422135216	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.19-alpine3.17.3	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.8-alpine-20230422141943	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230324153219	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1-alpine-20230422140933	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230423170220	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230423104356	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230423104356	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230423104356	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20230422141212	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230423172355	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.26.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.4-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230423172355	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-focal-20230423104356	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230423172355	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230423104356	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230423104356	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230423104356	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230423104356	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230423104356	Apache License 2.0

MOSK 23.1.3 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.12.8.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.1.3 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.11.11.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.11.11	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20230306000000	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20230306000000	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20230306000000	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20230306000000	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20230306000000	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20230306000000	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20230306000000	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20230306000000	Apache License 2.0
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20230306000000	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230306000000	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.16	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.11	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230126	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230424135332	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.2	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.4.0	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r9	Apache License 2.0
Pause	mirantis.azurecr.io/general/external/pause:3.1	Google Cloud Platform
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.2	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.2	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.15-17-de1dffc4	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.1-20230425	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.7	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230420090701	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230424100414	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20230306000000	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0

MOSK 23.1.3 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230505023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230330133839	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Security notes¶

In total, in the MOSK 23.1.3 release, 7 Common Vulnerabilities and Exposures (CVE) have been fixed: 2 of critical and 5 of high severity.

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

Learn more about new release cadence

23.1.4 patch¶

MOSK 23.1.4 details¶
Release date	June 05, 2023
Scope	Patch
Cluster release	12.7.4
OpenStack Operator	0.12.11
Tungsten Fabric Operator	0.11.11

Release artifacts¶

This section lists the components artifacts of the MOSK 23.1.4 release that includes binaries, Docker images, and Helm charts.

MOSK 23.1.4 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20230223191854.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20230128063511.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20230423104356	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20230423104356	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20230423104356	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20230423104356	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20230423104356	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20230423104356	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20230423104356	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20230423104356	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20230423104356	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20230423104356	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20230423104356	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20230423104356	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230427072424	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230524130506	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230524130506	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230524115243	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:0.8.6-alpine3.17.3-20230422135216	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.19-alpine3.17.3	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.8-alpine-20230422141943	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230523174753	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1-alpine-20230422140933	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230423170220	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20230423104356	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20230423104356	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20230423104356	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230522060448	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230423172355	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.4-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230423172355	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:victoria-focal-20230423104356	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20230423104356	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20230423172355	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20230423104356	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20230423104356	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20230423104356	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20230423104356	Apache License 2.0

MOSK 23.1.4 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20230227101732.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.vmlinuz	GPL-2.0
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20230128061113.gz	GPL-2.0
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-20221228132450.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20230524115243	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20230524115243	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20230524115243	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20230524115243	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20230524115243	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20230524115243	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20230524115243	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20230524115243	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20230524115243	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20230524115243	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20230524115243	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20230524115243	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20230427072424	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20230524130506	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20230524130506	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20230524115243	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.18-focal-20230222154055	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.18-focal-20230222154055	Mozilla Public License 2.0
amqproxy-0.8.x	mirantis.azurecr.io/general/amqproxy:0.8.6-alpine3.17.3-20230422135216	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-27d64fb-20230421151539	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.19-alpine3.17.3	BSD 3-Clause “New” or “Revised” License
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.8-alpine-20230422141943	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.7-alpine-20230523174753	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.1-alpine-20230422140933	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.12-focal-20230423170220	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.11.3	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20230524115243	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20230524115243	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20230524115243	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.5.1	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.8-20230522060448	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20230423172355	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.27.0	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.4-alpine-slim	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20230423172355	GPL-2.0
requirements	mirantis.azurecr.io/openstack/requirements:yoga-focal-20230524115243	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20230423172355	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20230524115243	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20230524115243	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20230524115243	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20230524115243	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20230524115243	Apache License 2.0

MOSK 23.1.4 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.12.11.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4241.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
cloudprober	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/cloudprober-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2913.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 23.1.4 Tungsten Fabric 21.4 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.11.11.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.11.11	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20230306000000	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20230306000000	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20230306000000	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20230306000000	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20230306000000	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20230306000000	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20230306000000	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20230306000000	Apache License 2.0
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20230306000000	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20230306000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20230306000000	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.16	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.11	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10-20230126	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20230328121138	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20230424135332	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.3.2	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.4.0	Apache License 2.0
	mirantis.azurecr.io/stacklight/jmx-exporter:0.18.0-debian-11-r9	Apache License 2.0
Pause	mirantis.azurecr.io/general/external/pause:3.1	Google Cloud Platform
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.3.2	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.2	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:1.0.0-RC19	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.15-17-de1dffc4	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.8.1-20230425	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.7	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.11-alpine3.17	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/stacklight/redis_exporter:v1.45.0	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20230420090701	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20230424100414	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20230328120524	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20230306000000	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0

MOSK 23.1.4 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.2-20230519023010	Mirantis Proprietary License
tungstenfabric-prometheus-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20230330133839	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Security notes¶

In total, in the MOSK 23.1.4 release, 8 Common Vulnerabilities and Exposures (CVE) have been fixed: 1 of critical and 7 of high severity.

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

Addressed CVEs¶
Image	Component name	CVE
openstack/extra/descheduler	golang.org/x/net	CVE-2022-41723 (High)
openstack/extra/powerdns	libpq	CVE-2023-2454 (High)
openstack/extra/strongswan	libcurl	CVE-2023-28319 (High)
		CVE-2023-28321 (High)
		CVE-2023-28322 (High)
	libcap	CVE-2023-2603 (High)
openstack/horizon	django	CVE-2023-31047 (Critical)
openstack/manila	sqlparse	CVE-2023-30608 (High)

Addressed issues¶

The following issues have been addressed in the MOSK 23.1.4 release:

[27031] Fixed the removal of objects marked as Deleted from the Barbican database during the database cleanup.
[30224] Decreased the default weight for build_failure_weight_multiplier to 2 to normalize instance spreading across compute nodes.
[30673] Fixed the issue with duplicate tasks responses from the ironic-python agent on the ironic-conductor side.
[30888] Adjusted the caching time for PowerDNS to fit Designate timeouts.
[31021] Fixed the race in openstack-controller that could lead to setting the default user names and passwords in configuration files during initial deployment.
[31358] Configured the warning message about the world readable directory with fernet keys to be logged only once during the startup.
[31711] Started to pass the autogenerated memcache_secret_key to avoid its regeneration every time the Manila Helm chart gets updated.

Known issues¶

This section describes the patch-related known issues with available workarounds.

[32761] Bare-metal nodes stuck in the cleaning state¶

During the initial deployment of Mirantis Container Cloud, some nodes may get stuck in the cleaning state. The workaround is to wipe disks manually before initializing the Mirantis Container Cloud bootstrap.

Learn more about new release cadence

Release notes for older product versions¶

22.5¶

Release date	December 19, 2022
Name	MOSK 22.5
Cluster release	12.5.0
Highlights	The fifth and last MOSK release in 2022 introduces the following key features: OpenStack Yoga full support Exposure of OpenStack notifications Technical preview for Shared Filesystems as a service Technical preview for L3 networking for the MOSK control plane MKE version update to 3.5.5 Documentation enhancements

New features¶

MOSK 22.5 features¶
Component	Support scope	Feature
OpenStack	Full	OpenStack Yoga
	Full	Exposable OpenStack notifications
	TechPreview	Shared Filesystems as a Service
	TechPreview	L3 networking for MOSK control plane
Mirantis Kubernetes Engine	Full	MKE minor version update to 3.5.5
Ceph	Full	Automated configuration of public FQDN for the Object Storage endpoint
StackLight	Full	Enhancements for etcd monitoring
Container Cloud	Full	Setting of a custom value for a node label using web UI
Documentation	n/a	Documentation enhancements

OpenStack Yoga¶

Added full support for OpenStack Yoga with Open vSwitch and Tungsten Fabric 2011 networking backends.

Starting from 22.5, MOSK deploys all new clouds using OpenStack Yoga by default. To upgrade an existing cloud from OpenStack Victoria to Yoga, follow the Upgrade OpenStack procedure.

For the OpenStack support cycle in MOSK, refer to OpenStack support cycle.

Highlights from upstream supported by Mirantis OpenStack deployed on Yoga

[Cinder] Removed the deprecated Block Storage API version 2.0. Instead, use the Block Storage API version 3.0 that is fully compatible with the previous version.
[Cinder] Removed the requirement for the request URLs to contain a project ID in the Block Storage API making it more consistent with other OpenStack APIs. For backward compatibility, legacy URLs containing a project ID continue to be recognized.
[Designate] Added support for the CERT resource record type enabling new use cases such as secure email and publication of certificate revocation list through DNS.
[Horizon] Added support for the Network QoS Policy creation.
[Glance] Implemented /v2/images/<image-id>/tasks to get tasks associated with an image.
[Ironic] Changed the default deployment boot mode from legacy BIOS to UEFI.
[Masakari] Added support for disabling and enabling failover segments. Now, cloud operators can put whole segments into the maintenance mode.
[Neutron] Implemented the address-groups resource that can be used to add groups of IP addresses to security group rules.
[Nova] Added support for the API microversion 2.90. It enables the users to configure the host name exposed through the Nova metadata service during instances creating or rebuilding.
[Octavia] Increased the performance and scalability of load balancers that use the amphora provider when using amphora images built with version 2.x of the HAProxy load balancing engine.
[Octavia] Improved the observability of load balancers by adding the PROMETHEUS listeners that expose a Prometheus exporter endpoint. The Octavia amphora provider exposes over 150 unique metrics.

To view the full list of OpenStack Yoga features, including those not supported by MOSK, refer to OpenStack Yoga upstream documentation: Release notes and source code.

Exposable OpenStack notifications¶

Implemented the capability to securely expose part of a MOSK cluster message bus (RabbitMQ) to the outside world. This enables external consumers to subscribe to notification messages emitted by the cluster services and can be helpful in several use cases:

Analysis of notification history for retrospective security audit
Real-time aggregation of notification messages to collect statistics of cloud resource consumption for capacity planning or charge-back

The external notification endpoint can be easily enabled and configured through the OpenStackDeployment custom resource.

Learn more

Reference Architecture: Exposure of OpenStack notifications

Shared Filesystems as a Service¶

TechPreview

Added MOSK support for the Shared Filesystems service (OpenStack Manila), which enables cloud users to create and manage virtual file shares, so that applications can store their data using common network file sharing protocols, such as CIFS, NFS, and so on.

Learn more

L3 networking for MOSK control plane¶

TechPreview

Implemented the ability to enable the BGP load-balancing mode for MOSK underlying Kubernetes to allow distribution of services providing OpenStack APIs across multiple independent racks that have no L2 segments in common.

Learn more

Reference Architecture: Multi-rack architecture

MKE minor version update to 3.5.5¶

Based MOSK 22.5 on the Cluster release 12.5.0 that supports Mirantis Kubernetes Engine (MKE) 3.5.5.

Learn more

MKE documentation: MKE release notes

Automated configuration of public FQDN for the Object Storage endpoint¶

The fully qualified domain name (FQDN) for the Object Storage service (Ceph Object gateway) public endpoint is now configurable through just a single parameter in the KaaSCephCluster custom resource, which is spec.cephClusterSpec.ingress.publicDomain. Previously, you had to perform a set of manual steps to define a custom name. If the parameter is not set, the FQDN settings from the OpenStackDeployment custom resource apply by default.

The new parameter simplifies configuration of Transport Layer Security of user-facing endpoints of the Object Storage service.

Learn more

Operations Guide: Configure Ceph Object Gateway TLS

Enhancements for etcd monitoring¶

Implemented the following enhancements for etcd monitoring:

Introduced etcd monitoring for OpenStack by implementing the Etcd Grafana dashboard and by adding OpenStack to the set of existing alerts for etcd that were used for MKE clusters only in previous releases.
Improved etcd monitoring for MKE on MOSK clusters by implementing the Etcd dashboard and etcdDbSizeCritical and etcdDbSizeMajor alerts that inform about the size of the etcd database.

Learn more

Setting of a custom value for a node label using web UI¶

Implemented the ability to set a custom value for a predefined node label using the Container Cloud web UI. The list of available node labels is obtained from allowedNodeLabels of your current Cluster release.

If the value field is not defined in allowedNodeLabels, select the check box of the required label and define an appropriate custom value for this label to be set to the node.

Learn more

Operations Guide: Create a machine using web UI

Documentation enhancements¶

Published concept documentation about implementing edge computing in MOSK using the remote computes design

Learn more

Reference Architecture: Blueprints
Restructured the content about the maintenance of OpenStack databases
Learn more
- Reference Architecture: OpenStack database
- Operations Guide: Backup and restore OpenStack databases

Major components versions¶

MOSK 22.5 components versions¶
Component	Version
Cluster release	12.5.0 (Cluster release notes)
OpenStack	Yoga
OpenStack Operator	0.11.7
Tungsten Fabric	2011 LTS 21.4 TechPreview
Tungsten Fabric Operator	0.10.5

See also

For the supported versions of operating system, Ceph, and other components, refer to Release Compatibility Matrix.

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.5.

[30450] High CPU load of MariaDB
[29501] Cinder periodic database cleanup resets the state of volumes
[25124] MPLSoGRE encapsulation has limited throughput
[25594] Security groups shared through RBAC cannot be used to create instances

[30450] High CPU load of MariaDB¶

Fixed in MOSK 23.1

One of the most common symptoms of the high CPU load of MariaDB is slow API responses. To troubleshoot the issue, verify the CPU consumption of MariaDB using the General > Kubernetes Pods Grafana dashboard or through the CLI as follows:

Obtain the resource consumption details for the MariaDB server:

kubectl -n openstack exec -it mariadb-server-0 -- bash
mysql@mariadb-server-0:/$ top

Example of system response:

top - 19:16:29 up 278 days, 20:56,  0 users,  load average: 16.62, 16.54, 16.39
Tasks:   8 total,   1 running,   7 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.3 us,  2.8 sy,  0.0 ni, 89.6 id,  0.0 wa,  0.0 hi,  1.3 si,  0.0 st
MiB Mem : 515709.3 total, 375731.7 free, 111383.8 used,  28593.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used. 399307.2 avail Mem
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    275 mysql     20   0   76.3g  18.8g   1.0g S 786.4   3.7  22656,15 mysqld

Determine which exact query is progressing. This is usually the one in the Sending data state:

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e "show processlist;" | grep -v Sleep

Example of system response:

Id      User    Host    db      Command Time    State   Info    Progress
60067757   placementgF9D11u29   10.233.195.246:40746   placement   Query   10   Sending data   SELECT a.id, a.resource_class_id, a.used, a.updated_at, a.created_at, c.id AS consumer_id, c.generat    0.000

Obtain more information about the query and the used tables:

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e " ANALYZE FORMAT=JSON SELECT <QUERY_ID>;"

Example of system response:

"table": {
   "table_name": "c",
   "access_type": "eq_ref",
   "possible_keys": [
     "uniq_consumers0uuid",
     "consumers_project_id_user_id_uuid_idx",
     "consumers_project_id_uuid_idx"
   ],
   "key": "uniq_consumers0uuid",
   "key_length": "110",
   "used_key_parts": ["uuid"],
   "ref": ["placement.a.consumer_id"],
   "r_loops": 838200,
   "rows": 1,
   "r_rows": 1,
   "r_table_time_ms": 62602.5453,
   "r_other_time_ms": 369.5835268,
   "filtered": 100,
   "r_filtered": 0.005249344,
   "attached_condition": "c.user_id = u.`id` and c.project_id = p.`id`"
}

If you are observing a huge difference between the filtered and r_filtered columns for the query, as in the example of system response above, analyze the performance of tables by running the ANALYZE TABLE <TABLE_NAME>; and ANALYZE TABLE <TABLE_NAME> PERSISTENT FOR ALL; commands:

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD
MariaDB > ANALYZE TABLE placement.allocations;
MariaDB > ANALYZE TABLE placement.allocations PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.consumers;
MariaDB > ANALYZE TABLE placement.consumers PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.users;
MariaDB > ANALYZE TABLE placement.users PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.projects;
MariaDB > ANALYZE TABLE placement.projects PERSISTENT FOR ALL;

[29501] Cinder periodic database cleanup resets the state of volumes¶

Fixed in MOSK 23.1

Due to an issue in the database auto-cleanup job for the Block Storage service (OpenStack Cinder), the state of volumes that are attached to instances gets reset every time the job runs. The instances can still write and read block storage data, however, volume objects appear in the OpenStack API as not attached causing confusion.

The workaround is to temporarily disable the job until the issue is fixed and execute the script below to restore the affected instances.

To disable the job, update the OpenStackDeployment custom resource as follows:

  kind: OpenStackDeployment
  spec:
    features:
      database:
        cleanup:
          cinder:
            enabled: false

To restore the affected instances:

Obtain one of the Nova API pods:

nova_api_pod=$(kubectl -n openstack get pod -l application=nova,component=os-api --no-headers | head -n1 | awk '{print $1}')

Download the restore_volume_attachments.py script to your local environment.

Note

The provided script does not fix the Cinder database clean-up job and is only intended to restore the functionality of the affected instances. Therefore, leave the job disabled.

Copy the script to the Nova API pod:

kubectl -n openstack cp restore_volume_attachments.py $nova_api_pod:tmp/

Run the script in the dry-run mode to only list affected instances and volumes:

kubectl -n openstack exec -ti $nova_api_pod -- python /tmp/restore_volume_attachments.py --dry-run

Run the script to restore the volume attachments:

kubectl -n openstack exec -ti $nova_api_pod -- python /tmp/restore_volume_attachments.py

[25124] MPLSoGRE encapsulation has limited throughput¶

Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) provides limited throughput while sending data between VMs up to 38 Mbps, as per Mirantis tests.

As a workaround, switch the encapsulation type to VXLAN in the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          conf:
            bagpipe_bgp:
              dataplane_driver_ipvpn:
                mpls_over_gre: "False"
                vxlan_encap: "True"

[25594] Security groups shared through RBAC cannot be used to create instances¶

Fixed in MOSK 22.5

It is not possible to create an instance that uses a security group shared through role-based access control (RBAC) with only specifying the network ID when calling Nova. In such case, before creating a port in the given network, Nova verifies if the given security group exists in Neutron. However, Nova asks only for the security groups filtered by project_id. Therefore, it will not get the shared security group back from the Neutron API. For details, see the OpenStack known issue #1942615.

Note

The bug affects only OpenStack Victoria and is fixed for OpenStack Yoga in MOSK 22.5.

Workaround:

Create a port in Neutron:

openstack port create --network <NET> --security-group <SG_ID> shared-sg-port

Pass the created port to Nova:

openstack server create --image <IMAGE> --flavor <FLAVOR> --port shared-sg-port vm-with-shared-sg

Note

If security groups shared through RBAC are used, apply them to ports only, not to instances directly.

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.5. For Tungsten Fabric limitations, see Tungsten Fabric known limitations.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[10096] tf-control does not refresh IP addresses of Cassandra pods

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Fixed in MOSK 23.1

The tf-control service resolves the DNS names of Cassandra pods at startup and does not update them if Cassandra pods got new IP addresses, for example, in case of a restart. As a workaround, to refresh the IP addresses of Cassandra pods, restart the tf-control pods one by one:

kubectl -n tf delete pod tf-control-<hash>

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.5.

[29438] Cluster update gets stuck during the Tungsten Fabric operator update
[27797] Cluster ‘kubeconfig’ stops working during MKE minor version update

[29438] Cluster update gets stuck during the Tungsten Fabric operator update¶

Fixed in MOSK 23.1

If the tungstenfabric-operator-metrics service was present on the cluster in MOSK 22.4, the update to 22.5 can stuck due to absence of correct labels for this service. As a workaround, delete the service manually:

kubectl -n tf delete svc tungstenfabric-operator-metrics

[27797] Cluster ‘kubeconfig’ stops working during MKE minor version update¶

To obtain the admin kubeconfig:

kubectl --kubeconfig <pathToMgmtKubeconfig> get secret -n <affectedClusterNamespace> \
-o yaml <affectedClusterName>-kubeconfig | awk '/admin.conf/ {print $2}' | \
head -1 | base64 -d > clusterKubeconfig.yaml

Ceph known issues¶

This section lists the Ceph known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.5.

[26820] ‘KaaSCephCluster’ does not reflect issues during Ceph cluster deletion

[26820] ‘KaaSCephCluster’ does not reflect issues during Ceph cluster deletion¶

Fixed in MOSK 23.1

The status section in the KaaSCephCluster.status CR does not reflect issues during the process of a Ceph cluster deletion.

As a workaround, inspect Ceph Controller logs on the managed cluster:

kubectl --kubeconfig <managedClusterKubeconfig> -n ceph-lcm-mirantis logs <ceph-controller-pod-name>

StackLight known issues¶

This section lists the StackLight known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.5.

[28372] False-positive liveness probe failures for ‘fluentd-notifications’

[28372] False-positive liveness probe failures for ‘fluentd-notifications’¶

Fixed in 23.1

If a cluster does not currently have any ongoing operations that comprise OpenStack notifications, the fluentd containers in the fluentd-notifications Pods are frequently restarted due to false-positive failures of liveness probe and trigger alerts.

Ignore such failures and alerts if the Pods are in the Running state. To verify the fluentd-notifications Pods:

kubectl get po -n stacklight -l app=fluentd

Example of system response:

fluentd-notifications-64fdc5f5cd-pgmjp    1/1    Running    9    4h51m
fluentd-notifications-64fdc5f5cd-xjfrs    1/1    Running    9    4h51m

Release artifacts¶

This section lists the components artifacts of the MOSK 22.5 release that includes binaries, Docker images, and Helm charts.

MOSK 22.5 OpenStack Yoga binaries and Docker images

Component	Path	License information for main executable programs
Binaries
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20221117061110.gz	GPL-2.0
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20221117061110.vmlinuz	GPL-2.0
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20221118095950.qcow2	Mirantis Proprietary License
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-1.3.0-77.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20221118093824	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20221118093824	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20221118093824	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20221118093824	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20221118093824	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20221118093824	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20221118093824	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20221118093824	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20221118093824	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20221118093824	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20221118093824	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20221118093824	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20221028120749	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20221028120749	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20221028120749	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20221118093824	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.7-focal-20220810183358	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.7-focal-20220810183358	Mozilla Public License 2.0
amqproxy-0.7.x	mirantis.azurecr.io/general/amqproxy:v0.7.0	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.17-alpine3.16	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:pacific-focal-20221027123328	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.4-alpine-20221116085516	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.0	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.7-focal-20221028120155	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0.9.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20221108015104	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20221118093824	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20221118093824	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20221118093824	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.3.1	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20221028090933	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20221028120749	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.25.0-amd64-20220922181123	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.2-alpine	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20221028120749	GPL-2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20221028141101	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20221118093824	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20221118093824	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20221118093824	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20221118093824	Apache License 2.0
manila	mirantis.azurecr.io/openstack/manila:yoga-focal-20221118093824	Apache License 2.0

MOSK 22.5 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20221021063512.gz	GPL-2.0
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20221021063512.vmlinuz	GPL-2.0
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20221028084954.qcow2	Mirantis Proprietary License
service-image	https://binary.mirantis.com/openstack/bin/manila/manila-service-image-1.3.0-77.qcow2	Mirantis Proprietary License
Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20221028080549	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20221028080549	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20221028080549	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20221028080549	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20221028080549	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20221028080549	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20221028080549	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20221028080549	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20221028080549	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20221028080549	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20221028080549	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20221028080549	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20221028120749	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20221028120749	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20221028120749	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:yoga-focal-20221028080549	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.7-focal-20220810183358	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.7-focal-20220810183358	Mozilla Public License 2.0
amqproxy-0.7.x	mirantis.azurecr.io/general/amqproxy:v0.7.0	MIT license
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-48d1e8a-20220919122849	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.17-alpine3.16	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:pacific-focal-20221027123328	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.4-alpine-20221116085516	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.0	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.7-focal-20221028120155	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0.9.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20221003215949	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20221028080549	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20221028080549	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20221028080549	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.3.1	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20221028090933	GPL License
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20221028120749	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.25.0-amd64-20220922181123	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.2-alpine	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20221028120749	GPL-2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20221028080549	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20221028141116	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20221028080549	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20221028080549	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20221028080549	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20221028080549	Apache License 2.0

MOSK 22.5 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.11.7.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4213.tgz	Apache License 2.0 (no License file in Helm chart)
manila	https://binary.mirantis.com/openstack/helm/openstack-helm/manila-0.1.0-mcp-4213.tgz
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2880.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 22.5 Tungsten Fabric 2011 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.10.5.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.10.5	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:2011.20221123000911	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:2011.20221123000911	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:2011.20221123000911	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:2011.20221123000911	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:2011.20221123000911	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:2011.20221123000911	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:2011.20221123000911	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:2011.20221123000911	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:2011.20221123000911	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:2011.20221123000911	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.14	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20220919122317	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20220919114133	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.2.2	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.2.2	Apache License 2.0
	mirantis.azurecr.io/tungsten/prometheus-jmx-exporter:0.17.2-debian-11-r6	Apache License 2.0
Pause	mirantis.azurecr.io/general/external/pause:3.1	Google Cloud Platform
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.2.3	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.2	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.14	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.7.1-20220914	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.4	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.5-alpine3.16	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/openstack/extra/redis_exporter:v1.43.0-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20221205131707	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20221030153940	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20221109111149	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:2011.20221123000911	Apache License 2.0

MOSK 22.5 Tungsten Fabric 21.4 artifacts

Important

Tungsten Fabric 21.4 is available as technical preview. For details, see Tungsten Fabric 21.4 support.

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.10.5.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.10.5	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20221205000000	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20221205000000	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20221205000000	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20221205000000	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20221205000000	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20221205000000	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20221205000000	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20221205000000	Apache License 2.0
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20221205000000	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20221205000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20221205000000	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.14	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20220919122317	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20220919114133	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.2.2	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.2.2	Apache License 2.0
	mirantis.azurecr.io/tungsten/prometheus-jmx-exporter:0.17.2-debian-11-r6	Apache License 2.0
Pause	mirantis.azurecr.io/general/external/pause:3.1	Google Cloud Platform
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.2.3	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/tungsten/rabbitmq:3.11.2	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.14	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.7.1-20220914	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.4	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.5-alpine3.16	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/openstack/extra/redis_exporter:v1.43.0-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20221205131707	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20221030153940	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20221109111149	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20221205000000	Apache License 2.0

MOSK 22.5 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 22.5 release:

[26773] Fixed the issue with VM autoscaling failure when using the CPU-related metrics in Telemetry.
[26534] Fixed the issue with the ironic-conductor Pod getting stuck in the CrashLoopBackOff state after the Container Cloud management cluster upgrade from 2.19.0 to 2.20.0. The issue occurred due to the race condition between the ironic-conductor and ironic-conductor-http containers of the ironic-conductor Pod that tried to use ca-bundle.pem simultaneously but from different users.
[25594][Yoga] Fixed the issue with security groups shared through RBAC not being filtered and used by Nova to create instances due to the OpenStack known issue #1942615.

Note

The bug still affects OpenStack Victoria and is fixed for OpenStack Yoga.
[24435] Fixed the issue with MetalLB speaker failing to announce the LB IP for the Ingress service after the MOSK cluster update.

For existing clusters, you can set externalTrafficPolicy back from Cluster to Local after updating to 22.5. For details, see Post-upgrade actions.

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 22.5. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Additionally, read through the Cluster update known issues for the problems that are known to occur during update with recommended workarounds.

Features¶

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

Update impact and maintenance windows planning¶

The update to MOSK 22.5 does not include any version-specific impact on the cluster. To start planning a maintenance window, use the Operations Guide: Update a MOSK cluster standard procedure.

Pre-update actions¶

Before you proceed with updating the cluster, make sure that you perform the following pre-update actions if applicable:

Due to the [29438] Cluster update gets stuck during the Tungsten Fabric operator update known issue, the MOSK cluster update from 22.4 to 22.5 can get stuck. Your cluster is affected if it has been updated from MOSK 22.3 to 22.4, regardless of the SDN backend in use (Open vSwitch or Tungsten Fabric). The newly deployed MOSK 22.4 clusters are not affected.

To avoid the issue, manually delete the tungstenfabric-operator-metrics service from the cluster before update:
```
kubectl -n tf delete svc tungstenfabric-operator-metrics
```
Due to the known issue in the database auto-cleanup job for the Block Storage service (OpenStack Cinder), the state of volumes that are attached to instances gets reset every time the job runs. The workaround is to temporarily disable the job until the issue is fixed. For details, refer to [29501] Cinder periodic database cleanup resets the state of volumes.

Post-update actions¶

Explicitly define the OIDCClaimDelimiter parameter¶

The OIDCClaimDelimiter parameter defines the delimiter to use when setting multi-valued claims in the HTTP headers. See the MOSK 22.5 OpenStack API Reference for details.

The current default value of the OIDCClaimDelimiter parameter is ",". This value misaligns with the behavior expected by Keystone. As a result, when creating federation mappings for Keystone, the cloud operator may be forced to write more complex rules. Therefore, in early 2023, Mirantis will change the default value for the OIDCClaimDelimiter parameter.

Affected deployments

Proceed with the instruction below only if the following conditions are true:

Keystone is set to use federation through the OpenID Connect protocol, with Mirantis Container Cloud Keycloak in particular. The following configuration is present in your OpenStackDeployment custom resource:
```
kind: OpenStackDeployment
spec:
  features:
    keystone:
      keycloak:
        enabled: true
```
No value has already been specified for the OIDCClaimDelimiter parameter in your OpenStackDeployment custom resource.

To facilitate smooth transition of the existing deployments to the new default value, explicitly define the OIDCClaimDelimiter parameter as follows:

kind: OpenStackDeployment
spec:
  features:
    keystone:
      keycloak:
        oidc:
          OIDCClaimDelimiter: ","

Note

The new default value for the OIDCClaimDelimiter parameter will be ";". To find out whether your Keystone mappings will need adjustment after changing the default value, set the parameter to ";" on your staging environment and verify the rules.

Optional. Set externalTrafficPolicy=Local for the OpenStack Ingress service¶

In MOSK 22.4 and older versions, the OpenStack Ingress service was not accessible through its LB IP address on the environments having the external network restricted to a few nodes in the MOSK cluster. For such use cases, Mirantis recommended setting the externalTrafficPolicy parameter to Cluster as a workaround.

The issue #24435 has been fixed in MOSK 22.5. Therefore, if the monitoring of source IPs of the requests to OpenStack services is required, you can set the externalTrafficPolicy parameter back to Local.

Affected deployments

You are affected if your deployment configuration matches the following conditions:

The external network is restricted to a few nodes in the MOSK cluster. In this case, only a limited set of nodes have IPs in the external network where MetalLB announces LB IPs.
The workaround was applied by setting externalTrafficPolicy=Cluster for the Ingress service.

To set externalTrafficPolicy back from Cluster to Local:

On the MOSK cluster, add the node selector to the L2Advertisement MetalLB object so that it matches the nodes in the MOSK cluster having IPs in the external network, or a subset of those nodes.

Example command to edit L2Advertisement:
```
kubectl -n metallb-system edit l2advertisements
```
Example of L2Advertisement.spec:
```
spec:
  ipAddressPools:
  - services
  nodeSelectors:
  - matchLabels:
      openstack-control-plane: enabled
```
The openstack-control-plane: enabled label selector defines nodes in the MOSK cluster having IPs in the external network.
In the MOSK Cluster object located on the management cluster, remove or edit node selectors and affinity for MetalLB speaker in the MetalLB chart values, if required.

Example of the helmReleases section in Cluster.spec after editing the nodeSelector parameter:
```
helmReleases:
  - name: metallb
    values:
      configInline:
        address-pools: []
      speaker:
        nodeSelector:
          kubernetes.io/os: linux
        resources:
          limits:
            cpu: 100m
            memory: 500Mi
```
The MetalLB speaker DaemonSet must have the same node selector as the OpenStack Ingress service DaemonSet.

Note

By default, the OpenStack Ingress service Pods run on all Linux cluster nodes.
Change externalTrafficPolicy to Local for the OpenStack Ingress service.

Example command to alter the Ingress object:
```
kubectl -n openstack patch svc ingress -p '{"spec":{"externalTrafficPolicy":"Local"}}'
```
Verify that OpenStack services are accessible through the load balancer IP of the OpenStack Ingress service.

Remove Panko from the deployment¶

The OpenStack Panko service has been removed from the product since MOSK 22.2 in OpenStack Victoria without the user being involved. The OpenStack Panko service is no longer maintained in the upstream OpenStack. See the project repository page for details.

Though, in MOSK 22.5, before upgrading to OpenStack Yoga, verify that you remove the Panko service from the cloud by removing the event entry from the spec:features:services structure in the OpenStackDeployment resource as described in Operations Guide: Remove an OpenStack service.

Security notes¶

The table below contains the number of vendor-specific addressed CVEs with Critical or High severity.

In total, in the MOSK 22.5 release, 108 CVEs have been fixed and 77 artifacts (images) updated.

Addressed CVEs¶
Fixed CVE ID	# of updated artifacts
CVE-2022-29187	1
CVE-2022-29155	1
CVE-2022-24765	1
CVE-2022-23219	1
CVE-2022-23218	1
CVE-2022-2068	2
CVE-2022-1679	1
CVE-2022-1664	2
CVE-2022-1292	2
CVE-2021-39686	1
CVE-2021-3847	1
CVE-2021-3737	1
CVE-2021-3520	1
CVE-2021-33574	1
CVE-2021-30475	1
CVE-2021-30474	1
CVE-2021-30473	1
CVE-2021-29921	1
CVE-2021-20312	1
CVE-2021-20309	1
CVE-2021-20246	1
CVE-2021-20245	1
CVE-2021-20244	1
CVE-2019-9169	1
CVE-2019-25013	1
CVE-2019-19814	1
CVE-2019-15794	1
CVE-2019-12900	1
CVE-2018-6551	1
CVE-2018-6485	1
CVE-2018-1000001	1
CVE-2016-2779	1
CVE-2015-20107	1
CVE-2013-7445	1
ELSA-2022-9564	1
RHSA-2022:7110	1
RHSA-2022:6834	24
RHSA-2022:6778	5
RHSA-2022:6765	26
RHSA-2022:6206	7
RHSA-2022:6180	1
RHSA-2022:5696	1
RHSA-2022:5056	1
RHSA-2022:4991	2
RHSA-2022:1642	1
RHSA-2022:1537	1

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

22.4¶

Release date	September 29, 2022
Name	MOSK 22.4
Cluster release	8.10.0
Highlights	The fourth MOSK release in 2022 introduces the following key features: Technical preview of OpenStack Yoga Technical preview of Tungsten Fabric 21.4 Application credentials CADF audit notifications Technical preview of the OpenStack region name configuration Automated restart of the Tungsten Fabric vRouter pods after the update Technical preview of external OpenStack database backup

New features¶

MOSK 22.4 features¶
Component	Support scope	Feature
OpenStack	TechPreview	OpenStack Yoga support
	Full	Application credentials
	Full	CADF audit notifications
	TechPreview	Configurable OpenStack region name
	TechPreview	External OpenStack database backups
Tungsten Fabric	TechPreview	Tungsten Fabric 21.4 support
	Full	Post-update restart of the TF vRouter pods
Ceph	Full	Ceph cluster summary in the Container Cloud web UI

OpenStack Yoga support¶

TechPreview

Provided the technical preview support for OpenStack Yoga with Neutron OVS and Tungsten Fabric 21.4.

To start experimenting with the new functionality, set openstack_version to yoga in the OpenStackDeployment custom resource during the cloud deployment.

Learn more

OpenStack documentation: OpenStack Yoga highlights

Tungsten Fabric 21.4 support¶

TechPreview

Provided the technical preview support for Tungsten Fabric 21.4. The new version of the Tungsten Fabric networking enables support for the EVPN type 2 routes for graceful restart and long-lived graceful restart features in MOSK.

Note

Implementation of the Red Hat Universal Base Image 8 (UBI 8) support for the Tungsten Fabric container images is being under development and will be released in one of the upcoming product versions.

To start experimenting with the new functionality, set tfVersion to 21.4 in the TFOperator custom resource during the cloud deployment.

Learn more

Deployment Guide: Deploy Tungsten Fabric

Application credentials¶

Enabled the application credentials mechanism in the Identity service for application automation tools to securely authenticate against the cloud’s API.

Learn more

CADF audit notifications¶

Enabled the capability of the OpenStack services to emit notifications in the Cloud Auditing Data Federation (CADF) format. The CADF notifications configuration is available through the features:logging:cadf section of the OpenStackDeployment custom resource.

Learn more

Security Guide: CADF audit notifications in OpenStack services

Configurable OpenStack region name¶

TechPreview

Added the capability to configure the name of the OpenStack region used for the deployment. By default, the name is RegionOne.

Learn more

Reference Architecture: Regions

External OpenStack database backups¶

TechPreview

Implemented the capability to store the OpenStack database backup data externally. Instead of the default Ceph volume, the cloud operator can now easily configure the NFS storage backend through the OpenStackDeployment CR.

Learn more

Operations Guide: Enable remote backup

Post-update restart of the TF vRouter pods¶

Implemented the post-update restart of the TF vRouter pods. Previously, the cloud operator had to manually restart the vRouter pods after updating the deployment to a newer MOSK version. The update procedure has been amended accordingly.

Learn more

Operations Guide: After the update

Ceph cluster summary in the Container Cloud web UI¶

Implemented the capability to easily view the summary and health status of all Ceph clusters through the Container Cloud web UI.

Learn more

Operations Guide: View Ceph cluster summary through Container Cloud web UI

Major components versions¶

MOSK 22.4 components versions¶
Component	Version
Cluster release	8.10.0 (Cluster release notes)
OpenStack	Victoria LTS Yoga TechPreview
OpenStack Operator	0.10.5
Tungsten Fabric	2011 LTS 21.4 TechPreview
Tungsten Fabric Operator	0.9.4

See also

Release Compatibility Matrix

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.4.

[30450] High CPU load of MariaDB
[25594] Security groups shared through RBAC cannot be used to create instances
[25124] MPLSoGRE encapsulation has limited throughput

[30450] High CPU load of MariaDB¶

Fixed in MOSK 23.1

Obtain the resource consumption details for the MariaDB server:

kubectl -n openstack exec -it mariadb-server-0 -- bash
mysql@mariadb-server-0:/$ top

Example of system response:

top - 19:16:29 up 278 days, 20:56,  0 users,  load average: 16.62, 16.54, 16.39
Tasks:   8 total,   1 running,   7 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.3 us,  2.8 sy,  0.0 ni, 89.6 id,  0.0 wa,  0.0 hi,  1.3 si,  0.0 st
MiB Mem : 515709.3 total, 375731.7 free, 111383.8 used,  28593.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used. 399307.2 avail Mem
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    275 mysql     20   0   76.3g  18.8g   1.0g S 786.4   3.7  22656,15 mysqld

Determine which exact query is progressing. This is usually the one in the Sending data state:

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e "show processlist;" | grep -v Sleep

Example of system response:

Id      User    Host    db      Command Time    State   Info    Progress
60067757   placementgF9D11u29   10.233.195.246:40746   placement   Query   10   Sending data   SELECT a.id, a.resource_class_id, a.used, a.updated_at, a.created_at, c.id AS consumer_id, c.generat    0.000

Obtain more information about the query and the used tables:

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e " ANALYZE FORMAT=JSON SELECT <QUERY_ID>;"

Example of system response:

"table": {
   "table_name": "c",
   "access_type": "eq_ref",
   "possible_keys": [
     "uniq_consumers0uuid",
     "consumers_project_id_user_id_uuid_idx",
     "consumers_project_id_uuid_idx"
   ],
   "key": "uniq_consumers0uuid",
   "key_length": "110",
   "used_key_parts": ["uuid"],
   "ref": ["placement.a.consumer_id"],
   "r_loops": 838200,
   "rows": 1,
   "r_rows": 1,
   "r_table_time_ms": 62602.5453,
   "r_other_time_ms": 369.5835268,
   "filtered": 100,
   "r_filtered": 0.005249344,
   "attached_condition": "c.user_id = u.`id` and c.project_id = p.`id`"
}

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD
MariaDB > ANALYZE TABLE placement.allocations;
MariaDB > ANALYZE TABLE placement.allocations PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.consumers;
MariaDB > ANALYZE TABLE placement.consumers PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.users;
MariaDB > ANALYZE TABLE placement.users PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.projects;
MariaDB > ANALYZE TABLE placement.projects PERSISTENT FOR ALL;

[25594] Security groups shared through RBAC cannot be used to create instances¶

Fixed in MOSK 22.5

Workaround:

Create a port in Neutron:

openstack port create --network <NET> --security-group <SG_ID> shared-sg-port

Pass the created port to Nova:

openstack server create --image <IMAGE> --flavor <FLAVOR> --port shared-sg-port vm-with-shared-sg

Note

If security groups shared through RBAC are used, apply them to ports only, not to instances directly.

[25124] MPLSoGRE encapsulation has limited throughput¶

Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) provides limited throughput while sending data between VMs up to 38 Mbps, as per Mirantis tests.

As a workaround, switch the encapsulation type to VXLAN in the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          conf:
            bagpipe_bgp:
              dataplane_driver_ipvpn:
                mpls_over_gre: "False"
                vxlan_encap: "True"

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.4. For Tungsten Fabric limitations, see Tungsten Fabric known limitations.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[10096] tf-control does not refresh IP addresses of Cassandra pods

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Fixed in MOSK 23.1

kubectl -n tf delete pod tf-control-<hash>

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.4.

[26534] The ‘ironic-conductor’ Pod fails after the management cluster upgrade
[24435] MetalLB speaker fails to announce the LB IP for the Ingress service

[26534] The ‘ironic-conductor’ Pod fails after the management cluster upgrade¶

Fixed in MOSK 22.5

After the Container Cloud management cluster upgrade from 2.19.0 to 2.20.0, the ironic-conductor Pod gets stuck in the CrashLoopBackOff state. The issue occurs due to the race condition between the ironic-conductor and ironic-conductor-http containers of the ironic-conductor Pod that try to use ca-bundle.pem simultaneously but from different users.

As a workaround, run the following command:

kubectl -n openstack exec -t <failedPodName> -c ironic-conductor-http -- chown 42424:42424 /certs/ca-bundle.pem

[24435] MetalLB speaker fails to announce the LB IP for the Ingress service¶

Fixed in MOSK 22.5

After updating the MOSK cluster, MetalLB speaker may fail to announce the Load Balancer (LB) IP address for the OpenStack Ingress service. As a result, the OpenStack Ingress service is not accessible using its LB IP address.

The issue may occur if the MetalLB speaker nodeSelector selects not all the nodes selected by nodeSelector of the OpenStack Ingress service.

The issue may arise and disappear when a new MetalLB speaker is being selected by the MetalLB Controller to announce the LB IP address.

The issue occurs since MOSK 22.2 after externalTrafficPolicy was set to local for the OpenStack Ingress service.

Workaround:

Select from the following options:

Set externalTrafficPolicy to cluster for the OpenStack Ingress service.

This option is preferable in the following cases:
- If not all cluster nodes have connection to the external network
- If the connection to the external network cannot be established
- If network configuration changes are not desired
If network configuration is allowed and if you require the externalTrafficPolicy: local option:
1. Wire the external network to all cluster nodes where the OpenStack Ingress service Pods are running.
2. Configure IP addresses in the external network on the nodes and change the default routes on the nodes.
3. Change nodeSelector of MetalLB speaker to match nodeSelector of the OpenStack Ingress service.

Ceph known issues¶

This section lists the Ceph known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.4.

[26820] ‘KaaSCephCluster’ does not reflect issues during Ceph cluster deletion

[26820] ‘KaaSCephCluster’ does not reflect issues during Ceph cluster deletion¶

The status section in the KaaSCephCluster.status CR does not reflect issues during the process of a Ceph cluster deletion.

As a workaround, inspect Ceph Controller logs on the managed cluster:

kubectl --kubeconfig <managedClusterKubeconfig> -n ceph-lcm-mirantis logs <ceph-controller-pod-name>

StackLight known issues¶

This section lists the StackLight known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.4.

[28372] False-positive liveness probe failures for ‘fluentd-notifications’

[28372] False-positive liveness probe failures for ‘fluentd-notifications’¶

Ignore such failures and alerts if the Pods are in the Running state. To verify the fluentd-notifications Pods:

kubectl get po -n stacklight -l app=fluentd

Example of system response:

fluentd-notifications-64fdc5f5cd-pgmjp    1/1    Running    9    4h51m
fluentd-notifications-64fdc5f5cd-xjfrs    1/1    Running    9    4h51m

Release artifacts¶

This section lists the components artifacts of the MOSK 22.4 release that includes binaries, Docker images, and Helm charts.

MOSK 22.4 OpenStack Victoria binaries and Docker images

Component	Path	License information for main executable programs
Binaries
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20220210133249.gz	GPL-2.0
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-victoria-20220210133249.vmlinuz	GPL-2.0
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20220823190806.qcow2	Mirantis Proprietary License
Docker images
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20220823183431	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20220823183431	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20220823183431	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20220823183431	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20220823183431	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20220823183431	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20220823183431	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20220823183431	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20220823183431	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20220823183431	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20220823183431	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20220823183431	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20220811132755	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20220811132755	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20220811132755	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:victoria-focal-20220823183431	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.7-focal-20220810183358	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.7-focal-20220810183358	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-44bd8b3-20220812062625	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.16-alpine3.16	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:pacific-focal-20220705100042	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.4-alpine-20220808224108	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.0	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.7-focal-20220811085105	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0.9.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20220711181928	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20220823183431	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20220823183431	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20220823183431	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.3.0	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20220809060458	GPL License
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20210901090922	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.24.1-amd64-20220822204951	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.1-alpine	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20220811132755	GPL-2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20220823183431	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-focal-20220811132755	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20220823183431	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20220823183431	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20220823183431	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20220823183431	Apache License 2.0

MOSK 22.4 OpenStack Yoga binaries and Docker images

Important

OpenStack Yoga is available as technical preview. For details, see OpenStack Yoga support.

Component	Path	License information for main executable programs
Binaries
initramfs	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20220714061116.gz	GPL-2.0
kernel	https://binary.mirantis.com/openstack/bin/ironic/tinyipa/tinyipa-stable-yoga-20220714061116.vmlinuz	GPL-2.0
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-xena-9f691e3-20220110111511.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-yoga-20220905111550.qcow2	Mirantis Proprietary License

Docker images
keystone	mirantis.azurecr.io/openstack/keystone:yoga-focal-20220905101550	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:yoga-focal-20220905101550	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:yoga-focal-20220905101550	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:yoga-focal-20220905101550	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:yoga-focal-20220905101550	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:yoga-focal-20220905101550	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:yoga-focal-20220905101550	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:yoga-focal-20220905101550	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:yoga-focal-20220905101550	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:yoga-focal-20220905101550	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:yoga-focal-20220905101550	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:yoga-focal-20220905101550	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20220811132755	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20220811132755	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20220811132755	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:victoria-focal-20220823183431	Apache License 2.0
rabbitmq-3.10.x	mirantis.azurecr.io/openstack/extra/rabbitmq:3.10.7-focal-20220810183358	Mozilla Public License 2.0
rabbitmq-3.10.x-management	mirantis.azurecr.io/openstack/extra/rabbitmq-management:3.10.7-focal-20220810183358	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.1-44bd8b3-20220812062625	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.16-alpine3.16	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:pacific-focal-20220705100042”Apache License 2.0	LGPL-2.1 or LGPL-3”
etcd	mirantis.azurecr.io/openstack/extra/etcd:v3.5.4-alpine-20220808224108	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:1.3.0	Apache License 2.0
tls-proxy	mirantis.azurecr.io/openstack/tls-proxy:focal-20220804082840	Mirantis Proprietary License
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:v1.19.2-77af1ef-20220823043839	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.7-focal-20220811085105	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0.9.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20220711181928	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:yoga-focal-20220905101550	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:yoga-focal-20220905101550	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:yoga-focal-20220905101550	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.3.0	GPL-2.0 and LGPL-2.1
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.5-20220809060458	GPL-2.0
rsyslog	mirantis.azurecr.io/openstack/extra/rsyslog:v8.2001.0-20210901090922	GNU General Public License v3
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.24.1-amd64-20220822204951	Apache License 2.0
nginx	mirantis.azurecr.io/openstack/extra/nginx:1.23.1-alpine	Apache License 2.0
tgt	mirantis.azurecr.io/general/tgt:1.0.x-focal-20220811132755	GPL-2.0
stepler	mirantis.azurecr.io/openstack/stepler:yoga-focal-20220811132755	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:yoga-focal-20220905101550	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:yoga-focal-20220905101550	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:yoga-focal-20220905101550	Apache License 2.0
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:yoga-focal-20220905101550	Apache License 2.0

MOSK 22.4 OpenStack Helm charts

Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.10.5.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4186.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2846.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 22.4 Tungsten Fabric 2011 artifacts

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.9.4.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.9.4	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:2011.20220907032746	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:2011.20220907032746	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:2011.20220907032746	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:2011.20220907032746	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:2011.20220907032746	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:2011.20220907032746	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:2011.20220907032746	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:2011.20220907032746	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:2011.20220907032746	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:2011.20220907032746	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:2.1.4	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.9	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20220705125748	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:v2-20220616143636	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.2.0	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:7.1.1	Apache License 2.0
	mirantis.azurecr.io/tungsten/prometheus-jmx-exporter:0.17.0-debian-11-r29	Apache License 2.0
Pause	mirantis.azurecr.io/general/external/pause:3.1	Google Cloud Platform
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.2.0	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.14	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.7.0-0.2.14	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.3.2	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:7.0.2-alpine3.16	BSD 3-Clause “New” or “Revised” License
	mirantis.azurecr.io/openstack/extra/redis_exporter:v1.43.0-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20220905045430	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20220819122303	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:2011.20220907032746	Apache License 2.0
ToR agent	mirantis.azurecr.io/tungsten/contrail-tor-agent:2011.20220907032746	Apache License 2.0
TF status	mirantis.azurecr.io/tungsten/contrail-status:2011.20220907032746	Apache License 2.0

MOSK 22.4 Tungsten Fabric 21.4 artifacts

Important

Tungsten Fabric 21.4 is available as technical preview. For details, see Tungsten Fabric 21.4 support.

Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.9.4.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.9.4	Mirantis Proprietary License
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:21.4.20220801000000	Apache License 2.0
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:21.4.20220801000000	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:21.4.20220801000000	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:21.4.20220801000000	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-dnsmasq:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:21.4.20220801000000	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:21.4.20220801000000	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:21.4.20220801000000	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:21.4.20220801000000	Apache License 2.0
TF Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:21.4.20220801000000	Apache License 2.0
TF status	mirantis.azurecr.io/tungsten/contrail-status:21.4.20220801000000	Apache License 2.0
TF tools	mirantis.azurecr.io/tungsten/contrail-tools:21.4.20220801000000	Apache License 2.0
ToR agent	mirantis.azurecr.io/tungsten/contrail-tor-agent:21.4.20220801000000	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-agent-dpdk:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:21.4.20220801000000	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:21.4.20220801000000	Apache License 2.0

MOSK 22.4 StackLight artifacts

Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 22.4 release:

[25349][Update] Fixed the issue causing MOSK cluster update failure after an OpenStack controller node replacement.
[26278][OpenStack] Fixed the issue with l3-agent being stuck in the Not ready state and routers not being initialized properly during Neutron restart.
[25447][OpenStack] Fixed the issue that caused a Masakari instance evacuation to fail if an encrypted volume was attached to a node.
[25448][OpenStack] Fixed the issue that caused some Masakari instances to get stuck in the Rebuild or Error state when being migrated to a new OpenStack compute node during host evacuation. The issue occurred on OpenStack compute nodes with a large number of instances.
[22930][OpenStack] Fixed the issue wherein Octavia load balancers provisioning, and, occasionally, the listeners or pools associated with these load balancers got stuck in the ERROR, PENDING_UPDATE, PENDING_CREATE, or PENDING_DELETE state.
[25450][OpenStack] Implemented the capability to enable trusted mode for SR-IOV ports.
[25316][StackLight] Introduced projects filtering by a domain name for the default domain to fix the issue wherein a wrong project was chosen by name in case of multiple projects with the same names.
[24376][Ceph] Implemented the capability to parametrize the RADOS Block Device (RBD) device map to avoid Ceph volumes being unresponsive due to a disabled cyclic redundancy check (CRC) mode. Now you can use the rbdDeviceMapOptions field in the Ceph pool parameters of the KaaSCephCluster CR to specify custom RBD map options to use with StorageClass of a corresponding Ceph pool. For details, see Pool parameters.
[28783] [Ceph] Fixed the issue causing Ceph conditon stuck in absence of the Ceph cluster secrets information. If you applied the workaround to your MOSK 22.3 cluster before the update, remove the version parameter definition from KaaSCephCluster after the managed cluster update because the Ceph cluster version in MOSK 22.4 updates to 15.2.17.

Update notes¶

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster to the version 22.4. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Additionally, read through the Cluster update known issues for the problems that are known to occur during update with recommended workarounds.

Features¶

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

Update impact and maintenance windows planning¶

When updating to MOSK 22.4, the Cloud Operator can easily determine if a node needs to be rebooted by checking for the restartRequired flag in the machine status. For details, see Determine if the node needs to be rebooted.

Post-upgrade actions¶

Explicitly define the OIDCClaimDelimiter parameter¶

The OIDCClaimDelimiter parameter defines the delimiter to use when setting multi-valued claims in the HTTP headers. See the MOSK 22.4 OpenStack API Reference for details.

Affected deployments

Proceed with the instruction below only if the following conditions are true:

Keystone is set to use federation through the OpenID Connect protocol, with Mirantis Container Cloud Keycloak in particular. The following configuration is present in your OpenStackDeployment custom resource:
```
kind: OpenStackDeployment
spec:
  features:
    keystone:
      keycloak:
        enabled: true
```
No value has already been specified for the OIDCClaimDelimiter parameter in your OpenStackDeployment custom resource.

To facilitate smooth transition of the existing deployments to the new default value, explicitly define the OIDCClaimDelimiter parameter as follows:

kind: OpenStackDeployment
spec:
  features:
    keystone:
      keycloak:
        oidc:
          OIDCClaimDelimiter: ","

Note

Security notes¶

The table below contains the number of vendor-specific addressed CVEs with Critical or High severity.

In total, in the MOSK 22.4 release, 124 CVEs have been fixed and 26 artifacts (images) updated.

Addressed CVEs¶
Fixed CVE ID	# of updated artifacts
CVE-2022-27404	3
CVE-2022-29217	2
CVE-2022-1652	1
CVE-2022-22822	1
CVE-2022-22823	1
CVE-2022-22824	1
CVE-2022-23852	1
CVE-2022-23990	1
CVE-2022-25315	1
CVE-2022-32207	1
CVE-2022-32250	1
CVE-2021-20231	1
CVE-2021-20232	1
CVE-2021-3156	1
CVE-2021-3177	1
CVE-2021-41945	1
CVE-2021-45960	1
CVE-2020-10878	1
RHSA-2022:5052	48
RHSA-2022:6160	54
RHSA-2022:6170	1

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

22.3¶

Release date	June 30, 2022
Name	MOSK 22.3
Cluster release	8.8.0
Highlights	The third MOSK release in 2022 introduces the following key features: Support for Ubuntu 20.04 on OpenStack with OVS and Tungsten Fabric greenfield deployments Support for large clusters Introduction of the OpenStackDeploymentSecret custom resource Switching to built-in policies for OpenStack services Tungsten Fabric image precaching

New features¶

MOSK 22.3 features¶
Component	Support scope	Feature
OpenStack	Full	Ubuntu 20.04 on OpenStack with OVS and Tungsten Fabric greenfield deployments
	Full	Support for large clusters
	Full	OpenStackDeploymentSecret custom resource
	Full	Built-in policies for OpenStack services
Tungsten Fabric	Full	Tungsten Fabric image precaching
Container Cloud	Full	Configuration of custom Docker registries

Ubuntu 20.04 on OpenStack with OVS and Tungsten Fabric greenfield deployments¶

Implemented full support for Ubuntu 20.04 LTS (Focal Fossa) as the default host operating system on OpenStack with OVS and OpenStack with Tungsten Fabric greenfield deployments.

Support for large clusters¶

MOSK is now confirmed to be able to run up to 10,000 virtual machines under a single control plane.

Depending on the cloud workload profile and the number of OpenStack objects in use, the control plane needs to be extended with additional hardware. Specifically, for the MOSK clouds that use Open vSwitch as a backend for the Networking service (OpenStack Neutron) and run more than 12,000 network ports, Mirantis recommends deploying extra tenant gateways.

The maximum size of a MOSK cluster is limited to 500 nodes in total, regardless of their roles.

OpenStackDeploymentSecret custom resource¶

Introduced the OpenStackDeploymentSecret custom resource to aggregate the cloud’s confidential settings such as SSL/TLS certificates, access credentials for external systems, and other secrets. Previously, the secrets were stored together with the rest of configuration in the OpenStackDeployment custom resource.

The following fields have been moved out of the OpenStackDeployment custom resource:

features:ssl
features:barbican:backends:vault:approle_role_id
features:barbican:backends:vault:approle_secret_id

Built-in policies for OpenStack services¶

Switched all OpenStack services to use the built-in policies, aka in-code policies, to control user access to cloud functions. MOSK keeps the built-in policies up-to-date with the OpenStack development ensuring safe by default behavior as well as allowing you to override only those access rules that you actually need through the features:policies structure in the OpenStackDeployment custom resource.

Sticking to the default policy set as much as possible simplifies the future enablement of advanced authentication and access control functionality, such as scoped tokens and scoped access policies.

Learn more

OpenStack official documentation: Policy in code specification

Tungsten Fabric image precaching¶

Added capability to precache containers’ images on Kubernetes nodes to minimize possible downtime on the components update. The feature is enabled by default and can be disabled through the TFOperator custom resource if required.

Learn more

Reference Architecture: Tungsten Fabric image precaching

Configuration of custom Docker registries¶

Implemented support for custom Docker registries configuration. Using the ContainerRegistry custom resource, you can configure CA certificates on machines to access private Docker registries.

Learn more

See also

Major components versions¶

MOSK 22.3 components versions¶
Component	Version
Cluster release	8.8.0
OpenStack	Victoria (LTS) Ussuri (deprecated)
openstack-operator	0.9.7
Tungsten Fabric	2011 (default) 5.1 (deprecated)
tungstenfabric-operator	0.8.5

See also

Release Compatibility Matrix

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.3.

[26278] ‘l3-agent’ gets stuck during Neutron restart
[25594] Security groups shared through RBAC cannot be used to create instances
[22930] Octavia load balancers provisioning gets stuck

[26278] ‘l3-agent’ gets stuck during Neutron restart¶

Fixed in 22.4

During l3-agent restart, routers may not be initialized properly due to erroneous logic in Neutron code causing l3-agent to get stuck in the Not ready state. The readiness probe states that one of routers is not ready with the keepalived process not started.

Example output of the kubectl -n openstack describe pod <neutron-l3 agent pod name> command:

Warning  Unhealthy  109s (x476 over 120m)  kubelet, ti-rs-nhmrmiuyqzxl-2-2obcnor6vt24-server-tmtr5ajqjflf \
Readiness probe failed: /tmp/health-probe.py:259: \
ERROR:/tmp/health-probe.py:The router: 66a885b7-0c7c-463a-a574-bdb19733baf3 is not initialized.

Workaround:

Remove the router from l3-agent:

neutron l3-agent-router-remove <router-name> <l3-agent-name>

Wait up to one minute.

Add the router back to l3-agent:

neutron l3-agent-router-add <router-name> <l3-agent-name>

[25594] Security groups shared through RBAC cannot be used to create instances¶

Fixed in 22.5

Workaround:

Create a port in Neutron:

openstack port create --network <NET> --security-group <SG_ID> shared-sg-port

Pass the created port to Nova:

openstack server create --image <IMAGE> --flavor <FLAVOR> --port shared-sg-port vm-with-shared-sg

Note

If security groups shared through RBAC are used, apply them to ports only, not to instances directly.

[22930] Octavia load balancers provisioning gets stuck¶

Fixed in 22.4

Octavia load balancers provisioning_status may get stuck in the ERROR, PENDING_UPDATE, PENDING_CREATE, or PENDING_DELETE state. Occasionally, the listeners or pools associated with these load balancers may also get stuck in the same state.

Workaround:

For administrative users that have access to the keystone-client pod:
1. Log in to a keystone-client pod.
2. Delete the affected load balancer:
```
openstack loadbalancer delete <load_balancer_id> --force
```

For non-administrative users, access the Octavia API directly and delete the affected load balancer using the "force": true argument in the delete request:

Access the Octavia API.

Obtain the token:

TOKEN=$(openstack token issue -f value -c id)

Obtain the endpoint:

ENDPOINT=$(openstack version show --service load-balancer --interface public --status CURRENT -f value -c Endpoint)

Delete the affected load balancers:

curl -H "X-Auth-Token: $TOKEN" -d '{"force": true}' -X DELETE $ENDPOINT/loadbalancers/<load_balancer_id>

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.3. For Tungsten Fabric limitations, see Tungsten Fabric known limitations.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[10096] tf-control does not refresh IP addresses of Cassandra pods

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Fixed in 23.1

kubectl -n tf delete pod tf-control-<hash>

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

Ceph known issues¶

This section lists the Ceph known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.3.

[28783] Ceph conditon stuck in absence of Ceph cluster secrets info¶

Fixed in 22.4

Ceph conditon gets stuck in absence of the Ceph cluster secrets information. The observed behaviour can be found on the MOSK clusters that have automatically updated their management cluster to Container Cloud 2.21 but are still running the MOSK 22.3 version.

The list of the symptoms includes:

The Cluster object contains the following condition:

Failed to configure Ceph cluster: ceph cluster status info is not \
updated at least for 5 minutes, ceph cluster secrets info is not available yet

The ceph-kcc-controller logs from the kaas namespace contain the following loglines:

2022-11-30 19:39:17.393595 E | ceph-spec: failed to update cluster condition to \
{Type:Ready Status:True Reason:ClusterCreated Message:Cluster created successfully \
LastHeartbeatTime:2022-11-30 19:39:17.378401993 +0000 UTC m=+2617.717554955 \
LastTransitionTime:2022-05-16 16:14:37 +0000 UTC}. failed to update object \
"rook-ceph/rook-ceph" status: Operation cannot be fulfilled on \
cephclusters.ceph.rook.io "rook-ceph": the object has been modified; please \
apply your changes to the latest version and try again

Workaround:

Edit KaaSCephCluster of the affected managed cluster:
```
kubectl -n <managedClusterProject> edit kaascephcluster
```
Substitute <managedClusterProject> with the corresponding managed cluster namespace.
Define the version parameter in the KaaSCephCluster spec:
```
spec:
  cephClusterSpec:
    version: 15.2.13
```
Note

Starting from MOSK 22.4, the Ceph cluster version updates to 15.2.17. Therefore, remove the version parameter definition from KaaSCephCluster after the managed cluster update.

Save the updated KaaSCephCluster spec.

Find the MiraCeph Custom Resource on a managed cluster and copy all annotations starting with meta.helm.sh:

kubectl --kubeconfig <managedClusterKubeconfig> get crd miracephs.lcm.mirantis.com -o yaml

Substitute <managedClusterKubeconfig> with a corresponding managed cluster kubeconfig.

Example of a system output:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.6.0
    # save all annotations with "meta.helm.sh" somewhere
    meta.helm.sh/release-name: ceph-controller
    meta.helm.sh/release-namespace: ceph
...

Create the miracephsecretscrd.yaml file and fill it with the following template:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.6.0
    <insert all "meta.helm.sh" annotations here>
  labels:
    app.kubernetes.io/managed-by: Helm
  name: miracephsecrets.lcm.mirantis.com
spec:
  conversion:
    strategy: None
  group: lcm.mirantis.com
  names:
    kind: MiraCephSecret
    listKind: MiraCephSecretList
    plural: miracephsecrets
    singular: miracephsecret
  scope: Namespaced
  versions:
    - name: v1alpha1
      schema:
        openAPIV3Schema:
          description: MiraCephSecret aggregates secrets created by Ceph
          properties:
            apiVersion:
              type: string
            kind:
              type: string
            metadata:
              type: object
            status:
              properties:
                lastSecretCheck:
                  type: string
                lastSecretUpdate:
                  type: string
                messages:
                  items:
                    type: string
                  type: array
                state:
                  type: string
              type: object
          type: object
      served: true
      storage: true

Insert the copied meta.helm.sh annotations to the metadata.annotations section of the template.

Apply miracephsecretscrd.yaml on the managed cluster:
```
kubectl --kubeconfig <managedClusterKubeconfig> apply -f miracephsecretscrd.yaml
```
Substitute <managedClusterKubeconfig> with a corresponding managed cluster kubeconfig.
Obtain the MiraCeph name from the managed cluster:
```
kubectl --kubeconfig <managedClusterKubeconfig> -n ceph-lcm-mirantis get miraceph -o name
```
Substitute <managedClusterKubeconfig> with the corresponding managed cluster kubeconfig.

Example of a system output:
```
miraceph.lcm.mirantis.com/rook-ceph
```
Copy the MiraCeph name after slash, the rook-ceph part from the example above.

Create the mcs.yaml file and fill it with the following template:

apiVersion: lcm.mirantis.com/v1alpha1
kind: MiraCephSecret
metadata:
  name: <miracephName>
  namespace: ceph-lcm-mirantis
status: {}

Substitute <miracephName> with the MiraCeph name from the previous step.

Apply mcs.yaml on the managed cluster:
```
kubectl --kubeconfig <managedClusterKubeconfig> apply -f mcs.yaml
```
Substitute <managedClusterKubeconfig> with a corresponding managed cluster kubeconfig.

After some delay, the cluster condition will be updated to the health state.

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.3.

[26534] The ‘ironic-conductor’ Pod fails after the management cluster upgrade
[25349] Cluster update failure after OpenStack controller node replacement
[24435] MetalLB speaker fails to announce the LB IP for the Ingress service
[23154] Ceph health is in ‘HEALTH_WARN’ state after managed cluster update

[26534] The ‘ironic-conductor’ Pod fails after the management cluster upgrade¶

Fixed in 22.5

As a workaround, run the following command:

kubectl -n openstack exec -t <failedPodName> -c ironic-conductor-http -- chown 42424:42424 /certs/ca-bundle.pem

[25349] Cluster update failure after OpenStack controller node replacement¶

Fixed in 22.4

After an OpenStack controller node replacement, the octavia-create-resources job does not restart and the Octavia Health Manager Pod on the new node cannot find its port in the Kubernetes secret. As a result, MOSK cluster update may fail.

Workaround:

After adding the new OpenStack controller node but before the update process starts, manually restart the octavia-create-resources job:

kubectl -n osh-system exec <OSCTL_POD> -- osctl-job-rerun octavia-create-resources openstack

[24435] MetalLB speaker fails to announce the LB IP for the Ingress service¶

Fixed in 22.5

The issue may occur if the MetalLB speaker nodeSelector selects not all the nodes selected by nodeSelector of the OpenStack Ingress service.

The issue may arise and disappear when a new MetalLB speaker is being selected by the MetalLB Controller to announce the LB IP address.

The issue occurs since MOSK 22.2 after externalTrafficPolicy was set to local for the OpenStack Ingress service.

Workaround:

Select from the following options:

Set externalTrafficPolicy to cluster for the OpenStack Ingress service.

This option is preferable in the following cases:
- If not all cluster nodes have connection to the external network
- If the connection to the external network cannot be established
- If network configuration changes are not desired
If network configuration is allowed and if you require the externalTrafficPolicy: local option:
1. Wire the external network to all cluster nodes where the OpenStack Ingress service Pods are running.
2. Configure IP addresses in the external network on the nodes and change the default routes on the nodes.
3. Change nodeSelector of MetalLB speaker to match nodeSelector of the OpenStack Ingress service.

[23154] Ceph health is in ‘HEALTH_WARN’ state after managed cluster update¶

After updating the MOSK cluster, Ceph health is in the HEALTH_WARN state with the SLOW_OPS health message. The workaround is to restart the affected Ceph Monitors.

Release artifacts¶

This section lists the components artifacts of the MOSK 22.3 release.

MOSK 22.3 OpenStack Victoria binaries and Docker images
MOSK 22.3 OpenStack Ussuri binaries and Docker images
MOSK 22.3 OpenStack Helm charts
MOSK 22.3 Tungsten Fabric 5.1 artifacts
MOSK 22.3 Tungsten Fabric 2011 artifacts
MOSK 22.3 StackLight artifacts

MOSK 22.3 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20220528060551.qcow2	Mirantis Proprietary License
Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-focal-20220528051016	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-focal-20220528051016	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-focal-20220528051016	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-focal-20220528051016	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-focal-20220528051016	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-focal-20220528051016	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-focal-20220528051016	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-focal-20220528051016	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-focal-20220528051016	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-focal-20220528051016	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-focal-20220528051016	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-focal-20220528051016	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-focal-20220528051016	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-focal-20220528051016	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-focal-20220528051016	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-focal-20220528051016	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20220503164811	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.13-focal-20220505142933	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.13-focal-20220505142933	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:victoria-focal-20220528051016	Apache License 2.0
rabbitmq-3.9	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
rabbitmq-3.9-management	mirantis.azurecr.io/general/rabbitmq:3.9.8-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:octopus-focal-20220516093942	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.5.2	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.3	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.7-focal-20220503161631	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0.9.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20220217210744	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-focal-20220528051016	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-focal-20220528051016	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-focal-20220528051016	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-focal-20220528051016	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.2.2	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20211018180158	GPL License
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.21.0	Apache License 2.0

MOSK 22.3 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20220513090507.qcow2	Mirantis Proprietary License

Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:ussuri-bionic-20220513073430	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:ussuri-bionic-20220513073430	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:ussuri-bionic-20220513073430	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20220513073430	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20220513073430	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20220513073430	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20220513073430	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20220513073430	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20220217210744	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:0.9.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
mariadb	mirantis.azurecr.io/general/mariadb:10.6.7-focal-20220503161631	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.3	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:octopus-focal-20220516093942	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.9-management	mirantis.azurecr.io/general/rabbitmq:3.9.8-management	Mozilla Public License 2.0
rabbitmq-3.9	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:victoria-focal-20220516060019	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20220421094914	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20220421094914	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-focal-20220503164811	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20220513073430	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20220513073430	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20220513073430	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20220513073430	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20220513073430	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20220513073430	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20220513073430	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20220513073430	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20220513073430	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20220513073430	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20220513073430	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20220513073430	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v8.2.2	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20211018180158	GPL License
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.21.0	Apache License 2.0

MOSK 22.3 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.9.7.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4106.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2813.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 22.3 Tungsten Fabric 5.1 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.8.5.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.8.5	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20220127155145	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20220127155145	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20220127155145	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20220127155145	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20220127155145	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20220127155145	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20220127155145	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20220127155145	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20220127155145	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v2.0.2-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.9	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:1.0.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:1.1.0	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.1.4	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.7	Apache License 2.0
	mirantis.azurecr.io/stacklight/prometheus-jmx-exporter:0.16.1-debian-10-r208	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.9	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.12	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.3-0.2.13	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.2.2-1-8cba7b0	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:6.2.7-alpine3.16	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20220601115218	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20220406152321	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License

MOSK 22.3 Tungsten Fabric 2011 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.8.5.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.8.5	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:2011.20220601111608	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:2011.20220601111608	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:2011.20220601111608	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:2011.20220601111608	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
Control	contrail-controller-control-control:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:2011.20220601111608	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:2011.20220601111608	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:2011.20220601111608	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:2011.20220601111608	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:2011.20220601111608	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:2011.20220601111608	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v2.0.2-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.9	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:1.0.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:v2-20220601125122	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.1.4	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.7	Apache License 2.0
	mirantis.azurecr.io/stacklight/prometheus-jmx-exporter:0.16.1-debian-10-r208	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.9	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.12	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.3-0.2.13	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.2.2-1-8cba7b0	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:6.2.7-alpine3.16	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20220601115218	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20220406152321	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:2011.20220601111608	Apache License 2.0

MOSK 22.3 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 22.3 release:

[23771][Update] Fixed the issue that caused connectivity loss due to a wrong update order of Neutron services.
[23131][Update] Fixed the issue that caused live migration to fail during an update of a cluster with encrypted storage. Now, you can perform live migrations using the following command:
```
openstack --os-cloud osctl server migrate --live-migration <INSTANCE_ID>
```
[21790][Update] Fixed the issue wherein the Ceph cluster failed to update on a managed cluster with the Daemonset csi-rbdplugin is not found error message.
[23985][OpenStack] Fixed the issue that caused the federated authorization failure on the Keycloak URL update.
[23484][OpenStack] To prevent from ovsdb session disconnection causing port processing operations being blocked, configured the default timeouts for ovsdb.
[23297][OpenStack] Fixed the issue that caused VMs to be unaccessible through floating IP due to missing IPtables rules for floating IPs on the OpenStack compute DVR router.
[23043][OpenStack] Updated Neutron Open vSwitch to version 2.13 to fix the issue that caused broken communication between VMs in the same network.
[19065][OpenStack] Fixed the issue that caused Octavia load balancers to lose Amphora VMs after failover.
[23338][Tungsten Fabric] Fixed the issue wherein the Tungsten Fabric (TF) tools from the contrail-tools container did not work on DPDK nodes.
[22273][Tungsten Fabric] To avoid issues with CassandraCacheHitRateTooLow StackLight alerts raising for the tf-cassandra-analytics Pods, implemented the capability to configure file_cache_size_in_mb for the tf-cassandra-analytics or tf-cassandra-config Cassandra deployment. By default, this parameter is set to 512. For details, see Cassandra configuration.

Update notes¶

This section describes the specific actions you as a cloud operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster to the version 22.3. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Additionally, read through the Cluster update known issues for the problems that are known to occur during update with recommended workarounds.

Features¶

Migrating secrets from OpenStackDeployment to OpenStackDeploymentSecret CR¶

The OpenStackDeploymentSecret custom resource replaced the fields in OpenStackDeployment customer resource that used to keep the cloud’s confidential settings. These include:

features:ssl
features:barbican:backends:vault:approle_role_id
features:barbican:backends:vault:approle_secret_id

After the update, migrate the fields mentioned above from OpenStackDeployment to OpenStackDeploymentSecret custom resource as follows:

Create an OpenStackDeploymentSecret object with the same name as the OpenStackDeployment object.
Set the fields in the OpenStackDeploymentSecret custom resource as required.
Remove the related fields from the OpenStackDeployment custom resource.

Switching to built-in policies for OpenStack services¶

Switched all OpenStack components to built-in policies by default. If you have any custom policies defined through the features:policies structure in the OpenStackDeployment custom resource, some API calls may not work as usual. Therefore, after completing the update, revalidate all the custom access rules configured for your cloud.

Post-update actions¶

Validation of custom OpenStack policies¶

Revalidate all the custom OpenStack access rules configured through the features:policies structure in the OpenStackDeployment custom resource.

Manual restart of TF vRouter agent Pods¶

To complete the update of a cluster with Tungsten Fabric as a backend for networking, manually restart Tungsten Fabric vRouter agent Pods on all compute nodes.

Restart of a vRouter agent on a compute node will cause up to 30-60 seconds of networking downtime per instance hosted there. If downtime is unacceptable for some workloads, we recommend that you migrate them before restarting the vRouter Pods.

Warning

Under certain rare circumstances, the reload of the vRouter kernel module triggered by the restart of a vRouter agent can hang due to the inability to complete the drop_caches operation. Watch the status and logs of the vRouter agent being restarted and trigger the reboot of the node, if necessary.

To restart the vRouter Pods:

Remove the vRouter pods one by one manually.

Note

Manual removal is required because vRouter pods use the OnDelete update strategy. vRouter pod restart causes networking downtime for workloads on the affected node. If it is not applicable for some workloads, migrate them before restarting the vRouter pods.
```
kubectl -n tf delete pod <VROUTER-POD-NAME>
```
Verify that all tf-vrouter-* pods have been updated:
```
kubectl -n tf get ds | grep tf-vrouter
```
The UP-TO-DATE and CURRENT fields must have the same values.

Changing the format of Keystone domain_specific configuration¶

Switch to the new format of domain_specific_configuration in the OpenStackDeployment object. For details, see Reference Architecture: Standard configuration.

Cluster nodes reboot¶

Reboot the cluster nodes to complete the update as described in Cluster update.

Security notes¶

The table below contains the number of vendor-specific addressed CVEs with Critical or High severity.

In total, in the MOSK 22.3 release, 88 CVEs have been fixed and 196 artifacts updated.

Addressed CVEs¶
Fixed CVE ID	# of updated artifacts
CVE-2022-28347	3
CVE-2022-28346	3
CVE-2022-1048	2
CVE-2022-0435	1
CVE-2021-43527	2
CVE-2021-42008	1
CVE-2021-4197	2
CVE-2021-4157	1
CVE-2021-41495	1
CVE-2021-4083	1
CVE-2021-4034	1
CVE-2021-39713	1
CVE-2021-39698	1
CVE-2021-39685	1
CVE-2021-39634	1
CVE-2021-38300	1
CVE-2021-38160	1
CVE-2021-37576	1
CVE-2021-3752	1
CVE-2021-3715	1
CVE-2021-3640	1
CVE-2021-3612	1
CVE-2021-3609	1
CVE-2021-3573	1
CVE-2021-35550	1
CVE-2021-3517	1
CVE-2021-33909	1
CVE-2021-3347	1
CVE-2021-31535	2
CVE-2021-29154	1
CVE-2021-28972	1
CVE-2021-28660	1
CVE-2021-27928	1
CVE-2021-2389	1
CVE-2021-23133	1
CVE-2021-20302	1
CVE-2021-20300	1
CVE-2021-20292	1
CVE-2021-1048	1
CVE-2021-0941	1
CVE-2021-0920	1
CVE-2020-36329	1
CVE-2020-36328	1
CVE-2020-36158	1
CVE-2020-29661	1
CVE-2020-29569	1
CVE-2020-29368	1
CVE-2020-27843	1
CVE-2020-27786	1
CVE-2020-27777	1
CVE-2020-25696	1
CVE-2020-25671	1
CVE-2020-25670	1
CVE-2020-25669	1
CVE-2020-25668	1
CVE-2020-25643	1
CVE-2020-1747	1
CVE-2020-15780	1
CVE-2020-15436	1
CVE-2020-14386	1
CVE-2020-14356	1
CVE-2020-13974	1
CVE-2020-12464	1
CVE-2020-10757	1
CVE-2020-0466	1
CVE-2020-0465	1
CVE-2020-0452	1
CVE-2020-0444	1
CVE-2019-20908	1
CVE-2019-19816	1
CVE-2019-19813	1
CVE-2019-19074	1
CVE-2019-11324	1
CVE-2019-10906	1
CVE-2019-1010142	1
CVE-2019-0145	1
CVE-2018-7750	1
CVE-2018-25014	1
CVE-2018-25011	1
CVE-2018-20060	1
CVE-2018-18074	1
CVE-2018-1000805	1
CVE-2017-18342	1
CVE-2017-12852	1
RHSA-2022:2213	27
RHSA-2022:2191	23
RHSA-2022:1069	23
RHSA-2022:1066	31

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

MOSK 22.2 release¶

Release date	April 14, 2022
Name	MOSK 22.2
Cluster release	8.6.0+22.2
Highlights	The second MOSK release in 2022 introduces the following key features: MariaDB minor version update to 10.6 End-user IP addresses captured in cloud’s logs Compliance with OpenStack security checklist Technical Preview of CPU isolation using cpusets Technical Preview of MOSK on local mdadm RAID devices of level 10

New features¶

MOSK 22.2 features¶
Component	Support scope	Feature
OpenStack	Full	MariaDB minor version update
	Full	End-user IP addresses captured in cloud’s logs
	TechPreview	CPU isolation using cpusets
	n/a	OpenStack security checklist compliance
	Full	LoadBalancer configuration for PowerDNS
Tungsten Fabric	Full	Access to external DNS for Tungsten Fabric
Bare metal	TechPreview	MOSK on local mdadm RAID devices of level 10

MariaDB minor version update¶

Updated the minor version of MariaDB from 10.4 to 10.6. The update applies automatically during the MOSK cluster update procedure.

Learn more

End-user IP addresses captured in cloud’s logs¶

Exposed the IP addresses of the cloud users that consume API of a cloud to all user-facing cloud services, such as OpenStack, Ceph, and others. Now, the IP addresses get recoded in the corresponding logs allowing for easy troubleshooting and security auditing of the cloud.

CPU isolation using cpusets¶

TechPreview

Implemented the capability to configure CPU isolation using the cpusets mechanism in Linux kernel. Configuring CPU isolation using the isolcpus configuration parameter for Linux kernel is considered deprecated.

Learn more

OpenStack security checklist compliance¶

Validated MOSK against the upstream OpenStack Security Checklist. The default configuration of MOSK services that include Identity, Dashboard, Compute, Block Storage, and Networking services is compliant with the security recommendations from the OpenStack community.

Encryption of all the internal communications for MOSK services will become available in one of the nearest product releases.

LoadBalancer configuration for PowerDNS¶

Implemented the capability to configure the LoadBalancer type for PowerDNS through the spec:features:designate definition in the OpenStackDeployment CR, for example, to expose the TCP protocol instead of the default UDP, or both.

Learn more

Reference Architecture: LoadBalancer type for PowerDNS

Access to external DNS for Tungsten Fabric¶

Added the tf-control-dns-external service to the list of the Tungsten Fabric configuration options. The service is created by default to expose TF control dns. You can disable creation of this service using the enableDNSExternal parameter in the TFOperator CR.

Learn more

Reference Architecture: Access to external DNS

MOSK on local mdadm RAID devices of level 10¶

TechPreview

Implemented the initial Technology Preview support for MOSK deployment on local software-based mdadm Redundant Array of Independent Disks (RAID) devices of level 10 (raid10) to withstand failure of one device at a time.

The raid10 RAID type requires at least four and in total an even number of storage devices available on your servers.

To create and configure RAID, use the softRaidDevices field in BaremetalHostProfile.

Also, added the capability to create LVM volume groups on top of mdadm-based RAID devices.

Learn more

Major components versions¶

MOSK 22.2 components versions¶
Component	Version
Cluster release	8.6.0
OpenStack	Victoria (LTS) Ussuri (deprecated)
openstack-operator	0.8.4
Tungsten Fabric	2011 (default) 5.1 (deprecated)
tungstenfabric-operator	0.7.2

See also

Release Compatibility Matrix

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

Tungsten Fabric known issues¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.2. For Tungsten Fabric limitations, see Tungsten Fabric known limitations.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[10096] tf-control does not refresh IP addresses of Cassandra pods

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Fixed in 23.1

kubectl -n tf delete pod tf-control-<hash>

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.2.

[26278] ‘l3-agent’ gets stuck during Neutron restart
[25594] Security groups shared through RBAC cannot be used to create instances
[23985] Federated authorization fails after updating Keycloak URL
[22930] Octavia load balancers provisioning gets stuck
[19065] Octavia load balancers lose Amphora VMs after failover
[6912] Octavia load balancers may not work properly with DVR

[26278] ‘l3-agent’ gets stuck during Neutron restart¶

Fixed in 22.4

Example output of the kubectl -n openstack describe pod <neutron-l3 agent pod name> command:

Warning  Unhealthy  109s (x476 over 120m)  kubelet, ti-rs-nhmrmiuyqzxl-2-2obcnor6vt24-server-tmtr5ajqjflf \
Readiness probe failed: /tmp/health-probe.py:259: \
ERROR:/tmp/health-probe.py:The router: 66a885b7-0c7c-463a-a574-bdb19733baf3 is not initialized.

Workaround:

Remove the router from l3-agent:

neutron l3-agent-router-remove <router-name> <l3-agent-name>

Wait up to one minute.

Add the router back to l3-agent:

neutron l3-agent-router-add <router-name> <l3-agent-name>

[25594] Security groups shared through RBAC cannot be used to create instances¶

Fixed in 22.5

Workaround:

Create a port in Neutron:

openstack port create --network <NET> --security-group <SG_ID> shared-sg-port

Pass the created port to Nova:

openstack server create --image <IMAGE> --flavor <FLAVOR> --port shared-sg-port vm-with-shared-sg

Note

If security groups shared through RBAC are used, apply them to ports only, not to instances directly.

[23985] Federated authorization fails after updating Keycloak URL¶

Fixed in 22.3

After updating the Keycloak URL in the OpenStackDeployment resource through the spec.features.keystone.keycloak.url or spec.features.keystone.keycloak.oidc.OIDCProviderMetadataURL fields, authentication to Keystone through federated OpenID Connect through Keycloak stops working returning HTTP 403 Unauthorized on authentication attempt.

The failure occurs because such change is not automatically propagated to the corresponding Keycloak identity provider, which was automatically created in Keystone during the initial deployment.

The workaround is to manually update the identity provider’s remote_ids attribute:

Compare the Keycloak URL set in the OpenStackDeployment resource with the one set in Keystone identity provider:

kubectl -n openstack get osdpl -ojsonpath='{.items[].spec.features.keystone.keycloak}'
# vs
openstack identity provider show keycloak -f value -c remote_ids

If the URLs do not coincide, update the identity provider in OpenStack with the correct URL keeping the auth/realms/iam part as shown below. Otherwise, the problem is caused by something else, and you need to proceed with the debugging.
```
openstack identity provider set keycloak --remote-id <new-correct-URL>/auth/realms/iam
```

[22930] Octavia load balancers provisioning gets stuck¶

Fixed in 22.4

Workaround:

For administrative users that have access to the keystone-client pod:
1. Log in to a keystone-client pod.
2. Delete the affected load balancer:
```
openstack loadbalancer delete <load_balancer_id> --force
```

For non-administrative users, access the Octavia API directly and delete the affected load balancer using the "force": true argument in the delete request:

Access the Octavia API.

Obtain the token:

TOKEN=$(openstack token issue -f value -c id)

Obtain the endpoint:

ENDPOINT=$(openstack version show --service load-balancer --interface public --status CURRENT -f value -c Endpoint)

Delete the affected load balancers:

curl -H "X-Auth-Token: $TOKEN" -d '{"force": true}' -X DELETE $ENDPOINT/loadbalancers/<load_balancer_id>

[19065] Octavia load balancers lose Amphora VMs after failover¶

Fixed in 22.3

If an Amphora VM does not respond or responds too long to heartbeat requests, the Octavia load balancer automatically initiates a failover process after 60 seconds of unsuccessful attempts. Long responses of an Amphora VM may be caused by various events, such as a high load on the OpenStack compute node that hosts the Amphora VM, network issues, system service updates, and so on. After a failover, the Amphora VMs may be missing in the completed Octavia load balancer.

Workaround:

If your deployment is already affected, manually restore the work of the load balancer by recreating the Amphora VM:
1. Define the load balancer ID:
```
openstack loadbalancer amphora list --column loadbalancer_id --format value --status ERROR
```
2. Start the load balancer failover:
```
openstack loadbalancer failover <Load balancer ID>
```

To avoid an automatic failover start that may cause the issue, set the heartbeat_timeout parameter in the OpenStackDeployment CR to a large value in seconds. The default is 60 seconds. For example:

spec:
  services:
    load-balancer:
      octavia:
        values:
          conf:
            octavia:
              health_manager:
                heartbeat_timeout: 31536000

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

When Neutron is deployed in the DVR mode, Octavia load balancers may not work correctly. The symptoms include both failure to properly balance traffic and failure to perform an amphora failover. For details, see DVR incompatibility with ARP announcements and VRRP.

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.2.

[22777] Admission Controller exception for deployments with Tungsten Fabric
[21790] Ceph cluster fails to update due to ‘csi-rbdplugin’ not found
[23154] Ceph health is in ‘HEALTH_WARN’ state after managed cluster update
[23771] Connectivity loss due to wrong update order of Neutron services
[24435] MetalLB speaker fails to announce the LB IP for the Ingress service

[22777] Admission Controller exception for deployments with Tungsten Fabric¶

Affects only MOSK 22.2

After updating the MOSK cluster, Admission Controller prohibits the OsDpl update with the following error message:

TungstenFabric as network backend and setting of floating network
physnet name without network type and segmentation id are not compatible.

As a workaround, after the update remove the orphaned physnet parameter from the OsDpl CR:

features:
  neutron:
    backend: tungstenfabric
    floating_network:
      enabled: true
      physnet: physnet1

[21790] Ceph cluster fails to update due to ‘csi-rbdplugin’ not found¶

Fixed in MOSK 22.3

A Ceph cluster fails to update on a managed cluster with the following message:

Failed to configure Ceph cluster: ceph cluster verification is failed:
[Daemonset csi-rbdplugin is not found]

As a workaround, restart the rook-ceph-operator pod:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1

[23154] Ceph health is in ‘HEALTH_WARN’ state after managed cluster update¶

After updating the MOSK cluster, Ceph health is in the HEALTH_WARN state with the SLOW_OPS health message. The workaround is to restart the affected Ceph Monitors.

[23771] Connectivity loss due to wrong update order of Neutron services¶

Fixed in MOSK 22.3

After updating the cluster, simultaneous unordered restart of Neutron L2 and L3, DHCP, and Metadata services leads to the state when ports on br-int are tagged with valid VLAN tags but with trunks: [4095].

Example of affected ports in Open vSwitch:

Port "tapdb11212e-15"
    tag: 1
    trunks: [4095]

Workaround:

Search for the nodes with the OVS ports:

for i in $(kubectl -n openstack get pods |grep openvswitch-vswitchd | awk '{print $1}'); do echo $i; kubectl -n openstack exec -it -c openvswitch-vswitchd $i -- ovs-vsctl show |grep trunks|head -1; done

Exec into the openvswitch-vswitchd pod with affected ports obtained in the previous step and run:

for i in $(ovs-vsctl show |grep trunks -B 3 |grep Port | awk '{print $2}' | tr -d '"'); do ovs-vsctl set port $i tag=4095; done

Restart the neutron-ovs agent on the affected nodes.

[24435] MetalLB speaker fails to announce the LB IP for the Ingress service¶

Fixed in MOSK 22.5

The issue may occur if the MetalLB speaker nodeSelector selects not all the nodes selected by nodeSelector of the OpenStack Ingress service.

The issue may arise and disappear when a new MetalLB speaker is being selected by the MetalLB Controller to announce the LB IP address.

The issue occurs since MOSK 22.2 after externalTrafficPolicy was set to local for the OpenStack Ingress service.

Workaround:

Select from the following options:

Set externalTrafficPolicy to cluster for the OpenStack Ingress service.

This option is preferable in the following cases:
- If not all cluster nodes have connection to the external network
- If the connection to the external network cannot be established
- If network configuration changes are not desired
If network configuration is allowed and if you require the externalTrafficPolicy: local option:
1. Wire the external network to all cluster nodes where the OpenStack Ingress service Pods are running.
2. Configure IP addresses in the external network on the nodes and change the default routes on the nodes.
3. Change nodeSelector of MetalLB speaker to match nodeSelector of the OpenStack Ingress service.

Release artifacts¶

This section lists the components artifacts of the MOSK 22.2 release.

MOSK 22.2 OpenStack Victoria binaries and Docker images
MOSK 22.2 OpenStack Ussuri binaries and Docker images
MOSK 22.2 OpenStack Helm charts
MOSK 22.2 Tungsten Fabric 5.1 artifacts
MOSK 22.2 Tungsten Fabric 2011 artifacts
MOSK 22.2 StackLight artifacts

MOSK 22.2 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20220324132511.qcow2	Mirantis Proprietary License
Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-bionic-20220324125700	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-bionic-20220324125700	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-bionic-20220324125700	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-bionic-20220324125700	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-bionic-20220324125700	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-bionic-20220324125700	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-bionic-20220324125700	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-bionic-20220324125700	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-bionic-20220324125700	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-bionic-20220324125700	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-bionic-20220324125700	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-bionic-20220324125700	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-bionic-20220324125700	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-bionic-20220324125700	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-bionic-20220324125700	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-bionic-20220324125700	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20220217094810	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20220217094810	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20220217094810	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20220113100346	Apache License 2.0
rabbitmq-3.9	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
rabbitmq-3.9-management	mirantis.azurecr.io/general/rabbitmq:3.9.8-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20211025114106	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.3	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.6.7-bionic-20220303173211	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.9.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20220217210744	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-bionic-20220324125700	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-bionic-20220324125700	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-bionic-20220324125700	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-bionic-20220324125700	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20211018180158	GPL License
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.21.0	Apache License 2.0

MOSK 22.2 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20220316010058.qcow2	Mirantis Proprietary License

Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:ussuri-bionic-20220113100346	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:ussuri-bionic-20220113100346	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:ussuri-bionic-20220113100346	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20220113100346	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20220113100346	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20220113100346	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20220113100346	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20220113100346	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.2.0-20220217210744	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.9.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
mariadb	mirantis.azurecr.io/general/mariadb:10.6.7-bionic-20220303173211	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.3	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20211025114106	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.9-management	mirantis.azurecr.io/general/rabbitmq:3.9.8-management	Mozilla Public License 2.0
rabbitmq-3.9	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20220113100346	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20220217094810	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20220217094810	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20220217094810	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20220113100346	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20220113100346	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20220113100346	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20220113100346	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20220113100346	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20220113100346	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20220113100346	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20220113100346	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20220113100346	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20220113100346	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20220113100346	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20220113100346	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20211018180158	GPL License
descheduler	mirantis.azurecr.io/openstack/extra/descheduler:v0.21.0	Apache License 2.0

MOSK 22.2 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.8.4.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4027.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2792.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 22.2 Tungsten Fabric 5.1 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.7.2.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.7.2	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20220127155145	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20220127155145	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20220127155145	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20220127155145	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20220127155145	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20220127155145	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20220127155145	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20220127155145	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20220127155145	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v2.0.2-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.9	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:1.0.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:1.1.0	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.1.2	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.7	Apache License 2.0
	mirantis.azurecr.io/stacklight/prometheus-jmx-exporter:0.16.1-debian-10-r208	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.8	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.23	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.12	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.3-0.2.13	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.2.2-1-8cba7b0	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20220321092905	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License

MOSK 22.2 Tungsten Fabric 2011 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.7.2.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.7.2	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:2011.20220322143737	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:2011.20220322143737	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:2011.20220322143737	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:2011.20220322143737	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
Control	contrail-controller-control-control:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:2011.20220322143737	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:2011.20220322143737	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:2011.20220322143737	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-node-init:2011.20220322143737	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:2011.20220322143737	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:2011.20220322143737	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v2.0.2-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.9	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:1.0.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:1.1.0	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.1.2	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.7	Apache License 2.0
	mirantis.azurecr.io/stacklight/prometheus-jmx-exporter:0.16.1-debian-10-r208	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.8	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.23	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.12	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.3-0.2.13	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.2.2-1-8cba7b0	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:6.2.6-alpine3.15	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20220321092905	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:2011.20220322143737	Apache License 2.0

MOSK 22.2 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 22.2 release:

[22725][Update] Fixed the issue raising upon managed cluster update and causing live migration to fail for instances with deleted images.
[18871][Update] Fixed the issue causing MySQL to crash during a managed cluster update or instances live migration.
[16987][Update] Fixed the issue that caused update of a MOSK cluster to fail with the ceph csi-driver is not evacuated yet, waiting… error during the Ceph CSI pod eviction.
[22321][OpenStack] Fixed the issue wherein the Neutron backend option in OsDpl inadvertently changed from ml2 to tungstenfabric upon managed cluster update.
[21998][OpenStack] Fixed the issue wherein the OpenStack Controller got stuck during the managed cluster update.
[21838][OpenStack] Fixed the issue wherein the Designate API failed to log requests.
[21376][OpenStack] Fixed the issue that caused inability to create non-encrypted volumes from a large image.
[15354][OpenStack] Implemented coordination between instances of the Masakari API service to prevent creation of multiple evacuation flows for instances.
[1659][OpenStack] Fixed the issue wherein the Neutron Open vSwitch agent did not clean tunnels upon changes of the tunnel_ip option.
[20192][StackLight] To avoid issues causing false-positive CassandraTombstonesTooManyMajor StackLight alerts, adjusted the thresholds for the CassandraTombstonesTooManyMajor and CassandraTombstonesTooManyWarning alerts and added a new CassandraTombstonesTooManyCritical alert.
[21064][Ceph] Fixed the issue causing the MOSK managed cluster to fail with the Error updating release ceph/ceph-controller error and enabled Helm v3 for Ceph Controller.

Update notes¶

This section describes the specific actions you as a cloud operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster to the version 22.2. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Additionally, read through the Cluster update known issues for the problems that are known to occur during update with recommended workarounds.

Features¶

Your MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

Update impact and maintenance windows planning¶

Up to 1 minute of downtime for TF data plane¶

During the Kubernetes master nodes update, there is a downtime on Kubernetes cluster’s internal DNS service. Thus, Tungsten Fabric vRouter pods lose connection with the control plane resulting in up to 1 minute of downtime for the Tungsten Fabric data plane nodes and impact on the tenant networking.

Post-update actions¶

Manual restart of TF vRouter Agent pods¶

To complete the update of the cluster with Tungsten Fabric, manually restart Tungsten Fabric vRouter Agent pods on all compute nodes. The restart of a vRouter Agent on a compute node will cause up to 30-60 seconds of networking downtime per instance hosted there. If downtime is unacceptable for some workloads, we recommend that you migrate them before restarting the vRouter pods.

Warning

Under certain rare circumstances, the reload of the vRouter kernel module triggered by the restart of a vRouter Agent can hang due to the inability to complete the drop_caches operation. Watch the status and logs of the vRouter Agent being restarted and trigger the reboot of the node, if necessary.

Security notes¶

The table below contains the number of vendor-specific addressed CVEs with Critical or High severity.

Addressed CVEs¶
CVE ID	Fixed artifacts
CVE-2022-23833	3
CVE-2022-26485	1
CVE-2022-26486	1
RHSA-2021:4904	6
RHSA-2022:0666	37

The full list of the CVEs present in the current MOSK release is available at the Mirantis Security Portal.

MOSK 22.1 release¶

Release date	February 23, 2022
Name	MOSK 22.1
Cluster release	8.5.0+22.1
Highlights	The first MOSK release in 2022 introduces the following key features: Virtual CPU mode configuration Automatic backup and restoration of Tungsten Fabric databases (Cassandra and ZooKeeper) Tungsten Fabric settings persistency: VxLAN Identifier Mode, Encapsulation Priority Order, BGP Autonomous System Technical preview of object storage encryption The list of the major changes in the component versions includes: Host OS kernel v5.4 RabbitMQ 3.9 Mirantis Kubernetes Engine (MKE) 3.4.6 with Kubernetes 1.20

New features¶

MOSK 22.1 features¶
Component	Support scope	Feature
OpenStack	Full	Major changes in the component versions
	Full	CPU model configuration
Tungsten Fabric	Full	Automatic backup and restoration of Tungsten Fabric data
	Full	Tungsten Fabric settings persistency
	TechPreview	Object storage encryption
Documentation	n/a	Calculating target ratio for Ceph pools

Major changes in the component versions¶

The list of the major changes in the component versions includes:

Host OS kernel v5.4
RabbitMQ 3.9
Mirantis Kubernetes Engine (MKE) 3.4.6 with Kubernetes 1.20

All the relevant changes are applied to the MOSK cluster automatically during the cluster update procedure. The host machine’s kernel update implies node reboot. See the links below for details.

Learn more

CPU model configuration¶

Implemented the capability to configure the CPU model through the spec:features:nova:vcpu_type definition of the OpenStackDeployment CR. The default CPU model is now host-model, which replaces the previous default kvm64 CPU model.

For deployments with CPU model customized through spec:services, remove this customization after upgrading your managed cluster.

Learn more

Reference Architecture: vCPU type

Automatic backup and restoration of Tungsten Fabric data¶

Implemented the capability to automatically back up and restore the Tungsten Fabric data stored in Cassandra and ZooKeeper.

The user can perform the automatic data backup by enabling the tf-dbBackup controller through the Tungsten Fabric Operator CR. By default, the job is scheduled for weekly execution, allocating PVC with 5Gi size for storing backups, and keeping 5 previous backups.

Also, MOSK allows for automatic data restoration with the ability to restore from the exact backup if required.

Learn more

Tungsten Fabric settings persistency¶

Implemented the Border Gateway Protocol (BGP) and encapsulation settings in the Tungsten Fabric Operator custom resource. This feature provides persistency of the BGP and encapsulation parameters.

Also, added technical preview of the VxLAN encapsulation feature.

Learn more

Object storage encryption¶

TechPreview

Implemented object storage encryption integrated with the OpenStack Key Manager service (Barbican). The feature is enabled by default in MOSK deployments with Barbican.

Learn more

Calculating target ratio for Ceph pools¶

Published the procedure on how to calculate target ratio for Ceph pools.

Learn more

Calculate target ratio for Ceph pools

Major components versions¶

MOSK 22.1 components versions¶
Component	Version
Cluster release	8.5.0
OpenStack	Victoria (LTS) Ussuri (deprecated)
openstack-operator	0.7.14
Tungsten Fabric	2011 (default) 5.1 (deprecated)
tungstenfabric-operator	0.6.11

See also

Release Compatibility Matrix

Known issues¶

This section describes the MOSK known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

Tungsten Fabric known issues and limitations¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.1.

Limitations
[10096] tf-control does not refresh IP addresses of Cassandra pods
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[15684] Pods fail when rolling Tungsten Fabric 2011 back to 5.1

Limitations¶

Tungsten Fabric does not provide the following functionality:

Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
Role Based Access Control (RBAC) for Neutron objects.

Learn more

MOSK Reference Architecture: Tungsten Fabric known limitations

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

kubectl -n tf delete pod tf-control-<hash>

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

[15684] Pods fail when rolling Tungsten Fabric 2011 back to 5.1¶

Some tf-control and tf-analytics pods may fail during the Tungsten Fabric rollback from version 2011 to 5.1. In this case, the control container from the tf-control pod and/or the collector container from the tf-analytics pod contain SYS_WARN messages such as … AMQP_QUEUE_DELETE_METHOD caused: PRECONDITION_FAILED - queue ‘<contrail-control/contrail-collector>.<nodename>’ in vhost ‘/’ not empty ….

The workaround is to manually delete the queue that fails to be deleted by AMQP_QUEUE_DELETE_METHOD:

kubectl -n tf exec -it tf-rabbitmq-<num of replica> -- rabbitmqctl delete_queue <queue name>

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.1.

[25594] Security groups shared through RBAC cannot be used to create instances
[23985] Federated authorization fails after updating Keycloak URL
[6912] Octavia load balancers may not work properly with DVR
[19065] Octavia load balancers lose Amphora VMs after failover

[25594] Security groups shared through RBAC cannot be used to create instances¶

Fixed in MOSK 22.5 for Yoga

Workaround:

Create a port in Neutron:

openstack port create --network <NET> --security-group <SG_ID> shared-sg-port

Pass the created port to Nova:

openstack server create --image <IMAGE> --flavor <FLAVOR> --port shared-sg-port vm-with-shared-sg

Note

If security groups shared through RBAC are used, apply them to ports only, not to instances directly.

[23985] Federated authorization fails after updating Keycloak URL¶

Fixed in MOSK 22.3

The failure occurs because such change is not automatically propagated to the corresponding Keycloak identity provider, which was automatically created in Keystone during the initial deployment.

The workaround is to manually update the identity provider’s remote_ids attribute:

Compare the Keycloak URL set in the OpenStackDeployment resource with the one set in Keystone identity provider:

kubectl -n openstack get osdpl -ojsonpath='{.items[].spec.features.keystone.keycloak}'
# vs
openstack identity provider show keycloak -f value -c remote_ids

If the URLs do not coincide, update the identity provider in OpenStack with the correct URL keeping the auth/realms/iam part as shown below. Otherwise, the problem is caused by something else, and you need to proceed with the debugging.
```
openstack identity provider set keycloak --remote-id <new-correct-URL>/auth/realms/iam
```

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

[19065] Octavia load balancers lose Amphora VMs after failover¶

Fixed in MOSK 22.3

Workaround:

If your deployment is already affected, manually restore the work of the load balancer by recreating the Amphora VM:
1. Define the load balancer ID:
```
openstack loadbalancer amphora list --column loadbalancer_id --format value --status ERROR
```
2. Start the load balancer failover:
```
openstack loadbalancer failover <Load balancer ID>
```

spec:
  services:
    load-balancer:
      octavia:
        values:
          conf:
            octavia:
              health_manager:
                heartbeat_timeout: 31536000

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.1.

[21790] Ceph cluster fails to update due to ‘csi-rbdplugin’ not found
[22725] Live migration may fail for instances with deleted images
[16987] Cluster update fails at Ceph CSI pod eviction
[18871] MySQL crashes during managed cluster update or instances live migration
[21998] OpenStack Controller may get stuck during the managed cluster update
[22321] Neutron backend may change from TF to ML2

[21790] Ceph cluster fails to update due to ‘csi-rbdplugin’ not found¶

A Ceph cluster fails to update on a managed cluster with the following message:

Failed to configure Ceph cluster: ceph cluster verification is failed:
[Daemonset csi-rbdplugin is not found]

As a workaround, restart the rook-ceph-operator pod:

kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1

[22725] Live migration may fail for instances with deleted images¶

Fixed in MOSK 22.2

During the update of a MOSK cluster to 22.1, live migration may fail for instances if their images were previously deleted. In this case, the nova-compute pod contains an error message similar to the following one:

2022-03-22 23:55:24.468 11816 ERROR nova.compute.manager [instance: 128cf508-f7f7-4a40-b742-392c8c80fc7d] Command: scp -C -r kaas-node-03ab613d-cf79-4830-ac70-ed735453481a:/var/l
ib/nova/instances/_base/e2b6c1622d45071ec8a88a41d07ef785e4dfdfe8 /var/lib/nova/instances/_base/e2b6c1622d45071ec8a88a41d07ef785e4dfdfe8
2022-03-22 23:55:24.468 11816 ERROR nova.compute.manager [instance: 128cf508-f7f7-4a40-b742-392c8c80fc7d] Exit code: 1
2022-03-22 23:55:24.468 11816 ERROR nova.compute.manager [instance: 128cf508-f7f7-4a40-b742-392c8c80fc7d] Stdout: ''
2022-03-22 23:55:24.468 11816 ERROR nova.compute.manager [instance: 128cf508-f7f7-4a40-b742-392c8c80fc7d] Stderr: 'ssh: Could not resolve hostname kaas-node-03ab613d-cf79-4830-ac
70-ed735453481a: Name or service not known\r\n'

Workaround:

If you have not yet started the managed cluster update, change the nova-compute image by setting the following metadata in the OpenStackDeployment CR:

spec:
  services:
    compute:
      nova:
        values:
          images:
            tags:
              nova_compute: mirantis.azurecr.io/openstack/nova:victoria-bionic-20220324125700

If you have already started the managed cluster update, manually update the nova-compute container image in the nova-compute DaemonSet to mirantis.azurecr.io/openstack/nova:victoria-bionic-20220324125700.

[16987] Cluster update fails at Ceph CSI pod eviction¶

Fixed in MOSK 22.2

An update of a MOSK cluster may fail with the ceph csi-driver is not evacuated yet, waiting… error during the Ceph CSI pod eviction.

Workaround:

Scale the affected StatefulSet of the pod that fails to init down to 0 replicas. If it is the DaemonSet such as nova-compute, it must not be scheduled on the affected node.
On every csi-rbdplugin pod, search for stuck csi-vol:
```
rbd device list | grep <csi-vol-uuid>
```
Unmap the affected csi-vol:
```
rbd unmap -o force /dev/rbd
```

Delete volumeattachment of the affected pod:

kubectl get volumeattachments | grep <csi-vol-uuid>
kubectl delete volumeattacmhent <id>

Scale the affected StatefulSet back to the original number of replicas or until its state is Running. If it is a DaemonSet, run the pod on the affected node again.

[18871] MySQL crashes during managed cluster update or instances live migration¶

Fixed in MOSK 22.2

MySQL may crash when performing instances live migration or during a managed cluster update. After the crash, MariaDB cannot connect to the cluster and gets stuck in the CrashLoopBackOff state.

Workaround:

Verify that other MariaDB replicas are up and running and have joined the cluster:

Verify that at least 2 pods are running and operational (2/2 and Running):

kubectl -n openstack get pods |grep maria

Example of system response where the pods mariadb-server-0 and mariadb-server-2 are operational:

mariadb-controller-77b5ff47d5-ndj68   1/1     Running     0          39m
mariadb-server-0                      2/2     Running     0          39m
mariadb-server-1                      0/2     Running     0          39m
mariadb-server-2                      2/2     Running     0          39m

mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e "show status;" |grep -e \
wsrep_cluster_size -e "wsrep_cluster_status" -e "wsrep_local_state_comment"

Example of system response:

wsrep_cluster_size          2
wsrep_cluster_status        Primary
wsrep_local_state_comment   Synced

Remove the content of the /var/lib/mysql/* directory:

kubectl -n openstack exec -it mariadb-server-1 – rm -rf /var/lib/mysql/*

Restart the MariaDB container:

kubectl -n openstack delete pod mariadb-server-1

[21998] OpenStack Controller may get stuck during the managed cluster update¶

Fixed in MOSK 22.2

During the MOSK cluster update, the OpenStack Controller may get stuck with the following symptoms:

Multiple nodemaintenancerequests exist:

kubectl get nodemaintenancerequests

NAME                                             AGE
kaas-node-50a51d95-1e4b-487e-a973-199de400b97d   17m
kaas-node-e41a610a-ceaf-4d80-90ee-4ea7b4dee161   85s

One nodemaintenancerequest has a DeletedAt time stamp and an active openstack-controller finalizer:
```
finalizers:
- lcm.mirantis.com/openstack-controller.nodemaintenancerequest-finalizer
```

In the openstack-controller logs, retries are exhausted:

2022-02-17 18:41:43,317 [ERROR] kopf._core.engines.peering: Request attempt #8 failed; will retry: PATCH https://10.232.0.1:443/apis/zalando.org/v1/namespaces/openstack/kopfpeerings/openstack-controller.nodemaintenancerequest -> APIServerError('Internal error occurred: unable to unmarshal response in forceLegacy: json: cannot unmarshal number into Go value of type bool', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'Internal error occurred: unable to unmarshal response in forceLegacy: json: cannot unmarshal number into Go value of type bool', 'reason': 'InternalError', 'details': {'causes': [{'message': 'unable to unmarshal response in forceLegacy: json: cannot unmarshal number into Go value of type bool'}]}, 'code': 500})
2022-02-17 18:42:50,834 [INFO] kopf.objects: Timer 'heartbeat' succeeded.
2022-02-17 18:47:50,848 [INFO] kopf.objects: Timer 'heartbeat' succeeded.
2022-02-17 18:52:50,853 [INFO] kopf.objects: Timer 'heartbeat' succeeded.
2022-02-17 18:57:50,858 [INFO] kopf.objects: Timer 'heartbeat' succeeded.
2022-02-17 19:02:50,862 [INFO] kopf.objects: Timer 'heartbeat' succeeded.

Notification about a successful finish does not exist:

kopf.objects: Handler 'node_maintenance_request_delete_handler' succeeded.

As a workaround, delete the OpenStack Controller pod:

kubectl -n osh-system delete pod -l app.kubernetes.io/name=openstack-operator

[22321] Neutron backend may change from TF to ML2¶

An update of the MOSK cluster with Tungsten Fabric may hang due to the changed Neutron backend with the following symptoms:

The libvirt and nova-compute pods fail to start:

Entrypoint WARNING: 2022/03/03 08:49:45 entrypoint.go:72:
Resolving dependency Pod on same host with labels
map[application:neutron component:neutron-ovs-agent] in namespace openstack failed:
Found no pods matching labels: map[application:neutron component:neutron-ovs-agent] .

In the OSDPL network section, the ml2 backend is specified instead of tungstenfabric:
```
spec:
  features:
    neutron:
      backend: ml2
```

As a workaround, change the backend option from ml2 to tungstenfabric:

spec:
  features:
    neutron:
      backend: tungstenfabric

Release artifacts¶

This section lists the components artifacts of the MOSK 22.1 release.

MOSK 22.1 OpenStack Victoria binaries and Docker images
MOSK 22.1 OpenStack Ussuri binaries and Docker images
MOSK 22.1 OpenStack Helm charts
MOSK 22.1 Tungsten Fabric 5.1 artifacts
MOSK 22.1 Tungsten Fabric 2011 artifacts
MOSK 22.1 StackLight artifacts

MOSK 22.1 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20220119130236.qcow2	Mirantis Proprietary License
Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-bionic-20220119123458	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-bionic-20220119123458	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-bionic-20220119123458	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-bionic-20220119123458	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-bionic-20220119123458	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-bionic-20220119123458	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-bionic-20220119123458	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-bionic-20220119123458	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-bionic-20220119123458	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-bionic-20220119123458	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-bionic-20220119123458	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-bionic-20220119123458	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-bionic-20220119123458	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-bionic-20220119123458	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-bionic-20220119123458	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-bionic-20220119123458	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20220106104815	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20220106104815	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20220106104815	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20220113100346	Apache License 2.0
rabbitmq-3.9	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
rabbitmq-3.9-management	mirantis.azurecr.io/general/rabbitmq:3.9.8-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20211025114106	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.3	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20220106095058	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.9.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20211116035447	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-bionic-20220119123458	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-bionic-20220119123458	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-bionic-20220119123458	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-bionic-20220119123458	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20211018180158	GPL License
descheduler ^NEW	mirantis.azurecr.io/openstack/extra/descheduler:v0.21.0	Apache License 2.0

MOSK 22.1 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20220113104335.qcow2	Mirantis Proprietary License

Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:ussuri-bionic-20220113100346	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:ussuri-bionic-20220113100346	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:ussuri-bionic-20220113100346	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20220113100346	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20220113100346	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20220113100346	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20220113100346	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20220113100346	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20211116035447	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.9.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20220106095058	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.3	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20211025114106	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.9-management	mirantis.azurecr.io/general/rabbitmq:3.9.8-management	Mozilla Public License 2.0
rabbitmq-3.9	mirantis.azurecr.io/general/rabbitmq:3.9.8	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20220113100346	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20220106104815	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20220106104815	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20220106104815	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20220113100346	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20220113100346	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20220113100346	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20220113100346	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20220113100346	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20220113100346	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20220113100346	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20220113100346	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20220113100346	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20220113100346	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20220113100346	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20220113100346	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20211018180158	GPL License
descheduler ^NEW	mirantis.azurecr.io/openstack/extra/descheduler:v0.21.0	Apache License 2.0

MOSK 22.1 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.7.14.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-4004.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)
descheduler ^NEW	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/descheduler-0.1.0-mcp-2782.tgz	Apache License 2.0 (no License file in Helm chart)

MOSK 22.1 Tungsten Fabric 5.1 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.6.11.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.6.11	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20220127155145	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20220127155145	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20220127155145	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20220127155145	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20220127155145	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20220127155145	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20220127155145	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20220127155145	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20220127155145	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20220127155145	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v2.0.2-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.9	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:1.0.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:1.1.0	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.1.0	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.7	Apache License 2.0
	mirantis.azurecr.io/stacklight/prometheus-jmx-exporter:0.14.0-r50	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.5	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.23	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.12	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.3-0.2.13	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.2.2-1-8cba7b0	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:6-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20220120132145	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License

MOSK 22.1 Tungsten Fabric 2011 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.6.11.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.6.11	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:2011.20220202093934	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:2011.20220202093934	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:2011.20220202093934	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:2011.20220202093934	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
Control	contrail-controller-control-control:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:2011.20220202093934	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:2011.20220202093934	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:2011.20220202093934	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:2011.20220202093934	Apache License 2.0
	^NEW mirantis.azurecr.io/tungsten/contrail-node-init:2011.20220202093934	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:2011.20220202093934	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:2011.20220202093934	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v2.0.2-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.9	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.10	Apache License 2.0
	mirantis.azurecr.io/tungsten/cass-config-builder:1.0.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/instaclustr-icarus:1.1.0	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.1.0	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.7	Apache License 2.0
	mirantis.azurecr.io/stacklight/prometheus-jmx-exporter:0.14.0-r50	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.5	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.23	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.12	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.3-0.2.13	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.2.2-1-8cba7b0	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:6-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20220120132145	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:2011.20220202093934	Apache License 2.0

MOSK 22.1 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/prometheus-libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.2.0-mcp-1.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.2.0-mcp-3.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the MOSK 22.1 release:

[16495][OpenStack] Fixed the issue with Kubernetes not rescheduling OpenStack deployment pods after a node recovery.
[18713][OpenStack] Fixed the issue causing inability to remove a Glance image after an unsuccessful instance spawn attempt when image signature verification was enabled but the signature on the image was incorrect.
[19274][OpenStack] Fixed the issue causing inability to create a Heat stack by specifying the Heat template as a URL link due to Horizon container missing proxy variables.
[19791][OpenStack] Fixed the issue with the DEFAULT volume type being automatically created, as well as listed as a volume type in Horizon, for database migration even if default_volume_type was set in cinder.conf.
[18829][StackLight] Fixed the issue causing the Prometheus exporter for the Tungsten Fabric Controller pod to fail to start upon the StackLight log level change.
[18879][Ceph] Fixed the issue with the RADOS Gateway (RGW) pod overriding the global CA bundle located at /etc/pki/tls/certs with an incorrect self-signed CA bundle during deployment of a Ceph cluster.
[19195][Tungsten Fabric] Fixed the issue causing the managed cluster status to flap between the Ready/Not Ready states in the Container Cloud web UI.

Update notes¶

This section describes the specific actions you as a cloud operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster to the version 22.1. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Additionally, read through the Cluster update known issues for the problems that are known to occur during update with recommended workarounds.

Features¶

Virtual CPU mode - new default¶

Starting from MOSK 22.1, the virtual CPU mode is set to host-model by default, which replaces the previous default kvm64 CPU model.

The new default option provides performance and workload portability, namely reliable live and cold migration of instances between hosts, and ability to run modern guest operating systems, such as Windows Server.

For the deployments the virtual CPU mode settings customized through spec:services, remove this customization in favor of the defaults after the update.

Update impact and maintenance windows planning¶

Host OS kernel version upgrade to v5.4¶

MOSK 22.1 includes the updated version of the host machine’s kernel that is v5.4. All nodes in the cluster will get restarted to apply the relevant changes.

Node group	Sequential restart	Impact on end users and workloads
Kubernetes master nodes	Yes	No
Control plane nodes	Yes	No
Storage nodes	Yes	No
Compute nodes	Yes	15-20 minutes of downtime for workloads hosted on a compute node depending on the hardware specifications of the node

Up to 1 minute of downtime for TF data plane¶

Post-update actions¶

Manual restart of TF vRouter Agent pods¶

Warning

Under certain rare circumstances, the reload of the vRouter kernel module triggered by the restart of a vRouter Agent is known to hang due to the inability to complete the drop_caches operation. Watch the status and logs of the vRouter Agent being restarted and trigger the reboot of the node if necessary.

MOS 21.6 release¶

Release date	Name	Cluster release	Highlights
November 11, 2021	MOS 21.6	6.20.0+21.6	Tungsten Fabric 2011 as default Periodic automatic cleanup of OpenStack databases Validation of the TF Operator custom resource Technical preview of image signature verification capability Technical preview of the multi-rack architecture with Tungsten Fabric Improvements to StackLight alerting Technical preview of Vault connectivity configuration

New features¶

MOS 21.6 features¶
Component	Support scope	Feature
OpenStack	Full	Periodic automatic cleanup of OpenStack databases
	TechPreview	Image signature verification
	TechPreview	Vault connectivity configuration
Tungsten Fabric	Full	Validation of the TFOperator custom resource
	TechPreview	Multi-rack architecture with Tungsten Fabric
Tungsten Fabric	Full	Tungsten Fabric 2011 as the default version
StackLight	Full	Improvements to StackLight alerting

Periodic automatic cleanup of OpenStack databases¶

Implemented an automatic cleanup of deleted entries from the databases of OpenStack services. By default, the deleted rows older than 30 days are sanely and safely purged from the Barbican, Cinder, Glance, Heat, Masakari, and Nova databases for all relevant tables.

Learn more

Reference Architecture: features:database:cleanup

Image signature verification¶

TechPreview

Implemented the capability to perform image signature verification when booting an OpenStack instance, uploading a Glance image with signature metadata fields set, and creating a volume from an image.

Learn more

Vault connectivity configuration¶

TechPreview

Implemented the ability to set the kv_mountpoint and namespace in spec:features:barbican to specify the mountpoint of a Key-Value store and the Vault namespace to be used for all requests to Vault respectively.

Learn more

Reference Architecture: features:barbican:backends:vault

Validation of the TFOperator custom resource¶

Implemented the capability for the Tungsten Fabric Operator to use ValidatingAdmissionWebhook to validate environment variables set to Tungsten Fabric components upon the TFOperator object creation or update.

Learn more

Reference Architecture: TFOperator custom resource validation

Tungsten Fabric 2011 as the default version¶

Tungsten Fabric 2011 is now set as the default version for deployment.

Tungsten Fabric 5.1 is considered deprecated and will be declared unsupported in one of the upcoming releases. Therefore, Mirantis highly recommends upgrading from Tungsten Fabric 5.1 to 2011.

Learn more

Deployment guide: Deploy Tungsten Fabric

Multi-rack architecture with Tungsten Fabric¶

TechPreview

Implemented the capability to deploy a MOS with Tungsten Fabric cluster with a multi-rack architecture to allow for native integration with the Layer 3-centric networking topologies.

Learn more

Reference Architecture: Multi-rack architecture

Improvements to StackLight alerting¶

Implemented the following improvements to StackLight alerting:

Implemented per-service *TargetDown and *TargetsOutage alerts that raise if one or all Prometheus targets are down.
Enhanced the alert inhibition rules to reduce alert flooding.

Learn more

Operations guide: StackLight alerts

Major components versions¶

Mirantis has tested MOS against a very specific configuration and can guarantee a predictable behavior of the product only in the exact same environments. The table below includes the major MOS components with the exact versions against which testing has been performed.

MOS 21.6 components versions¶
Component	Version
Cluster release	6.20.0
OpenStack	Victoria (LTS) Ussuri (deprecated)
openstack-operator	0.6.3
Tungsten Fabric	2011 (default) 5.1 (deprecated)
tungstenfabric-operator	0.5.3

See also

Release Compatibility Matrix

Known issues¶

This section describes the MOS known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

Tungsten Fabric known issues and limitations¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.6.

Limitations
[10096] tf-control does not refresh IP addresses of Cassandra pods
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[15684] Pods fail when rolling Tungsten Fabric 2011 back to 5.1
[18148] TF resets BGP_ASN, ENCAP_PRIORITY, and VXLAN_VN_ID_MODE to defaults
[19195] Managed cluster status is flapping between the Ready/Not Ready states

Limitations¶

Tungsten Fabric does not provide the following functionality:

Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
Role Based Access Control (RBAC) for Neutron objects.
Modification of custom vRouter DaemonSets based on the SR-IOV definition in the OsDpl CR.

Learn more

MOS Reference Architecture: Tungsten Fabric known limitations

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

kubectl -n tf delete pod tf-control-<hash>

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

[15684] Pods fail when rolling Tungsten Fabric 2011 back to 5.1¶

The workaround is to manually delete the queue that fails to be deleted by AMQP_QUEUE_DELETE_METHOD:

kubectl -n tf exec -it tf-rabbitmq-<num of replica> -- rabbitmqctl delete_queue <queue name>

[18148] TF resets BGP_ASN, ENCAP_PRIORITY, and VXLAN_VN_ID_MODE to defaults¶

Invalid

During LCM operations such as Tungsten Fabric update or upgrade, the following parameters defined by the cluster administrator are reset to the following defaults upon the tf-config pod restart:

BGP_ASN to 64512
ENCAP_PRIORITY to MPLSoUDP,MPLSoGRE,VXLAN
VXLAN_VN_ID_MODE to automatic

As a workaround, manually set up values for the required parameters if they differ from the defaults:

controllers:
  tf-config:
    provisioner:
      containers:
      - env:
        - name: BGP_ASN
          value: <USER_BGP_ASN_VALUE>
        - name: ENCAP_PRIORITY
          value: <USER_ENCAP_PRIORITY_VALUE>
        name: provisioner

[19195] Managed cluster status is flapping between the Ready/Not Ready states¶

Fixed in MOS 22.1

The status of a managed cluster may be flapping between the Ready and Not Ready states in the Container Cloud web UI. In this case, if the cluster Status field includes a message about not ready tf/tf-tool-status-aggregator and/or tf-tool-status-party deployments with 1/1 replicas, the status flapping may be caused by frequent updates of these deployments by the Tungsten Fabric Operator.

Workaround:

Verify whether the tf/tf-tool-status-aggregator and tf-tool-status-party deployments are up and running:
```
kubectl -n tf get deployments
```

Safely disable the tf/tf-tool-status-aggregator and tf-tool-status-party deployments through the TFOperator CR:

spec:
  controllers:
    tf-tool:
      status:
        enabled: false
      statusAggregator:
        enabled: false
      statusThirdParty:
        enabled: false

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.6.

[25594] Security groups shared through RBAC cannot be used to create instances
[23985] Federated authorization fails after updating Keycloak URL
[6912] Octavia load balancers may not work properly with DVR
[16495] Failure to reschedule OpenStack deployment pods after a node recovery
[18713] Inability to remove a Glance image after an unsuccessful instance spawn
[19274] Horizon container missing proxy variables
[19065] Octavia load balancers lose Amphora VMs after failover

[25594] Security groups shared through RBAC cannot be used to create instances¶

Fixed in MOSK 22.5 for Yoga

Workaround:

Create a port in Neutron:

openstack port create --network <NET> --security-group <SG_ID> shared-sg-port

Pass the created port to Nova:

openstack server create --image <IMAGE> --flavor <FLAVOR> --port shared-sg-port vm-with-shared-sg

Note

if SGs shared through RBAC is used, apply them to ports only not to instances directly.

[23985] Federated authorization fails after updating Keycloak URL¶

Fixed in MOSK 22.3

The failure occurs because such change is not automatically propagated to the corresponding Keycloak identity provider, which was automatically created in Keystone during the initial deployment.

The workaround is to manually update the identity provider’s remote_ids attribute:

Compare the Keycloak URL set in the OpenStackDeployment resource with the one set in Keystone identity provider:

kubectl -n openstack get osdpl -ojsonpath='{.items[].spec.features.keystone.keycloak}'
# vs
openstack identity provider show keycloak -f value -c remote_ids

If the URLs do not coincide, update the identity provider in OpenStack with the correct URL keeping the auth/realms/iam part as shown below. Otherwise, the problem is caused by something else, and you need to proceed with the debugging.
```
openstack identity provider set keycloak --remote-id <new-correct-URL>/auth/realms/iam
```

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

[16495] Failure to reschedule OpenStack deployment pods after a node recovery¶

Fixed in MOS 22.1

Kubernetes does not reschedule OpenStack deployment pods after a node recovery.

As a workaround, delete all pods of the deployment:

for i in $(kubectl -n openstack get deployments |grep -v NAME | awk '{print $1}');
do
kubectl -n openstack rollout restart deployment/$i;
done

Once done, the pods will respawn automatically.

[18713] Inability to remove a Glance image after an unsuccessful instance spawn¶

Fixed in MOS 22.1

When image signature verification is enabled and the signature is incorrect on the image, it is impossible to remove a Glance image right after an unsuccessful instance spawn attempt. As a workaround, wait for at least one minute before trying to remove the image.

[19274] Horizon container missing proxy variables¶

Fixed in MOS 22.1

Horizon container is missing proxy variables. As a result, it is not possible to create a Heat stack by specifying the Heat template as a URL link. As a workaround, use a different upload method and specify the file from a local folder.

[19065] Octavia load balancers lose Amphora VMs after failover¶

Fixed in MOSK 22.3

Workaround:

If your deployment is already affected, manually restore the work of the load balancer by recreating the Amphora VM:
1. Define the load balancer ID:
```
openstack loadbalancer amphora list --column loadbalancer_id --format value --status ERROR
```
2. Start the load balancer failover:
```
openstack loadbalancer failover <Load balancer ID>
```

spec:
  services:
    load-balancer:
      octavia:
        values:
          conf:
            octavia:
              health_manager:
                heartbeat_timeout: 31536000

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.6.

[4288] Cluster update failure with kubelet being stuck
[16987] Cluster update fails at Ceph CSI pod eviction
[18871] MySQL crashes during managed cluster update or instances live migration

[4288] Cluster update failure with kubelet being stuck¶

Note

Moving forward, the workaround for this issue will be moved from Release Notes to Mirantis Container Cloud documentation: MOS clusters update fails with stuck kubelet.

A MOS cluster may fail to update to the latest Cluster release with kubelet being stuck and reporting authorization errors.

The cluster is affected by the issue if you see the Failed to make webhook authorizer request: context canceled error in the kubelet logs:

docker logs ucp-kubelet --since 5m 2>&1 | grep 'Failed to make webhook authorizer request: context canceled'

As a workaround, restart the ucp-kubelet container on the affected node(s):

ctr -n com.docker.ucp snapshot rm ucp-kubelet
docker rm -f ucp-kubelet

Note

Ignore failures in the output of the first command, if any.

[16987] Cluster update fails at Ceph CSI pod eviction¶

Fixed in MOS 22.2

An update of a MOS cluster may fail with the ceph csi-driver is not evacuated yet, waiting… error during the Ceph CSI pod eviction.

Workaround:

Scale the affected StatefulSet of the pod that fails to init down to 0 replicas. If it is the DaemonSet such as nova-compute, it must not be scheduled on the affected node.
On every csi-rbdplugin pod, search for stuck csi-vol:
```
rbd device list | grep <csi-vol-uuid>
```
Unmap the affected csi-vol:
```
rbd unmap -o force /dev/rbd
```

Delete volumeattachment of the affected pod:

kubectl get volumeattachments | grep <csi-vol-uuid>
kubectl delete volumeattacmhent <id>

Scale the affected StatefulSet back to the original number of replicas or until its state is Running. If it is a DaemonSet, run the pod on the affected node again.

[18871] MySQL crashes during managed cluster update or instances live migration¶

Fixed in MOS 22.2

MySQL may crash when performing instances live migration or during an update of a managed cluster running MOS from version 6.19.0 to 6.20.0. After the crash, MariaDB cannot connect to the cluster and gets stuck in the CrashLoopBackOff state.

Workaround:

Verify that other MariaDB replicas are up and running and have joined the cluster:

Verify that at least 2 pods are running and operational (2/2 and Running):

kubectl -n openstack get pods |grep maria

Example of system response where the pods mariadb-server-0 and mariadb-server-2 are operational:

mariadb-controller-77b5ff47d5-ndj68   1/1     Running     0          39m
mariadb-server-0                      2/2     Running     0          39m
mariadb-server-1                      0/2     Running     0          39m
mariadb-server-2                      2/2     Running     0          39m

mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e "show status;" |grep -e \
wsrep_cluster_size -e "wsrep_cluster_status" -e "wsrep_local_state_comment"

Example of system response:

wsrep_cluster_size          2
wsrep_cluster_status        Primary
wsrep_local_state_comment   Synced

Remove the content of the /var/lib/mysql/* directory:

kubectl -n openstack exec -it mariadb-server-1 – rm -rf /var/lib/mysql/*

Restart the MariaDB container:

kubectl -n openstack delete pod mariadb-server-1

Ceph known issues¶

This section lists Ceph known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.6.

[18879] The RGW pod overrides the global CA bundle with an incorrect mount

[18879] The RGW pod overrides the global CA bundle with an incorrect mount¶

Fixed in MOS 22.1

During deployment of a Ceph cluster, the RADOS Gateway (RGW) pod overrides the global CA bundle located at /etc/pki/tls/certs with an incorrect self-signed CA bundle. The issue affects only clusters with public certificates.

Workaround:

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with a corresponding value.

Note

If the public CA certificates also apply to the OsDpl CR, edit this resource as well.

Select from the following options:

If you are using the GoDaddy certificates, in the cephClusterSpec.objectStorage.rgw section, replace the cacert parameters with your public CA certificate that already contains both the root CA certificate and intermediate CA certificate:

cephClusterSpec:
  objectStorage:
    rgw:
      SSLCert:
        cacert: |
          -----BEGIN CERTIFICATE-----
          ca-certificate here
          -----END CERTIFICATE-----
        tlsCert: |
          -----BEGIN CERTIFICATE-----
          private TLS certificate here
          -----END CERTIFICATE-----
        tlsKey: |
          -----BEGIN RSA PRIVATE KEY-----
          private TLS key here
          -----END RSA PRIVATE KEY-----

If you are using the DigiCert certificates:

Download the <root_CA> from DigiCert.

In the cephClusterSpec.objectStorage.rgw section, replace the cacert parameters with your public intermediate CA certificate along with the root one:

cephClusterSpec:
  objectStorage:
    rgw:
      SSLCert:
        cacert: |
          -----BEGIN CERTIFICATE-----
          <root CA here>
          <intermediate CA here>
          -----END CERTIFICATE-----
        tlsCert: |
          -----BEGIN CERTIFICATE-----
          private TLS certificate here
          -----END CERTIFICATE-----
        tlsKey: |
          -----BEGIN RSA PRIVATE KEY-----
          private TLS key here
          -----END RSA PRIVATE KEY-----

Release artifacts¶

This section lists the components artifacts of the MOS 21.6 release:

MOS 21.6 OpenStack Victoria binaries and Docker images
MOS 21.6 OpenStack Ussuri binaries and Docker images
MOS 21.6 OpenStack Helm charts
MOS 21.6 Tungsten Fabric 5.1 artifacts
MOS 21.6 Tungsten Fabric 2011 artifacts
MOS 21.6 StackLight artifacts

MOS 21.6 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20211025092055.qcow2	Mirantis Proprietary License
Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-bionic-20211025085835	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-bionic-20211025085835	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-bionic-20211025085835	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:victoria-bionic-20211025085835	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-bionic-20211025085835	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-bionic-20211025085835	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-bionic-20211025085835	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-bionic-20211025085835	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-bionic-20211025085835	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-bionic-20211025085835	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-bionic-20211025085835	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-bionic-20211025085835	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-bionic-20211025085835	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-bionic-20211025085835	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-bionic-20211025085835	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-bionic-20211025085835	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20211007094813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20211007094813	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20211007094813	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20211025060022	Apache License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.14-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210804080905	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.3	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20211007085158	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.9.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20211007200025	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-bionic-20211025085835	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-bionic-20211025085835	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-bionic-20211025085835	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-bionic-20211025085835	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20211018180158	GPL License

MOS 21.6 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20211025093130.qcow2	Mirantis Proprietary License

Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:ussuri-bionic-20211025085835	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:ussuri-bionic-20211025085835	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:ussuri-bionic-20211025085835	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20211025085835	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20211025085835	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20211025085835	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20211025085835	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20211025085835	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20211007200025	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.9.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20211007085158	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.3	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210804080905	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.14-management	Mozilla Public License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20211025085835	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20211007094813	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20211007094813	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20211007094813	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20211025085835	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20211025085835	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20211025085835	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20211025085835	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20211025085835	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20211025085835	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20211025085835	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20211025085835	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20211025085835	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20211025085835	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20211025085835	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20211025085835	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20211018180158	GPL License

MOS 21.6 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.6.3.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-3956.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2764.tgz	Apache License 2.0 (no License file in Helm chart)

MOS 21.6 Tungsten Fabric 5.1 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.5.3.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.5.3	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20211021075650	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20211021075650	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20211021075650	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20211021075650	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20211021075650	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20211021075650	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20211021075650	Apache License 2.0
Status	mirantis.azurecr.io/tungsten/contrail-status:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:5.1.20211021075650	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20211021075650	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20211021075650	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20211021075650	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v2.0.2-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
	^NEW mirantis.azurecr.io/tungsten/cass-config-builder:1.0.4	Apache License 2.0
	^NEW mirantis.azurecr.io/tungsten/instaclustr-icarus:1.1.0	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.1.0	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.4	Apache License 2.0
	mirantis.azurecr.io/stacklight/prometheus-jmx-exporter:0.14.0-r50	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.5	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.23	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.12	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.3-0.2.13	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.2.2-1-8cba7b0	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:6-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20211019145848	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License

MOS 21.6 Tungsten Fabric 2011 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.4.2.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.4.2	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:2011.20211019042938	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:2011.20211019042938	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:2011.20211019042938	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:2011.20211019042938	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:2011.20211019042938	Apache License 2.0
Control	contrail-controller-control-control:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:2011.20211019042938	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:2011.20211019042938	Apache License 2.0
Status	mirantis.azurecr.io/tungsten/contrail-status:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:2011.20211019042938	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:2011.20211019042938	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:2011.20211019042938	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:2011.20211019042938	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:2011.20211019042938	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v2.0.2-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
	^NEW mirantis.azurecr.io/tungsten/cass-config-builder:1.0.4	Apache License 2.0
	^NEW mirantis.azurecr.io/tungsten/instaclustr-icarus:1.1.0	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.1.0	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.4	Apache License 2.0
	mirantis.azurecr.io/stacklight/prometheus-jmx-exporter:0.14.0-r50	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.5	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.23	Mozilla Public License 2.0
	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.12	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.3-0.2.13	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.2.2-1-8cba7b0	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:6-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20211019145848	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:2011.20211019042938	Apache License 2.0

MOS 21.6 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.1.0-mcp-2.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.1.0-mcp-1.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the Mirantis OpenStack for Kubernetes 21.6 release:

[16452][OpenStack] Fixed the issue causing failure to update the Octavia policy after policies removal from the OsDpl CR. The issue affected OpenStack Victoria.
[16103][OpenStack] Fixed the issue causing Glance client to return the HTTPInternalServerError while operating with volume when Glance was configured with the Cinder backend ^TechPreview.
[17321][OpenStack] Fixed the issue causing RPC errors in the logs of the designate-central pods during liveness probes.
[17927][OpenStack] Fixed the issue causing inability to delete volume backups created from encrypted volumes.
[18029][OpenStack] Fixed the issue with live migration of instances with SR-IOV macvtap ports occasionally requiring the same virtual functions (VF) number to be free on the destination compute nodes.
[18744][OpenStack] Fixed the issue with the application_credential authentication method being disabled in Keystone in case of an enabled Keycloak integration.
[17246][StackLight] Deprecated the redundant openstack.externalFQDN (string) parameter and added the new externalFQDNs.enabled (bool) parameter.

MOS 21.5 release¶

Release date

Name

Cluster release

Highlights

October 05, 2021

MOS 21.5

6.19.0+21.5

Update for the MOS GA release introducing the following key features:

Tungsten Fabric 2011 LTS
Upgrade path for Tungsten Fabric from 5.1 to 2011
Machine-readable status for OpenStack
Technical preview of Tungsten Fabric multiple workers of Contrail API
Technical preview of Cinder volume encryption

New features¶

MOS 21.5 features¶
Component	Support scope	Feature
OpenStack	Full	OpenStackDeploymentStatus custom resource
	TechPreview	Cinder volume encryption
Tungsten Fabric	Full	Tungsten Fabric 2011 LTS
	TechPreview	Tungsten Fabric multiple workers of Contrail API
StackLight	Full	Short names for Kubernetes nodes in Grafana dashboards
	Full	Improvements to StackLight alerting
Documentation	n/a	MOS release compatibility matrix
	n/a	OpenStack and Tungsten Fabric upgrade procedures

OpenStackDeploymentStatus custom resource¶

Implemented a machine-readable status for OpenStack deployments. Now, you can use the OpenStackDeploymentStatus (OsDplSt) custom resource as a single data structure that describes the OpenStackDeployment (OsDpl) status at a particular moment.

Learn more

OpenStackDeploymentStatus custom resource

Cinder volume encryption¶

TechPreview

Implemented the capability to enable Cinder volume encryption through the OpenStackDeployment CR using Barbican that will store the encryption keys and Linux Unified Key Setup (LUKS) that will create encrypted Cinder volumes including bootable ones. If an encrypted volume is bootable, Nova will get a symmetric encryption key from Barbican.

Learn more

Tungsten Fabric 2011 LTS¶

Implemented full support for Tungsten Fabric 2011. Though, Tungsten Fabric 5.1 is deployed by default in MOS 21.5, you can use the tfVersion parameter to define the 2011 version for deployment.

Learn more

Deploy Tungsten Fabric

Tungsten Fabric multiple workers of Contrail API¶

TechPreview

Implemented support for multiple workers of the contrail-api in Tungsten Fabric. Starting from the MOS 21.5 release, six workers of the contrail-api service are used by default. In the previous MOS releases, only one worker of this service was used.

Learn more

Configure multiple Contrail API workers

Short names for Kubernetes nodes in Grafana dashboards¶

Enhanced the Grafana dashboards to display user-friendly short names for Kubernetes nodes, for example, master-0, instead of long name labels such as kaas-node-f736fc1c-3baa-11eb-8262-0242ac110002. This feature provides for consistency with Kubernetes nodes naming in the Mirantis Container Cloud web UI.

All Grafana dashboards that present node data now have an additional Node identifier drop-down menu. By default, it is set to machine to display short names for Kubernetes nodes. To display Kubernetes node name labels as previously, change this option to node.

Learn more

View Grafana dashboards

Improvements to StackLight alerting¶

Implemented the following improvements to StackLight alerting:

Added the OpenstackServiceInternalApiOutage and OpenstackServicePublicApiOutage alerts that raise in case of an OpenStack service internal or public API outage.
Enhanced the alert inhibition rules.
Reworked a number of alerts to improve alerting efficiency and reduce alert flooding.
Removed the inefficient OpenstackServiceApiDown and OpenstackServiceApiOutage alerts.

Learn more

StackLight alerts

MOS release compatibility matrix¶

Published MOS Release Compatibility Matrix that describes the cloud configurations that have been supported by the product over the course of its lifetime and the path a MOS cloud can take to move from an older configuration to a newer one.

Learn more

Release Compatibility Matrix

OpenStack and Tungsten Fabric upgrade procedures¶

Published the OpenStack Ussuri to Victoria upgrade procedure and Tungsten Fabric 5.1 to 2011 upgrade procedure that instruct cloud operators on how to prepare for the upgrade, use the MOS life cycle management API to perform the upgrade, and verify the cloud operability after the upgrade.

Learn more

Upgrade OpenStack

Major components versions¶

Mirantis has tested MOS against a very specific configuration and can guarantee a predictable behavior of the product onlyin the exact same environments. The table below includes the major MOS components with the exact versions against which testing has been performed.

MOS 21.5 components versions¶
Component	Version
Cluster release	6.19.0
OpenStack	Ussuri Victoria
openstack-operator	0.5.7
Tungsten Fabric	5.1 (default) 2011
tungstenfabric-operator	0.4.2

Known issues¶

This section describes the MOS known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

Tungsten Fabric known issues and limitations¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.5.

Limitations
[10096] tf-control does not refresh IP addresses of Cassandra pods
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[15684] Pods fail when rolling Tungsten Fabric 2011 back to 5.1
[18148] TF resets BGP_ASN, ENCAP_PRIORITY, and VXLAN_VN_ID_MODE to defaults

Limitations¶

Tungsten Fabric does not provide the following functionality:

Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
Role Based Access Control (RBAC) for Neutron objects.
Modification of custom vRouter DaemonSets based on the SR-IOV definition in the OsDpl CR.

Learn more

MOS Reference Architecture: Tungsten Fabric known limitations

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

kubectl -n tf delete pod tf-control-<hash>

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

[15684] Pods fail when rolling Tungsten Fabric 2011 back to 5.1¶

The workaround is to manually delete the queue that fails to be deleted by AMQP_QUEUE_DELETE_METHOD:

kubectl -n tf exec -it tf-rabbitmq-<num of replica> -- rabbitmqctl delete_queue <queue name>

[18148] TF resets BGP_ASN, ENCAP_PRIORITY, and VXLAN_VN_ID_MODE to defaults¶

Invalid

During LCM operations such as Tungsten Fabric update or upgrade, the following parameters defined by the cluster administrator are reset to the following defaults upon the tf-config pod restart:

BGP_ASN to 64512
ENCAP_PRIORITY to MPLSoUDP,MPLSoGRE,VXLAN
VXLAN_VN_ID_MODE to automatic

As a workaround, manually set up values for the required parameters if they differ from the defaults:

controllers:
  tf-config:
    provisioner:
      containers:
      - env:
        - name: BGP_ASN
          value: <USER_BGP_ASN_VALUE>
        - name: ENCAP_PRIORITY
          value: <USER_ENCAP_PRIORITY_VALUE>
        name: provisioner

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.5.

[6912] Octavia load balancers may not work properly with DVR
[16495] Failure to reschedule OpenStack deployment pods after a node recovery
[16452] Failure to update the Octavia policy after policies removal
[16103] Glance client returns HTTPInternalServerError error
[19065] Octavia load balancers lose Amphora VMs after failover

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

[16495] Failure to reschedule OpenStack deployment pods after a node recovery¶

Kubernetes does not reschedule OpenStack deployment pods after a node recovery.

As a workaround, delete all pods of the deployment:

for i in $(kubectl -n openstack get deployments |grep -v NAME | awk '{print $1}');
do
kubectl -n openstack rollout restart deployment/$i;
done

Once done, the pods will respawn automatically.

[16452] Failure to update the Octavia policy after policies removal¶

Fixed in MOS 21.6

The Octavia policy fails to be updated after policies removal from the OsDpl CR. The issue affects OpenStack Victoria.

As a workaround, restart the Octavia API pods:

kubectl -n openstack delete pod -l application=octavia,component=api

[16103] Glance client returns HTTPInternalServerError error¶

Fixed in MOSK 21.6

When Glance is configured with the Cinder backend ^TechPreview, the Glance client may return the HTTPInternalServerError error while operating with volume. In this case, repeat the action again until it succeeds.

[19065] Octavia load balancers lose Amphora VMs after failover¶

Fixed in MOSK 22.3

Workaround:

If your deployment is already affected, manually restore the work of the load balancer by recreating the Amphora VM:
1. Define the load balancer ID:
```
openstack loadbalancer amphora list --column loadbalancer_id --format value --status ERROR
```
2. Start the load balancer failover:
```
openstack loadbalancer failover <Load balancer ID>
```

spec:
  services:
    load-balancer:
      octavia:
        values:
          conf:
            octavia:
              health_manager:
                heartbeat_timeout: 31536000

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.5.

[4288] Cluster update failure with kubelet being stuck
[16987] Cluster update fails at Ceph CSI pod eviction

[4288] Cluster update failure with kubelet being stuck¶

A MOS cluster may fail to update to the latest Cluster release with kubelet being stuck and reporting authorization errors.

The cluster is affected by the issue if you see the Failed to make webhook authorizer request: context canceled error in the kubelet logs:

docker logs ucp-kubelet --since 5m 2>&1 | grep 'Failed to make webhook authorizer request: context canceled'

As a workaround, restart the ucp-kubelet container on the affected node(s):

ctr -n com.docker.ucp snapshot rm ucp-kubelet
docker rm -f ucp-kubelet

Note

Ignore failures in the output of the first command, if any.

[16987] Cluster update fails at Ceph CSI pod eviction¶

Fixed in MOS 22.2

An update of a MOS cluster may fail with the ceph csi-driver is not evacuated yet, waiting… error during the Ceph CSI pod eviction.

Workaround:

Scale the affected StatefulSet of the pod that fails to init down to 0 replicas. If it is the DaemonSet such as nova-compute, it must not be scheduled on the affected node.
On every csi-rbdplugin pod, search for stuck csi-vol:
```
rbd device list | grep <csi-vol-uuid>
```
Unmap the affected csi-vol:
```
rbd unmap -o force /dev/rbd
```

Delete volumeattachment of the affected pod:

kubectl get volumeattachments | grep <csi-vol-uuid>
kubectl delete volumeattacmhent <id>

Scale the affected StatefulSet back to the original number of replicas or until its state is Running. If it is a DaemonSet, run the pod on the affected node again.

Ceph known issues¶

This section lists Ceph known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.5.

[18879] The RGW pod overrides the global CA bundle with an incorrect mount

[18879] The RGW pod overrides the global CA bundle with an incorrect mount¶

Workaround:

Open the KaasCephCluster CR of a managed cluster for editing:
```
kubectl edit kaascephcluster -n <managedClusterProjectName>
```
Substitute <managedClusterProjectName> with a corresponding value.

Note

If the public CA certificates also apply to the OsDpl CR, edit this resource as well.

Select from the following options:

cephClusterSpec:
  objectStorage:
    rgw:
      SSLCert:
        cacert: |
          -----BEGIN CERTIFICATE-----
          ca-certificate here
          -----END CERTIFICATE-----
        tlsCert: |
          -----BEGIN CERTIFICATE-----
          private TLS certificate here
          -----END CERTIFICATE-----
        tlsKey: |
          -----BEGIN RSA PRIVATE KEY-----
          private TLS key here
          -----END RSA PRIVATE KEY-----

If you are using the DigiCert certificates:

Download the <root_CA> from DigiCert.

In the cephClusterSpec.objectStorage.rgw section, replace the cacert parameters with your public intermediate CA certificate along with the root one:

cephClusterSpec:
  objectStorage:
    rgw:
      SSLCert:
        cacert: |
          -----BEGIN CERTIFICATE-----
          <root CA here>
          <intermediate CA here>
          -----END CERTIFICATE-----
        tlsCert: |
          -----BEGIN CERTIFICATE-----
          private TLS certificate here
          -----END CERTIFICATE-----
        tlsKey: |
          -----BEGIN RSA PRIVATE KEY-----
          private TLS key here
          -----END RSA PRIVATE KEY-----

Release artifacts¶

This section lists the components artifacts of the MOS 21.5 release:

MOS 21.5 OpenStack Victoria binaries and Docker images
MOS 21.5 OpenStack Ussuri binaries and Docker images
MOS 21.5 OpenStack Helm charts
MOS 21.5 Tungsten Fabric 5.1 artifacts
MOS 21.5 Tungsten Fabric 2011 artifacts
MOS 21.5 StackLight artifacts

MOS 21.5 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20210819064238.qcow2	Mirantis Proprietary License
Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-bionic-20210913060022	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-bionic-20210913060022	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-bionic-20210913060022	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-bionic-20210909120529	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:victoria-bionic-20210913060022	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-bionic-20210913060022	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-bionic-20210913060022	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-bionic-20210913060022	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-bionic-20210913060022	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-bionic-20210913060022	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-bionic-20210913060022	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-bionic-20210913060022	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-bionic-20210913060022	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-bionic-20210913060022	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-bionic-20210913060022	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-bionic-20210913060022	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-bionic-20210913060022	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210617094817	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210617094817	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210617094817	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210913060022	Apache License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.14-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210804080905	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.0	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210617085111	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210830173823	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:victoria-bionic-20210913060022	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-bionic-20210913060022	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-bionic-20210913060022	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-bionic-20210913060022	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20210511072313	GPL License

MOS 21.5 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20210121085750.qcow2	Mirantis Proprietary License

Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:ussuri-bionic-20210910182329	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:ussuri-bionic-20210910182329	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:ussuri-bionic-20210910182329	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:ussuri-bionic-20210909094910	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20210910182329	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20210910182329	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20210910182329	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20210910182329	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20210910182329	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210830173823	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210617085111	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.49.0	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210804080905	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.14-management	Mozilla Public License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210910182329	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210617094817	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210617094817	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210617094817	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20210910182329	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20210910182329	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20210910182329	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20210910182329	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20210910182329	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20210910182329	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20210910182329	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20210910182329	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20210910182329	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20210910182329	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20210910182329	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20210910182329	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20210511072313	GPL License

MOS 21.5 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.5.7.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-3906.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
masakari	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2749.tgz	Apache License 2.0 (no License file in Helm chart)

MOS 21.5 Tungsten Fabric 5.1 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.4.2.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.4.2	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20210826171459	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20210826171459	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20210826171459	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20210826171459	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20210826171459	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20210826171459	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20210826171459	Apache License 2.0
Status	mirantis.azurecr.io/tungsten/contrail-status:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:5.1.20210826171459	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20210826171459	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20210826171459	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20210826171459	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.9	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.4	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.4	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.10	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20210428100236	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License

MOS 21.5 Tungsten Fabric 2011 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.4.2.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.4.2	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:2011.20210906142040	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:2011.20210906142040	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:2011.20210906142040	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:2011.20210906142040	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:2011.20210906142040	Apache License 2.0
Control	contrail-controller-control-control:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:2011.20210906142040	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:2011.20210906142040	Apache License 2.0
Status	mirantis.azurecr.io/tungsten/contrail-status:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:2011.20210906142040	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:2011.20210906142040	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:2011.20210906142040	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:2011.20210906142040	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:2011.20210906142040	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.9	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.4	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.4	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.10	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20210428100236	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License
TF Tools	mirantis.azurecr.io/tungsten/contrail-tools:2011.20210906142040	Apache License 2.0

MOS 21.5 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.1.0-mcp-2.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.1.0-mcp-1.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the Mirantis OpenStack for Kubernetes 21.5 release:

[17115][Update] Fixed the issue with the status.providerStatus.releaseRefs.previous.name field in the Cluster object for Ceph not changing during the MOS cluster update. If you have previously applied the workaround as described in [17115] Cluster update does not change releaseRefs in Cluster object for Ceph, manually add the subresources section back to the clusterworkloadlock CRD:
```
kubectl edit crd clusterworkloadlocks.lcm.mirantis.com
# add here 'subresources' section:
spec:
  versions:
  - name: v1alpha1
    subresources:
      status: {}
```
[17477][Update] Fixed the issue with StackLight in HA mode placed on controller nodes being not deployed or cluster update being blocked. Once you update your MOS cluster from the Cluster release 6.18.0 to 6.19.0, roll back the workaround applied as described in [17477] StackLight in HA mode is not deployed or cluster update is blocked:
1. Remove stacklight labels from worker nodes. Wait for the labels to be removed.
2. Remove the custom nodeSelector section from the cluster spec.
[16103][OpenStack] Fixed the issue with the Glance client returning the HTTPInternalServerError error while operating with a volume if Glance was configured with the Cinder backend ^TechPreview.
[14678][OpenStack] Fixed the issue with instance being inaccessible through floating IP upon floating IP quick reuse when using a small floating network.
[16963][OpenStack] Fixed the issue with Ironic failing to provide nodes on deployments with OpenStack Victoria.
[16241][Tungsten Fabric] Fixed the issue causing failure to update a port, or security group assigned to the port, through the Horizon web UI.
[17045][StackLight] Fixed the issue causing the fluentd-notifications pod failing to track the RabbitMQ credentials updates in the Secret object.
[17573][StackLight] Fixed the issue with OpenStack notifications missing in Elasticsearch and the Kibana notification-* index being empty.

MOS 21.4 release¶

Release date

Name

Container Cloud Cluster release

Highlights

September 01, 2021

MOS 21.4

6.18.0+21.4

Update for the MOS GA release introducing the following key features:

Full support for OpenStack Victoria with OVS or Tungsten Fabric 5.1 with the verified Ussuri to Victoria upgrade path
Technical preview of SR-IOV and DPDK for Tungsten Fabric 2011
Technical preview of Masakari instance evacuation

New features¶

MOS 21.4 features¶
Component	Support scope	Feature
OpenStack	Full	OpenStack Victoria LTS
	Full	Default policies override for core OpenStack services
	TechPreview	Masakari instance evacuation
	Full	Helm v3 for OpenStack operator
	TechPreview	Cinder backend for Glance
	TechPreview	Compact control plane for small Open vSwitch-based clouds
	Full	SR-IOV with OVS
	Full	Large deployments support
	TechPreview	BGP VPN
	n/a	MOS API Reference
Tungsten Fabric	TechPreview	Tungsten Fabric 2011 SR-IOV and DPDK
StackLight	Full	Improvements to StackLight alerting
	Full	Kibana improvement

OpenStack Victoria LTS¶

Implemented full support for OpenStack Victoria with OVS or Tungsten Fabric 5.1. However, for new OpenStack Victoria with Tungsten Fabric deployments, Mirantis recommends that you install Tungsten Fabric 2011, which is shipped as TechPreview in this release.

Verified the upgrade path from Ussuri with OVS or Tungsten Fabric 5.1 to Victoria with OVS or Tungsten Fabric 5.1.

OpenStack Ussuri is considered deprecated and will be declared unsupported in one of the upcoming releases. Therefore, start planning your Ussuri to Victoria cloud upgrade.

Default policies override for core OpenStack services¶

Implemented the mechanism to define additional policy rules for the core OpenStack services through the OpenStackDeployment Custom Resource.

Learn more

MOS Reference Architecture: features:policies

Masakari instance evacuation¶

TechPreview

Implemented support for Masakari instance evacuation. Now, Masakari host monitor is deployed by default with Instances High Availability Service for OpenStack to provide automatic instance evacuation from failed instances.

Learn more

Helm v3 for OpenStack operator¶

Implemented the usage of direct Helm 3 communication by the OpenStack operator. The usage of HelmBundles is dropped and automatic transition from Helm 2 to Helm 3 is performed during the MOS 21.3 to MOS 21.4 release update.

Cinder backend for Glance¶

TechPreview

Implemented the capability to configure Cinder backend for images through the OpenStackDeployment Custom Resource. The usage of Cinder backend for Glance enables the OpenStack clouds relying on third-party appliances for block storage to have images in one place.

Learn more

MOS Deployment Guide: Enable Cinder backend for Glance

Compact control plane for small Open vSwitch-based clouds¶

TechPreview

Added the capability to collocate the OpenStack control plane with the managed cluster master nodes through the OpenStackDeployment Custom Resource.

Note

If the StackLight cluster is configured to run in the HA mode on the same nodes with the control plane services, additional manual steps are required for an upgrade to MOS 21.4 or for a greenfield deployment. For details, see known issue 17477.

Learn more

MOS Deployment Guide: Create a MOS managed cluster

SR-IOV with OVS¶

Implemented full support for the SR-IOV with the Neutron OVS backend topology.

Learn more

MOS Deployment Guide: Enable SR-IOV with OVS

Large deployments support¶

Added support for large-scale deployments that number up to 200 nodes out of the box. The use case has been verified for core OpenStack services with OVS and non-DVR Neutron configuration on a dedicated hardware scale lab.

For a successful deployment, we recommend sticking to the optimal limit for the number of ports on gateway nodes that is 1500 ports per gateway node. This recommendation was confirmed during the testing and should be taken into account when planning large environments.

BGP VPN¶

TechPreview

Implemented the capability to enable the BGP VPN service to allow for connection of OpenStack Virtual Private Networks with external VPN sites through either BGP/MPLS IP VPNs or E-VPN.

Learn more

MOS Deployment Guide: Enable BGP VPN

MOS API Reference¶

Published MOS API Reference to provide cloud operators with an up-to-date and comprehensive definition of the language they need to use to communicate with MOS OpenStack and Tungsten Fabric.

Learn more

OpenStack Operator resources

Tungsten Fabric 2011 SR-IOV and DPDK¶

TechPreview

Implemented support for SR-IOV and DPDK with Tungsten Fabric 2011.

Improvements to StackLight alerting¶

Implemented the following improvements to StackLight alerting:

Added the following alerts:
- CinderServiceDisabled that raises when a Cinder service is disabled on all hosts.
- NeutronAgentDisabled that raises when a Neutron Agent is disabled on all hosts.
- NeutronAgentOutage that raises when a Neutron Agent is down on all hosts where it is enabled.
- NovaServiceDisabled that raises when a Nova service is disabled on all hosts.
- TungstenFabricAPI401Critical that raises when Tungsten Fabric API responds with HTTP 401.
- TungstenFabricAPI5xxCritical that raises when Tungsten Fabric API responds with HTTP 5xx.
Reworked the alert inhibition rules.
Reworked a number of alerts to improve alerting efficiency and reduce alert flooding.
Removed the inefficient *ServicesDownMajor, *ServicesDownMinor, *AgentsDownMajor, *AgentsDownMinor alerts.

Learn more

MOS Operations Guide: StackLight alerts

Kibana improvement¶

Enhanced StackLight to send all OpenStack notifications to the notification index. Now, to view the previously called audit notifications, see the cadf Logger in the Kibana Notifications dashboard.

Learn more

MOS Operations Guide: View OpenSearch dashboards

Major components versions¶

MOS 21.4 components versions¶
Component	Version
Cluster release	6.18.0
OpenStack	Ussuri Victoria
openstack-operator	0.4.7
Tungsten Fabric	5.1 2011 ^TechPrev
tungstenfabric-operator	0.3.9

Known issues¶

This section describes the MOS known issues with available workarounds. For the known issues in the related version of Mirantis Container Cloud, refer to Mirantis Container Cloud: Release Notes.

Tungsten Fabric known issues and limitations¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.4.

Limitations
[10096] tf-control does not refresh IP addresses of Cassandra pods
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[16241] Failure to update instance port through Horizon

Limitations¶

Tungsten Fabric does not provide the following functionality:

Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
Role Based Access Control (RBAC) for Neutron objects.
Modification of custom vRouter DaemonSets based on the SR-IOV definition in the OsDpl CR.

Learn more

MOS Reference Architecture: Tungsten Fabric known limitations

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

kubectl -n tf delete pod tf-control-<hash>

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

[16241] Failure to update instance port through Horizon¶

^{Fixed in MOS 21.5}

Updating a port or security group assigned to the port through the Horizon web UI fails with an error.

Workaround:

Update the port through the Tungsten Fabric web UI:
1. Log in to the Tungsten Fabric web UI.
2. Navigate to Configure > Networking > Ports.
3. Click the gear icon next to the required port and click Edit.
4. Change the parameters as required and click Save.
Update the port through CLI:
1. Log in to the keystone-client pod.
2. Run the openstack port set with the required parameters. For example:
```
openstack port set 48f7dfce-9111-4951-a2d3-95a63c94b64e --name port-name-changed
```

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.4.

[6912] Octavia load balancers may not work properly with DVR
[14678] Instance inaccessible through floating IP upon floating IP quick reuse
[16963] Ironic cannot provide nodes
[16495] Failure to reschedule OpenStack deployment pods after a node recovery
[16452] Failure to update the Octavia policy after policies removal
[16103] Glance client returns HTTPInternalServerError error

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

[14678] Instance inaccessible through floating IP upon floating IP quick reuse¶

^{Fixed in MOS 21.5}

When using a small floating network and the floating IP that was previously allocated to an instance and re-associated with another instance in a short period of time, the instance may be inaccessible. The Address Resolution Protocol (ARP) cache timeout on the infrastructure layer is typically set to 5 minutes.

As a workaround, set a shorter ARP cache timeout on the infrastructure side.

[16963] Ironic cannot provide nodes¶

^{Fixed in MOS 21.5}

On deployments with OpenStack Victoria, Ironic may fail to provide nodes.

Workaround:

In the OsDpl CR, set valid_interfaces to public,internal:

spec:
  services:
    baremetal:
      ironic:
        values:
          conf:
            ironic:
              service_catalog:
                valid_interfaces: public,internal

Trigger the OpenStack deployment to restart Ironic:

kubectl apply -f openstackdeployment.yaml

To monitor the status:

kubectl -n openstack get pods
kubectl -n openstack describe osdpl osh-dev

[16495] Failure to reschedule OpenStack deployment pods after a node recovery¶

Kubernetes does not reschedule OpenStack deployment pods after a node recovery.

As a workaround, delete all pods of the deployment:

for i in $(kubectl -n openstack get deployments |grep -v NAME | awk '{print $1}');
do
kubectl -n openstack rollout restart deployment/$i;
done

Once done, the pods will respawn automatically.

[16452] Failure to update the Octavia policy after policies removal¶

Fixed in MOS 21.6

The Octavia policy fails to be updated after policies removal from the OsDpl CR. The issue affects OpenStack Victoria.

As a workaround, restart the Octavia API pods:

kubectl -n openstack delete pod -l application=octavia,component=api

[16103] Glance client returns HTTPInternalServerError error¶

Fixed in MOSK 21.6

StackLight known issues¶

This section lists the StackLight known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.4.

[17045] fluentd-notifications does not track RabbitMQ credentials updates
[17573] OpenStack notifications missing in Elasticsearch and Kibana

[17045] fluentd-notifications does not track RabbitMQ credentials updates¶

^{Fixed in MOS 21.5}

The fluentd-notifications pod fails to track the RabbitMQ credentials updates in the Secret object. In this case, the fluentd-notifications pods in the StackLight namespace are being restarted too often with the following error message present in logs:

Authentication with RabbitMQ failed. Please check your connection settings.

Workaround:

Delete the affected fluentd-notifications pod. For example:

kubectl delete pod fluentd-notifications-cfcf77f9-h2wd7 -n stacklight

Once done, a new pod will be created automatically and will read the valid credentials from the Secret object.

If the issue still persists, apply the workaround described in [17573] OpenStack notifications missing in Elasticsearch and Kibana.

[17573] OpenStack notifications missing in Elasticsearch and Kibana¶

^{Fixed in MOS 21.5}

OpenStack notifications may be missing in Elasticsearch and the Kibana notification-* index may be empty. In this case, error messages similar to the following one may be present in the fluentd-notifications logs:

kubectl logs -l release=fluentd-notifications -n stacklight
2021-09-08 11:02:49 +0000 [error]: #0 unexpected error error_class=Bunny::
# AuthenticationFailureError error="Authentication with RabbitMQ failed.
# Please check your connection settings. Username: stacklight5BwVEEwd4s,
# vhost: openstack, password length: 10"

Workaround:

Apply the workaround described in [17045] fluentd-notifications does not track RabbitMQ credentials updates. If the issue still persists, perform the following steps:

On the affected managed cluster, obtain the proper user name and password from the rabbitmq-creds Secret in the openstack-lma-shared namespace (strip b' prefix and ' suffix):

kubectl get secret -n openstack-lma-shared rabbitmq-creds -o jsonpath="{.data.password}" | base64 -d
b'81g4B0wvEJ0rB2LWdAQcMBANf3E2DaEa'
kubectl get secret -n openstack-lma-shared rabbitmq-creds -o jsonpath="{.data.username}" | base64 -d
b'stacklightN7MAuk4LVd'

On the related management cluster, modify the affected Cluster object by specifying the obtained user name and password.

kubectl edit cluster <affectedManagedClusterName> -n <affectedManagedClusterProjectName>

For example:

spec:
  ...
  providerSpec:
    ...
    value:
      ...
      helmReleases:
        ...
        - name: stacklight
          values:
            ...
            openstack:
              rabbitmq:
                credentialsDiscovery:
                  enabled: false
                credentialsConfig:
                  username: stacklightN7MAuk4LVd
                  password: 81g4B0wvEJ0rB2LWdAQcMBANf3E2DaEa

Cluster update known issues¶

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.4.

[17477] StackLight in HA mode is not deployed or cluster update is blocked
[17305] Cluster update fails with the ‘Not ready releases: descheduler’ error
[16987] Cluster update fails at Ceph CSI pod eviction
[17115] Cluster update does not change releaseRefs in Cluster object for Ceph
[17038] Cluster update may fail with TimeoutError

[17477] StackLight in HA mode is not deployed or cluster update is blocked¶

^{Fixed in MOS 21.5}

The deployment of new managed clusters using the Cluster release 6.18.0 with StackLight enabled in the HA mode on control plane nodes does not have StackLight deployed. The update of existing clusters with such StackLight configuration that were created using the Cluster release 6.16.0 is blocked with the following error message:

cluster release version upgrade is forbidden: \
Minimum number of worker machines with StackLight label is 3

Workaround:

On the affected managed cluster:
1. Create a key-value pair that will be used as a unique label on the cluster nodes. In our example, it is forcedRole: stacklight.
  
  To verify the labels names that already exist on the cluster nodes:
```
kubectl get nodes --show-labels
```
2. Add the new label to the target nodes for StackLight. For example, to the Kubernetes master nodes:
```
kubectl label nodes --selector=node-role.kubernetes.io/master forcedRole=stacklight
```
3. Verify that the new label is added:
```
kubectl get nodes --show-labels
```

On the related management cluster:

Configure nodeSelector for the StackLight components by modifying the affected Cluster object:

kubectl edit cluster <affectedManagedClusterName> -n <affectedManagedClusterProjectName>

For example:

spec:
  ...
  providerSpec:
    ...
    value:
      ...
      helmReleases:
        ...
        - name: stacklight
          values:
            ...
            nodeSelector:
              default:
                forcedRole: stacklight

Select from the following options:
- If you faced the issue during a managed cluster deployment, skip this step.
- If you faced the issue during a managed cluster update, wait until all StackLight components resources are recreated on the target nodes with updated node selectors.
 
 To monitor the cluster status:
```
kubectl get cluster <affectedManagedClusterName> -n <affectedManagedClusterProjectName> -o jsonpath='{.status.providerStatus.conditions[?(@.type=="StackLight")]}' | jq
```
 In the cluster status, verify that the elasticsearch-master and prometheus-server resources are ready. The process can take up to 30 minutes.
 
 Example of a negative system response:
```
{
 "message": "not ready: statefulSets: stacklight/elasticsearch-master got 2/3 replicas",
 "ready": false,
 "type": "StackLight"
}
```

In the Container Cloud web UI, add a fake StackLight label to any 3 worker nodes to satisfy the deployment requirement as described in Mirantis Container Cloud Operations Guide: Create a machine using web UI. Eventually, StackLight will be still placed on the target nodes with the forcedRole: stacklight label.

Once done, the StackLight deployment or update proceeds

[17305] Cluster update fails with the ‘Not ready releases: descheduler’ error¶

^{Affects only MOS 21.4}

An update of a MOS cluster from the Cluster release 6.16.0 to 6.18.0 may fail with the following exemplary error message:

Cluster data status: conditions:
- message: 'Helm charts are not installed(upgraded) yet. Not ready releases: descheduler.'
  ready: false
  type: Helm

The issue may affect the descheduler and metrics-server Helm releases.

As a workaround, run helm uninstall descheduler or helm uninstall metrics-server and wait for Helm Controller to recreate the affected release.

[16987] Cluster update fails at Ceph CSI pod eviction¶

^{Fixed in MOS 22.2}

An update of a MOS cluster may fail with the ceph csi-driver is not evacuated yet, waiting… error during the Ceph CSI pod eviction.

Workaround:

Scale the affected StatefulSet of the pod that fails to init down to 0 replicas. If it is the DaemonSet such as nova-compute, it must not be scheduled on the affected node.
On every csi-rbdplugin pod, search for stuck csi-vol:
```
rbd device list | grep <csi-vol-uuid>
```
Unmap the affected csi-vol:
```
rbd unmap -o force /dev/rbd
```

Delete volumeattachment of the affected pod:

kubectl get volumeattachments | grep <csi-vol-uuid>
kubectl delete volumeattacmhent <id>

Scale the affected StatefulSet back to the original number of replicas or until its state is Running. If it is a DaemonSet, run the pod on the affected node again.

[17115] Cluster update does not change releaseRefs in Cluster object for Ceph¶

^{Fixed in MOS 21.5}

During an update of a MOS cluster from the Cluster release 6.16.0 to 6.18.0, the status.providerStatus.releaseRefs.previous.name field in the Cluster object does not change.

Workaround:

In the clusterworkloadlock CRD, remove the subresources section:

kubectl edit crd clusterworkloadlocks.lcm.mirantis.com
# remove here 'subresources' section:
spec:
   versions:
   - name: v1alpha1
     subresources:
       status: {}

Obtain clusterRelease from the ceph-controller settings ConfigMap:

kubectl -n ceph-lcm-mirantis get cm ccsettings -o jsonpath='{.data.clusterRelease}'

Create a ceph-cwl.yaml file with Ceph ClusterWorkloadLock:

apiVersion: lcm.mirantis.com/v1alpha1
kind: ClusterWorkloadLock
metadata:
  name: ceph-clusterworkloadlock
spec:
  controllerName: ceph
status:
  state: inactive
  release: <clusterRelease> # from the previous step

Substitute <clusterRelease> with clusterRelease obtained in the previous step.

Apply the resource:
```
kubectl apply -f ceph-cwl.yaml
```

Verify that the lock has been created:

kubectl get clusterworkloadlock ceph-clusterworkloadlock -o yaml

[17038] Cluster update may fail with TimeoutError¶

^{Affects only MOS 21.4}

A MOS cluster update from the Cluster version 6.16.0 to 6.18.0 may fail with the Timeout waiting for pods statuses timeout error. The error means that pods containers will be not ready and will often restart with OOMKilled as a restart reason. For example:

kubectl describe pod prometheus-server-0 -n stacklight
...
Containers:
  ...
  prometheus-server:
    ...
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Mon, 16 Aug 2021 12:47:57 +0400
      Finished:     Mon, 16 Aug 2021 12:58:02 +0400
...

Workaround:

In the cluster object, set clusterSize to medium as described in Mirantis Container Cloud Operations Guide: StackLight configuration parameters.
Wait until the updated resource limits propagate to the prometheus-server StatefulSet object.

Delete the affected prometheus-server pods. For example:

kubectl delete pods prometheus-server-0 prometheus-server-1 -n stacklight

Once done, new pods with updated resource limits will be created automatically.

Release artifacts¶

This section lists the components artifacts of the MOS 21.4 release:

MOS 21.4 OpenStack Victoria binaries and Docker images
MOS 21.4 OpenStack Ussuri binaries and Docker images
MOS 21.4 OpenStack Helm charts
MOS 21.4 Tungsten Fabric 5.1 artifacts
MOS 21.4 Tungsten Fabric 2011 artifacts
MOS 21.4 StackLight artifacts

MOS 21.4 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20210813064403.qcow2	Mirantis Proprietary License
Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-bionic-20210816060025	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-bionic-20210816060025	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-bionic-20210816060025	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-bionic-20210812104817	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:victoria-bionic-20210816060025	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-bionic-20210816060025	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-bionic-20210816060025	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-bionic-20210816060025	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-bionic-20210816060025	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-bionic-20210816060025	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-bionic-20210816060025	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-bionic-20210816060025	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-bionic-20210816060025	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-bionic-20210816060025	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-bionic-20210816060025	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-bionic-20210816060025	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-bionic-20210816060025	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210617094817	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210617094817	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210617094817	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210809060028	Apache License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.14-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210804080905	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.47.0	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210617085111	GPLv2, LGPLv2.1 (client libraries)
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210729214058	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
aodh	mirantis.azurecr.io/openstack/aodh:victoria-bionic-20210816060025	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-bionic-20210816060025	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-bionic-20210816060025	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-bionic-20210816060025	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20210511072313	GPL License

MOS 21.4 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20210121085750.qcow2	Mirantis Proprietary License

Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:ussuri-bionic-20210805152625	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:ussuri-bionic-20210805152625	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:ussuri-bionic-20210805152625	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:ussuri-bionic-20210805094820	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20210805152625	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20210805152625	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20210805152625	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20210805152625	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20210805152625	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210729214058	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210617085111	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.47.0	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210804080905	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.14-management	Mozilla Public License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210805152625	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210617094817	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210617094817	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210617094817	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20210805152625	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20210805152625	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20210805152625	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20210805152625	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20210805152625	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20210805152625	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20210805152625	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20210805152625	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20210805152625	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20210805152625	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20210805152625	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20210805152625	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20210511072313	GPL License

MOS 21.4 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.4.8.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
masakari ^NEW	https://binary.mirantis.com/openstack/helm/openstack-helm/masakari-0.1.0-mcp-3901.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2743.tgz	Apache License 2.0 (no License file in Helm chart)

MOS 21.4 Tungsten Fabric 5.1 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.3.9.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.3.9	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20210723093724	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20210723093724	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20210723093724	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20210723093724	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20210723093724	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20210723093724	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20210723093724	Apache License 2.0
Status	mirantis.azurecr.io/tungsten/contrail-status:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:5.1.20210723093724	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20210723093724	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20210723093724	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20210723093724	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.9	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.4	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.4	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.10	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20210428100236	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License

MOS 21.4 Tungsten Fabric 2011 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.3.9.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.3.9	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:2011.20210728070330	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:2011.20210728070330	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:2011.20210728070330	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:2011.20210728070330	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:2011.20210728070330	Apache License 2.0
Control	contrail-controller-control-control:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:2011.20210728070330	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:2011.20210728070330	Apache License 2.0
Status	mirantis.azurecr.io/tungsten/contrail-status:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:2011.20210728070330	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:2011.20210728070330	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:2011.20210728070330	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:2011.20210728070330	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:2011.20210728070330	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.9	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.4	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.4	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.10	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20210428100236	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License
TF Tools ^New	mirantis.azurecr.io/tungsten/contrail-tools:2011.20210728070330	Apache License 2.0

MOS 21.4 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.1.0-mcp-2.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.1.0-mcp-1.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the Mirantis OpenStack for Kubernetes 21.4 release:

[13273][OpenStack] Fixed the issue with Octavia amphora getting stuck after the MOS cluster update.
[16849][OpenStack] Fixed the issue causing inability to delete a load balancer with a number higher than the maximum limit in API.
[16180][Tungsten Fabric] Fixed the issue with inability to schedule vRouter DPDK on a node with DPDK and 1 GB huge pages enabled. Enhanced Tungsten Fabric Operator to support 1 GB huge pages for a DPDK-based vRouter.
[16033][Ceph] Fixed the issue with inability to access RADOS Gateway through using S3 authentication. Added rgw s3 auth use keystone = true to the default RADOS Gateway options.
[16604][StackLight] To avoid issues with defunct processes on the OpenStack controller nodes, temporarily disabled instances downtime monitoring and removed the KPI - Downtime Grafana dashboard.

MOS 21.3 release¶

Release date

Name

Container Cloud Cluster release

Highlights

June 15, 2021

MOS 21.3

6.16.0+21.3

Update for the MOS GA release introducing support for Hyperconverged OpenStack compute nodes, SR-IOV and control interface specification for Tungsten Fabric, and the following Technology Preview features:

LVM block storage
East-west traffic encryption
Tungsten Fabric 2011

New features¶

Hyper-converged OpenStack compute nodes
LVM block storage
East-west traffic encryption
Tungsten Fabric control interface specification
Tungsten Fabric 2011
Full support for SR-IOV in Tungsten Fabric
Ceph default configuration options
Disabling of TX offload on NICs used by vRouter
targetSizeRatio in KaasCephCluster
customIngress in KaasCephCluster
Improvements to StackLight alerting

Hyper-converged OpenStack compute nodes¶

Implemented full support for colocation of a cluster services on the same host, for example, Ceph OSD and OpenStack compute. To avoid nodes overloading, limit the hardware resources consumption by the OpenStack compute services as described in Deployment Guide: Limit HW resources for hyperconverged OpenStack compute nodes.

Learn more

MOS Reference Architecture: Components collocation

LVM block storage¶

TechPreview

Implemented the capability to configure LVM as a backend for the OpenStack Block Storage service.

Learn more

MOS Deployment Guide: Enable LVM block storage

East-west traffic encryption¶

TechPreview

Implemented the capability to encrypt the east-west tenant traffic between the OpenStack compute nodes and gateways using strongSwan Internet Protocol Security (IPsec) solution.

Learn more

Tungsten Fabric control interface specification¶

Implemented the capability to specify the TF control service interface for the BGP and XMPP traffic, for example, to combine it with the data traffic.

Learn more

MOS Reference Architecture: Control interface specification

Tungsten Fabric 2011¶

TechPreview

Implemented support for Tungsten Fabric 2011.

Ceph default configuration options¶

Enhanced Ceph Controller to automatically specify default configuration options for each Ceph cluster during the Ceph deployment.

Learn more

Disabling of TX offload on NICs used by vRouter¶

Implemented the capability to disable the transmit (TX) offloading using the DISABLE_TX_OFFLOAD parameter in the TFOperator CR.

Learn more

MOS Deployment Guide: Disable TX offloading on NICs used by vRouter

targetSizeRatio in KaasCephCluster¶

Implemented the targetSizeRatio parameter for the replicated MOS Ceph pools. The targetSizeRatio value specifies the default ratio for each Ceph pool type to define the expected consumption of the Ceph cluster capacity.

Learn more

customIngress in KaasCephCluster¶

Added the customIngress parameter to implement the capability to specify a custom Ingress Controller when configuring the Ceph RGW TLS.

Caution

Starting from MOS 21.3, external Ceph RGW service is not supported and will be deleted during update. If your system already uses endpoints of an external RGW service, reconfigure them to the ingress endpoints.

Learn more

MOS Operations Guide: Configure Ceph RGW TLS

Improvements to StackLight alerting¶

Implemented the following OpenStack service-level alerts on public/ingress endpoints:

OpenstackAPI401Critical
OpenstackAPI5xxCritical
OpenstackPublicAPI401Critical
OpenstackPublicAPI5xxCritical

Learn more

MOS Operations Guide: StackLight alerts

Major components versions¶

MOS 21.3 components versions¶
Component	Version
Cluster release	6.16.0
OpenStack	Ussuri Victoria ^TechPrev
openstack-operator	0.3.33
Tungsten Fabric	5.1 2011 ^TechPrev
tungstenfabric-operator	0.3.1

Known issues¶

This section contains the description of the known issues with available workarounds.

Tungsten Fabric known issues and limitations¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.3.

Limitations
[10096] tf-control does not refresh IP addresses of Cassandra pods
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot

Limitations¶

Tungsten Fabric does not provide the following functionality:

Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
Role Based Access Control (RBAC) for Neutron objects.
Modification of custom vRouter DaemonSets based on the SR-IOV definition in the OsDpl CR.

Learn more

MOS Reference Architecture: Tungsten Fabric known limitations

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

kubectl -n tf delete pod tf-control-<hash>

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.3.

[15525] HelmBundle Controller gets stuck during cluster update
[13273] Octavia amphora may get stuck after cluster update
[6912] Octavia load balancers may not work properly with DVR

[15525] HelmBundle Controller gets stuck during cluster update¶

Affects only MOS 21.3

The HelmBundle Controller that handles OpenStack releases gets stuck during cluster update and does not apply HelmBundle changes. The issue is caused by an unlimited releases history that increases the amount of RAM consumed by Tiller. The workaround is to manually limit the releases number history to 3.

Workaround:

Remove the old releases:

Clean up releases in the stacklight namespace:

function cleanup_release_history {
   pattern=$1
   left_items=${2:-3}
   for i in $(kubectl -n stacklight get cm |grep "$pattern" | awk '{print $1}' | sort -V | head -n -${left_items})
   do
     kubectl -n stacklight delete cm $i
   done
}

For example:

kubectl -n stacklight get cm |grep "openstack-cinder.v" | awk '{print $1}'
openstack-cinder.v1
...
openstack-cinder.v50
openstack-cinder.v51
cleanup_release_history openstack-cinder.v

Fix the releases in the FAILED state:

Connect to one of StackLight Helm Controller pods and list the releases in the FAILED state:

kubectl -n stacklight exec -it stacklight-helm-controller-699cc6949-dtfgr -- sh
./helm --host localhost:44134 list

Example of system response:

# openstack-heat            2313   Wed Jun 23 06:50:55 2021   FAILED   heat-0.1.0-mcp-3860      openstack
# openstack-keystone        76     Sun Jun 20 22:47:50 2021   FAILED   keystone-0.1.0-mcp-3860  openstack
# openstack-neutron         147    Wed Jun 23 07:00:37 2021   FAILED   neutron-0.1.0-mcp-3860   openstack
# openstack-nova            1      Wed Jun 23 07:09:43 2021   FAILED   nova-0.1.0-mcp-3860      openstack
# openstack-nova-rabbitmq   15     Wed Jun 23 07:04:38 2021   FAILED   rabbitmq-0.1.0-mcp-2728  openstack

Determine the reason for a release failure. Typically, this is due to changes in the immutable objects (jobs). For example:

./helm --host localhost:44134 history openstack-mariadb

Example of system response:

REVISION   UPDATED                    STATUS     CHART                   APP VERSION   DESCRIPTION
      Thu Jun 17 20:26:14 2021   DEPLOYED   mariadb-0.1.0-mcp-2710                Upgrade complete
      Wed Jun 23 07:07:58 2021   FAILED     mariadb-0.1.0-mcp-2728                Upgrade "openstack-mariadb" failed: Job.batch "openstack-...
      Wed Jun 23 07:55:22 2021   FAILED     mariadb-0.1.0-mcp-2728                Upgrade "openstack-mariadb" failed: Job.batch "exporter-c...

Remove the FAILED job and roll back the release. For example:

kubectl -n openstack delete job -l application=mariadb
./helm --host localhost:44134 rollback openstack-mariadb 213

Verify that the release is in the DEPLOYED state. For example:

./helm --host localhost:44134 history openstack-mariadb

Perform the steps above for all releases in the FAILED state one by one.

Set TILLER_HISTORY_MAX in the StackLight Controller to 3:

kubectl -n stacklight edit deployment stacklight-helm-controller

[13273] Octavia amphora may get stuck after cluster update¶

^{Fixed in MOS 21.4}

After the MOS cluster update, Octavia amphora may get stuck with the WARNING octavia.amphorae.drivers.haproxy.rest_api_driver [-] Could not connect to instance. Retrying. error message present in the Octavia worker logs. The workaround is to manually switch the Octavia amphorae driver from V2 to V1.

Workaround:

In the OsDpl CR, specify the following configuration:

spec:
  services:
    load-balancer:
      octavia:
        values:
          conf:
            octavia:
              api_settings:
                default_provider_driver: amphora

Trigger the OpenStack deployment to restart Octavia:

kubectl apply -f openstackdeployment.yaml

To monitor the status:

kubectl -n openstack get pods
kubectl -n openstack describe osdpl osh-dev

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

Ceph known issues¶

This section lists the Ceph known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.3.

[16229] Ceph Controller failure during update

[16229] Ceph Controller failure during update¶

During the MOS cluster update to Cluster release 6.16.0, the Ceph Controller may fail with the following traceback:

panic: runtime error: invalid memory address on nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x15d2c58]

goroutine 352 [running]:
github.com/Mirantis/ceph-controller/pkg/controller/miraceph.VerifyCertificateExpireDate(0x0, 0x0, 0x0, 0x6, 0x28a7bc0)
        ceph-controller/pkg/controller/miraceph/util.go:250 +0x48

Workaround:

Obtain the cert base64-encoded value of the rook-ceph/rgw-ssl-certificate secret:

kubectl -n rook-ceph get secret rgw-ssl-certificate -o jsonpath='{.data.cert}' | base64 -d

Example of system response:

-----BEGIN RSA PRIVATE KEY-----
...
-----END RSA PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----

Copy last certificate in the chain and save it to the temp file, for example, tmp-cacert.crt.
Encode the certificate from tmp-cacert.crt with base64 encoding in one line:
```
cat tmp-cacert.crt | base64 -w 0
```
Create a new cacert key in the rook-ceph/rgw-ssl-certificate secret and copy the base64-encoded cacert to its value. The following is an example of the resulting secret data:
```
data:
 cert: <base64 string>
 cacert: <copied base64 cacert string>
```

Restart the ceph-lcm-mirantis/ceph-controller pod:

kubectl -n ceph-lcm-mirantis delete pod -l app=ceph-controller

StackLight known issues¶

This section lists the StackLight known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.3.

[16353] Incorrect API availability calculation

[16353] Incorrect API availability calculation¶

OpenStack API availability data is shown in a 0 - 1 scale instead of an expected 0 - 100 scale.

As a workaround, manually set the extraQueries for OpenStack services API to * 100 in the StackLight Helm chart values of the Cluster release resource:

sfReporter:
 extraQueries:
   nova_api:
     title: 'Nova API'
     expr: 'avg(avg_over_time(openstack_api_check_status{service_name="nova"}[24h])) * 100'
   cinder_api:
     title: 'Cinder API'
     expr: 'avg(avg_over_time(openstack_api_check_status{service_name="cinderv2"}[24h])) * 100'
   glance_api:
     title: 'Glance API'
     expr: 'avg(avg_over_time(openstack_api_check_status{service_name="glance"}[24h])) * 100'
   keystone_api:
     title: 'Keystone API'
     expr: 'avg(avg_over_time(openstack_api_check_status{service_name="keystone"}[24h])) * 100'
   neutron_api:
     title: 'Neutron API'
     expr: 'avg(avg_over_time(openstack_api_check_status{service_name="neutron"}[24h])) * 100'

Release artifacts¶

This section lists the components artifacts of the MOS 21.3 release:

MOS 21.3 OpenStack Victoria binaries and Docker images
MOS 21.3 OpenStack Ussuri binaries and Docker images
MOS 21.3 OpenStack Helm charts
MOS 21.3 Tungsten Fabric 5.1 artifacts
MOS 21.3 Tungsten Fabric 2011 artifacts
MOS 21.3 StackLight artifacts

MOS 21.3 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20210518064309.qcow2	Mirantis Proprietary License
Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:victoria-bionic-20210521110756	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:victoria-bionic-20210521110756	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:victoria-bionic-20210521110756	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-bionic-20210520105038	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:victoria-bionic-20210521110756	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-bionic-20210521110756	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-bionic-20210521110756	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-bionic-20210521110756	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-bionic-20210521110756	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-bionic-20210521110756	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-bionic-20210521110756	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-bionic-20210521110756	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-bionic-20210521110756	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-bionic-20210521110756	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-bionic-20210521110756	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-bionic-20210521110756	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-bionic-20210521110756	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210509034813	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210509034813	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210509034813	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210517170337	Apache License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.14-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210519111601	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.42.0	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210509025121	GPLv2, LGPLv2.1 (client libraries)
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210408062007	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
aodh	mirantis.azurecr.io/openstack/aodh:victoria-bionic-20210521110756	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-bionic-20210521110756	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-bionic-20210521110756	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-bionic-20210521110756	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan ^New	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20210511072313	GPL License

MOS 21.3 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20210121085750.qcow2	Mirantis Proprietary License

Docker images
ironic-inspector	mirantis.azurecr.io/openstack/ironic-inspector:ussuri-bionic-20210517170337	Apache License 2.0
masakari-monitors	mirantis.azurecr.io/openstack/masakari-monitors:ussuri-bionic-20210517170337	Apache License 2.0
masakari	mirantis.azurecr.io/openstack/masakari:ussuri-bionic-20210517170337	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:ussuri-bionic-20210512124012	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20210517170337	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20210517170337	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20210517170337	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20210517170337	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20210517170337	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210408062007	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC8	MIT License
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210509025121	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.42.0	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210519111601	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.14-management	Mozilla Public License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210517170337	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210509034813	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210509034813	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210509034813	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20210517170337	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20210517170337	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20210517170337	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20210517170337	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20210517170337	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20210517170337	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20210517170337	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20210517170337	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20210517170337	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20210517170337	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20210517170337	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20210517170337	Apache License 2.0
frr	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License
strongswan ^New	mirantis.azurecr.io/openstack/extra/strongswan:alpine-5.9.1-20210511072313	GPL License

MOS 21.3 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.3.33.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-3860.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
frr	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
iscsi ^New	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/iscsi-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)
strongswan ^New	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/strongswan-0.1.0-mcp-2728.tgz	Apache License 2.0 (no License file in Helm chart)

MOS 21.3 Tungsten Fabric 5.1 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.3.1.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.3.1	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20210518083050	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20210518083050	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20210518083050	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20210518083050	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20210518083050	Apache License 2.0
Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20210518083050	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20210518083050	Apache License 2.0
Status	mirantis.azurecr.io/tungsten/contrail-status:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:5.1.20210518083050	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20210518083050	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20210518083050	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20210518083050	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.8	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.4	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.3	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.10	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20210428100236	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo ^New	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License

MOS 21.3 Tungsten Fabric 2011 artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.3.1.tgz	Mirantis Proprietary License
Docker images
Tungsten Fabric Operator	mirantis.azurecr.io/tungsten-operator/tungstenfabric-operator:0.3.1	Mirantis Proprietary License
Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:mos-R2011-20210520121530	Apache License 2.0
Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:mos-R2011-20210520121530	Apache License 2.0
Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:mos-R2011-20210520121530	Apache License 2.0
Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:mos-R2011-20210520121530	Apache License 2.0
Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:mos-R2011-20210520121530	Apache License 2.0
Control	contrail-controller-control-control:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:mos-R2011-20210520121530	Apache License 2.0
Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:mos-R2011-20210520121530	Apache License 2.0
Status	mirantis.azurecr.io/tungsten/contrail-status:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:mos-R2011-20210520121530	Apache License 2.0
vRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:mos-R2011-20210520121530	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:mos-R2011-20210520121530	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:mos-R2011-20210520121530	Apache License 2.0
Provisioner	mirantis.azurecr.io/tungsten/contrail-provisioner:mos-R2011-20210520121530	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.8	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.4	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.3	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.14	Mozilla Public License 2.0
ZooKeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.10	Apache License 2.0
ZooKeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20210428100236	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License
TF NodeInfo ^New	mirantis.azurecr.io/tungsten/tf-nodeinfo:0.1-20210430090010	MIT License

MOS 21.3 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.1.0-mcp-2.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.1.0-mcp-1.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the Mirantis OpenStack for Kubernetes 21.3 release:

[13422][OpenStack] Fixed the issue with some Redis pods remaining in Pending state and causing failure to update the Cluster release.
[12511][OpenStack] Fixed the issue with Kubernetes nodes getting stuck in the Prepare state during the MOS cluster update.
[13233][StackLight] Fixed the issue with low memory limits for StackLight Helm Controller causing update failure.
[12917][StackLight] Fixed the issue with prometheus-tf-vrouter-exporter pods failing to start on Tungsten Fabric nodes with DPDK. To remove the nodeSelector definition specified when applying the workaround:
1. Remove the nodeSelector.component.tfVrouterExporter definition from the stackLight helmRelease (.spec.providerSpec.value.helmReleases) values of the Cluster resource.
2. Remove the label using the following command. Do not remove the last dash sign.
```
kubectl label node <node_name> tfvrouter-fix-
```
[11961][Tungsten Fabric] Fixed the issue with members failing to join the RabbitMQ cluster after the tf-control nodes reboot.

MOS 21.2 release¶

Release date

Name

Container Cloud Cluster release

Highlights

April 22, 2021

MOS 21.2

6.14.0+21.2

Update for the MOS GA release introducing proxy support and the following Technology Preview features:

Instances High Availability service
LVM ephemeral storage and encryption
SR-IOV for Tungsten Fabric vRouter
Custom Tungsten Fabric vRouter settings

New features¶

Cache and proxy support
Instances High Availability service
LVM ephemeral storage
SR-IOV for Tungsten Fabric vRouter
Custom Tungsten Fabric vRouter settings
cephClusterSpec section in KaasCephCluster

Cache and proxy support¶

Implemented the cache and proxy support for MOS managed clusters. By default, during a MOS cluster deployment and update the Mirantis artifacts are downloaded through a cache running on a management or regional cluster. If you have an external application that requires Internet access, you can now use proxy with required parameters specified for that application.

Learn more

Instances High Availability service¶

TechPreview

Implemented the capability to enable Masakari, the OpenStack service that ensures high availability of instances running on a host. The feature is disabled by default.

Learn more

MOS Reference Architecture: Instance High Availability service

LVM ephemeral storage¶

TechPreview

Implemented the capability to configure LVM as a backend for the VM disks and ephemeral storage and configure ephemeral disk encryption.

Learn more

MOS Deployment Guide: Enable LVM ephemeral storage

SR-IOV for Tungsten Fabric vRouter¶

TechPreview

Implemented the capability to enable SR-IOV for the Tungsten Fabric vRouter.

Learn more

MOS Deployment Guide: Enable SR-IOV for Tungsten Fabric

Custom Tungsten Fabric vRouter settings¶

TechPreview

Implemented the capability to specify custom settings for the Tungsten Fabric vRouter nodes using the customSpecs parameter, such as to change the name of the tunnel network interface or enable debug level logging.

Learn more

MOS Reference Architecture: Custom vRouter settings

cephClusterSpec section in KaasCephCluster¶

Improved user experience by moving the rgw.ingress parameters of the KaasCephCluster CR to a common cephClusterSpec.ingress section. The rgw section is deprecated. However, if you continue using rgw.ingress, it will be automatically translated into cephClusterSpec.ingress during the MOS cluster release update.

Learn more

MOS Operations Guide: Configure Ceph RGW TLS

Major components versions¶

MOS 21.2 components versions¶
Component	Version
Cluster release	6.14.0
OpenStack	Ussuri Victoria ^TechPrev
openstack-operator	0.3.31
Tungsten Fabric	5.1
tungstenfabric-operator	0.2.6

Known issues¶

This section contains the description of the known issues with available workarounds.

Tungsten Fabric known issues and limitations¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.2.

Limitations
[10096] tf-control does not refresh IP addresses of Cassandra pods
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot

Limitations¶

Tungsten Fabric does not provide the following functionality:

Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
Role Based Access Control (RBAC) for Neutron objects.
Modification of custom vRouter DaemonSets based on the SR-IOV definition in the OsDpl CR.

Learn more

MOS Reference Architecture: Tungsten Fabric known limitations

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

kubectl -n tf delete pod tf-control-<hash>

[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶

Rebooting all Cassandra cluster TFConfig or TFAnalytics nodes, maintenance, or other circumstances that cause the Cassandra pods to start simultaneously may cause a broken Cassandra TFConfig and/or TFAnalytics cluster. In this case, Cassandra nodes do not join the ring and do not update the IPs of the neighbor nodes. As a result, the TF services cannot operate Cassandra cluster(s).

To verify that a Cassandra cluster is affected:

Run the nodetool status command specifying the config or analytics cluster and the replica number:

kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status

Example of system response with outdated IP addresses:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
DN  <outdated ip>   ?          256          64.9%             a58343d0-1e3f-4d54-bcdf-9b9b949ca873  r1
DN  <outdated ip>   ?          256          69.8%             67f1d07c-8b13-4482-a2f1-77fa34e90d48  r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)  Host ID                               Rack
UN  <actual ip>      3.84 GiB   256          65.2%             7324ebc4-577a-425f-b3de-96faac95a331  rack1

Workaround:

Manually delete a Cassandra pod from the failed config or analytics cluster to re-initiate the bootstrap process for one of the Cassandra nodes:

kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>

OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.2.

[13422] Redis pods remain in Pending state and cause update failure
[12511] Kubernetes workers remain in Prepare state
[13273] Octavia amphora may get stuck after cluster update
[6912] Octavia load balancers may not work properly with DVR

[13422] Redis pods remain in Pending state and cause update failure¶

Fixed in MOS 21.3

During the MOS cluster update to Cluster release 6.14.0, some Redis pods may remain in the Pending state and cause update failure.

Workaround:

Scale the Redis deployment to 0 replicas:

kubectl -n openstack-redis scale deployment rfs-openstack-redis --replicas=0

Wait for the pods removal.

Scale the Redis deployment back to 3 replicas:

kubectl -n openstack-redis scale deployment rfs-openstack-redis --replicas=3

Obtain the list of replicas:

kubectl -n openstack-redis get replicaset

Example of system response:

NAME                                         DESIRED  CURRENT  READY  AGE
os-redis-operator-redisoperator-6bd8455f8c   1        1        1      26h
rfs-openstack-redis-568b8f6fcb               0        0        0      26h
rfs-openstack-redis-798655cf9b               3        3        3      24h

Remove the ReplicaSet with 0 replicas:

kubectl -n openstack-redis delete replicaset rfs-openstack-redis-568b8f6fcb

[12511] Kubernetes workers remain in Prepare state¶

Fixed in MOS 21.3

During the MOS cluster update to Cluster release 6.14.0, Kubernetes nodes may get stuck in the Prepare state. At the same time, the LCM Controller logs may contain the following errors:

evicting pod "horizon-57f7ccff74-d469c"
error when evicting pod "horizon-57f7ccff74-d469c" (will retry after
5s): Cannot evict pod as it would violate the pod's disruption budget.

The workaround is to decrease the Pod Disruption Budget (PDB) limit for Horizon by executing the following command on the managed cluster:

kubectl -n openstack patch pdb horizon -p='{"spec": {"minAvailable": 1}}'

[13273] Octavia amphora may get stuck after cluster update¶

Fixed in MOSK 21.3

Workaround:

In the OsDpl CR, specify the following configuration:

spec:
  services:
    load-balancer:
      octavia:
        values:
          conf:
            octavia:
              api_settings:
                default_provider_driver: amphora

Trigger the OpenStack deployment to restart Octavia:

kubectl apply -f openstackdeployment.yaml

To monitor the status:

kubectl -n openstack get pods
kubectl -n openstack describe osdpl osh-dev

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

StackLight known issues¶

This section lists the StackLight known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.2.

[13233] Low memory limits for StackLight Helm Controller cause update failure
[12917] prometheus-tf-vrouter-exporter pods fail to start on TF nodes with DPDK

[13233] Low memory limits for StackLight Helm Controller cause update failure¶

Fixed in MOS 21.3

During the MOS cluster update to Cluster release 6.14.0, StackLight Helm Controller containers (controller and/or tiller) may get OOMKilled and cause failure to update.

As a workaround, manually increase the default resource requests and limits for stacklightHelmControllerController and stacklightHelmControllerTiller in the StackLight Helm chart values of the Cluster release resource:

resources:
  stacklightHelmControllerController:
    requests:
      cpu: "250m"
      memory: "128Mi"
    limits:
      cpu: "1000m"
      memory: "512Mi"
  stacklightHelmControllerTiller:
    requests:
      cpu: "250m"
      memory: "128Mi"
    limits:
      cpu: "1000m"
      memory: "2048Mi"

For details about resource limits, see Mirantis Container Cloud Operations Guide: Resource limits.

[12917] prometheus-tf-vrouter-exporter pods fail to start on TF nodes with DPDK¶

Fixed in MOS 21.3

StackLight deploys the prometheus-tf-vrouter-exporter exporter based on the node selector matching the tfvrouter: enabled node label. The Tungsten Fabric nodes with DPDK have the tfvrouter-dpdk: enabled label set instead. Therefore, the prometheus-tf-vrouter-exporter exporter fails to start on these nodes.

Workaround:

Add the tfvrouter-fix: enabled label to every node that contains either the tfvrouter: enabled or the tfvrouter-dpdk: enabled node label.
```
kubectl label node <node_name> tfvrouter-fix=enabled
```
In the Cluster release resource, specify the following nodeSelector definition in the StackLight Helm chart values:
```
nodeSelector:
  component:
    tfVrouterExporter:
        tfvrouter-fix: enabled
```

Once done, prometheus-tf-vrouter-exporter will be deployed to every node with the tfvrouter-fix: enabled label.

Release artifacts¶

This section lists the components artifacts of the MOS 21.2 release:

MOS 21.2 OpenStack Victoria binaries and Docker images
MOS 21.2 OpenStack Ussuri binaries and Docker images
MOS 21.2 OpenStack Helm charts
MOS 21.2 Tungsten Fabric artifacts
MOS 21.2 StackLight artifacts

MOS 21.2 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20210318064459.qcow2	Mirantis Proprietary License
Docker images
ironic-inspector ^New	mirantis.azurecr.io/openstack/ironic-inspector:victoria-bionic-20210326055904	Apache License 2.0
masakari-monitors ^New	mirantis.azurecr.io/openstack/masakari-monitors:victoria-bionic-20210326055904	Apache License 2.0
masakari ^New	mirantis.azurecr.io/openstack/masakari:victoria-bionic-20210326055904	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:victoria-bionic-20210325154004	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:victoria-bionic-20210326055904	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-bionic-20210326055904	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-bionic-20210326055904	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-bionic-20210326055904	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-bionic-20210326055904	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-bionic-20210326055904	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-bionic-20210326055904	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-bionic-20210326055904	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-bionic-20210326055904	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-bionic-20210326055904	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-bionic-20210326055904	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-bionic-20210326055904	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-bionic-20210326055904	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210325145045	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210325145045	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210325145045	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210323180013	Apache License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.9	Mozilla Public License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.9-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210202133935	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.42.0	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210203155435	GPLv2, LGPLv2.1 (client libraries)
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210212172540	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC7.1	MIT License
aodh	mirantis.azurecr.io/openstack/aodh:victoria-bionic-20210326055904	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-bionic-20210326055904	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-bionic-20210326055904	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-bionic-20210326055904	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
frr ^New	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License

MOS 21.2 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20210121085750.qcow2	Mirantis Proprietary License

Docker images
ironic-inspector ^New	mirantis.azurecr.io/openstack/ironic-inspector:ussuri-bionic-20210326055904	Apache License 2.0
masakari-monitors ^New	mirantis.azurecr.io/openstack/masakari-monitors:ussuri-bionic-20210326055904	Apache License 2.0
masakari ^New	mirantis.azurecr.io/openstack/masakari:ussuri-bionic-20210326055904	Apache License 2.0
stepler	mirantis.azurecr.io/openstack/stepler:ussuri-bionic-20210325151618	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20210326055904	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20210326055904	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20210326055904	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20210326055904	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20210326055904	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210212172540	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC7.1	MIT License
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210203155435	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.42.0	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210202133935	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.9-management	Mozilla Public License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.9	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210326055904	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210325145045	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210325145045	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210325145045	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20210326055904	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20210326055904	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20210326055904	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20210326055904	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20210326055904	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20210326055904	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20210326055904	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20210326055904	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20210326055904	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20210326055904	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20210326055904	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20210326055904	Apache License 2.0
frr ^New	mirantis.azurecr.io/general/external/docker.io/frrouting/frr:v7.5.0	GPL-2.0 License

MOS 21.2 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.3.31.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-3822.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)
frr ^New	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/frr-0.1.0-mcp-2710.tgz	Apache License 2.0 (no License file in Helm chart)

MOS 21.2 Tungsten Fabric artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.2.6.tgz	Mirantis Proprietary License
Docker images
TF Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20210312153639	Apache License 2.0
TF Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:20210312153639	Apache License 2.0
TF Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20210312153639	Apache License 2.0
TF Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20210312153639	Apache License 2.0
TF Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20210312153639	Apache License 2.0
TF Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20210312153639	Apache License 2.0
TF Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20210312153639	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20210312153639	Apache License 2.0
TF Status	mirantis.azurecr.io/tungsten/contrail-status:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:5.1.20210312153639	Apache License 2.0
TF VRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20210312153639	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20210312153639	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.7	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.2	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.1	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.9	Mozilla Public License 2.0
Zookeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.9	Apache License 2.0
Zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20210211114307	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210214191656	MIT License

MOS 21.2 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.1.0-mcp-2.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.1.0-mcp-1.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the Mirantis OpenStack for Kubernetes 21.2 release:

[7725][Tungsten Fabric] Fixed the issue with the Neturon service failing to create a network through Horizon and OpenStack CLI throwing the An unknown exception occurred error.

MOS 21.1 release¶

Release date

Name

Container Cloud Cluster release

Highlights

March 01, 2021

MOS 21.1

6.12.0+21.1

Update for the MOS GA release introducing support for the PCI passthrough feature and Tungsten Fabric monitoring, as well as the following Technology Preview features:

OpenStack Victoria support with OVS and Tungsten Fabric 5.1
SR-IOV for OpenStack
Components collocation (OpenStack compute and Ceph nodes)

PCI passthrough support in OpenStack¶

Added support for the Peripheral Component Interconnect (PCI) passthrough feature in OpenStack to use, mainly, as a part of the SR-IOV network traffic acceleration technique. Now, MOS enables the user to configure Nova on a per-node basis to allow PCI devices to be passed through from hosts to virtual machines.

Learn more

Deployment Guide: Configure PCI passthrough for guests

OpenStack Victoria support¶

TechPreview

Implemented support for OpenStack Victoria with Neutron OVS and Tungsten Fabric 5.1.

SR-IOV for OpenStack¶

TechPreview

Implemented support for the SR-IOV with the Neutron OVS backend topology.

Learn more

Deployment Guide: Enable SR-IOV with OVS

Hyper-converged OpenStack compute nodes¶

TechPreview

Implemented the capability to colocate components on the same host, for example, Ceph OSD and OpenStack compute.

Learn more

Reference Architecture: Components collocation

Tungsten Fabric monitoring¶

Enhanced StackLight to monitor Tungsten Fabric and its components, including Casandra, Kafka, Redis, and ZooKeeper. Implemented the Tungsten Fabric alerts and Grafana dashboards. The feature is disabled by default. You can enable it manually during or after the Tungsten Fabric deployment.

Learn more

Alert inhibition rules¶

Implemented alert inhibition rules to provide a clearer view on the cloud status and simplify troubleshooting. Using alert inhibition rules, Alertmanager decreases alert noise by suppressing dependent alerts notifications. The feature is enabled by default. For details, see Operations Guide: Alert dependencies.

Learn more

Integration between Grafana and Kibana¶

Implemented integration between Grafana and Kibana by adding a View logs in Kibana link to most Grafana dashboards, which allows you to immediately view contextually relevant logs through the Kibana web UI.

Learn more

Operations Guide: View Grafana dashboards

Ceph RGW TLS¶

Added the capability to configure the Transport Layer Security (TLS) protocol for a Ceph RGW public endpoint using MOS TLS if enabled, or using a custom ingress specified in the KaaSCephCluster custom resource.

Learn more

Operations Guide: Configure Ceph RGW TLS

Major components versions¶

MOS 21.1 components versions¶
Component	Version
Cluster release	6.12.0
OpenStack	Ussuri Victoria ^TechPrev
openstack-operator	0.3.25
Tungsten Fabric	5.1
tungstenfabric-operator	0.2.3

Known issues¶

This section contains the description of the known issues with available workarounds.

OpenStack known issues¶

This section lists the OpenStack known issues for the Mirantis OpenStack for Kubernetes release 21.1.

[6912] Octavia load balancers may not work properly with DVR

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

Tungsten Fabric known issues and limitations¶

This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.1.

Limitations
[7725] Neutron cannot create networks
[10096] tf-control does not refresh IP addresses of Cassandra pods

Limitations¶

Tungsten Fabric does not provide the following functionality:

Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
Role Based Access Control (RBAC) for Neutron objects.

Learn more

Reference Architecture: Tungsten Fabric known limitations

[7725] Neutron cannot create networks¶

^{Fixed in MOS 21.2}

The Neturon service fails to create a network through Horizon and OpenStack CLI throwing the An unknown exception occurred error.

The workaround is to restart the tf-config pods:

Obtain the list of the tf-config pods:
```
kubectl -n tf get pod -l app=tf-config
```

Delete the tf-config-* pods. For example:

kubectl -n tf delete pod tf-config-2whbb

Verify that the pods have been recreated:
```
kubectl -n tf get pod -l app=tf-config
```

[10096] tf-control does not refresh IP addresses of Cassandra pods¶

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

kubectl -n tf delete pod tf-control-<hash>

Release artifacts¶

This section lists the components artifacts of the MOS 21.1 release:

MOS 21.1 OpenStack Victoria binaries and Docker images
MOS 21.1 OpenStack Ussuri binaries and Docker images
MOS 21.1 OpenStack Helm charts
MOS 21.1 Tungsten Fabric artifacts
MOS 21.1 StackLight artifacts

MOS 21.1 OpenStack Victoria binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-victoria-8f71802-20210119120707.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-victoria-20210119144419.qcow2	Mirantis Proprietary License
Docker images
stepler	mirantis.azurecr.io/openstack/stepler:victoria-bionic-20210118135111	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:victoria-bionic-20210129120815	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:victoria-bionic-20210129120815	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:victoria-bionic-20210129120815	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:victoria-bionic-20210129120815	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:victoria-bionic-20210129120815	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:victoria-bionic-20210129120815	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:victoria-bionic-20210129120815	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:victoria-bionic-20210129120815	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:victoria-bionic-20210129120815	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:victoria-bionic-20210129120815	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:victoria-bionic-20210129120815	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:victoria-bionic-20210129120815	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:victoria-bionic-20210129120815	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210106163230	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210106163231	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210106163230	Apache License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210127180024	Apache License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.9	Mozilla Public License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.9-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra:ceph-config-helper:nautilus-bionic-20210112174540	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.42.0	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210106145941	GPLv2, LGPLv2.1 (client libraries)
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210115084431	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC7.1	MIT License
aodh	mirantis.azurecr.io/openstack/aodh:victoria-bionic-20210129120815	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:victoria-bionic-20210129120815	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:victoria-bionic-20210129120815	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:victoria-bionic-20210129120815	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License

MOS 21.1 OpenStack Ussuri binaries and Docker images¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-48f346e-20210119132403.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20210121085750.qcow2	Mirantis Proprietary License
Docker images
stepler	mirantis.azurecr.io/openstack/stepler:ussuri-bionic-20210121144938	Mirantis Proprietary License
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20210127180024	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20210127180024	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20210127180024	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20210127180024	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20210127180024	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0-20210115084431	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.17-bionic-20210106145941	GPLv2, LGPLv2.1 (client libraries)
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.42.0	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20210112174540	Apache License 2.0, LGPL-2.1 or LGPL-3
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.9-management	Mozilla Public License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.9	Mozilla Public License 2.0
openstack-tools	mirantis.azurecr.io/openstack/openstack-tools:ussuri-bionic-20210127180024	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20210106163230	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20210106163231	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC7.1	MIT License
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20210106163230	LGPL-2.1 License
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20210127180024	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20210127180024	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20210127180024	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20210127180024	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20210127180024	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20210127180024	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20210127180024	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20210127180024	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20210127180024	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20210127180024	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20210127180024	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20210127180024	Apache License 2.0

MOS 21.1 OpenStack Helm charts¶
Component	Path	License information for main executable programs
openstack-operator	https://binary.mirantis.com/binary-dev-kaas-local/openstack/helm/openstack-controller/openstack-operator-0.3.25.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
stepler	https://binary.mirantis.com/openstack/helm/openstack-helm/stepler-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-3797.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2689.tgz	Apache License 2.0 (no License file in Helm chart)

MOS 21.1 Tungsten Fabric artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary.mirantis.com/tungsten/helm/tungstenfabric-operator-0.2.3.tgz	Mirantis Proprietary License
Docker images
TF Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20210129092117	Apache License 2.0
TF Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20210129092117	Apache License 2.0
TF Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20210129092117	Apache License 2.0
TF Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20210129092117	Apache License 2.0
TF Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20210129092117	Apache License 2.0
TF Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20210129092117	Apache License 2.0
TF Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20210129092117	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20210129092117	Apache License 2.0
TF Status	mirantis.azurecr.io/tungsten/contrail-status:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:5.1.20210129092117	Apache License 2.0
TF VRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20210129092117	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20210129092117	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.7	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.2	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.1.1	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.9	Mozilla Public License 2.0
Zookeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.9	Apache License 2.0
Zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20210202061227	MIT License
TF CLI	mirantis.azurecr.io/tungsten/tf-cli:0.1-20210202130729	MIT License

MOS 21.1 StackLight artifacts¶
Component	Path	License information for main executable programs
Docker images
prometheus-libvirt-exporter	mirantis.azurecr.io/stacklight/libvirt-exporter:v0.1-20200610164751	Mirantis Proprietary License
prometheus-tungstenfabric-exporter ^New	mirantis.azurecr.io/stacklight/tungstenfabric-prometheus-exporter:0.1-20210115152338	Mirantis Proprietary License
Helm charts
prometheus-libvirt-exporter	https://binary.mirantis.com/stacklight/helm/prometheus-libvirt-exporter-0.1.0-mcp-2.tgz	Mirantis Proprietary License
prometheus-tungstenfabric-exporter ^New	https://binary.mirantis.com/stacklight/helm/prometheus-tungstenfabric-exporter-0.1.0-mcp-1.tgz	Mirantis Proprietary License

Addressed issues¶

The following issues have been addressed in the Mirantis OpenStack for Kubernetes 21.1 release:

[9809] [Kubernetes] Fixed the issue with the pods getting stuck in the Pending state during update of a MOSK cluster by increasing the default kubelet_max_pods setting to 150.
[9589][StackLight] Fixed the issue with the Patroni pod crashing when scheduled to an OpenStack compute node with huge pages.
[8573] [OpenStack] Fixed the issue with the external authentication to Horizon failing to log in a different user.

MOS Ussuri GA Update release¶

Release date	Name	Container Cloud Cluster release	Highlights
December 23, 2020	MOS Ussuri Update	6.10.0	The first update to MOS Ussuri release introducing support for object storage and a Telco deployment profile, which includes implementation of baseline Enhanced Platform Awareness (NUMA awareness, huge pages, CPU pinning) capabilities, and a technical preview of packet processing acceleration (Data Plane Development Kit-enabled Tungsten Fabric).

New features¶

Node-specific overrides
Customizable theme for OpenStack Dashboard
DPDK for OVS
Advanced configuration for OpenStack computes nodes
Support for RADOS Gateway Object Storage
Disabling HTTP probes
DPDK for Tungsten Fabric vRouter
Tungsten Fabric services status verification

Node-specific overrides¶

Implemented the capability to easily perform the node-specific configuration through the OpenStack Controller. More specifically, the node-specific overrides allow you to:

Enable DPDK with OVS ^{Technical Preview}
Enable libvirt CPU pinning

Learn more

MOS Reference Architecture: Node-specific settings

Customizable theme for OpenStack Dashboard¶

Implemented the capability to customize the look and feel of Horizon through the OpenStackDeployment custom resource. Cloud operator is now able to specify the origin of the theme bundle to be applied to OpenStack Horizon in features:horizon:themes.

Learn more

MOS Reference Architecture: OsDpl standard configuration

DPDK for OVS¶

TechPreview

Implemented the capability to enable the DPDK mode for OVS.

Learn more

MOS Deployment guide: OsDpl standard configuration

Advanced configuration for OpenStack computes nodes¶

Implemented the capability to enable huge pages and configure CPU isolation and CPU pinning in your MOS deployment.

Learn more

MOS Deployment guide: Advanced configuration for OpenStack computes nodes

Support for RADOS Gateway Object Storage¶

Added support for RADOS Gateway Object Storage (SWIFT).

Learn more

MOS Reference Architecture: Object Storage service

Disabling HTTP probes¶

Implemented the capability to disable HTTP probes for public endpoints from the OpenStack service catalog. In this case, Telegraf performs HTTP checks only for the admin and internal OpenStack endpoints. By default, Telegraf verifies all endpoints from the OpenStack service catalog.

Learn more

MOS Operations Guide: StackLight configuration parameters

DPDK for Tungsten Fabric vRouter¶

TechPreview

Implemented the capability to enable DPDK mode for the Tungsten Fabric vRouter.

Learn more

MOS Deployment Guide: Enable DPDK for Tungsten Fabric

Tungsten Fabric services status verification¶

Implemented the capability to verify the status of Tungsten Fabric services, including the third-party services such as Cassandra, ZooKeeper, Kafka, Redis, and RabbitMQ using the Tungsten Fabric Operator tf-status tool.

Major components versions¶

MOS Ussuri Update components versions¶
Component	Version
Cluster release	6.10.0
OpenStack	Ussuri
openstack-operator	0.3.18
Tungsten Fabric	5.1
tungstenfabric-operator	0.2.1

Known issues¶

This section contains the description of the known issues with available workarounds.

OpenStack known issues and limitations¶

Limitations
[9809] The default max_pods setting does not allow upgrading a cluster
[6912] Octavia load balancers may not work properly with DVR
[8573] External authentication to Horizon fails to log in a different user

Limitations¶

Due to limitations in the Octavia and MOS integration, the clusters where Neutron is deployed in the Distributed Virtual Router (DVR) mode are not stable. Therefore, Mirantis does not recommend such configuration for production deployments.

[9809] The default max_pods setting does not allow upgrading a cluster¶

^{Fixed in MOS 21.1}

During update of a MOS cluster, the pods may get stuck in the Pending state with the following example warning:

Warning FailedScheduling <unknown> default-scheduler 0/9 nodes are available:
1 node(s) were unschedulable, 2 Too many pods, 6 node(s) didn't match node selector.

Workaround

Before you update the managed cluster:

Set kubelet_max_pods to 250:

UCP_HOST=$(kubectl -n <child name space> get clusters <child name> -o jsonpath='{.status.providerStatus.ucpDashboard}')
AUTHTOKEN=$(curl --silent --insecure --data '{"username":"admin","password":"<PASWORD>"}' $UCP_HOST/auth/login | jq --raw-output .auth_token)
curl --insecure -X GET "$UCP_HOST/api/ucp/config-toml" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" > ucp-config.toml
sed -i 's/kubelet_max_pods = 110/kubelet_max_pods = 250/g' ucp-config.toml
curl --insecure -X PUT -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file 'ucp-config.toml' -H "X-Ucp-Allow-Restricted-Api: i-solemnly-swear-i-am-up-to-no-good" $UCP_HOST/api/ucp/config-toml
curl -k -X PUT "$UCP_HOST/api/ucp/config/tuning" -H "X-Ucp-Allow-Restricted-Api: i-solemnly-swear-i-am-up-to-no-good" -H "Authorization: Bearer $AUTHTOKEN" --data '{"kaasManagedCluster":true}'

Verify that the changes have been applied:

kubectl get nodes -o jsonpath='{.items[*].status.capacity.pods}'

Example of a positive system response:

250 250 250 250 250 250 250 250 250

After you update the managed cluster, set kubelet_max_pods to the default 110 value:

UCP_HOST=$(kubectl -n <child name space> get clusters <child name> -o jsonpath='{.status.providerStatus.ucpDashboard}')
AUTHTOKEN=$(curl --silent --insecure --data '{"username":"admin","password":"<PASWORD>"}' $UCP_HOST/auth/login | jq --raw-output .auth_token)
curl --insecure -X GET "$UCP_HOST/api/ucp/config-toml" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" > ucp-config.toml
sed -i 's/kubelet_max_pods = 250/kubelet_max_pods = 110/g' ucp-config.toml
curl --insecure -X PUT -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file 'ucp-config.toml' -H "X-Ucp-Allow-Restricted-Api: i-solemnly-swear-i-am-up-to-no-good" $UCP_HOST/api/ucp/config-toml
curl -k -X PUT "$UCP_HOST/api/ucp/config/tuning" -H "X-Ucp-Allow-Restricted-Api: i-solemnly-swear-i-am-up-to-no-good" -H "Authorization: Bearer $AUTHTOKEN" --data '{"kaasManagedCluster":true}'

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

[8573] External authentication to Horizon fails to log in a different user¶

^{Fixed in MOS 21.1}

Horizon retains the user’s credentials following their initial login using External Authentication Service, and does not allow to log in with another user credentials.

Workaround:

Clear cookies in your browser.
Select External Authentication Service on the Horizon login page.

Click Sign In. The Keycloak login page opens.

If the following error occurs, refresh the page and try again:

CSRF token missing or incorrect. Cookies may be turned off.
Make sure cookies are enabled and try again.

Tungsten Fabric known issues and limitations¶

Limitations
[10096] tf-control service does not refresh IP addresses of Cassandra pods

Limitations¶

Tungsten Fabric is not monitored by StackLight
Tungsten Fabric does not provide the following functionality:
- Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
- Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
- Role Based Access Control (RBAC) for Neutron objects.

Learn more

MOS Reference Architecture: Tungsten Fabric known limitations

[10096] tf-control service does not refresh IP addresses of Cassandra pods¶

The tf-control service resolves the DNS names of Cassandra pods at startup and does not update them if Cassandra pods got new IP addresses, for example, in case of a restart. As a workaround, to refresh the IP addresses of Cassandra pods, restart the tf-control pods one by one:

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

kubectl -n tf delete pod tf-control-<hash>

Release artifacts¶

This section lists the components artifacts of the MOS Ussuri Update release:

OpenStack Ussuri Update artifacts
Tungsten Fabric Ussuri Update artifacts

OpenStack Ussuri Update artifacts¶
Component	Path	License information for main executable programs
Binaries
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-43c3886-20201121205800.tar.gz	Mirantis Proprietary License
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20201120005752.qcow2	Mirantis Proprietary License
Docker images
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20201121180111	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20201121180111	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20201121180111	Apache License 2.0
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20200810084204	Apache License 2.0, LGPL-2.1 or LGPL-3
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20201121180111	Apache License 2.0
dashboard-selenium	mirantis.azurecr.io/openstack/dashboard-selenium:ussuri-bionic-20201123130303	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20201121180111	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20201121180111	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20201121180111	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20201121180111	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20201121180111	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20201121180111	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20201121180111	Apache License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20201105044831	LGPL-2.1 License
mariadb	mirantis.azurecr.io/general/mariadb:10.4.16-bionic-20201105025052	GPLv2, LGPLv2.1 (client libraries)
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20201121180111	Apache License 2.0
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.32.0	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20201121180111	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20201121180111	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20201109141859	Apache License 2.0
openvswitch-dpdk	mirantis.azurecr.io/general/openvswitch-dpdk:2.11-bionic-20201109141858	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20201121180111	Apache License 2.0
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20201121180111	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.7	Mozilla Public License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.7-management	Mozilla Public License 2.0
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v1.0.0-RC7.1	MIT License
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20201121180111	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:2.1.0	Apache License 2.0
Helm charts
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.3.18.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
dashboard-selenium	https://binary.mirantis.com/openstack/helm/openstack-helm/dashboard-selenium-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-3760.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2672.tgz	Apache License 2.0 (no License file in Helm chart)

Tungsten Fabric Ussuri Update artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary-mirantis-com/tungsten/helm/tungstenfabric-operator-0.2.1.tgz	Mirantis Proprietary License
Docker images
TF Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20201127135523	Apache License 2.0
TF Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20201127135523	Apache License 2.0
TF Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20201127135523	Apache License 2.0
TF Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20201127135523	Apache License 2.0
TF Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20201127135523	Apache License 2.0
TF Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20201127135523	Apache License 2.0
TF Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20201127135523	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20201127135523	Apache License 2.0
TF Status	mirantis.azurecr.io/tungsten/contrail-status:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:5.1.20201127135523	Apache License 2.0
TF VRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20201127135523	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20201127135523	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.6	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.2	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.0.7	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.7	Mozilla Public License 2.0
Zookeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.9	Apache License 2.0
Zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License
TF Test	mirantis.azurecr.io/tungsten/tungsten-pytest:0.1-20201127103849	MIT License

Addressed issues¶

The following issues have been addressed in the Mirantis OpenStack for Kubernetes Ussuri Update release:

[9681] [OS] Fixed the issue that caused the openstack-mariadb-cluster-wait job reaching the backoff limit during the MOS cluster update.
[9883] [OS] Improved the logic of the MOS cluster update to prevent the occurrence of deadlock.
[8293] [TF] Fixed the configuration error that caused the logging of error messages on attempts to use loggers in contrail-lbaas-haproxy-stdout.log.

MOS Ussuri GA release¶

Release date	Name	Container Cloud Cluster release	Highlights
November 05, 2020	MOS Ussuri	6.8.1	General availability of the product with OpenStack Ussuri and choice of Neutron/OVS or Tungsten Fabric 5.1 for networking. Runs on top of a bare metal Kubernetes cluster managed by Container Cloud.

Product highlights¶

Mirantis OpenStack for Kubernetes (MOS) represents a frictionless cloud infrastructure on-premise. MOS Ussuri is integrated with Container Cloud bare metal with Ceph and StackLight onboard and, optionally, supports Tugsten Fabric 5.1 as a backend for the OpenStack networking. In terms of updates, MOS Ussuri fully relies on the Container Cloud update delivery mechanism.

OpenStack Ussuri¶

MOS provides support for OpenStack Ussuri and the following OpenStack components of this release, in particular:

Identity service (Keystone)
Compute service (Nova)
Image service (Glance)
Block Storage service (Cinder)
Orchestration (Heat)
Networking (Neutron)
Load Balancer (Octavia)
DNS service (Designate)
Dashboard (Horizon)
Key management (Barbican)
Tempest

Learn more

Tungsten Fabric 5.1¶

MOS provides support for Tungsten Fabric 5.1 as an SDN backend for OpenStack.

The list of the key highlights include:

Integration with OpenStack Ussuri
Implementation of the Octavia Tungsten Fabric driver for OpenStack LBaaS
LCM operations for supported Tungsten Fabric services as well as third-party services such as Cassandra, ZooKeeper, Kafka, Redis, and RabbitMQ.

Learn more

Major components versions¶

MOS GA components versions¶
Component	Version
Cluster release	6.8.1
OpenStack	Ussuri
openstack-operator	0.3.9
Tungsten Fabric	5.1
tungstenfabric-operator	0.1.3

Known issues¶

This section contains the description of the known issues with available workarounds.

OpenStack known issues and limitations¶

Limitations
[6912] Octavia load balancers may not work properly with DVR
[8573] External authentication to Horizon fails to log in a different user

Limitations¶

[6912] Octavia load balancers may not work properly with DVR¶

Limitation

[8573] External authentication to Horizon fails to log in a different user¶

^{Target fix version: next MOS update}

Horizon retains the user’s credentials following their initial login using External Authentication Service, and does not allow to log in with another user credentials.

Workaround:

Clear cookies in your browser.
Select External Authentication Service on the Horizon login page.

Click Sign In. The Keycloak login page opens.

If the following error occurs, refresh the page and try again:

CSRF token missing or incorrect. Cookies may be turned off.
Make sure cookies are enabled and try again.

Tungsten Fabric known issues and limitations¶

Limitations
[8293] Error messages on attempts to use loggers
[10096] tf-control service does not refresh IP addresses of Cassandra pods

Limitations¶

Tungsten Fabric is not monitored by StackLight
Tungsten Fabric does not provide the following functionality:
- Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
- Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
- Role Based Access Control (RBAC) for Neutron objects.

Learn more

MOS Reference Architecture: Tungsten Fabric known limitations

[8293] Error messages on attempts to use loggers¶

^{Fixed in MOS Ussuri Update}

The HAProxy service, which is used as a backend for load balancers in Tungsten Fabric, uses non-existing socket files from the log collection service. This error in the configuration causes the logging of error messages on attempts to use loggers in contrail-lbaas-haproxy-stdout.log. The issue does not affect the service operability.

[10096] tf-control service does not refresh IP addresses of Cassandra pods¶

Caution

Before restarting the tf-control pods:

Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one tf-control pod that will be restarted.

kubectl -n tf delete pod tf-control-<hash>

Release artifacts¶

This section lists the components artifacts of the MOS Ussuri release:

OpenStack Ussuri release artifacts
Tungsten Fabric release artifacts

OpenStack Ussuri release artifacts¶
Component	Path	License information for main executable programs
Binaries
octavia-amphora	https://binary.mirantis.com/openstack/bin/octavia/amphora-x64-haproxy-ussuri-20200926005743.qcow2	Mirantis Proprietary License
mirantis	https://binary.mirantis.com/openstack/bin/horizon/mirantis-ussuri-26b0ff5.tar.gz	Mirantis Proprietary License
Docker images
placement	mirantis.azurecr.io/openstack/placement:ussuri-bionic-20201019180023	Apache License 2.0
keystone	mirantis.azurecr.io/openstack/keystone:ussuri-bionic-20201019180023	Apache License 2.0
heat	mirantis.azurecr.io/openstack/heat:ussuri-bionic-20201019180023	Apache License 2.0
glance	mirantis.azurecr.io/openstack/glance:ussuri-bionic-20201019180023	Apache License 2.0
cinder	mirantis.azurecr.io/openstack/cinder:ussuri-bionic-20201019180023	Apache License 2.0
neutron	mirantis.azurecr.io/openstack/neutron:ussuri-bionic-20201019180023	Apache License 2.0
nova	mirantis.azurecr.io/openstack/nova:ussuri-bionic-20201019180023	Apache License 2.0
horizon	mirantis.azurecr.io/openstack/horizon:ussuri-bionic-20201019180023	Apache License 2.0
tempest	mirantis.azurecr.io/openstack/tempest:ussuri-bionic-20201019180023	Apache License 2.0
dashboard-selenium	mirantis.azurecr.io/openstack/dashboard-selenium:ussuri-bionic-20201006074752	Apache License 2.0
octavia	mirantis.azurecr.io/openstack/octavia:ussuri-bionic-20201019180023	Apache License 2.0
designate	mirantis.azurecr.io/openstack/designate:ussuri-bionic-20201019180023	Apache License 2.0
ironic	mirantis.azurecr.io/openstack/ironic:ussuri-bionic-20201019180023	Apache License 2.0
barbican	mirantis.azurecr.io/openstack/barbican:ussuri-bionic-20201019180023	Apache License 2.0
libvirt	mirantis.azurecr.io/general/libvirt:6.0.0-bionic-20201007084753	LGPL-2.1 License
pause	mirantis.azurecr.io/general/external/pause:3.1	Apache License 2.0
openvswitch	mirantis.azurecr.io/general/openvswitch:2.11-bionic-20200812034813	Apache License 2.0
rabbitmq-3.8	mirantis.azurecr.io/general/rabbitmq:3.8.7	Mozilla Public License 2.0
rabbitmq-3.8-management	mirantis.azurecr.io/general/rabbitmq:3.8.7-management	Mozilla Public License 2.0
kubernetes-entrypoint	mirantis.azurecr.io/openstack/extra/kubernetes-entrypoint:v1.0.0-20200311160233	Apache License 2.0
docker	mirantis.azurecr.io/openstack/extra/docker:17.07.0	Apache License 2.0
memcached	mirantis.azurecr.io/general/memcached:1.6.6-alpine	BSD 3-Clause “New” or “Revised” License
ceph-config-helper	mirantis.azurecr.io/openstack/extra/ceph-config-helper:nautilus-bionic-20200810084204	Apache License 2.0, LGPL-2.1 or LGPL-3
etcd	mirantis.azurecr.io/openstack/extra/etcd:3.2.26	Apache License 2.0
powerdns	mirantis.azurecr.io/openstack/extra/powerdns:4.2-alpine-20200117133238	GPL-2.0 License
nginx-ingress-controller	mirantis.azurecr.io/openstack/extra/nginx-ingress-controller:0.32.0	Apache License 2.0
defaultbackend	mirantis.azurecr.io/openstack/extra/defaultbackend:1.0	Apache License 2.0
mariadb	mirantis.azurecr.io/general/mariadb:10.4.14-bionic-20200812025059	GPLv2, LGPLv2.1 (client libraries)
rabbitmq-exporter	mirantis.azurecr.io/stacklight/rabbitmq-exporter:v0.29.0	MIT License
prometheus-memcached-exporter	mirantis.azurecr.io/stacklight/memcached-exporter:v0.5.0	Apache License 2.0
prometheus-mysql-exporter	mirantis.azurecr.io/stacklight/mysqld-exporter:v0.11.0	Apache License 2.0
xrally-openstack	mirantis.azurecr.io/openstack/extra/xrally-openstack:1.5.0	Apache License 2.0
aodh	mirantis.azurecr.io/openstack/aodh:ussuri-bionic-20201019180023	Apache License 2.0
panko	mirantis.azurecr.io/openstack/panko:ussuri-bionic-20201019180023	Apache License 2.0
ceilometer	mirantis.azurecr.io/openstack/ceilometer:ussuri-bionic-20201019180023	Apache License 2.0
gnocchi	mirantis.azurecr.io/openstack/gnocchi:ussuri-bionic-20201019180023	Apache License 2.0
redis	mirantis.azurecr.io/openstack/extra/redis:5.0-alpine	BSD 3-Clause “New” or “Revised” License
Helm charts
openstack-operator	https://binary.mirantis.com/openstack/helm/openstack-controller/openstack-operator-0.3.9.tgz	Mirantis Proprietary License
aodh	https://binary.mirantis.com/openstack/helm/openstack-helm/aodh-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
barbican	https://binary.mirantis.com/openstack/helm/openstack-helm/barbican-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
ceilometer	https://binary.mirantis.com/openstack/helm/openstack-helm/ceilometer-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
cinder	https://binary.mirantis.com/openstack/helm/openstack-helm/cinder-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
designate	https://binary.mirantis.com/openstack/helm/openstack-helm/designate-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
glance	https://binary.mirantis.com/openstack/helm/openstack-helm/glance-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
heat	https://binary.mirantis.com/openstack/helm/openstack-helm/heat-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
horizon	https://binary.mirantis.com/openstack/helm/openstack-helm/horizon-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
ironic	https://binary.mirantis.com/openstack/helm/openstack-helm/ironic-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
keystone	https://binary.mirantis.com/openstack/helm/openstack-helm/keystone-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
magnum	https://binary.mirantis.com/openstack/helm/openstack-helm/magnum-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
mistral	https://binary.mirantis.com/openstack/helm/openstack-helm/mistral-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
neutron	https://binary.mirantis.com/openstack/helm/openstack-helm/neutron-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
nova	https://binary.mirantis.com/openstack/helm/openstack-helm/nova-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
octavia	https://binary.mirantis.com/openstack/helm/openstack-helm/octavia-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
panko	https://binary.mirantis.com/openstack/helm/openstack-helm/panko-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
rally	https://binary.mirantis.com/openstack/helm/openstack-helm/rally-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
senlin	https://binary.mirantis.com/openstack/helm/openstack-helm/senlin-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
tempest	https://binary.mirantis.com/openstack/helm/openstack-helm/tempest-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
dashboard-selenium	https://binary.mirantis.com/openstack/helm/openstack-helm/dashboard-selenium-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
placement	https://binary.mirantis.com/openstack/helm/openstack-helm/placement-0.1.0-mcp-3742.tgz	Apache License 2.0 (no License file in Helm chart)
calico	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/calico-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-client	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-client-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-mon	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-mon-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-osd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-osd-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-provisioners	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-provisioners-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
ceph-rgw	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ceph-rgw-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
dnsmasq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/dnsmasq-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-apm-server	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-apm-server-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-filebeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-filebeat-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-metricbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-metricbeat-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
elastic-packetbeat	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/elastic-packetbeat-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
etcd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/etcd-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
falco	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/falco-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
flannel	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/flannel-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
fluentbit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentbit-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
fluentd	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/fluentd-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
gnocchi	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/gnocchi-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
grafana	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/grafana-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
helm-toolkit	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/helm-toolkit-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
ingress	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ingress-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
kube-dns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kube-dns-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
kubernetes-keystone-webhook	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/kubernetes-keystone-webhook-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
ldap	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/ldap-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
libvirt	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/libvirt-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
lockdown	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/lockdown-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
mariadb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mariadb-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
memcached	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/memcached-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
mongodb	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/mongodb-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
nagios	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nagios-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
nfs-provisioner	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/nfs-provisioner-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
openvswitch	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/openvswitch-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
podsecuritypolicy	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/podsecuritypolicy-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
postgresql	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/postgresql-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
powerdns	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/powerdns-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-alertmanager	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-alertmanager-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-kube-state-metrics	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-kube-state-metrics-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-node-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-node-exporter-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-openstack-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-openstack-exporter-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
prometheus-process-exporter	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/prometheus-process-exporter-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
rabbitmq	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/rabbitmq-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
redis	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/redis-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
registry	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/registry-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
tiller	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/tiller-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)
zookeeper	https://binary.mirantis.com/openstack/helm/openstack-helm-infra/zookeeper-0.1.0-mcp-2650.tgz	Apache License 2.0 (no License file in Helm chart)

Tungsten Fabric release artifacts¶
Component	Path	License information for main executable programs
Helm charts
Tungsten Fabric Operator	https://binary-mirantis-com/tungsten/helm/tungstenfabric-operator-0.1.3.tgz	Mirantis Proprietary License
Docker images
TF Analytics	mirantis.azurecr.io/tungsten/contrail-analytics-api:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-collector:5.1.20201022210010	Apache License 2.0
TF Analytics Alarm	mirantis.azurecr.io/tungsten/contrail-analytics-alarm-gen:5.1.20201022210010	Apache License 2.0
TF Analytics DB	mirantis.azurecr.io/tungsten/contrail-analytics-query-engine:5.1.20201022210010	Apache License 2.0
TF Analytics SNMP	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-collector:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-analytics-snmp-topology:5.1.20201022210010	Apache License 2.0
TF Config	mirantis.azurecr.io/tungsten/contrail-controller-config-api:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-devicemgr:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-schema:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-config-svcmonitor:5.1.20201022210010	Apache License 2.0
TF Control	mirantis.azurecr.io/tungsten/contrail-controller-control-control:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-dns:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-control-named:5.1.20201022210010	Apache License 2.0
TF Web UI	mirantis.azurecr.io/tungsten/contrail-controller-webui-job:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-controller-webui-web:5.1.20201022210010	Apache License 2.0
Nodemanager	mirantis.azurecr.io/tungsten/contrail-nodemgr:5.1.20201022210010	Apache License 2.0
TF Status	mirantis.azurecr.io/tungsten/contrail-status:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-aggregator:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tf-status-party:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-tungsten-pytest:5.1.20201022210010	MIT License
TF VRouter	mirantis.azurecr.io/tungsten/contrail-vrouter-agent:5.1.20201022210010	Apache License 2.0
	mirantis.azurecr.io/tungsten/contrail-vrouter-kernel-build-init:5.1.20201022210010	Apache License 2.0
Cassandra operator	mirantis.azurecr.io/tungsten-operator/casskop:v0.5.3-release	Apache License 2.0
Cassandra	mirantis.azurecr.io/tungsten/cassandra-bootstrap:0.1.4	Apache License 2.0
	mirantis.azurecr.io/tungsten/cassandra:3.11.6	Apache License 2.0
Kafka operator	mirantis.azurecr.io/tungsten-operator/kafka-k8s-operator:0.0.6	Mirantis Proprietary License
Kafka	mirantis.azurecr.io/tungsten/cp-kafka:5.5.2	Apache License 2.0
RabbitMQ operator	mirantis.azurecr.io/tungsten-operator/rabbitmq-operator:0.0.7	Mirantis Proprietary License
RabbitMQ	mirantis.azurecr.io/general/rabbitmq:3.8.7	Mozilla Public License 2.0
Zookeeper operator	mirantis.azurecr.io/tungsten-operator/zookeeper-operator:0.2.9	Apache License 2.0
Zookeeper	mirantis.azurecr.io/tungsten/zookeeper:3.6.1-0.2.9	Apache License 2.0
Redis operator	mirantis.azurecr.io/tungsten-operator/redis-operator:0.1.5-1-ccd6a63	Apache License 2.0
Redis	mirantis.azurecr.io/tungsten/redis:5-alpine	BSD 3-Clause “New” or “Revised” License

Deprecation Notes¶

Considering continuous reorganization and enhancement of Mirantis OpenStack for Kubernetes (MOSK), certain components are deprecated and eventually removed from the product. This section provides details about the deprecated and removed functionality that may potentially impact existing MOSK deployments.

OpenStack¶

This section provides information about deprecated and removed functionality in Mirantis OpenStack services.

BGP VPN as a Service¶

Deprecated	MOSK 25.1
Unsupported	MOSK 25.2
Details	Deprecated support for the BGP VPN as a service. For more information, contact Mirantis support.

Logical Volume Manager with iSCSI for the Block Storage service¶

Deprecated	MOSK 25.1
Unsupported	MOSK 25.2
Details	Deprecated support for the Logical Volume Manager (LVM) with iSCSI backend for the Block Storage service. For more information, contact Mirantis support.

Multitenant OVS networking for the Bare Metal service¶

Deprecated	MOSK 25.1
Unsupported	MOSK 25.2
Details	Deprecated support for the multitenant networking in the Bare Metal service (Ironic) using Open vSwitch (OVS). For more information, contact Mirantis support.

DPDK¶

Deprecated	MOSK 25.1
Unsupported	MOSK 25.2
Details	Deprecated the DPDK feature. For more information, contact Mirantis support.

Telemetry services¶

Deprecated	MOSK 24.2.2
Unsupported	MOSK 24.2.4
Details	Deprecated the MOSK Telemetry services (Gnocchi, Ceilometer, and Aodh) that used to monitor OpenStack components.

The allow-unsafe-backup parameter for MariaDB¶

Deprecated	MOSK 22.3
Removed	MOSK 22.4
Details	Removed the unsafe `--allow-unsafe-backup` flag for MariaDB backup for security and backup procedure simplification purposes.

CPU isolation using isolcpus¶

Deprecated	MOSK 22.2
Removed	To be decided
Details	Configuring CPU isolation through the `isolcpus` configuration parameter for Linux kernel is considered deprecated. MOSK 21.5 introduces the capability to configure CPU isolation using the `cpusets` mechanism in Linux kernel. For details, see CPU isolation using cpusets.

Tungsten Fabric¶

This section provides information about deprecated and removed functionality in Mirantis Tungsten Fabric.

Tungsten Fabric API v1alpha1¶

Deprecated	MOSK 24.2
Unsupported	MOSK 24.3
Removed	MOSK 25.1
Details	Instead of v1alpha1, MOSK introduces support for the API v2 for the Tungsten Fabric Operator. The new version of the Tungsten Fabric Operator API aligns with the OpenStack Controller API and provides better interface for advanced configurations. In MOSK 24.1, the API v2 is available only for the new product deployments with Tungsten Fabric. Since 24.2, MOSK API v2 becomes default for new product deployments and includes the ability to convert existing v1alpha1 `TFOperator` to v2 during update. For details, refer to Convert v1alpha1 TFOperator custom resource to v2. During the update to the 24.3 series, the old Tungsten Fabric cluster configuration API v1alpha1 is automatically converted and replaced with the v2 version.

Tungsten Fabric analytics services¶

Deprecated	MOSK 24.1
Unsupported	MOSK 24.2
Removed	MOSK 25.2
Details	Tungsten Fabric analytics services, primarily designed for collecting various metrics from the Tungsten Fabric services, have been deprecated. All greenfield deployments starting from MOSK 24.1 do not include Tungsten Fabric analytics services. The existing deployments updated to 24.1 and newer until 25.2 include Tungsten Fabric analytics services as well as the ability to disable them as described in Disable Tungsten Fabric analytics services.

Bare metal¶

This section provides information about deprecated and removed functionality in bare metal services.

WireGuard¶

Deprecated	MOSK 25.1 and Container Cloud 2.29.0
Unsupported	To be decided
Details	Deprecated support for WireGuard. If you still require the feature, contact Mirantis support for further information.

MetalLBConfigTemplate resource management¶

Deprecated

MOSK 24.2 and Container Cloud 2.27.0

Unsupported

MOSK 24.3 and Container Cloud 2.28.0

Details

Deprecated the MetalLBConfigTemplate resource. Use the MetalLBConfig resource instead.

Existing MetalLBConfigTemplate objects and related Subnet objects are automatically migrated to MetallbConfig during MOSK cluster update to 24.2.

Since MOSK 25.1, Admission Controller blocks creation of new MetalLBConfigTemplate objects.

SubnetPool resource management¶

Deprecated

MOSK 24.2 and Container Cloud 2.27.0

Unsupported

MOSK 24.3 and Container Cloud 2.28.0

Details

Deprecated the SubnetPool resource along with automated subnet creation using SubnetPool.

Existing configurations that use SubnetPool objects in L2Template will be automatically migrated to Subnet objects during cluster update to MOSK 24.2. As a result of migration, existing Subnet objects will be referenced in L2Template objects instead of SubnetPool.

Since MOSK 25.1 and Container Cloud 2.29.0, Admission Controller blocks creation of new SubnetPool objects.

If you still require this feature, contact Mirantis support for further information.

Default L2Template for a namespace¶

Deprecated

MOSK 23.2 and Container Cloud 2.24.0

Unsupported

MOSK 23.3 and Container Cloud 2.25.0

Details

Disabled creation of the default L2 template for a namespace.

On existing clusters, clusterRef: default is removed during the migration process. Subsequently, this parameter is not substituted with the cluster.sigs.k8s.io/cluster-name label, ensuring the application of the L2 template across the entire Kubernetes namespace. Therefore, you can continue using existing default L2 templates for namespaces.

The clusterRef parameter in L2Template¶

Deprecated

MOSK 23.3 and Container Cloud 2.25.0

Unsupported

To be decided

Details

Deprecated the clusterRef parameter located in the L2Template spec. Use the cluster.sigs.k8s.io/cluster-name label instead.

On existing clusters, this parameter is automatically migrated to the cluster.sigs.k8s.io/cluster-name label since MOSK 23.3 and Container Cloud 2.25.0.

Ubuntu Focal¶

Deprecated	MOSK 24.3 and Container Cloud 2.28.0
Unsupported	MOSK 25.1 and Container Cloud 2.29.0
Details	Deprecated support for the Focal Fossa Ubuntu distribution in favor of Jammy Jellyfish. Warning During the course of the MOSK 24.3 and Container Cloud 2.28.x series, Mirantis highly recommends upgrading an operating system on your cluster machines to Ubuntu 22.04 before the following major release becomes available. It is not mandatory to upgrade all machines at once. You can upgrade them one by one or in small batches, for example, if the maintenance window is limited in time. The Cluster release update of the Ubuntu 20.04-based MOSK clusters will become impossible as of Container Cloud 2.29.0, where Ubuntu 22.04 is the only supported version. Management cluster update to Container Cloud 2.29.1 will be blocked if at least one node of any related MOSK cluster is running Ubuntu 20.04.

byName in BareMetalHostProfile¶

Deprecated	MOSK 24.1 and Container Cloud 2.26.0
Unsupported	MOSK 25.2 and Container Cloud 2.27.0
Details	Deprecated the `byName` field in the `BareMetalHostProfile` object. As a replacement, use a more specific selector, such as `byPath`, `serialNumber`, or `wwn`. For details, see BareMetalHostProfile resource.

minSizeGiB and maxSizeGiB in BareMetalHostProfile¶

Deprecated

MOSK 24.1 and Container Cloud 2.26.0

Unsupported

To be decided

Details

Deprecated the minSizeGiB and maxSizeGiB fields in the BareMetalHostProfile object.

Instead of floats that define sizes in GiB for *GiB fields, use the <sizeNumber>Gi text notation such as Ki, Mi, and so on.

All newly created profiles are automatically migrated to the Gi syntax. In existing profiles, migrate the syntax manually.

The wipe field in spec of BareMetalHostProfile¶

Deprecated

MOSK 24.1 and Container Cloud 2.26.0

Unsupported

To be decided

Details

Deprecated the wipe field from the spec:devices section of the BareMetalHostProfile object for the sake of wipeDevice.

For backward compatibility, any existing wipe: true option is automatically converted to the following structure:

wipeDevice:
  eraseMetadata:
    enabled: True

For new machines, use the wipeDevice structure in the BareMetalHostProfile object.

L2Template without the l3Layout parameters section¶

Deprecated

MOS 21.3 and Container Cloud 2.9.0

Unsupported

MOSK 23.2 and Container Cloud 2.24.0

Details

Deprecated the use of the L2Template object without the l3Layout section in spec. The use of the l3Layout section is mandatory since Container Cloud 2.24.0 and MOSK 23.2.

On existing clusters, the l3Layout section is not added automatically. Therefore, if you do not have the l3Layout section in L2 templates of your existing clusters, manually add it and define all subnets that are used in the npTemplate section of the L2 template.

For details on L2 template configuration, see Create L2 templates.

Caution

Partial definition of subnets is prohibited.

The dnsmasq.dhcp_range parameter¶

Deprecated

MOSK 22.5 and Container Cloud 2.21.0

Unsupported

MOSK 23.2 and Container Cloud 2.24.0

Details

Deprecated the dnsmasq.dhcp_range parameter of the baremetal-operator Helm chart values in the Cluster spec. Use the Subnet object configuration for this purpose instead.

Since Container Cloud 2.24.0, admission-controller does not accept any changes to dnsmasq.dhcp_range except removal. Therefore, manually remove this parameter from the baremetal-operator release spec section of the Cluster object as described in Configure multiple DHCP address ranges.

The MetalLB configInline parameter¶

Deprecated	MOSK 23.2 and Container Cloud 2.24.0
Unsupported	MOSK 23.3 and Container Cloud 2.25.0
Details	Deprecated the `configInline` parameter in the `metallb` Helm chart values of the `Cluster` `spec`. Use the `MetalLBConfig`, `MetalLBConfigTemplate`, and `Subnet` objects instead of this parameter.

The L2Template and IPaddr parameter status fields¶

Deprecated

MOSK 23.1 and Container Cloud 2.23.0

Unsupported

MOSK 23.3 and Container Cloud 2.25.0

Details

Deprecated the following status fields for the L2Template and IPaddr objects:

IPaddr:
- labelSetChecksum
- phase in favor of state
- reason in favor of messages
L2Template:
- phase in favor of state
- reason in favor of messages
- specHash

Subnet and SubnetPool status fields¶

Deprecated

MOSK 23.1 and Container Cloud 2.23.0

Unsupported

MOSK 23.3 and Container Cloud 2.25.0

Details

Deprecated the following Subnet and SubnetPool status fields:

Subnet.Status: labelSetChecksum, statusMessage in favor of state and messages
SubnetPool.Status: statusMessage in favor of state and messages

IpamHost status fields renaming¶

Deprecated

MOSK 22.5 and Container Cloud 2.21.0

Renamed

MOSK 23.1 and Container Cloud 2.22.0

Details

Renamed the following fields of the IpamHost status:

netconfigV2 to netconfigCandidate
netconfigV2state to netconfigCandidateState
netconfigFilesState to netconfigFilesStates (per file)

The format of netconfigFilesState is changed after renaming. The netconfigFilesStates field contains a dictionary of statuses of network configuration files stored in netconfigFiles. The dictionary contains the keys that are file paths and values that have the same meaning for each file that netconfigFilesState had:

For a successfully rendered configuration file: OK: <timestamp> <sha256-hash-of-rendered-file>, where a timestamp is in the RFC 3339 format.
For a failed rendering: ERR: <error-message>.

The status.l2RenderResult field of the IpamHost object¶

Deprecated	MOSK 22.4 and Container Cloud 2.19.0
Unsupported	To be decided
Details	Deprecated the `status.l2RenderResult` field of the `IpamHost` object in the sake of `status.NetconfigCandidateState`.

The status.nicMACmap field of the IpamHost object¶

Deprecated	MOSK 21.6 and Container Cloud 2.14.0
Unsupported	MOSK 22.1 and Container Cloud 2.15.0
Details	Removed `nicMACmap` in the `IpamHost` status. Instead, use the `serviceMap` field that contains actual information about services, IP addresses, and addresses interfaces.

The ipam/DefaultSubnet label of the Subnet object¶

Deprecated	MOSK 21.6 and Container Cloud 2.14.0
Unsupported	To be decided
Details	Deprecated the `ipam/DefaultSubnet` label of the metadata field of the `Subnet` object.

The IpamHost status fields¶

Deprecated

MOSK 21.5 and Container Cloud 2.12.0

Renamed

MOSK 22.5 and Container Cloud 2.22.0

Details

Replaced the following IpamHost status fields:

ipAllocationResult is replaced with State and an error message, if any, in the Messages list.
osMetadataNetwork is replaced with NetconfigCandidate.

IPAM API resources¶

Deprecated	MOSK 21.4 and Container Cloud 2.11.0
Renamed	MOSK 21.5 and Container Cloud 2.12.0
Details	Deprecated the following IPAM API resources: `created`, `lastUpdated`, and `versionIpam`. These resources will be eventually replaced with `objCreated`, `objUpdated`, and `objStatusUpdated`.

Netchecker¶

Deprecated	MOSK 21.3 and Container Cloud 2.9.0
Unsupported	MOSK 21.4 and Container Cloud 2.11.0
Details	Deprecated the Netchecker service.

bootUEFI¶

Deprecated

MOSK 21.2 and Container Cloud 2.8.0

Unsupported

MOSK 21.3 and Container Cloud 2.9.0

Details

Replaced the bootUEFI field in the BareMetalHost configuration with bootMode that has the following values:

UEFI if UEFI is enabled
legacy if UEFI is disabled.

Container Cloud¶

This section provides information about deprecated and removed functionality in Mirantis Container Cloud services.

OpenStack cloud provider¶

Deprecated

Container Cloud 2.28.4 (Cluster release 16.3.4)

Unsupported

Container Cloud 2.29.0 (Cluster release 16.4.0)

Details

Suspended support for OpenStack-based deployments for the sake of the MOSK product. Simultaneously, ceased performing functional integration testing of the OpenStack provider and removed the possibility to update an OpenStack-based cluster to Container Cloud 2.29.0 (Cluster release 16.4.0).

Therefore, the final supported version for this cloud provider is Container Cloud 2.28.5 (Cluster release 16.3.5). If you still require the feature, contact Mirantis support for further information.

Reference Application for workload monitoring¶

Deprecated	Container Cloud 2.28.0 (Cluster release 16.3.0)
Unsupported	Container Cloud 2.28.3 (Cluster release 16.3.3)
Details	Deprecated support for Reference Application on non-MOSK clusters. Due to this deprecation, if the `RefAppDown` alert is firing in the cluster, disable `refapp.enabled` to prevent unnecessary alerts. Note For the feature support on MOSK deployments, refer to Deploy your first cloud application using automation.

The maxWorkerUpgradeCount parameter¶

Deprecated	Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0)
Unsupported	To be decided
Details	Deprecated the `maxWorkerUpgradeCount` parameter of the `Cluster` object. Use the `concurrentUpdates` parameter in the `UpdateGroup` object instead.

VMware vSphere-based clusters¶

Deprecated

Container Cloud 2.27.2 (Cluster release 16.2.2)

Unsupported

Container Cloud 2.27.3 (Cluster release 16.2.3)

Details

Suspended support for VMware vSphere-based deployments, including attachment of existing Mirantis Kubernetes Engine (MKE) clusters that were originally not deployed by Container Cloud.

Simultaneously, ceased performing functional integration testing of the vSphere provider and removed the possibility to update a vSphere-based cluster to Container Cloud 2.27.3 (Cluster release 16.2.3).

Therefore, the final supported version for this cloud provider is Container Cloud 2.27.2 (Cluster release 16.2.2). If you still require the feature, contact Mirantis support for further information.

Regional clusters¶

Deprecated	Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0)
Unsupported	Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)
Details	Suspended support for regional clusters and several regions on a single management cluster. Simultaneously, ceased performing functional integration testing of the feature and removed the related code in Container Cloud 2.26.0. If you still require this feature, contact Mirantis support for further information.

Bootstrap v1¶

Deprecated	Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0)
Unsupported	Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)
Details	Deprecated the bootstrap procedure using Bootstrap v1 in favor of Bootstrap v2. For new bootstrap procedure, see Deploy a management cluster.

Attachment of MKE clusters¶

Deprecated	Container Cloud 2.24.0 (Cluster release 14.0.0)
Unsupported	Container Cloud 2.24.0 (Cluster release 14.0.0)
Details	Suspended support for attachment of existing Mirantis Kubernetes Engine (MKE) clusters that were originally not deployed by Container Cloud. Also suspended support for all related features, such as sharing a Ceph cluster with an attached MKE cluster.

IAM API and IAM CLI¶

Deprecated	Container Cloud 2.15.0 (Cluster releases 7.5.0 and 5.22.0)
Unsupported	Container Cloud 2.18.0 (Cluster releases 11.2.0 and 7.8.0)
Details	Deprecated the `iam-api` service and IAM CLI (the iamctl command). The logic of the `iam-api` service required for Container Cloud is moved to `scope-controller`.

SSH-enabled user names¶

Deprecated	Container Cloud 2.7.0 (Cluster release 5.14.0)
Unsupported	Container Cloud 2.9.0 (Cluster releases 6.16.0 and 5.16.0)
Details	Replaced all existing SSH user names, such as `ubuntu`, with the universal `mcc-user` user name. Since Container Cloud 2.9.0, SSH keys are managed only for `mcc-user`.

DISABLE_OIDC flag for TLS¶

Deprecated	Container Cloud 2.13.0 (Cluster releases 7.3.0 and 5.20.0)
Unsupported	Container Cloud 2.14.0 (Cluster releases 7.4.0 and 5.21.0)
Details	Removed the `DISABLE_OIDC` flag required to be set for custom TLS Keycloak and web UI certificates during a management cluster deployment. Do not set this parameter anymore in `bootstrap.env`. To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for cluster applications.

Ceph¶

This section provides information about deprecated and removed functionality in Mirantis Ceph.

The ingress section in the Ceph spec¶

Deprecated

MOSK 24.3

Unsupported

MOSK 25.1

Details

Deprecated the ingress section of the Ceph spec. Use the ingressConfig section instead.

On existing clusters, the ingress section is automatically replaced by ingressConfig since MOSK 25.1.

Ceph metrics¶

Deprecated

MOSK 24.2

Unsupported

MOSK 24.3

Details

Deprecated the performance metric exporter that is integrated into the Ceph Manager daemon for the sake of the dedicated Ceph Exporter daemon. Names of metrics will not be changed, no metrics will be removed.

All Ceph metrics to be collected by the Ceph Exporter daemon will change their labels job and instance due to scraping metrics from new Ceph Exporter daemon instead of the performance metric exporter of Ceph Manager:

Values of the job labels will be changed from rook-ceph-mgr to prometheus-rook-exporter for all Ceph metrics moved to Ceph Exporter. The full list of moved metrics is presented below.
Values of the instance labels will be changed from the metric endpoint of Ceph Manager with port 9283 to the metric endpoint of Ceph Exporter with port 9926 for all Ceph metrics moved to Ceph Exporter. The full list of moved metrics is presented below.
Values of the instance_id labels of Ceph metrics from the RADOS Gateway (RGW) daemons will be changed from the daemon GID to the daemon subname. For example, instead of instance_id="<RGW_PROCESS_GID>", the instance_id="a" (ceph_rgw_qlen{instance_id="a"}) will be used. The list of moved Ceph RGW metrics is presented below.

Therefore, if Ceph metrics to be collected by the Ceph Exporter daemon are used in any customizations, for example, custom alerts, Grafana dashboards, or queries in custom tools, update your customizations to use new labels since Container Cloud 2.28.0 (Cluster releases 16.3.0 and 17.3.0).

The cephDeviceMapping status fields¶

Deprecated	MOSK 23.1
Unsupported	MOSK 23.2
Details	Removed `cephDeviceMapping` from the `status.fullClusterInfo.cephDetails` section of the `KaaSCephCluster` object because its large size can potentially exceed the Kubernetes 1.5 MB quota.

The mgr.modules parameter¶

Deprecated	MOSK 22.4
Unsupported	To be decided
Details	Deprecated the `mgr.modules` parameter in favor of `mgr.mgrModules`.

Ceph RADOS Gateway users parameter¶

Deprecated	MOSK 22.4
Unsupported	To be decided
Details	Deprecated the Ceph RADOS Gateway `users` parameter in favor of `objectUsers`.

Ceph on management and regional clusters¶

Deprecated	Container Cloud 2.19.0 (Cluster releases 11.3.0 and 7.9.0)
Unsupported	Container Cloud 2.20.0 (Cluster releases 11.4.0 and 7.10.0)
Details	Removed Ceph cluster deployment from the management and regional clusters to reduce resource consumption. Ceph is automatically removed during the Cluster release update to 11.4.0 or 7.10.0.

The dataPool field in CephFS specification¶

Deprecated	MOSK 22.4 and Container Cloud 2.19.0
Unsupported	To be decided
Details	Deprecated the `dataPool` field in CephFS specification in favor of `dataPools`.

The manageOsds parameter in KaaSCephCluster¶

Deprecated	MOSK 22.1 and Container Cloud 2.14.0
Unsupported	MOSK 22.3 and Container Cloud 2.17.0
Details	Deprecated `manageOsds` in the `KaaSCephCluster` CR. To remove Ceph OSDs, see Remove Ceph OSD manually or Automated Ceph LCM.

The maintenance flag¶

Deprecated	MOSK 21.5 and Container Cloud 2.12.0
Unsupported	MOSK 22.1 and Container Cloud 2.14.0
Details	Deprecated the Ceph `maintenance` flag as part of Ceph components refactoring.

StackLight¶

This section provides information about deprecated and removed functionality in Mirantis StackLight.

Alertmanager API v1¶

Deprecated	MOSK 25.1 and Container Cloud 2.29.0
Unsupported	To be decided
Details	Deprecated the Alertmanager API v1 in favor of v2. The current Alertmanager version supports both API versions. However, in one of the upcomping MOSK and Container Cloud releases, Alertmanager will be upgraded to the version that supports only the API v2. Therefore, if you use API v1, update your integrations and configurations to use the API v2 to ensure compatibility with the upgraded Alertmanager.

RabbitMQ Prometheus Exporter¶

Deprecated

MOSK 25.1 and Container Cloud 2.29.0

Unsupported

To be decided

Details

Following the upstream deprecation in Prometheus, deprecated the prometheus-rabbitmq-exporter job in favor of the rabbitmq-prometheus-plugin one, which is based on the native RabbitMQ Prometheus plugin ensuring reliable and direct metric colletion.

As a result, deprecated and renamed the RabbitMQ Grafana dashboard to the RabbitMQ [Deprecated] one. As a replacement, use the RabbitMQ Overview and RabbitMQ Erlang Grafana dashboards.

Warning

Angular plugins in Grafana dashboards¶

Deprecated

MOSK 24.3 and Container Cloud 2.28.0

Unsupported

MOSK 25.1 and Container Cloud 2.29.0

Details

Following the upstream deprecation in Grafana, deprecated the Angular-based plugins in favor of the React-based ones. In Container Cloud 2.29.0 and MOSK 25.1, where Grafana is updated from version 10 to 11, the following Angular plugins are automatically migrated to the React ones:

Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap

All Grafana dashboards provided by StackLight are also migrated to React automatically. For the list of default dashboards, see View Grafana dashboards.

Warning

This migration may corrupt custom Grafana dashboards that have Angular-based panels. Therefore, if you have such dashboards, back them up and manually upgrade Angular-based panels during the course of Container Cloud 2.28.x and MOSK 24.3.x to prevent custom appearance issues after plugin migration.

The telegraf-openstack plugin¶

Deprecated	MOSK 23.3
Final release	MOSK 23.3.4
Removed	MOSK 24.1
Details	The StackLight `telegraf-openstack` plugin is going to be replaced by `osdpl-exporter`. As a result, all valuable Telegraf metrics that are used by StackLight components will be reimplemented in `osdpl-exporter` and all dependent StackLight alerts and dashboards will start using new metrics. Therefore, if you use any `telegraf-openstack` metrics in any cluster customizations, consider reimplementing them with new metrics. To obtain the list of metrics that are removed and replaced with new ones, contact Mirantis support.

The resourcesPerClusterSize parameter¶

Deprecated	MOSK 24.3 and Container Cloud 2.28.0
Removed	To be decided
Details	Deprecated the `resourcesPerClusterSize` parameter in StackLight configuration in favor of the `resources` parameter.

Logging parameters¶

Deprecated

MOSK 23.3 and Container Cloud 2.25.0

Removed

MOSK 24.1 and Container Cloud 2.26.0

Details

Removed the following logging-related StackLight parameters in the light of the logging pipeline refactoring:

logging.level (also removed from the logging.externalOutputs configuration)
logging.syslog.level
logging.retentionTime
elasticsearch.logstashRetentionTime
elasticsearch.retentionTime

The logging.syslog parameter¶

Deprecated	MOSK 23.1 and Container Cloud 2.23.0
Removed	To be decided
Details	Deprecated `logging.syslog` in favor of `logging.externalOutputs` that contains a wider range of configuration options.

Services and parameters related to OpenSearch and Kibana¶

Deprecated

MOSK 22.3 and Container Cloud 2.18.0

Removed

To be decided

Details

Deprecated elasticsearch-master in favor of opensearch-master. In future releases, the following parameters of the stacklight.values section will be deprecated and finally replaced by:

elasticsearch in favor of logging
elasticsearch.retentionTime in favor of logging.retentionTime
resourcesPerClusterSize.elasticsearch in favor of resourcesPerClusterSize.opensearch
resourcesPerClusterSize.fluentdElasticsearch in favor of resourcesPerClusterSize.fluentdLogs
resources.fluentdElasticsearch in favor of resources.fluentdLogs
resources.elasticsearch in favor of resources.opensearch
resources.iamProxyKibana in favor of resources.iamProxyOpenSearchDashboards
resources.kibana in favor of resources.opensearchDashboards
nodeSelector.component.elasticsearch in favor of nodeSelector.component.opensearch
nodeSelector.component.fluentdElasticsearch in favor of nodeSelector.component.fluentdLogs
nodeSelector.component.kibana in favor of nodeSelector.component.opensearchDashboards
tolerations.component.elasticsearch in favor of tolerations.component.opensearch
tolerations.component.fluentdElasticsearch in favor of tolerations.component.fluentdLogs
tolerations.component.kibana in favor of tolerations.component.opensearchDashboards
stacklightLogLevels.component.fluentdElasticsearch in favor of stacklightLogLevels.component.fluentdLogs
stacklightLogLevels.component.elasticsearch in favor of stacklightLogLevels.component.opensearch
stacklightLogLevels.component.kibana in favor of stacklightLogLevels.component.opensearchDashboards

Elasticsearch and Kibana¶

Deprecated

MOSK 22.2 and Container Cloud 2.16.0

Removed

To be decided

Details

Replaced Elasticsearch with OpenSearch, and Kibana with OpenSearch Dashboards due to licensing changes for Elasticsearch. OpenSearch is a fork of Elasticsearch under the open-source Apache License with development led by Amazon Web Services.

For new deployments with the logging stack enabled, OpenSearch is now deployed by default. For existing deployments, migration to OpenSearch is performed automatically during clusters update. However, the entire Elasticsearch cluster may go down for up to 15 minutes.

Retention Time parameter in the Container Cloud web UI¶

Deprecated	MOSK 22.2 and Container Cloud 2.16.0
Removed	MOSK 22.3 and Container Cloud 2.17.0
Details	Replaced the Retention Time parameter with the Logstash Retention Time, Events Retention Time, and Notifications Retention Time parameters.

logstashRetentionTime parameter for Elasticsearch¶

Deprecated	MOSK 22.2 and Container Cloud 2.16.0
Removed	MOSK 24.1 and Container Cloud 2.26.0
Details	Deprecated the `elasticsearch.logstashRetentionTime` parameter in favor of the `elasticsearch.retentionTime.logstash`, `elasticsearch.retentionTime.events`, and `elasticsearch.retentionTime.notifications` parameters.

match and match_re keys for notifications routes¶

Deprecated	MOSK 21.5 and Container Cloud 2.12.0
Removed	MOSK 21.6 and Container Cloud 2.14.0
Details	Replaced `match` and `match_re` keys for configuration of notifications routes with the `matchers` key.

Release cadence and support cycle¶

Mirantis aims to release Mirantis OpenStack for Kubernetes (MOSK) software regularly and often.

MOSK software includes OpenStack, Tungsten Fabric, life-cycle management tooling, other supporting software, and dependencies. Mirantis’s goal is to ensure that such updates are easy to install in zero-touch and zero-downtime fashion.

MOSK major and patch releases¶

MOSK release cadence consists of major, for example, MOSK 24.1, and patch, for example, MOSK 24.1.1 or 24.1.2, releases. The major release with the patch releases based on it are called a release series, for example, MOSK 24.1 series.

Patch releases strive to considerably reduce the timeframe for delivering CVE resolutions in images to your deployments, aiding in the mitigation of cyber threats and data breaches.

Content	Major release	Patch release
Version update and upgrade of the major product components including but not limited to OpenStack, Tungsten Fabric, Kubernetes, Ceph, and Stacklight 0
Container runtime changes including Mirantis Container Runtime and containerd updates
Changes in public API
Changes in the Container Cloud and MOSK lifecycle management including but not limited to machines, clusters, Ceph OSDs
Host machine changes including host operating system and kernel updates
Patch version bumps of MKE and Kubernetes
Fixes for Common Vulnerabilities and Exposures (CVE) in images
Fixes for known product issues

0: StackLight subcomponents may be updated during patch releases

Most patch release versions involve minor changes that only require restarting containers on the cluster during updates. However, the product can also deliver CVE fixes on Ubuntu, which includes updating the minor version of the Ubuntu kernel. This kernel update is not mandatory, but if you prioritize getting the latest CVE fixes for Ubuntu, you can manually reboot machines during a convenient maintenance window to update the kernel.

Each subsequent major release includes patch release updates of the previous major release.

You may decide to update between only major releases without updating to patch releases. In this case, you will perform updates from an N to N+1 major release. However, Mirantis recommends applying security fixes using patch releases as soon as they become available.

Starting from MOSK 24.1.5, Mirantis introduces a new update scheme allowing for the update path flexibility.

Update schemes comparison¶
Before MOSK 24.1.5	Since MOSK 24.1.5
The user cannot update to the intermediate patch version in the series if a newer patch version has been released.	The user can update to any patch version in the series even if a newer patch version has been released already.
If the cluster starts receiving patch releases, the user must apply the latest patch version in the series to be able to update to the following major release.	The user can always update to the newer major version from the latest patch version of the previous series. Additionally, there will be another possibility of major update during the course of the patch series from the patch version released immediately before the target major version. Refer to Update path for 24.1, 24.2, 24.3, and 25.1 series for the update possibility illustration.

See also

Cluster update

OpenStack support cycle¶

Mirantis provides Long Term Support (LTS) for specific versions of OpenStack. LTS includes scheduled updates with new functionality, bug and security fixes. Mirantis intends to introduce support for new OpenStack version once a year. The LTS duration of an OpenStack version is two years.

The diagram below illustrates the current LTS support cycle for OpenStack. The upstream versions not mentioned in the diagram are not supported in the product as well as the upgrade paths from or to such versions.

Important

MOSK supports the OpenStack Victoria version until September, 2023. MOSK 23.2 is the last release version where OpenStack Victoria packages are updated.

If you have not already upgraded your OpenStack version to Yoga, Mirantis highly recommends doing this during the course of the MOSK 23.2 series.

Untitled Diagram

Versions of Tungsten Fabric, underlying Kubernetes, Ceph, StackLight, and other supporting software and dependencies may change at Mirantis discretion. Follow Release Compatibility Matrix and product Release Notes for any changes in product component versions.

See also

Operations Guide: Upgrade OpenStack

Technology Preview features¶

A Technology Preview feature provides early access to upcoming product innovations, allowing customers to experiment with the functionality and provide feedback.

Technology Preview features may be privately or publicly available but neither are intended for production use. While Mirantis will provide assistance with such features through official channels, normal Service Level Agreements do not apply.

As Mirantis considers making future iterations of Technology Preview features generally available, we will do our best to resolve any issues that customers experience when using these features.

During the development of a Technology Preview feature, additional components may become available to the public for evaluation. Mirantis cannot guarantee the stability of such features. As a result, if you are using Technology Preview features, you may not be able to seamlessly update to subsequent product releases, as well as upgrade or migrate to the functionality that has not been announced as full support yet.

Mirantis makes no guarantees that Technology Preview features will graduate to generally available features.

Release Compatibility Matrix¶

The Release Compatibility Matrix describes the cloud configurations that have been supported by the product over the course of its lifetime and the path a MOSK cloud can take to move from an older configuration to a newer one.

For each MOSK release, the document outlines the versions of the product major components, the valid combinations of these versions, and the way every component must be updated or upgraded.

For a more comprehensive list of the product subcomponents and their respective versions included in each MOSK release, refer to Release Notes, or use the Releases section in the Container Cloud UI or API.

Compatibility matrix¶

The following table outlines the compatibility matrix of the most recent MOSK releases and their major components in conjunction with Container Cloud and Cluster releases.

25.1 series compatibility matrix¶
MOSK	25.1
Release date	March 11, 2025
Container Cloud	2.29
Release support status 0	Supported
Deprecated major MOSK 1	24.3
MOSK cluster	17.4.0
MKE	3.7.19 with K8s 1.27
MCR	25.0.8
OpenStack 2	Caracal, Antelope Deprecated
Tungsten Fabric	21.4, OpenSDN 24.1 TechPreview
Ceph	Reef 18.2.4-12.cve
Operating system	Ubuntu 22.04
Linux kernel 3	5.15.0-131-generic

24.3 series compatibility matrix¶
MOSK	24.3.7	24.3.6	24.3.5	24.3.4	24.3.3	24.3.2	24.3.1	24.3
Release date	July 14, 2025	June 16, 2025	May 20, 2025	April 22, 2025	March 26, 2025	February 03, 2025	January 06, 2025	October 16, 2024
Container Cloud	2.29.5	2.29.4	2.29.3	2.29.2	2.29.1	2.28.5	2.28.4	2.28
Release support status 0	Supported	Deprecated	Deprecated	Deprecated	Deprecated	Deprecated	Deprecated	Deprecated
Deprecated major MOSK 1	24.2	24.2	24.2	24.2	24.2	24.2	24.2	24.2
MOSK cluster	17.3.10	17.3.9	17.3.8	17.3.7	17.3.6	17.3.5	17.3.4	17.3.0
MKE	3.7.23 with k8s 1.27	3.7.23 with k8s 1.27	3.7.22 with K8s 1.27	3.7.20 with K8s 1.27	3.7.20 with K8s 1.27	3.7.18 with K8s 1.27	3.7.17 with K8s 1.27	3.7.12 with K8s 1.27
OpenStack 2	Caracal, Antelope	Caracal, Antelope	Caracal, Antelope	Caracal, Antelope	Caracal, Antelope	Caracal, Antelope	Caracal, Antelope	Caracal, Antelope
Tungsten Fabric	21.4	21.4	21.4	21.4	21.4	21.4	21.4	21.4
Ceph	Reef 18.2.7-2.cve	Reef 18.2.7-2.cve	Reef 18.2.7-1.cve	Reef 18.2.4-13.cve	Reef 18.2.4-13.cve	Reef 18.2.4-11.cve	Reef 18.2.4-11.cve	Reef 18.2.4-6.cve
Operating system	Ubuntu 22.04	Ubuntu 22.04	Ubuntu 22.04	Ubuntu 22.04	Ubuntu 22.04	Ubuntu 22.04, Ubuntu 20.04 Deprecated	Ubuntu 22.04, Ubuntu 20.04 Deprecated	Ubuntu 22.04, Ubuntu 20.04 Deprecated
Linux kernel 3	5.15.0-142-generic	5.15.0-140-generic	5.15.0-135-generic	5.15.0-135-generic	5.15.0-134-generic	5.15.0-130-generic	5.15.0-126-generic	5.15.0-119-generic

24.2 series compatibility matrix

MOSK	24.2.5	24.2.4	24.2.3	24.2.2	24.2.1	24.2
Release date	December 09, 2024	November 18, 2024	October 30, 2024	September 16, 2024	August 27, 2024	July 02, 2024
Container Cloud	2.28.3	2.28.2	2.28.1	2.27.4	2.27.3	2.27.0
Release support status 0	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
Deprecated major MOSK 1	24.1	24.1	24.1	24.1	24.1	24.1
MOSK cluster	17.2.7	17.2.6	17.2.5	17.2.4	17.2.3	17.2.0
MKE	3.7.16 with K8s 1.27	3.7.16 with K8s 1.27	3.7.15 with K8s 1.27	3.7.12 with K8s 1.27	3.7.12 with K8s 1.27	3.7.8 with K8s 1.27
OpenStack 2	Caracal TechPreview, Antelope, Yoga Deprecated	Caracal TechPreview, Antelope, Yoga Deprecated	Caracal TechPreview, Antelope, Yoga Deprecated	Caracal TechPreview, Antelope, Yoga Deprecated	Caracal TechPreview, Antelope, Yoga Deprecated	Caracal TechPreview, Antelope, Yoga Deprecated
Tungsten Fabric	21.4	21.4	21.4	21.4	21.4	21.4
Ceph	Reef 18.2.4-10.cve	Reef 18.2.4-8.cve	Reef 18.2.4-7.cve	Reef 18.2.4-4.cve	Reef 18.2.4-3.cve	Reef 18.2.3-1.release
Operating system	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04
Linux kernel 3	5.15.0-125-generic	5.15.0-124-generic	5.15.0-122-generic	5.15.0-118-generic	5.15.0-117-generic	5.15.0-107-generic

24.1 series compatibility matrix

MOSK	24.1.7	24.1.6	24.1.5	24.1.4	24.1.3	24.1.2	24.1.1	24.1
Release date	Aug 05, 2024	July 16, 2024	June 18, 2024	May 20, 2024	Apr 29, 2024	Apr 08, 2024	Mar 20, 2024	Mar 04, 2024
Container Cloud	2.27.2	2.27.1	2.26.5	2.26.4	2.26.3	2.26.2	2.26.1	2.26.0
Release support status 0	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
Deprecated major MOSK 1	24.1	24.1	23.3	23.3	23.3	23.3	23.3	23.3
MOSK cluster	17.1.7	17.1.6	17.1.5	17.1.4	17.1.3	17.1.2	17.1.1	17.1.0
MKE	3.7.11 with K8s 1.27	3.7.10 with K8s 1.27	3.7.8 with K8s 1.27	3.7.8 with K8s 1.27	3.7.7 with K8s 1.27	3.7.6 with K8s 1.27	3.7.5 with K8s 1.27	3.7.5 with K8s 1.27
OpenStack 2	Antelope, Yoga	Antelope, Yoga	Antelope, Yoga	Antelope, Yoga	Antelope, Yoga	Antelope, Yoga	Antelope, Yoga	Antelope, Yoga
Tungsten Fabric	21.4	21.4	21.4	21.4	21.4	21.4	21.4	21.4
Ceph	Quincy v17.2.7-15.cve	Quincy v17.2.7-15.cve	Quincy 17.2.7-13.cve	Quincy 17.2.7-12.cve	Quincy 17.2.7-11.cve	Quincy 17.2.7-10.release	Quincy 17.2.7-9.release	Quincy 17.2.7-8.release
Operating system	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04
Linux kernel 3	5.15.0-113-generic	5.15.0-113-generic	5.15.0-107-generic	5.15.0-105-generic	5.15.0-102-generic	5.15.0-101-generic	5.15.0-97-generic	5.15.0-92-generic

23.3 series compatibility matrix

MOSK	23.3.4 patch	23.3.3 patch	23.3.2 patch	23.3.1 patch	23.3
Release date	Jan 10, 2024	Dec 18, 2023	Dec 05, 2023	Nov 27, 2023	Nov 06, 2023
Container Cloud	2.25.4 - Jan 10, 2024	2.25.3 - Dec 18, 2023	2.25.2 - Dec 05, 2023	2.25.1 - Nov 27, 2023	2.25 - Nov 06, 2023
Release support status 0	Deprecated	Unsupported	Unsupported	Unsupported	Deprecated
Deprecated major MOSK 1	23.2	23.2	23.2	23.2	23.2
MOSK cluster	17.0.4	17.0.3	17.0.2	17.0.1	17.0.0
MKE	3.7.3 with K8s 1.27	3.7.3 with K8s 1.27	3.7.2 with K8s 1.27	3.7.2 with K8s 1.27	3.7.1 with K8s 1.27
OpenStack 2	Yoga, Antelope TechPreview	Yoga, Antelope TechPreview	Yoga, Antelope TechPreview	Yoga, Antelope TechPreview	Yoga, Antelope TechPreview
Tungsten Fabric	21.4	21.4	21.4	21.4	21.4
Ceph	Quincy 17.2.6-8.cve	Quincy 17.2.6-8.cve	Quincy 17.2.6-5.cve	Quincy 17.2.6-2.cve	Quincy 17.2.6-cve-1
Operating system	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04	Ubuntu 20.04
Linux kernel 3	5.15.0-86-generic	5.15.0-86-generic	5.15.0-86-generic	5.15.0-86-generic	5.15.0-86-generic

23.2 series compatibility matrix

MOSK	23.2.3 patch	23.2.2 patch	23.2.1 patch	23.2
Release date	Sept 26, 2023	Sept 14, 2023	Aug 29, 2023	Aug 21, 2023
Container Cloud	2.24.5 - Sept 26, 2023	2.24.4 - Sept 14, 2023	2.24.3 - Aug 28, 2023	2.24.2 - Aug 21, 2023 2.24.3 - Aug 28, 2023
Release support status 0	Unsupported	Unsupported	Unsupported	Unsupported
Deprecated major MOSK 1	23.1	23.1	23.1	23.1
MOSK cluster	15.0.4	15.0.3	15.0.2	15.0.1
MKE	3.6.6 with K8s 1.24	3.6.6 with K8s 1.24	3.6.6 with K8s 1.24	3.6.5 with K8s 1.24
OpenStack 2	Yoga, Victoria Deprecated	Yoga, Victoria Deprecated	Yoga, Victoria Deprecated	Yoga, Victoria Deprecated
Tungsten Fabric	21.4	21.4	21.4	21.4
Ceph	Quincy 17.2.6-cve-1	Quincy 17.2.6-cve-1	Quincy 17.2.6-cve-1	Quincy 17.2.6-rel-5
Operating system	Ubuntu 20.04, Ubuntu 18.04 Deprecated	Ubuntu 20.04, Ubuntu 18.04 Deprecated	Ubuntu 20.04, Ubuntu 18.04 Deprecated	Ubuntu 20.04, Ubuntu 18.04 Deprecated
Linux kernel 3	5.4.0-150-generic	5.4.0-150-generic	5.4.0-150-generic	5.4.0-150-generic

23.1 series compatibility matrix

MOSK version	23.1.4 patch	23.1.3 patch	23.1.2 patch	23.1.1 patch	23.1
Release date	Jun 05, 2023	May 22, 2023	May 04, 2023	Apr 20, 2023	Apr 04, 2023
Container Cloud	2.24.1 - Jul 27, 2023 2.24.0 - Jul 20, 2023 2.23.5 - Jun 05, 2023	2.23.4 - May 22, 2023	2.23.3 - May 04, 2023	2.23.2 - Apr 20, 2023	2.23.3 - May 04, 2023 2.23.2 - Apr 20, 2023 2.23.1 - Apr 04, 2023
Release support status 0	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
Deprecated major MOSK 1	22.5	22.5	22.5	22.5	22.5
MOSK cluster	12.7.4	12.7.3	12.7.2	12.7.1	12.7.0
MKE	3.5.7 with K8s 1.21	3.5.7 with K8s 1.21	3.5.7 with K8s 1.21	3.5.7 with K8s 1.21	3.5.7 with K8s 1.21
OpenStack 2	Yoga, Victoria	Yoga, Victoria	Yoga, Victoria	Yoga, Victoria	Yoga, Victoria
Tungsten Fabric	21.4	21.4	21.4	21.4	21.4
Ceph	Pacific 16.2.11-cve-4	Pacific 16.2.11-cve-4	Pacific 16.2.11-cve-4	Pacific 16.2.11-cve-2	Pacific 16.2.11
Operating system	Ubuntu 20.04 Greenfield, Ubuntu 18.04	Ubuntu 20.04 Greenfield, Ubuntu 18.04	Ubuntu 20.04 Greenfield, Ubuntu 18.04	Ubuntu 20.04 Greenfield, Ubuntu 18.04	Ubuntu 20.04 Greenfield, Ubuntu 18.04
Linux kernel 3	5.4.0-137-generic	5.4.0-137-generic	5.4.0-137-generic	5.4.0-137-generic	5.4.0-137-generic

22.5 compatibility matrix

MOSK	22.5
Release date	Dec 19, 2022
Container Cloud	2.23.0 - Mar 7, 2023, 2.22.0 - Jan 31, 2023, 2.21.1 - Dec 19, 2022
Release support status 0	Unupported
Deprecated major MOSK 1	22.4
MOSK cluster	12.5.0
MKE	3.5.5 with K8s 1.21
OpenStack 2	Yoga, Victoria
Tungsten Fabric	2011, 21.4 TechPreview
Ceph	Octopus 15.2.17
Operating system	Ubuntu 20.04 Greenfield, Ubuntu 18.04
Linux kernel 3	5.4.0-125-generic

0(1,2,3,4,5,6,7,8)

The product support status reflects the freshness of a MOSK cluster and should be considered when planning the cluster update path:

Supported

Latest supported product release version to use for a greenfield cluster deployment and to update to.

Deprecated

Product release version that you should update to the latest supported product release. You cannot update between two deprecated release versions.

The deprecated product release version becomes unsupported when newer product versions are released. Therefore, when planning the update path for the cluster, consider the dates of the upcoming product releases.

Greenfield deployments based on a deprecated product release are not supported. Use the latest supported release version for initial deployments instead.

Unsupported

Product release that blocks automatic upgrade of a management cluster and must be updated immediately to resume receiving newest product features and enhancements.

1(1,2,3,4,5,6,7,8)

Mirantis Container Cloud will update itself automatically as long as the release of each managed cluster has either supported or deprecated status in the new version of Container Cloud.

A deprecated cluster release becomes unsupported in one of the following Container Cloud releases. Therefore, we strongly recommend that you update your deprecated MOSK clusters to the latest supported version.

2(1,2,3,4,5,6,7,8)

The LTS duration of an OpenStack version is two years. For support timeline for different OpenStack versions, refer to OpenStack support cycle.

3(1,2,3,4,5,6,7,8)

The kernel version of the host operating system validated by Mirantis and confirmed to be working for the supported use cases. Usage of custom kernel versions or third-party vendor-provided kernels, such as FIPS-enabled, assume full responsibility for validating the compatibility of components in such environments.

Cluster update paths¶

Cluster updates in MOSK encompass both management cluster and managed cluster updates.

Management cluster update¶

Management cluster update is performed automatically as long as the release of each managed cluster has either supported or deprecated status in the new version of Container Cloud. In case, any of the clusters managed by Container Cloud is about to get an unsupported status as a result of an update, the Container Cloud updates will get blocked till the cluster gets to a later release.

For the cluster release statuses, see Container Cloud documentation: Cluster releases (managed).

Managed cluster update to a major version¶

Major cluster update is initiated by a cloud operator through the Container Cloud UI. The update procedure is automated and covers all the life cycle management modules of the cluster that include OpenStack, Tungsten Fabric, Ceph, and StackLight. See Cluster update for details.

Version-specific considerations

Before the MOSK 24.1 series, if between the major releases you apply at least one patch release belonging to the N series, you should obtain the last patch release in the series to be able to update to the N+1 major release version.

Since MOSK 24.1.5, the product introduces a new approach to cluster updates. For details, refer to Cluster update scheme and Update schemes comparison.

Managed cluster update to a patch version¶

Patch cluster update is initiated by a cloud operator through the Container Cloud UI. The update procedure is automated and covers all the life cycle management modules of the cluster that include OpenStack, Tungsten Fabric, Ceph, and StackLight. See Update to a patch version for details.

Version-specific considerations

Since MOSK 24.1.5, the product introduces a new approach to cluster updates. For details, refer to Cluster update scheme and Update schemes comparison.

Managed cluster update schema¶

The table below provides the schema for the cluster update between the latest and upcoming releases for MOSK clusters.

Update path for 24.1, 24.2, 24.3, and 25.1 series¶
MOSK 24.1 series	MOSK 24.2 series	MOSK 24.3 series	MOSK 25.1 series
24.1.4	n/a	n/a	n/a
24.1.5	n/a	n/a	n/a
24.1.5	24.2	24.3	25.1
24.1.6	n/a	n/a	n/a
24.1.7	n/a	n/a	n/a
24.1.7	24.2.1	n/a	n/a
n/a	24.2.2	24.3	25.1
n/a	24.2.3	n/a	n/a
n/a	24.2.4	n/a	n/a
n/a	24.2.5	24.3.1	n/a
n/a	n/a	24.3.2	25.1
n/a	n/a	24.3.3	n/a
n/a	n/a	24.3.4	n/a
n/a	n/a	24.3.5	n/a
n/a	n/a	24.3.6	n/a
n/a	n/a	24.3.7	25.1.1