Enhancements

This section outlines new features implemented in the Cluster release 14.0.0 that is introduced in the Container Cloud release 2.24.0.

Support for MKE 3.6.5 and MCR 20.10.17

Introduced support for Mirantis Container Runtime (MCR) 20.10.17 and Mirantis Kubernetes Engine (MKE) 3.6.5 that supports Kubernetes 1.24 for the Container Cloud management, regional, and managed clusters. On existing clusters, MKE and MCR are updated to the latest supported version when you update your managed cluster to the Cluster release 14.0.0.

Caution

Support for MKE 3.5.x is dropped. Therefore, new deployments on MKE 3.5.x are not supported.

Note

For MOSK-based deployments, the feature support is available since MOSK 23.2.

Automatic upgrade of Ceph from Pacific to Quincy

Upgraded Ceph major version from Pacific 16.2.11 to Quincy 17.2.6 with an automatic upgrade of Ceph components on existing managed clusters during the Cluster version update.

Note

For MOSK-based deployments, the feature support is available since MOSK 23.2.

Ceph non-admin client for a shared Ceph cluster

Implemented a Ceph non-admin client to share the producer cluster resources with the consumer cluster in the shared Ceph cluster configuration. The use of the non-admin client, as opposed to the admin client, prevents the risk of destructive actions from the consumer cluster.

Caution

For MKE clusters that are part of MOSK infrastructure, the feature is not supported yet.

Dropping of redundant Ceph components from management and regional clusters

As the final part of Ceph removal from Container Cloud management clusters, which reduces resource consumption, removed the following Ceph components that were present on clusters for backward compatibility:

  • Helm chart of the Ceph Controller (ceph-operator)

  • Ceph deployments

  • Ceph namespaces ceph-lcm-mirantis and rook-ceph

Monitoring of network connectivity between Ceph nodes

Introduced healthcheck metrics and the following Ceph alerts to monitor network connectivity between Ceph nodes:

  • CephDaemonSlowOps

  • CephMonClockSkew

  • CephOSDFlapping

  • CephOSDSlowClusterNetwork

  • CephOSDSlowPublicNetwork

Note

For MOSK-based deployments, the feature support is available since MOSK 23.2.

Improvements to StackLight alerting

Implemented the following improvements to StackLight alerting:

  • Changed severity for multiple alerts to increase visibility of potentially workload-impacting alerts and decrease noise of non-workload-impacting alerts

  • Renamed MCCLicenseExpirationCritical to MCCLicenseExpirationHigh, MCCLicenseExpirationMajor to MCCLicenseExpirationMedium

  • For Ironic:

    • Removed IronicBmMetricsMissing in favor of IronicBmApiOutage

    • Removed inhibition rules for IronicBmTargetDown and IronicBmApiOutage

    • Improved expression for IronicBmApiOutage

  • For Kubernetes applications:

    • Reworked troubleshooting steps for KubeStatefulSetUpdateNotRolledOut, KubeDeploymentOutage, KubeDeploymentReplicasMismatch

    • Updated descriptions for KubeStatefulSetOutage and KubeDeploymentOutage

    • Changed expressions for KubeDeploymentOutage, KubeDeploymentReplicasMismatch, CephOSDDiskNotResponding, and CephOSDDown

Major version update of OpenSearch and OpenSearch Dashboards

Updated OpenSearch and OpenSearch Dashboards from major version 1.3.7 to 2.7.0. The latest version includes a number of enhancements along with bug and security fixes.

Note

For MOSK-based deployments, the feature support is available since MOSK 23.2.

Caution

The version update process can take up to 20 minutes, during which both OpenSearch and OpenSearch Dashboards may become temporarily unavailable. Additionally, the KubeStatefulsetUpdateNotRolledOut alert for the opensearch-master StatefulSet may fire for a short period of time.

Note

The end-of-life support of the major version 1.x ends on December 31, 2023.

Performance tuning of Grafana dashboards

Tuned the performance of Grafana dashboards for faster loading and a better UX by refactoring and optimizing different Grafana dashboards.

This enhancement includes extraction of the OpenSearch Indices dashboard out of the OpenSearch dashboard to provide detailed information about the state of indices, including their size, the size of document values and segments.

Dropped and white-listed metrics

To improve Prometheus performance and provide better resource utilization with faster query response, dropped metrics that are unused by StackLight. Also created the default white list of metrics that you can expand.

The feature is enabled by default using the prometheusServer.metricsFiltering.enabled:true parameter. Thus, if you have created custom alerts, recording rules, dashboards, or if you were actively using some metrics for different purposes, some of those metrics can be dropped. Therefore, verify the white list of Prometheus scrape jobs to ensure that the required metrics are not dropped.

If a job name that relates to the required metric is not present in this list, its target metrics are not dropped and are collected by Prometheus by default. If the required metric is not present in this list, you can whitelist it using the prometheusServer.metricsFiltering.extraMetricsInclude parameter.