Enhancements

This section outlines new features implemented in the Cluster release 17.0.0 that is introduced in the Container Cloud release 2.25.0.

Support for MKE 3.7.1 and MCR 23.0.7

Introduced support for Mirantis Container Runtime (MCR) 23.0.7 and Mirantis Kubernetes Engine (MKE) 3.7.1 that supports Kubernetes 1.27 for the Container Cloud management and managed clusters. On existing clusters, MKE and MCR are updated to the latest supported version when you update your managed cluster to the Cluster release 17.0.0.

Caution

Support for MKE 3.6.x is dropped. Therefore, new deployments on MKE 3.6.x are not supported.

Detailed view of a Ceph cluster summary in web UI

Implemented the Ceph Cluster details page in the Container Cloud web UI containing the Machines and OSDs tabs with a detailed descriptions and statuses of Ceph machines and Ceph OSDs comprising a Ceph cluster deployment.

Addressing storage devices using by-id identifiers

Implemented the capability to address Ceph storage devices using the by-id identifiers.

The by-id identifier is the only persistent device identifier for a Ceph cluster that remains stable after the cluster upgrade or any other maintenance. Therefore, Mirantis recommends using device by-id symlinks rather than device names or by-path symlinks.

Verbose Ceph cluster status

Added the kaasCephState field in the KaaSCephCluster.status specification to display the current state of KaasCephCluster and any errors during object reconciliation, including specification generation, object creation on a managed cluster, and status retrieval.

Fluentd log forwarding to Splunk

TechPreview

Added initial Technology Preview support for forwarding of Container Cloud services logs, which are sent to OpenSearch by default, to Splunk using the syslog external output configuration.

Ceph monitoring improvements

Implemented the following monitoring improvements for Ceph:

  • Optimized the following Ceph dashboards in Grafana: Ceph Cluster, Ceph Pools, Ceph OSDs.

  • Removed the redundant Ceph Nodes Grafana dashboard. You can view its content using the following dashboards:

    • Ceph stats through the Ceph Cluster dashboard.

    • Resource utilization through the System dashboard, which now includes filtering by Ceph node labels, such as ceph_role_osd, ceph_role_mon, and ceph_role_mgr.

  • Removed the rook_cluster alert label.

  • Removed the redundant CephOSDDown alert.

  • Renamed the CephNodeDown alert to CephOSDNodeDown.

Optimization of StackLight ‘NodeDown’ alerts

Optimized StackLight NodeDown alerts for a better notification handling after cluster recovery from an accident:

  • Reworked the NodeDown-related alert inhibition rules

  • Reworked the logic of all NodeDown-related alerts for all supported groups of nodes, which includes renaming of the <alertName>TargetsOutage alerts to <alertNameTargetDown>

  • Added the TungstenFabricOperatorTargetDown alert for Tungsten Fabric deployments of MOSK clusters

  • Removed redundant KubeDNSTargetsOutage and KubePodsNotReady alerts

OpenSearch performance optimization

Optimized OpenSearch configuration and StackLight datamodel to provide better resources utilization and faster query response. Added the following enhancements:

  • Limited the default namespaces for log collection with the ability to add custom namespaces to the monitoring list using the following parameters:

    • logging.namespaceFiltering.logs - limits the number of namespaces for Pods log collection. Enabled by default.

    • logging.namespaceFiltering.events - limits the number of namespaces for Kubernetes events collection. Disabled by default.

    • logging.namespaceFiltering.events/logs.extraNamespaces - adds extra namespaces, which are not in the default list, to collect specific Kubernetes Pod logs or Kubernetes events. Empty by default.

  • Added the logging.enforceOopsCompression parameter that enforces 32 GB of heap size, unless the defined memory limit allows using 50 GB of heap. Enabled by default.

  • Added the NO_SEVERITY severity label that is automatically added to a log with no severity label in the message. This allows having more control over which logs are actually being processed by Fluentd and which are skipped by mistake.

  • Added documentation on how to tune OpenSearch performance using hardware and software settings for baremetal-based Container Cloud clusters.

Documentation enhancements

On top of continuous improvements delivered to the existing Container Cloud guides, added the documentation on how to export data from the Table panels of Grafana dashboards to CSV.