Enhancements¶
This section outlines new features implemented in the Cluster release 14.1.0 that is introduced in the Container Cloud release 2.25.0.
Support for MCR 23.0.7¶
Introduced support for Mirantis Container Runtime (MCR) 23.0.7 for the Container Cloud management and managed clusters. On existing clusters, MCR is updated to the latest supported version when you update your managed cluster to the Cluster release 14.1.0.
Learn more
Addressing storage devices using by-id identifiers¶
Implemented the capability to address Ceph storage devices using the by-id
identifiers.
The by-id
identifier is the only persistent device identifier for a Ceph
cluster that remains stable after the cluster upgrade or any other maintenance.
Therefore, Mirantis recommends using device by-id
symlinks rather than
device names or by-path
symlinks.
Verbose Ceph cluster status¶
Added the kaasCephState
field in the KaaSCephCluster.status
specification to display the current state of KaasCephCluster
and
any errors during object reconciliation, including specification
generation, object creation on a managed cluster, and status retrieval.
Fluentd log forwarding to Splunk¶
TechPreview
Added initial Technology Preview support for forwarding of Container Cloud services logs, which are sent to OpenSearch by default, to Splunk using the syslog external output configuration.
Ceph monitoring improvements¶
Implemented the following monitoring improvements for Ceph:
Optimized the following Ceph dashboards in Grafana: Ceph Cluster, Ceph Pools, Ceph OSDs.
Removed the redundant Ceph Nodes Grafana dashboard. You can view its content using the following dashboards:
Ceph stats through the Ceph Cluster dashboard.
Resource utilization through the System dashboard, which now includes filtering by Ceph node labels, such as
ceph_role_osd
,ceph_role_mon
, andceph_role_mgr
.
Removed the
rook_cluster
alert label.Removed the redundant
CephOSDDown
alert.Renamed the
CephNodeDown
alert toCephOSDNodeDown
.
Optimization of StackLight ‘NodeDown’ alerts¶
Optimized StackLight NodeDown
alerts for a better notification handling
after cluster recovery from an accident:
Reworked the
NodeDown
-related alert inhibition rulesReworked the logic of all
NodeDown
-related alerts for all supported groups of nodes, which includes renaming of the<alertName>TargetsOutage
alerts to<alertNameTargetDown>
Added the
TungstenFabricOperatorTargetDown
alert for Tungsten Fabric deployments of MOSK clustersRemoved redundant
KubeDNSTargetsOutage
andKubePodsNotReady
alerts
OpenSearch performance optimization¶
Optimized OpenSearch configuration and StackLight datamodel to provide better resources utilization and faster query response. Added the following enhancements:
Limited the default namespaces for log collection with the ability to add custom namespaces to the monitoring list using the following parameters:
logging.namespaceFiltering.logs
- limits the number of namespaces for Pods log collection. Enabled by default.logging.namespaceFiltering.events
- limits the number of namespaces for Kubernetes events collection. Disabled by default.logging.namespaceFiltering.events/logs.extraNamespaces
- adds extra namespaces, which are not in the default list, to collect specific Kubernetes Pod logs or Kubernetes events. Empty by default.
Added the
logging.enforceOopsCompression
parameter that enforces 32 GB of heap size, unless the defined memory limit allows using 50 GB of heap. Enabled by default.Added the
NO_SEVERITY
severity label that is automatically added to a log with no severity label in the message. This allows having more control over which logs are actually being processed by Fluentd and which are skipped by mistake.Added documentation on how to tune OpenSearch performance using hardware and software settings for baremetal-based Container Cloud clusters.
Documentation enhancements¶
On top of continuous improvements delivered to the existing Container Cloud guides, added the documentation on how to export data from the Table panels of Grafana dashboards to CSV.