Update notes

This section describes the specific actions you as a cloud operator need to complete before or after your Container Cloud cluster update to the Cluster releases 17.3.0 or 16.3.0.

Consider this information as a supplement to the generic update procedures published in Operations Guide: Automatic upgrade of a management cluster and Update a managed cluster.

Pre-update actions

Change label values in Ceph metrics used in customizations

Note

If you do not use Ceph metrics in any customizations, for example, custom alerts, Grafana dashboards, or queries in custom workloads, skip this section.

After deprecating the performance metric exporter that is integrated into the Ceph Manager daemon for the sake of the dedicated Ceph Exporter daemon in Container Cloud 2.27.0, you may need to update values of several labels in Ceph metrics if you use them in any customizations such as custom alerts, Grafana dashboards, or queries in custom tools. These labels are changed in Container Cloud 2.28.0 (Cluster releases 16.3.0 and 17.3.0).

Note

Names of metrics are changed, no metrics are removed.

All Ceph metrics to be collected by the Ceph Exporter daemon changed their labels job and instance due to scraping metrics from new Ceph Exporter daemon instead of the performance metric exporter of Ceph Manager:

  • Values of the job labels are changed from rook-ceph-mgr to prometheus-rook-exporter for all Ceph metrics moved to Ceph Exporter. The full list of moved metrics is presented below.

  • Values of the instance labels are changed from the metric endpoint of Ceph Manager with port 9283 to the metric endpoint of Ceph Exporter with port 9926 for all Ceph metrics moved to Ceph Exporter. The full list of moved metrics is presented below.

  • Values of the instance_id labels of Ceph metrics from the RADOS Gateway (RGW) daemons are changed from the daemon GID to the daemon subname. For example, instead of instance_id="<RGW_PROCESS_GID>", the instance_id="a" (ceph_rgw_qlen{instance_id="a"}) is now used. The list of moved Ceph RGW metrics is presented below.

List of affected Ceph RGW metrics
  • ceph_rgw_cache_.*

  • ceph_rgw_failed_req

  • ceph_rgw_gc_retire_object

  • ceph_rgw_get.*

  • ceph_rgw_keystone_.*

  • ceph_rgw_lc_.*

  • ceph_rgw_lua_.*

  • ceph_rgw_pubsub_.*

  • ceph_rgw_put.*

  • ceph_rgw_qactive

  • ceph_rgw_qlen

  • ceph_rgw_req

List of all metrics to be collected by Ceph Exporter instead of Ceph Manager
  • ceph_bluefs_.*

  • ceph_bluestore_.*

  • ceph_mds_cache_.*

  • ceph_mds_caps

  • ceph_mds_ceph_.*

  • ceph_mds_dir_.*

  • ceph_mds_exported_inodes

  • ceph_mds_forward

  • ceph_mds_handle_.*

  • ceph_mds_imported_inodes

  • ceph_mds_inodes.*

  • ceph_mds_load_cent

  • ceph_mds_log_.*

  • ceph_mds_mem_.*

  • ceph_mds_openino_dir_fetch

  • ceph_mds_process_request_cap_release

  • ceph_mds_reply_.*

  • ceph_mds_request

  • ceph_mds_root_.*

  • ceph_mds_server_.*

  • ceph_mds_sessions_.*

  • ceph_mds_slow_reply

  • ceph_mds_subtrees

  • ceph_mon_election_.*

  • ceph_mon_num_.*

  • ceph_mon_session_.*

  • ceph_objecter_.*

  • ceph_osd_numpg.*

  • ceph_osd_op.*

  • ceph_osd_recovery_.*

  • ceph_osd_stat_.*

  • ceph_paxos.*

  • ceph_prioritycache.*

  • ceph_purge.*

  • ceph_rgw_cache_.*

  • ceph_rgw_failed_req

  • ceph_rgw_gc_retire_object

  • ceph_rgw_get.*

  • ceph_rgw_keystone_.*

  • ceph_rgw_lc_.*

  • ceph_rgw_lua_.*

  • ceph_rgw_pubsub_.*

  • ceph_rgw_put.*

  • ceph_rgw_qactive

  • ceph_rgw_qlen

  • ceph_rgw_req

  • ceph_rocksdb_.*

Post-update actions

Manually disable collection of performance metrics by Ceph Manager (optional)

Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), Ceph cluster metrics are collected by the dedicated Ceph Exporter daemon. At the same time, same metrics are still available to be collected by the Ceph Manager daemon. To improve performance of the Ceph Manager daemon, you can manually disable collection of performance metrics by Ceph Manager, which are collected by the Ceph Exporter daemon.

To disable performance metrics for the Ceph Manager daemon, add the following parameter to the KaaSCephCluster spec in the rookConfig section:

spec:
  cephClusterSpec:
    rookConfig:
      "mgr|mgr/prometheus/exclude_perf_counters": "true"

Once you add this option, Ceph performance metrics are collected by the Ceph Exporter daemon only. For more details, see Official Ceph documentation.

Upgrade to Ubuntu 22.04 on baremetal-based clusters

The Cluster release series 17.3.x and 16.3.x are the last ones where Ubuntu 20.04 is supported on existing clusters. The Container Cloud release update to 2.29.0 will be blocked if any machine in any cluster runs on Ubuntu 20.04.

Therefore, if your existing cluster runs nodes on Ubuntu 20.04, prevent blocking of your cluster update by upgrading all cluster nodes to Ubuntu 22.04 during the course of the Container Cloud 2.28.x series. For the procedure, see Operations Guide: Upgrade an operating system distribution.

It is not mandatory to upgrade all machines at once. You can upgrade them one by one or in small batches, for example, if the maintenance window is limited in time.

Warning

Usage of third-party software, which is not part of Mirantis-supported configurations, for example, the use of custom DPDK modules, may block upgrade of an operating system distribution. Users are fully responsible for ensuring the compatibility of such custom components with the latest supported Ubuntu version.