Update notes¶
This section describes the specific actions you as a cloud operator need to complete before or after your Container Cloud cluster update to the Cluster releases 17.3.0 or 16.3.0.
Consider this information as a supplement to the generic update procedures published in Operations Guide: Automatic upgrade of a management cluster and Update a managed cluster.
Pre-update actions¶
Change label values in Ceph metrics used in customizations¶
Note
If you do not use Ceph metrics in any customizations, for example, custom alerts, Grafana dashboards, or queries in custom workloads, skip this section.
After deprecating the performance metric exporter that is integrated into the Ceph Manager daemon for the sake of the dedicated Ceph Exporter daemon in Container Cloud 2.27.0, you may need to update values of several labels in Ceph metrics if you use them in any customizations such as custom alerts, Grafana dashboards, or queries in custom tools. These labels are changed in Container Cloud 2.28.0 (Cluster releases 16.3.0 and 17.3.0).
Note
Names of metrics are changed, no metrics are removed.
All Ceph metrics to be collected by the Ceph Exporter daemon changed their
labels job
and instance
due to scraping metrics from new Ceph Exporter
daemon instead of the performance metric exporter of Ceph Manager:
Values of the
job
labels are changed fromrook-ceph-mgr
toprometheus-rook-exporter
for all Ceph metrics moved to Ceph Exporter. The full list of moved metrics is presented below.Values of the
instance
labels are changed from the metric endpoint of Ceph Manager with port9283
to the metric endpoint of Ceph Exporter with port9926
for all Ceph metrics moved to Ceph Exporter. The full list of moved metrics is presented below.Values of the
instance_id
labels of Ceph metrics from the RADOS Gateway (RGW) daemons are changed from the daemon GID to the daemon subname. For example, instead ofinstance_id="<RGW_PROCESS_GID>"
, theinstance_id="a"
(ceph_rgw_qlen{instance_id="a"}
) is now used. The list of moved Ceph RGW metrics is presented below.
List of affected Ceph RGW metrics
ceph_rgw_cache_.*
ceph_rgw_failed_req
ceph_rgw_gc_retire_object
ceph_rgw_get.*
ceph_rgw_keystone_.*
ceph_rgw_lc_.*
ceph_rgw_lua_.*
ceph_rgw_pubsub_.*
ceph_rgw_put.*
ceph_rgw_qactive
ceph_rgw_qlen
ceph_rgw_req
List of all metrics to be collected by Ceph Exporter instead of
Ceph Manager
ceph_bluefs_.*
ceph_bluestore_.*
ceph_mds_cache_.*
ceph_mds_caps
ceph_mds_ceph_.*
ceph_mds_dir_.*
ceph_mds_exported_inodes
ceph_mds_forward
ceph_mds_handle_.*
ceph_mds_imported_inodes
ceph_mds_inodes.*
ceph_mds_load_cent
ceph_mds_log_.*
ceph_mds_mem_.*
ceph_mds_openino_dir_fetch
ceph_mds_process_request_cap_release
ceph_mds_reply_.*
ceph_mds_request
ceph_mds_root_.*
ceph_mds_server_.*
ceph_mds_sessions_.*
ceph_mds_slow_reply
ceph_mds_subtrees
ceph_mon_election_.*
ceph_mon_num_.*
ceph_mon_session_.*
ceph_objecter_.*
ceph_osd_numpg.*
ceph_osd_op.*
ceph_osd_recovery_.*
ceph_osd_stat_.*
ceph_paxos.*
ceph_prioritycache.*
ceph_purge.*
ceph_rgw_cache_.*
ceph_rgw_failed_req
ceph_rgw_gc_retire_object
ceph_rgw_get.*
ceph_rgw_keystone_.*
ceph_rgw_lc_.*
ceph_rgw_lua_.*
ceph_rgw_pubsub_.*
ceph_rgw_put.*
ceph_rgw_qactive
ceph_rgw_qlen
ceph_rgw_req
ceph_rocksdb_.*
Post-update actions¶
Manually disable collection of performance metrics by Ceph Manager (optional)¶
Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), Ceph cluster metrics are collected by the dedicated Ceph Exporter daemon. At the same time, same metrics are still available to be collected by the Ceph Manager daemon. To improve performance of the Ceph Manager daemon, you can manually disable collection of performance metrics by Ceph Manager, which are collected by the Ceph Exporter daemon.
To disable performance metrics for the Ceph Manager daemon, add the following
parameter to the KaaSCephCluster
spec
in the rookConfig
section:
spec:
cephClusterSpec:
rookConfig:
"mgr|mgr/prometheus/exclude_perf_counters": "true"
Once you add this option, Ceph performance metrics are collected by the Ceph Exporter daemon only. For more details, see Official Ceph documentation.
Upgrade to Ubuntu 22.04 on baremetal-based clusters¶
The Cluster release series 17.3.x and 16.3.x are the last ones where Ubuntu 20.04 is supported on existing clusters. The Container Cloud release update to 2.29.0 will be blocked if any machine in any cluster runs on Ubuntu 20.04.
Therefore, if your existing cluster runs nodes on Ubuntu 20.04, prevent blocking of your cluster update by upgrading all cluster nodes to Ubuntu 22.04 during the course of the Container Cloud 2.28.x series. For the procedure, see Operations Guide: Upgrade an operating system distribution.
It is not mandatory to upgrade all machines at once. You can upgrade them one by one or in small batches, for example, if the maintenance window is limited in time.
Warning
Usage of third-party software, which is not part of Mirantis-supported configurations, for example, the use of custom DPDK modules, may block upgrade of an operating system distribution. Users are fully responsible for ensuring the compatibility of such custom components with the latest supported Ubuntu version.