Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!
Starting with MOSK 25.2, the MOSK documentation set covers all product layers, including MOSK management (formerly Container Cloud). This means everything you need is in one place. Some legacy names may remain in the code and documentation and will be updated in future releases. The separate Container Cloud documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.
Update known issues¶
This section lists the update known issues with workarounds for the MOSK release 24.2.
[42449] Rolling reboot failure on a Tungsten Fabric cluster¶
During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.
[42463] KubePodsCrashLooping is firing during cluster update¶
During major or patch update of a MOSK cluster with StackLight enabled in
non-HA mode, the KubePodsCrashLooping alert may be firing for the Grafana
ReplicaSet.
Grafana relies on PostgreSQL for persistent data. In non-HA StackLight setup,
PostgreSQL becomes temporarily unavailable during updates. If Grafana loses its
database connection or fails to establish one during startup, Grafana fails
with an error. This may cause the Grafana pod to enter the CrashLoopBackOff
state. Such behavior is expected in non-HA StackLight setups. The Grafana pod
will resume normal operation after PostgreSQL is restored.
To prevent the issue, deploy StackLight in HA mode.
[46671] Cluster update fails with the tf-config pods crashed¶
Fixed in MOSK 25.1 Fixed in MOSK 25.1.1
When updating to the MOSK 24.3 series, tf-config pods from the Tungsten
Fabric namespace may enter the CrashLoopBackOff state. For example:
tf-config-cs8zr 2/5 CrashLoopBackOff 676 (19s ago) 15h
tf-config-db-6zxgg 1/1 Running 44 (25m ago) 15h
tf-config-db-7k5sz 1/1 Running 43 (23m ago) 15h
tf-config-db-dlwdv 1/1 Running 43 (25m ago) 15h
tf-config-nw4tr 3/5 CrashLoopBackOff 665 (43s ago) 15h
tf-config-wzf6c 1/5 CrashLoopBackOff 680 (10s ago) 15h
tf-control-c6bnn 3/4 Running 41 (23m ago) 13h
tf-control-gsnnp 3/4 Running 42 (23m ago) 13h
tf-control-sj6fd 3/4 Running 41 (23m ago) 13h
To troubleshoot the issue, check the logs inside the tf-config API
container and the tf-cassandra pods. The following example logs
indicate that Cassandra services failed to peer with each other and
are operating independently:
Logs from the
tf-configAPI container:NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})Logs from the
tf-cassandrapods:INFO [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling WARN [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready
To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2
Now, all other services in the Tungsten Fabric namespace should be in
the Active state.