Update known issues¶
This section lists the update known issues with workarounds for the MOSK release 24.2.
[42449] Rolling reboot failure on a Tungsten Fabric cluster¶
During cluster update, the rolling reboot fails on the Tungsten Fabric cluster. To work around the issue, restart the RabbitMQ pods in the Tungsten Fabric cluster.
[46671] Cluster update fails with the tf-config pods crashed¶
When updating to the MOSK 24.3 series, tf-config
pods from the Tungsten
Fabric namespace may enter the CrashLoopBackOff
state. For example:
tf-config-cs8zr 2/5 CrashLoopBackOff 676 (19s ago) 15h
tf-config-db-6zxgg 1/1 Running 44 (25m ago) 15h
tf-config-db-7k5sz 1/1 Running 43 (23m ago) 15h
tf-config-db-dlwdv 1/1 Running 43 (25m ago) 15h
tf-config-nw4tr 3/5 CrashLoopBackOff 665 (43s ago) 15h
tf-config-wzf6c 1/5 CrashLoopBackOff 680 (10s ago) 15h
tf-control-c6bnn 3/4 Running 41 (23m ago) 13h
tf-control-gsnnp 3/4 Running 42 (23m ago) 13h
tf-control-sj6fd 3/4 Running 41 (23m ago) 13h
To troubleshoot the issue, check the logs inside the tf-config
API
container and the tf-cassandra
pods. The following example logs
indicate that Cassandra services failed to peer with each other and
are operating independently:
Logs from the
tf-config
API container:NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.168.200.23:9042 dc1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1, \'consistency\': \'QUORUM\'}',)})
Logs from the
tf-cassandra
pods:INFO [OptionalTasks:1] 2024-09-09 08:59:36,231 CassandraRoleManager.java:419 - Setup task failed with error, rescheduling WARN [OptionalTasks:1] 2024-09-09 08:59:46,231 CassandraRoleManager.java:379 - CassandraRoleManager skipped default role setup: some nodes were not ready
To work around the issue, restart the Cassandra services in the Tungsten Fabric namespace by deleting the affected pods sequentially to establish the connection between them:
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-0
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-1
kubectl -n tf delete pod tf-cassandra-config-dc1-rack1-2
Now, all other services in the Tungsten Fabric namespace should be in
the Active
state.