Ceph known issues¶
This section lists the Ceph known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.1.
[30857] Irrelevant error during Ceph OSD deployment on removable devices
[31630] Ceph cluster upgrade to Pacific is stuck with Rook connection failure
[31555] Ceph can find only 1 out of 2 ‘mgr’ after update to MOSK 23.1
[30857] Irrelevant error during Ceph OSD deployment on removable devices¶
The deployment of Ceph OSDs fails with the following messages in the status
section of the KaaSCephCluster
custom resource:
shortClusterInfo:
messages:
- Not all osds are deployed
- Not all osds are in
- Not all osds are up
To find out if your cluster is affected, verify if the devices on
the AMD hosts you use for the Ceph OSDs deployment are removable.
For example, if the sdb
device name is specified in
spec.cephClusterSpec.nodes.storageDevices
of the KaaSCephCluster
custom resource for the affected host, run:
# cat /sys/block/sdb/removable
1
The system output shows that the reason of the above messages in status
is the enabled hotplug functionality on the AMD nodes, which marks all drives
as removable. And the hotplug functionality is not supported by Ceph in
MOSK.
As a workaround, disable the hotplug functionality in the BIOS settings for disks that are configured to be used as Ceph OSD data devices.
[31630] Ceph cluster upgrade to Pacific is stuck with Rook connection failure¶
During update to MOSK 23.1, the Ceph cluster gets stuck during upgrade to Ceph Pacific.
To verify whether your cluster is affected:
The cluster is affected if the following conditions are true:
The
ceph-status-controller
pod on the MOSK cluster contains the following log lines:kubectl -n ceph-lcm-mirantis logs <ceph-status-controller-podname> ... E0405 08:07:15.603247 1 cluster.go:222] Cluster health: "HEALTH_ERR" W0405 08:07:15.603266 1 cluster.go:230] found issue error: {Urgent failed to get status. . timed out: exit status 1}
The
KaaSCephCluster
custom resource contains the following configuration option in therookConfig
section:spec: cephClusterSpec: rookConfig: ms_crc_data: "false" # or 'ms crc data: "false"'
As a workaround, remove ms_crc_data
(or ms crc data
) configuration
key from the KaaSCephCluster
custom resource and wait for the
rook-ceph-mon
pods to restart on the MOSK cluster:
kubectl -n rook-ceph get pod -l app=rook-ceph-mon -w
[31555] Ceph can find only 1 out of 2 ‘mgr’ after update to MOSK 23.1¶
Fixed in 23.2
After update to MOSK 23.1, the status
section of the KaaSCephCluster
custom resource can contain the following message:
shortClusterInfo:
messages:
- Not all mgrs are running: 1/2
To verify whether the cluster is affected:
If the KaaSCephCluster
spec contains the external
section, the cluster
is affected:
spec:
cephClusterSpec:
external:
enable: false
Workaround::
In
spec.cephClusterSpec
of theKaaSCephCluster
custom resource, remove theexternal
section.Wait for the Not all mgrs are running: 1/2 message to disappear from the
KaaSCephCluster
status
.Verify that the
nova
Ceph client that is integrated to MOSK has the same keyring as in the Ceph cluster.Keyring verification for the Ceph
nova
clientCompare the keyring used in the
nova-compute
andlibvirt
pods with the one from the Ceph cluster:kubectl -n openstack get pod | grep nova-compute kubectl -n openstack exec -it <nova-compute-pod-name> -- cat /etc/ceph/ceph.client.nova.keyring kubectl -n openstack get pod | grep libvirt kubectl -n openstack exec -it <libvirt-pod-name> -- cat /etc/ceph/ceph.client.nova.keyring kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph auth get client.nova
If the keyring differs, change the one stored in Ceph cluster with the key from the OpenStack pods:
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash ceph auth get client.nova -o /tmp/nova.key vi /tmp/nova.key # in the editor, change "key" value to the key obtained from the OpenStack pods # then save and exit editing ceph auth import -i /tmp/nova.key
Verify that the
client.nova
keyring of the Ceph cluster matches the one obtained from the OpenStack pods:kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph auth get client.nova
Verify that
nova-compute
andlibvirt
pods have access to the Ceph cluster:kubectl -n openstack get pod | grep nova-compute kubectl -n openstack exec -it <nova-compute-pod-name> -- ceph -s -n client.nova kubectl -n openstack get pod | grep libvirt kubectl -n openstack exec -it <libvirt-pod-name> -- ceph -s -n client.nova
Verify that the
cinder
Ceph client integrated to MOSK has the same keyring as in the Ceph cluster:Keyring verification for the Ceph
cinder
clientCompare the keyring used in the
cinder-volume
pods with the one from the Ceph cluster.kubectl -n openstack get pod | grep cinder-volume kubectl -n openstack exec -it <cinder-volume-pod-name> -- cat /etc/ceph/ceph.client.cinder.keyring kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph auth get client.cinder
If the keyring differs, change the one stored in Ceph cluster with the key from the OpenStack pods:
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash ceph auth get client.cinder -o /tmp/cinder.key vi /tmp/cinder.key # in the editor, change "key" value to the key obtained from the OpenStack pods # then save and exit editing ceph auth import -i /tmp/cinder.key
Verify that the
client.cinder
keyring of the Ceph cluster matches the one obtained from the OpenStack pods:kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph auth get client.cinder
Verify that the
cinder-volume
pods have access to the Ceph cluster:kubectl -n openstack get pod | grep cinder-volume kubectl -n openstack exec -it <cinder-volume-pod-name> -- ceph -s -n client.cinder
Verify that the
glance
Ceph client integrated to MOSK has the same keyring as in the Ceph cluster.Keyring verification for the Ceph
glance
clientCompare the keyring used in the
glance-api
pods with the one from the Ceph cluster:kubectl -n openstack get pod | grep glance-api kubectl -n openstack exec -it <glance-api-pod-name> -- cat /etc/ceph/ceph.client.glance.keyring kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph auth get client.glance
If the keyring differs, change the one stored in Ceph cluster with the key from the OpenStack pods:
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash ceph auth get client.glance -o /tmp/glance.key vi /tmp/glance.key # in the editor, change "key" value to the key obtained from the OpenStack pods # then save and exit editing ceph auth import -i /tmp/glance.key
Verify that the
client.glance
keyring of the Ceph cluster matches the one obtained from the OpenStack pods:kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph auth get client.glance
Verify that the
glance-api
pods have access to the Ceph cluster:kubectl -n openstack get pod | grep glance-api kubectl -n openstack exec -it <glance-api-pod-name> -- ceph -s -n client.glance