Ceph Monitors recovery¶
This section describes how to recover failed Ceph Monitors of an existing Ceph cluster in the following state:
The Ceph cluster contains failed Ceph Monitors that cannot start and hang in the
Error
orCrashLoopBackOff
state.The logs of the failed Ceph Monitor pods contain the following lines:
mon.g does not exist in monmap, will attempt to join an existing cluster ... mon.g@-1(???) e11 not in monmap and have been in a quorum before; must have been removed mon.g@-1(???) e11 commit suicide!
The Ceph cluster contains at least one
Running
Ceph Monitor and theceph -s
command outputs one healthymon
and one healthymgr
instance.
Perform the following steps for all failed Ceph Monitors at a time if not stated otherwise.
To recover failed Ceph Monitors:
Obtain and export the
kubeconfig
of the affected cluster.Scale the
rook-ceph/rook-ceph-operator
deployment down to0
replicas:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
Delete all failed Ceph Monitor deployments:
Identify the Ceph Monitor pods in the
Error
orCrashLookBackOff
state:kubectl -n rook-ceph get pod -l 'app in (rook-ceph-mon,rook-ceph-mon-canary)'
Verify that the affected pods contain the failure logs described above:
kubectl -n rook-ceph logs <failedMonPodName>
Substitute
<failedMonPodName>
with the Ceph Monitor pod name. For example,rook-ceph-mon-g-845d44b9c6-fjc5d
.Save the identifying letters of failed Ceph Monitors for further usage. For example,
f
,e
, and so on.Delete all corresponding deployments of these pods:
Identify the affected Ceph Monitor pod deployments:
kubectl -n rook-ceph get deploy -l 'app in (rook-ceph-mon,rook-ceph-mon-canary)'
Delete the affected Ceph Monitor pod deployments. For example, if the Ceph cluster has the
rook-ceph-mon-c-845d44b9c6-fjc5d
pod in theCrashLoopBackOff
state, remove the correspondingrook-ceph-mon-c
:kubectl -n rook-ceph delete deploy rook-ceph-mon-c
Canary
mon
deployments have the suffix-canary
.
Remove all corresponding entries of Ceph Monitors from the MON map:
Enter the
ceph-tools
pod:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \ app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash
Inspect the current MON map and save the IP addresses of the failed Ceph monitors for further usage:
ceph mon dump
Remove all entries of failed Ceph Monitors using the previously saved letters:
ceph mon rm <monLetter>
Substitute
<monLetter>
with the corresponding letter of a failed Ceph Monitor.Exit the
ceph-tools
pod.
Remove all failed Ceph Monitors entries from the Rook
mon
endpoints ConfigMap:Open the
rook-ceph/rook-ceph-mon-endpoints
ConfigMap for editing:kubectl -n rook-ceph edit cm rook-ceph-mon-endpoints
Remove all entries of failed Ceph Monitors from the ConfigMap data and update the
maxMonId
value with the current number ofRunning
Ceph Monitors. For example,rook-ceph-mon-endpoints
has the followingdata
:data: csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["172.0.0.222:6789","172.0.0.223:6789","172.0.0.224:6789","172.16.52.217:6789","172.16.52.216:6789"]}]' data: a=172.0.0.222:6789,b=172.0.0.223:6789,c=172.0.0.224:6789,f=172.0.0.217:6789,e=172.0.0.216:6789 mapping: '{"node":{ "a":{"Name":"kaas-node-21465871-42d0-4d56-911f-7b5b95cb4d34","Hostname":"kaas-node-21465871-42d0-4d56-911f-7b5b95cb4d34","Address":"172.16.52.222"}, "b":{"Name":"kaas-node-43991b09-6dad-40cd-93e7-1f02ed821b9f","Hostname":"kaas-node-43991b09-6dad-40cd-93e7-1f02ed821b9f","Address":"172.16.52.223"}, "c":{"Name":"kaas-node-15225c81-3f7a-4eba-b3e4-a23fd86331bd","Hostname":"kaas-node-15225c81-3f7a-4eba-b3e4-a23fd86331bd","Address":"172.16.52.224"}, "e":{"Name":"kaas-node-ba3bfa17-77d2-467c-91eb-6291fb219a80","Hostname":"kaas-node-ba3bfa17-77d2-467c-91eb-6291fb219a80","Address":"172.16.52.216"}, "f":{"Name":"kaas-node-6f669490-f0c7-4d19-bf73-e51fbd6c7672","Hostname":"kaas-node-6f669490-f0c7-4d19-bf73-e51fbd6c7672","Address":"172.16.52.217"}} }' maxMonId: "5"
If
e
andf
are the letters of failed Ceph Monitors, the resulting ConfigMap data must be as follows:data: csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["172.0.0.222:6789","172.0.0.223:6789","172.0.0.224:6789"]}]' data: a=172.0.0.222:6789,b=172.0.0.223:6789,c=172.0.0.224:6789 mapping: '{"node":{ "a":{"Name":"kaas-node-21465871-42d0-4d56-911f-7b5b95cb4d34","Hostname":"kaas-node-21465871-42d0-4d56-911f-7b5b95cb4d34","Address":"172.16.52.222"}, "b":{"Name":"kaas-node-43991b09-6dad-40cd-93e7-1f02ed821b9f","Hostname":"kaas-node-43991b09-6dad-40cd-93e7-1f02ed821b9f","Address":"172.16.52.223"}, "c":{"Name":"kaas-node-15225c81-3f7a-4eba-b3e4-a23fd86331bd","Hostname":"kaas-node-15225c81-3f7a-4eba-b3e4-a23fd86331bd","Address":"172.16.52.224"}} }' maxMonId: "3"
Back up the data of the failed Ceph Monitors one by one:
SSH to the node of a failed Ceph Monitor using the previously saved IP address.
Move the Ceph Monitor data directory to another place:
mv /var/lib/rook/mon-<letter> /var/lib/rook/mon-<letter>.backup
Close the SSH connection.
Scale the
rook-ceph/rook-ceph-operator
deployment up to1
replica:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1
Wait until all Ceph Monitors are in the
Running
state:kubectl -n rook-ceph get pod -l app=rook-ceph-mon -w
Restore the data from the backup for each recovered Ceph Monitor one by one:
Enter a recovered Ceph Monitor pod:
kubectl -n rook-ceph exec -it <monPodName> bash
Substitute
<monPodName>
with the recovered Ceph Monitor pod name. For example,rook-ceph-mon-g-845d44b9c6-fjc5d
.Recover the
mon
data backup for the current Ceph Monitor:ceph-monstore-tool /var/lib/rook/mon-<letter>.backup/data store-copy /var/lib/rook/mon-<letter>/data/
Substitute
<letter>
with the current Ceph Monitor pod letter, for example,e
.
Verify the Ceph state. The output must indicate the desired number of Ceph Monitors and all of them must be in quorum.
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s