Ceph disaster recovery¶
This section describes how to recover a failed or accidentally removed Ceph cluster in the following cases:
If Ceph Controller underlying a running Rook Ceph cluster has failed and you want to install a new Ceph Controller Helm release and recover the failed Ceph cluster onto the new Ceph Controller.
To migrate the data of an existing Ceph cluster to a new Container Cloud or Mirantis OpenStack for Kubernetes (MOSK) deployment in case downtime can be tolerated.
Consider the common state of a failed or removed Ceph cluster:
The
rook-ceph
namespace does not contain pods or they are in theTerminating
state.The
rook-ceph
or/andceph-lcm-mirantis
namespaces are in theTerminating
state.The
ceph-operator
is in theFAILED
state:For Container Cloud: the state of the
ceph-operator
Helm release in the management HelmBundle, such asdefault/kaas-mgmt
, has switched fromDEPLOYED
toFAILED
.For MOSK: the state of the
osh-system/ceph-operator
HelmBundle, or a related namespace, has switched fromDEPLOYED
toFAILED
.
The Rook
CephCluster
,CephBlockPool
,CephObjectStore
CRs in therook-ceph
namespace cannot be found or have thedeletionTimestamp
parameter in themetadata
section.
Note
Prior to recovering the Ceph cluster, verify that your deployment meets the following prerequisites:
The Ceph cluster
fsid
exists.The Ceph cluster Monitor keyrings exist.
The Ceph cluster devices exist and include the data previously handled by Ceph OSDs.
Overview of the recovery procedure workflow:
Create a backup of the remaining data and resources.
Clean up the failed or removed
ceph-operator
Helm release.Deploy a new
ceph-operator
Helm release with the previously usedKaaSCephCluster
and one Ceph Monitor.Replace the
ceph-mon
data with the old cluster data.Replace
fsid
insecrets/rook-ceph-mon
with the old one.Fix the Monitor map in the
ceph-mon
database.Fix the Ceph Monitor authentication key and disable authentication.
Start the restored cluster and inspect the recovery.
Fix the admin authentication key and enable authentication.
Restart the cluster.
To recover a failed or removed Ceph cluster:
Back up the remaining resources. Skip the commands for the resources that have already been removed:
kubectl -n rook-ceph get cephcluster <clusterName> -o yaml > backup/cephcluster.yaml # perform this for each cephblockpool kubectl -n rook-ceph get cephblockpool <cephBlockPool-i> -o yaml > backup/<cephBlockPool-i>.yaml # perform this for each client kubectl -n rook-ceph get cephclient <cephclient-i> -o yaml > backup/<cephclient-i>.yaml kubectl -n rook-ceph get cephobjectstore <cephObjectStoreName> -o yaml > backup/<cephObjectStoreName>.yaml # perform this for each secret kubectl -n rook-ceph get secret <secret-i> -o yaml > backup/<secret-i>.yaml # perform this for each configMap kubectl -n rook-ceph get cm <cm-i> -o yaml > backup/<cm-i>.yaml
SSH to each node where the Ceph Monitors or Ceph OSDs were placed before the failure and back up the valuable data:
mv /var/lib/rook /var/lib/rook.backup mv /etc/ceph /etc/ceph.backup mv /etc/rook /etc/rook.backup
Once done, close the SSH connection.
Clean up the previous installation of
ceph-operator
. For details, see Rook documentation: Cleaning up a cluster.Delete the
ceph-lcm-mirantis/ceph-controller
deployment:kubectl -n ceph-lcm-mirantis delete deployment ceph-controller
Delete all deployments, DaemonSets, and jobs from the
rook-ceph
namespace, if any:kubectl -n rook-ceph delete deployment --all kubectl -n rook-ceph delete daemonset --all kubectl -n rook-ceph delete job --all
Edit the
MiraCeph
andMiraCephLog
CRs of theceph-lcm-mirantis
namespace and remove thefinalizer
parameter from themetadata
section:kubectl -n ceph-lcm-mirantis edit miraceph kubectl -n ceph-lcm-mirantis edit miracephlog
Edit the
CephCluster
,CephBlockPool
,CephClient
, andCephObjectStore
CRs of therook-ceph
namespace and remove thefinalizer
parameter from themetadata
section:kubectl -n rook-ceph edit cephclusters kubectl -n rook-ceph edit cephblockpools kubectl -n rook-ceph edit cephclients kubectl -n rook-ceph edit cephobjectstores kubectl -n rook-ceph edit cephobjectusers
Once you clean up every single resource related to the Ceph release, open the
Cluster
CR for editing:kubectl -n <projectName> edit cluster <clusterName>
Substitute
<projectName>
withdefault
for the management cluster or with a related project name for the managed cluster.Remove the
ceph-controller
Helm release item from thespec.providerSpec.value.helmReleases
array and save theCluster
CR:- name: ceph-controller values: {}
Verify that
ceph-controller
has disappeared from the corresponding HelmBundle:kubectl -n <projectName> get helmbundle -o yaml
Open the
KaaSCephCluster
CR of the related management or managed cluster for editing:kubectl -n <projectName> edit kaascephcluster
Substitute
<projectName>
withdefault
for the management cluster or with a related project name for the managed cluster.Edit the roles of nodes. The entire
nodes
spec must contain only onemon
role. SaveKaaSCephCluster
after editing.Open the
Cluster
CR for editing:kubectl -n <projectName> edit cluster <clusterName>
Substitute
<projectName>
withdefault
for the management cluster or with a related project name for the managed cluster.Add
ceph-controller
tospec.providerSpec.value.helmReleases
to restore theceph-controller
Helm release. SaveCluster
after editing.- name: ceph-controller values: {}
Verify that the
ceph-controller
Helm release is deployed:Inspect the Rook Operator logs and wait until the orchestration has settled:
kubectl -n rook-ceph logs -l app=rook-ceph-operator
Verify that the pods in the
rook-ceph
namespace haverook-ceph-mon-a
,rook-ceph-mgr-a
, and all the auxiliary pods ar up and running, and norook-ceph-osd-ID-xxxxxx
are running:kubectl -n rook-ceph get pod
Verify the Ceph state. The output must indicate that one
mon
and onemgr
are running, all Ceph OSDs are down, and all PGs are in theUnknown
state.kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Note
Rook should not start any Ceph OSD daemon because all devices belong to the old cluster that has a different
fsid
. To verify the Ceph OSD daemons, inspect theosd-prepare
pods logs:kubectl -n rook-ceph logs -l app=rook-ceph-osd-prepare
Connect to the terminal of the
rook-ceph-mon-a
pod:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \ -l app=rook-ceph-mon -o jsonpath='{.items[0].metadata.name}') bash
Output the
keyring
file and save it for further usage:cat /etc/ceph/keyring-store/keyring exit
Obtain and save the
nodeName
ofmon-a
for further usage:kubectl -n rook-ceph get pod $(kubectl -n rook-ceph get pod \ -l app=rook-ceph-mon -o jsonpath='{.items[0].metadata.name}') -o jsonpath='{.spec.nodeName}'
Obtain and save the
cephImage
used in the Ceph cluster for further usage:kubectl -n ceph-lcm-mirantis get cm ccsettings -o jsonpath='{.data.cephImage}'
Stop Rook Operator and scale the deployment replicas to
0
:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
Remove the Rook deployments generated with Rook Operator:
kubectl -n rook-ceph delete deploy -l app=rook-ceph-mon kubectl -n rook-ceph delete deploy -l app=rook-ceph-mgr kubectl -n rook-ceph delete deploy -l app=rook-ceph-osd kubectl -n rook-ceph delete deploy -l app=rook-ceph-crashcollector
Using the saved
nodeName
, SSH to the host whererook-ceph-mon-a
in the new Kubernetes cluster is placed and perform the following steps:Remove
/var/lib/rook/mon-a
or copy it to another folder:mv /var/lib/rook/mon-a /var/lib/rook/mon-a.new
Pick a healthy
rook-ceph-mon-ID
directory (/var/lib/rook.backup/mon-ID
) in the previous backup, copy to/var/lib/rook/mon-a
:cp -rp /var/lib/rook.backup/mon-<ID> /var/lib/rook/mon-a
Substitute
ID
with any healthymon
node ID of the old cluster.Replace
/var/lib/rook/mon-a/keyring
with the previously saved keyring, preserving only the[mon.]
section. Remove the[client.admin]
section.Run the
cephImage
Docker container using the previously savedcephImage
image:docker run -it --rm -v /var/lib/rook:/var/lib/rook <cephImage> bash
Inside the container, create
/etc/ceph/ceph.conf
for a stable operation ofceph-mon
:touch /etc/ceph/ceph.conf
Change the directory to
/var/lib/rook
and editmonmap
by replacing the existingmon
hosts with the newmon-a
endpoints:cd /var/lib/rook rm /var/lib/rook/mon-a/data/store.db/LOCK # make sure the quorum lock file does not exist ceph-mon --extract-monmap monmap --mon-data ./mon-a/data # Extract monmap from old ceph-mon db and save as monmap monmaptool --print monmap # Print the monmap content, which reflects the old cluster ceph-mon configuration. monmaptool --rm a monmap # Delete `a` from monmap. monmaptool --rm b monmap # Repeat, and delete `b` from monmap. monmaptool --rm c monmap # Repeat this pattern until all the old ceph-mons are removed and monmap won't be empty monmaptool --addv a [v2:<nodeIP>:3300,v1:<nodeIP>:6789] monmap # Replace it with the rook-ceph-mon-a address you got from previous command. ceph-mon --inject-monmap monmap --mon-data ./mon-a/data # Replace monmap in ceph-mon db with our modified version. rm monmap exit
Substitute
<nodeIP>
with the IP address of the current<nodeName>
node.Close the SSH connection.
Change
fsid
to the original one to run Rook as an old cluster:kubectl -n rook-ceph edit secret/rook-ceph-mon
Note
The
fsid
isbase64
encoded and must not contain a trailing carriage return. For example:echo -n a811f99a-d865-46b7-8f2c-f94c064e4356 | base64 # Replace with the fsid from the old cluster.
Scale the
ceph-lcm-mirantis/ceph-controller
deployment replicas to0
:kubectl -n ceph-lcm-mirantis scale deployment ceph-controller --replicas 0
Disable authentication:
Open the
cm/rook-config-override
ConfigMap for editing:kubectl -n rook-ceph edit cm/rook-config-override
Add the following content:
data: config: | [global] ... auth cluster required = none auth service required = none auth client required = none auth supported = none
Start Rook Operator by scaling its deployment replicas to
1
:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1
Inspect the Rook Operator logs and wait until the orchestration has settled:
kubectl -n rook-ceph logs -l app=rook-ceph-operator
Verify that the pods in the
rook-ceph
namespace have therook-ceph-mon-a
,rook-ceph-mgr-a
, and all the auxiliary pods are up and running, and allrook-ceph-osd-ID-xxxxxx
greater than zero are running:kubectl -n rook-ceph get pod
Verify the Ceph state. The output must indicate that one
mon
, onemgr
, and all Ceph OSDs are up and running and all PGs are either in theActive
orDegraded
state:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \ -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Enter the
ceph-tools
pod and import the authentication key:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \ -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash vi key [paste keyring content saved before, preserving only `[client admin]` section] ceph auth import -i key rm key exit
Stop Rook Operator by scaling the deployment to
0
replicas:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
Re-enable authentication:
Open the
cm/rook-config-override
ConfigMap for editing:kubectl -n rook-ceph edit cm/rook-config-override
Remove the following content:
data: config: | [global] ... auth cluster required = none auth service required = none auth client required = none auth supported = none
Remove all Rook deployments generated with Rook Operator:
kubectl -n rook-ceph delete deploy -l app=rook-ceph-mon kubectl -n rook-ceph delete deploy -l app=rook-ceph-mgr kubectl -n rook-ceph delete deploy -l app=rook-ceph-osd kubectl -n rook-ceph delete deploy -l app=rook-ceph-crashcollector
Start Ceph Controller by scaling its deployment replicas to
1
:kubectl -n ceph-lcm-mirantis scale deployment ceph-controller --replicas 1
Start Rook Operator by scaling its deployment replicas to
1
:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1
Inspect the Rook Operator logs and wait until the orchestration has settled:
kubectl -n rook-ceph logs -l app=rook-ceph-operator
Verify that the pods in the
rook-ceph
namespace have therook-ceph-mon-a
,rook-ceph-mgr-a
, and all the auxiliary pods are up and running, and allrook-ceph-osd-ID-xxxxxx
greater than zero are running:kubectl -n rook-ceph get pod
Verify the Ceph state. The output must indicate that one
mon
, onemgr
, and all Ceph OSDs are up and running and the overall stored data size equals to the old cluster data size.kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Edit the
MiraCeph
CR and add two moremon
andmgr
roles to the corresponding nodes:kubectl -n ceph-lcm-mirantis edit miraceph
Inspect the Rook namespace and wait until all Ceph Monitors are in the
Running
state:kubectl -n rook-ceph get pod -l app=rook-ceph-mon
Verify the Ceph state. The output must indicate that three
mon
(three in quorum), onemgr
, and all Ceph OSDs are up and running and the overall stored data size equals to the old cluster data size.kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Once done, the data from the failed or removed Ceph cluster is restored and ready to use.