Ceph disaster recovery¶
This section describes how to recover a failed or accidentally removed Ceph cluster in the following cases:
If Ceph Controller underlying a running Rook Ceph cluster has failed and you want to install a new Ceph Controller Helm release and recover the failed Ceph cluster onto the new Ceph Controller.
To migrate the data of an existing Ceph cluster to a new deployment in case downtime can be tolerated.
Consider the common state of a failed or removed Ceph cluster:
The
rook-ceph
namespace does not contain pods or they are in theTerminating
state.The
rook-ceph
or/andceph-lcm-mirantis
namespaces are in theTerminating
state.The
ceph-operator
is in theFAILED
state:Management cluster: the state of the
ceph-operator
Helm release in the management HelmBundle, such asdefault/kaas-mgmt
, has switched fromDEPLOYED
toFAILED
.Managed cluster: the state of the
osh-system/ceph-operator
HelmBundle, or a related namespace, has switched fromDEPLOYED
toFAILED
.
The Rook
CephCluster
,CephBlockPool
,CephObjectStore
CRs in therook-ceph
namespace cannot be found or have thedeletionTimestamp
parameter in themetadata
section.
Note
Prior to recovering the Ceph cluster, verify that your deployment meets the following prerequisites:
The Ceph cluster
fsid
exists.The Ceph cluster Monitor keyrings exist.
The Ceph cluster devices exist and include the data previously handled by Ceph OSDs.
Ceph cluster recovery workflow¶
Create a backup of the remaining data and resources.
Clean up the failed or removed
ceph-operator
Helm release.Deploy a new
ceph-operator
Helm release with the previously usedKaaSCephCluster
and one Ceph Monitor.Replace the
ceph-mon
data with the old cluster data.Replace
fsid
insecrets/rook-ceph-mon
with the old one.Fix the Monitor map in the
ceph-mon
database.Fix the Ceph Monitor authentication key and disable authentication.
Start the restored cluster and inspect the recovery.
Fix the admin authentication key and enable authentication.
Restart the cluster.
Recover a failed or removed Ceph cluster¶
Back up the remaining resources. Skip the commands for the resources that have already been removed:
kubectl -n rook-ceph get cephcluster <clusterName> -o yaml > backup/cephcluster.yaml # perform this for each cephblockpool kubectl -n rook-ceph get cephblockpool <cephBlockPool-i> -o yaml > backup/<cephBlockPool-i>.yaml # perform this for each client kubectl -n rook-ceph get cephclient <cephclient-i> -o yaml > backup/<cephclient-i>.yaml kubectl -n rook-ceph get cephobjectstore <cephObjectStoreName> -o yaml > backup/<cephObjectStoreName>.yaml # perform this for each secret kubectl -n rook-ceph get secret <secret-i> -o yaml > backup/<secret-i>.yaml # perform this for each configMap kubectl -n rook-ceph get cm <cm-i> -o yaml > backup/<cm-i>.yaml
SSH to each node where the Ceph Monitors or Ceph OSDs were placed before the failure and back up the valuable data:
mv /var/lib/rook /var/lib/rook.backup mv /etc/ceph /etc/ceph.backup mv /etc/rook /etc/rook.backup
Once done, close the SSH connection.
Clean up the previous installation of
ceph-operator
. For details, see Rook documentation: Cleaning up a cluster.Delete the
ceph-lcm-mirantis/ceph-controller
deployment:kubectl -n ceph-lcm-mirantis delete deployment ceph-controller
Delete all deployments, DaemonSets, and jobs from the
rook-ceph
namespace, if any:kubectl -n rook-ceph delete deployment --all kubectl -n rook-ceph delete daemonset --all kubectl -n rook-ceph delete job --all
Edit the
MiraCeph
andMiraCephLog
CRs of theceph-lcm-mirantis
namespace and remove thefinalizer
parameter from themetadata
section:kubectl -n ceph-lcm-mirantis edit miraceph kubectl -n ceph-lcm-mirantis edit miracephlog
Edit the
CephCluster
,CephBlockPool
,CephClient
, andCephObjectStore
CRs of therook-ceph
namespace and remove thefinalizer
parameter from themetadata
section:kubectl -n rook-ceph edit cephclusters kubectl -n rook-ceph edit cephblockpools kubectl -n rook-ceph edit cephclients kubectl -n rook-ceph edit cephobjectstores kubectl -n rook-ceph edit cephobjectusers
Once you clean up every single resource related to the Ceph release, open the
Cluster
CR for editing:kubectl -n <projectName> edit cluster <clusterName>
Substitute
<projectName>
withdefault
for the management cluster or with a related project name for the managed cluster.Remove the
ceph-controller
Helm release item from thespec.providerSpec.value.helmReleases
array and save theCluster
CR:- name: ceph-controller values: {}
Verify that
ceph-controller
has disappeared from the corresponding HelmBundle:kubectl -n <projectName> get helmbundle -o yaml
Open the
KaaSCephCluster
CR of the related management or managed cluster for editing:kubectl -n <projectName> edit kaascephcluster
Substitute
<projectName>
withdefault
for the management cluster or with a related project name for the managed cluster.Edit the roles of nodes. The entire
nodes
spec must contain only onemon
role. SaveKaaSCephCluster
after editing.Open the
Cluster
CR for editing:kubectl -n <projectName> edit cluster <clusterName>
Substitute
<projectName>
withdefault
for the management cluster or with a related project name for the managed cluster.Add
ceph-controller
tospec.providerSpec.value.helmReleases
to restore theceph-controller
Helm release. SaveCluster
after editing.- name: ceph-controller values: {}
Verify that the
ceph-controller
Helm release is deployed:Inspect the Rook Operator logs and wait until the orchestration has settled:
kubectl -n rook-ceph logs -l app=rook-ceph-operator
Verify that the pods in the
rook-ceph
namespace haverook-ceph-mon-a
,rook-ceph-mgr-a
, and all the auxiliary pods ar up and running, and norook-ceph-osd-ID-xxxxxx
are running:kubectl -n rook-ceph get pod
Verify the Ceph state. The output must indicate that one
mon
and onemgr
are running, all Ceph OSDs are down, and all PGs are in theUnknown
state.kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Note
Rook should not start any Ceph OSD daemon because all devices belong to the old cluster that has a different
fsid
. To verify the Ceph OSD daemons, inspect theosd-prepare
pods logs:kubectl -n rook-ceph logs -l app=rook-ceph-osd-prepare
Connect to the terminal of the
rook-ceph-mon-a
pod:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \ -l app=rook-ceph-mon -o jsonpath='{.items[0].metadata.name}') bash
Output the
keyring
file and save it for further usage:cat /etc/ceph/keyring-store/keyring exit
Obtain and save the
nodeName
ofmon-a
for further usage:kubectl -n rook-ceph get pod $(kubectl -n rook-ceph get pod \ -l app=rook-ceph-mon -o jsonpath='{.items[0].metadata.name}') -o jsonpath='{.spec.nodeName}'
Obtain and save the
cephImage
used in the Ceph cluster for further usage:kubectl -n ceph-lcm-mirantis get cm ccsettings -o jsonpath='{.data.cephImage}'
Stop Rook Operator and scale the deployment replicas to
0
:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
Remove the Rook deployments generated with Rook Operator:
kubectl -n rook-ceph delete deploy -l app=rook-ceph-mon kubectl -n rook-ceph delete deploy -l app=rook-ceph-mgr kubectl -n rook-ceph delete deploy -l app=rook-ceph-osd kubectl -n rook-ceph delete deploy -l app=rook-ceph-crashcollector
Using the saved
nodeName
, SSH to the host whererook-ceph-mon-a
in the new Kubernetes cluster is placed and perform the following steps:Remove
/var/lib/rook/mon-a
or copy it to another folder:mv /var/lib/rook/mon-a /var/lib/rook/mon-a.new
Pick a healthy
rook-ceph-mon-ID
directory (/var/lib/rook.backup/mon-ID
) in the previous backup, copy to/var/lib/rook/mon-a
:cp -rp /var/lib/rook.backup/mon-<ID> /var/lib/rook/mon-a
Substitute
ID
with any healthymon
node ID of the old cluster.Replace
/var/lib/rook/mon-a/keyring
with the previously saved keyring, preserving only the[mon.]
section. Remove the[client.admin]
section.Run the
cephImage
Docker container using the previously savedcephImage
image:docker run -it --rm -v /var/lib/rook:/var/lib/rook <cephImage> bash
Inside the container, create
/etc/ceph/ceph.conf
for a stable operation ofceph-mon
:touch /etc/ceph/ceph.conf
Change the directory to
/var/lib/rook
and editmonmap
by replacing the existingmon
hosts with the newmon-a
endpoints:cd /var/lib/rook rm /var/lib/rook/mon-a/data/store.db/LOCK # make sure the quorum lock file does not exist ceph-mon --extract-monmap monmap --mon-data ./mon-a/data # Extract monmap from old ceph-mon db and save as monmap monmaptool --print monmap # Print the monmap content, which reflects the old cluster ceph-mon configuration. monmaptool --rm a monmap # Delete `a` from monmap. monmaptool --rm b monmap # Repeat, and delete `b` from monmap. monmaptool --rm c monmap # Repeat this pattern until all the old ceph-mons are removed and monmap won't be empty monmaptool --addv a [v2:<nodeIP>:3300,v1:<nodeIP>:6789] monmap # Replace it with the rook-ceph-mon-a address you got from previous command. ceph-mon --inject-monmap monmap --mon-data ./mon-a/data # Replace monmap in ceph-mon db with our modified version. rm monmap exit
Substitute
<nodeIP>
with the IP address of the current<nodeName>
node.Close the SSH connection.
Change
fsid
to the original one to run Rook as an old cluster:kubectl -n rook-ceph edit secret/rook-ceph-mon
Note
The
fsid
isbase64
encoded and must not contain a trailing carriage return. For example:echo -n a811f99a-d865-46b7-8f2c-f94c064e4356 | base64 # Replace with the fsid from the old cluster.
Scale the
ceph-lcm-mirantis/ceph-controller
deployment replicas to0
:kubectl -n ceph-lcm-mirantis scale deployment ceph-controller --replicas 0
Disable authentication:
Open the
cm/rook-config-override
ConfigMap for editing:kubectl -n rook-ceph edit cm/rook-config-override
Add the following content:
data: config: | [global] ... auth cluster required = none auth service required = none auth client required = none auth supported = none
Start Rook Operator by scaling its deployment replicas to
1
:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1
Inspect the Rook Operator logs and wait until the orchestration has settled:
kubectl -n rook-ceph logs -l app=rook-ceph-operator
Verify that the pods in the
rook-ceph
namespace have therook-ceph-mon-a
,rook-ceph-mgr-a
, and all the auxiliary pods are up and running, and allrook-ceph-osd-ID-xxxxxx
greater than zero are running:kubectl -n rook-ceph get pod
Verify the Ceph state. The output must indicate that one
mon
, onemgr
, and all Ceph OSDs are up and running and all PGs are either in theActive
orDegraded
state:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \ -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Enter the
ceph-tools
pod and import the authentication key:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \ -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash vi key [paste keyring content saved before, preserving only `[client admin]` section] ceph auth import -i key rm key exit
Stop Rook Operator by scaling the deployment to
0
replicas:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
Re-enable authentication:
Open the
cm/rook-config-override
ConfigMap for editing:kubectl -n rook-ceph edit cm/rook-config-override
Remove the following content:
data: config: | [global] ... auth cluster required = none auth service required = none auth client required = none auth supported = none
Remove all Rook deployments generated with Rook Operator:
kubectl -n rook-ceph delete deploy -l app=rook-ceph-mon kubectl -n rook-ceph delete deploy -l app=rook-ceph-mgr kubectl -n rook-ceph delete deploy -l app=rook-ceph-osd kubectl -n rook-ceph delete deploy -l app=rook-ceph-crashcollector
Start Ceph Controller by scaling its deployment replicas to
1
:kubectl -n ceph-lcm-mirantis scale deployment ceph-controller --replicas 1
Start Rook Operator by scaling its deployment replicas to
1
:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1
Inspect the Rook Operator logs and wait until the orchestration has settled:
kubectl -n rook-ceph logs -l app=rook-ceph-operator
Verify that the pods in the
rook-ceph
namespace have therook-ceph-mon-a
,rook-ceph-mgr-a
, and all the auxiliary pods are up and running, and allrook-ceph-osd-ID-xxxxxx
greater than zero are running:kubectl -n rook-ceph get pod
Verify the Ceph state. The output must indicate that one
mon
, onemgr
, and all Ceph OSDs are up and running and the overall stored data size equals to the old cluster data size.kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Edit the
MiraCeph
CR and add two moremon
andmgr
roles to the corresponding nodes:kubectl -n ceph-lcm-mirantis edit miraceph
Inspect the Rook namespace and wait until all Ceph Monitors are in the
Running
state:kubectl -n rook-ceph get pod -l app=rook-ceph-mon
Verify the Ceph state. The output must indicate that three
mon
(three in quorum), onemgr
, and all Ceph OSDs are up and running and the overall stored data size equals to the old cluster data size.kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Once done, the data from the failed or removed Ceph cluster is restored and ready to use.