After you replace a KVM node as described in Replace a failed KVM node, if your new KVM node has the same IP address, proceed with recovering GlusterFS as described below.
To recover GlusterFS on a replaced KVM node:
Log in to the Salt Master node.
Define the IP address of the failed and any working KVM node that is running the GlusterFS cluster services. For example:
FAILED_NODE_IP=<IP_of_failed_kvm_node>
WORKING_NODE_IP=<IP_of_working_kvm_node>
If the failed node has been recovered with the old disk and GlusterFS installed:
Remove the /var/lib/glusterd
directory:
salt -S $FAILED_NODE_IP file.remove '/var/lib/glusterd'
Restart glusterfs-server
:
salt -S $FAILED_NODE_IP service.restart glusterfs-server
Configure glusterfs-server
on the failed node:
salt -S $FAILED_NODE_IP state.apply glusterfs.server.service
Remove the failed node from the GlusterFS cluster:
salt -S $WORKING_NODE_IP cmd.run "gluster peer detach $FAILED_NODE_IP"
Re-add the failed node to the GlusterFS cluster with a new ID:
salt -S $WORKING_NODE_IP cmd.run "gluster peer probe $FAILED_NODE_IP"
Finalize the configuration of the failed node:
salt -S $FAILED_NODE_IP state.apply
Set the correct trusted.glusterfs.volume-id
attribute in the GlusterFS
directories on the failed node:
for vol in $(salt --out=txt -S $WORKING_NODE_IP cmd.run 'for dir in /srv/glusterfs/*; \
do echo -n "${dir}@0x"; getfattr -n trusted.glusterfs.volume-id \
--only-values --absolute-names $dir | xxd -g0 -p;done' | awk -F: '{print $2}'); \
do VOL_PATH=$(echo $vol| cut -d@ -f1); TRUST_ID=$(echo $vol | cut -d@ -f2); \
salt -S $FAILED_NODE_IP cmd.run "setfattr -n trusted.glusterfs.volume-id -v $TRUST_ID $VOL_PATH"; \
done
Restart glusterfs-server
:
salt -S $FAILED_NODE_IP service.restart glusterfs-server