Recover GlusterFS on a replaced KVM node

Recover GlusterFS on a replaced KVM node

After you replace a KVM node as described in Replace a failed KVM node, if your new KVM node has the same IP address, proceed with recovering GlusterFS as described below.

To recover GlusterFS on a replaced KVM node:

  1. Log in to the Salt Master node.

  2. Define the IP address of the failed and any working KVM node that is running the GlusterFS cluster services. For example:

    FAILED_NODE_IP=<IP_of_failed_kvm_node>
    WORKING_NODE_IP=<IP_of_working_kvm_node>
    
  3. If the failed node has been recovered with the old disk and GlusterFS installed:

    1. Remove the /var/lib/glusterd directory:

      salt -S $FAILED_NODE_IP file.remove '/var/lib/glusterd'
      
    2. Restart glusterfs-server:

      salt -S $FAILED_NODE_IP service.restart glusterfs-server
      
  4. Configure glusterfs-server on the failed node:

    salt -S $FAILED_NODE_IP state.apply glusterfs.server.service
    
  5. Remove the failed node from the GlusterFS cluster:

    salt -S $WORKING_NODE_IP cmd.run "gluster peer detach $FAILED_NODE_IP"
    
  6. Re-add the failed node to the GlusterFS cluster with a new ID:

    salt -S $WORKING_NODE_IP cmd.run "gluster peer probe $FAILED_NODE_IP"
    
  7. Finalize the configuration of the failed node:

    salt -S $FAILED_NODE_IP state.apply
    
  8. Set the correct trusted.glusterfs.volume-id attribute in the GlusterFS directories on the failed node:

    for vol in $(salt --out=txt -S $WORKING_NODE_IP cmd.run 'for dir in /srv/glusterfs/*; \
    do echo -n "${dir}@0x"; getfattr  -n trusted.glusterfs.volume-id \
    --only-values --absolute-names $dir | xxd -g0 -p;done' | awk -F: '{print $2}'); \
    do VOL_PATH=$(echo $vol| cut -d@ -f1); TRUST_ID=$(echo $vol | cut -d@ -f2); \
    salt -S $FAILED_NODE_IP cmd.run "setfattr -n trusted.glusterfs.volume-id -v $TRUST_ID $VOL_PATH"; \
    done
    
  9. Restart glusterfs-server:

    salt -S $FAILED_NODE_IP service.restart glusterfs-server