MKE disaster recovery

If you cannot recover half or more manager nodes to a healthy state, you have lost quorum and must restore your system using the following procedure.

Note

Perform Swarm disaster recovery procedures prior to those described here.

Recover an MKE cluster from an existing backup

  1. If MKE is still installed on the swarm, uninstall MKE:

    Note

    Skip this step when restoring MKE on new machines.

    docker container run -it --rm -v /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:<mke-version> uninstall-ucp -i
    

    Substitute <mke-version> with the MKE version of your backup.

  2. Confirm that you want to uninstall MKE.

    Example output:

    INFO[0000] Detected UCP instance tgokpm55qcx4s2dsu1ssdga92
    INFO[0000] We're about to uninstall UCP from this Swarm cluster
    Do you want to proceed with the uninstall? (y/n):
    
  3. Restore MKE from the existing backup as described in Restore MKE. If the swarm exists, restore MKE on a manager node. Otherwise, restore MKE on any node, and the swarm will be created automatically during the restore procedure.

Recreate Kubernetes and Swarm objects

For Kubernetes, MKE backs up the declarative state of Kubernetes objects in etcd.

For Swarm, it is not possible to take the state and export it to a declarative format, as the objects that are embedded within the Swarm raft logs are not easily transferable to other nodes or clusters.

To recreate swarm-related workloads, you must refer to the original scripts used for deployment. Alternatively, you can recreate the workloads by manually recreating output using the docker inspect commands.