UCP Disaster Recovery

UCP Disaster Recovery

Disaster recovery procedures should be performed in the following order:

  1. Docker Swarm
  2. UCP disaster recovery (this topic)
  3. DTR disaster recovery

UCP disaster recovery

In the event half or more manager nodes are lost and cannot be recovered to a healthy state, the system is considered to have lost quorum and can only be restored through the following disaster recovery procedure.

Recover a UCP cluster from an existing backup

  1. If UCP is still installed on the swarm, uninstall UCP using the uninstall-ucp command. > Note: If the restore is happening on new machines, skip this step.
  2. Perform a restore from an existing backup on any node. If there is an existing swarm, the restore operation must be performed on a manager node. If no swarm exists, the restore operation will create one.

Recreate objects within Orchestrators that Docker Enterprise supports

Kubernetes currently backs up the declarative state of Kube objects in etcd. However, for Swarm, there is no way to take the state and export it to a declarative format, since the objects that are embedded within the Swarm raft logs are not easily transferable to other nodes or clusters.

For disaster recovery, to recreate swarm related workloads requires having the original scripts used for deployment. Alternatively, you can recreate workloads by manually recreating output from docker inspect commands.