Back up Swarm¶

MKE manager nodes store the swarm state and manager logs in the /var/lib/docker/swarm/ directory. Swarm raft logs contain crucial information for recreating Swarm-specific resources, including services, secrets, configurations, and node cryptographic identity. This data includes the keys used to encrypt the raft logs. You must have these keys to restore the swarm.

Because logs contain node IP address information and are not transferable to other nodes, you must perform a manual backup on each manager node. If you do not back up the raft logs, you cannot verify workloads or Swarm resource provisioning after restoring the cluster.

Note

You can avoid performing a Swarm backup by storing stacks, services definitions, secrets, and networks definitions in a source code management or config management tool.

Swarm backup contents¶
Data	Backed up	Description
Raft keys	Yes	Keys used to encrypt communication between Swarm nodes and to encrypt and decrypt raft logs
Membership	Yes	List of the nodes in the cluster
Services	Yes	Stacks and services stored in Swarm mode
Overlay networks	Yes	Overlay networks created on the cluster
Configs	Yes	Configs created in the cluster
Secrets	Yes	Secrets saved in the cluster
Swarm unlock key	No	Secret key needed to unlock a manager after its Docker daemon restarts

To back up Swarm:

Note

All commands that follow must be prefixed with sudo or executed from a superuser shell by first running sudo sh.

If auto-lock is enabled, retrieve your Swarm unlock key. Refer to Rotate the unlock key in the Docker documentation for more information.
Optional. Mirantis recommends that you run at least three manager nodes, in order to achieve high availability, as you must stop the engine of the manager node before performing the backup. A majority of managers must be online for a cluster to be operational. If you have less than 3 managers, the cluster will be unavailable during the backup.

Note

While a manager is shut down, your swarm is more likely to lose quorum if further nodes are lost. A loss of quorum renders the swarm unavailable until quorum is recovered. Quorum is only recovered when more than 50% of the nodes become available. If you regularly take down managers when performing backups, consider running a 5-manager swarm, as this will enable you to lose an additional manager while the backup is running, without disrupting services.
Select a manager node other than the leader to avoid a new election inside the cluster:
```
docker node ls -f "role=manager" | tail -n+2 | grep -vi leader
```
Optional. Store the Mirantis Container Runtime (MCR) version in a variable to easily add it to your backup name.
```
ENGINE=$(docker version -f '{{.Server.Version}}')
```
Stop MCR on the manager node before backing up the data, so that no data is changed during the backup:
```
systemctl stop docker
```

Back up the /var/lib/docker/swarm directory:

tar cvzf "/tmp/swarm-${ENGINE}-$(hostname -s)-$(date +%s%z).tgz" /var/lib/docker/swarm/

You can decode the Unix epoch in the file name by typing date -d @timestamp:

date -d @1531166143
Mon Jul  9 19:55:43 UTC 2018

If auto-lock is enabled, unlock the swarm:
```
docker swarm unlock
```
Restart MCR on the manager node:
```
systemctl start docker
```
Repeat the above steps for each manager node.