Back up Swarm¶
MKE manager nodes store the swarm state and manager logs in the
/var/lib/docker/swarm/
directory. Swarm raft logs contain crucial
information for recreating Swarm-specific resources, including services,
secrets, configurations, and node cryptographic identity. This data includes
the keys used to encrypt the raft logs. You must have these keys to restore the
swarm.
Because logs contain node IP address information and are not transferable to other nodes, you must perform a manual backup on each manager node. If you do not back up the raft logs, you cannot verify workloads or Swarm resource provisioning after restoring the cluster.
Note
You can avoid performing a Swarm backup by storing stacks, services definitions, secrets, and networks definitions in a source code management or config management tool.
Data |
Backed up |
Description |
---|---|---|
Raft keys |
Yes |
Keys used to encrypt communication between Swarm nodes and to encrypt and decrypt raft logs |
Membership |
Yes |
List of the nodes in the cluster |
Services |
Yes |
Stacks and services stored in Swarm mode |
Overlay networks |
Yes |
Overlay networks created on the cluster |
Configs |
Yes |
Configs created in the cluster |
Secrets |
Yes |
Secrets saved in the cluster |
Swarm unlock key |
No |
Secret key needed to unlock a manager after its Docker daemon restarts |
To back up Swarm:
Note
All commands that follow must be prefixed with sudo
or executed from
a superuser shell by first running sudo sh
.
If
auto-lock
is enabled, retrieve your Swarm unlock key. Refer to Rotate the unlock key in the Docker documentation for more information.Optional. Mirantis recommends that you run at least three manager nodes, in order to achieve high availability, as you must stop the engine of the manager node before performing the backup. A majority of managers must be online for a cluster to be operational. If you have less than 3 managers, the cluster will be unavailable during the backup.
Note
While a manager is shut down, your swarm is more likely to lose quorum if further nodes are lost. A loss of quorum renders the swarm unavailable until quorum is recovered. Quorum is only recovered when more than 50% of the nodes become available. If you regularly take down managers when performing backups, consider running a 5-manager swarm, as this will enable you to lose an additional manager while the backup is running, without disrupting services.
Select a manager node other than the leader to avoid a new election inside the cluster:
docker node ls -f "role=manager" | tail -n+2 | grep -vi leader
Optional. Store the Mirantis Container Runtime (MCR) version in a variable to easily add it to your backup name.
ENGINE=$(docker version -f '{{.Server.Version}}')
Stop MCR on the manager node before backing up the data, so that no data is changed during the backup:
systemctl stop docker
Back up the
/var/lib/docker/swarm
directory:tar cvzf "/tmp/swarm-${ENGINE}-$(hostname -s)-$(date +%s%z).tgz" /var/lib/docker/swarm/
You can decode the Unix epoch in the file name by typing
date -d @timestamp
:date -d @1531166143 Mon Jul 9 19:55:43 UTC 2018
If
auto-lock
is enabled, unlock the swarm:docker swarm unlock
Restart MCR on the manager node:
systemctl start docker
Repeat the above steps for each manager node.
See also