You may need to restore a ZooKeeper database after a hardware or software failure.
To restore a ZooKeeper database:
Log in to the Salt Master node.
Open your project Git repository with the Reclass model on the cluster level.
Add the following snippet to
cluster/<cluster_name>/infra/backup/client_zookeeper.yml
:
zookeeper:
backup:
client:
enabled: true
restore_latest: 1
restore_from: remote
where:
restore_latest
can have, for example, the following values:1
, which means restoring the database from the last complete backup.2
, which means restoring the database from the second latest
complete backup.restore_from
can have the local
or remote
values. The
remote
value uses scp
to get the files from the zookeeper
server.Proceed either with automatic restore steps using the Jenkins web UI pipeline or with manual restore steps:
Automatic restore steps:
Verify that the following class is present in
cluster/cicd/control/leader.yml
:
classes:
- system.jenkins.client.job.deploy.update.restore_zookeeper
If you manually add this class, apply the changes:
salt -C 'I@jenkins:client' state.sls jenkins.client
Log in to the Jenkins web UI.
Open the zookeeper - restore pipeline.
Specify the following parameters:
Parameter | Description and values |
---|---|
SALT_MASTER_CREDENTIALS | The Salt Master credentials to use for connection, defaults to
salt . |
SALT_MASTER_URL | The Salt Master node host URL with the salt-api port,
defaults to the jenkins_salt_api_url parameter.
For example, http://172.18.170.27:6969 . |
Click Deploy.
Manual restore steps:
Stop the config
services on the OpenContrail control
nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-api'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-schema'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-svc-monitor'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-device-manager'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-config-nodemgr'
Stop the control
services on the OpenContrail control
nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-control'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-named'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-dns'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-control-nodemgr'
Stop the zookeeper
service on the OpenContrail controller nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller service zookeeper stop'
Remove the ZooKeeper files from the OpenContrail controller nodes:
salt -C 'I@opencontrail:control' cmd.run 'rm -rf /var/lib/config_zookeeper_data/version-2/*'
Run the zookeeper
state.
Mirantis recommends running the following command using Linux GNU Screen or alternatives.
salt -C 'I@opencontrail:control' state.apply zookeeper.backup
This state restores the databases and creates a file in
/var/backups/zookeeper/dbrestored
.
Caution
If you rerun the state, it will not restore the database
again. To repeat the restore procedure, first delete the
/var/backups/zookeeper/dbrestored
file and
then rerun the zookeeper
state again.
Verify that the ZooKeeper files are present again on the OpenContrail controller nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller ls /var/lib/zookeeper/version-2'
Start the zookeeper
service on the OpenContrail controller nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller service zookeeper start'
Start the config
services on the OpenContrail controller
nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-api'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-schema'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-svc-monitor'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-device-manager'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-config-nodemgr'
Start the control
services on the OpenContrail controller
nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-control'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-named'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-dns'
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-control-nodemgr'
Wait 60 seconds and verify that the OpenContrail is in correct state on OpenContrail controller nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller contrail-status'