You may need to restore a ZooKeeper database after a hardware or software failure.
To restore a ZooKeeper database:
Log in to the Salt Master node.
Add the following lines to cluster/opencontrail/control.yml
:
zookeeper:
backup:
client:
enabled: true
restore_latest: 1
restore_from: remote
where:
restore_latest
can have, for example, the following values:1
, which means restoring the database from the last complete backup.2
, which means restoring the database from the second latest
complete backup.restore_from
can have the local
or remote
values. The
remote
value uses scp
to get the files from the zookeeper
server.Proceed either with automatic restore steps using the Jenkins web UI pipeline or with manual restore steps:
Automatic restore steps:
Add the upgrade pipeline to DriveTrain:
Add the following lines to cluster/cicd/control/leader.yml
:
classes:
- system.jenkins.client.job.deploy.update.restore_zookeeper
Run the salt -C 'I@jenkins:client' state.sls jenkins.client state.
Log in to the Jenkins web UI.
Open the zookeeper - restore pipeline.
Specify the following parameters:
Parameter | Description and values |
---|---|
SALT_MASTER_CREDENTIALS | The Salt Master credentials to use for connection, defaults to
salt . |
SALT_MASTER_URL | The Salt Master node host URL with the salt-api port,
defaults to the jenkins_salt_api_url parameter.
For example, http://172.18.170.27:6969 . |
Click Deploy.
Manual restore steps:
Stop the supervisor-config
service on the OpenContrail controller
nodes:
salt -C 'I@opencontrail:control' service.stop supervisor-config
Stop the supervisor-control
service on the OpenContrail control
nodes:
salt -C 'I@opencontrail:control' service.stop supervisor-control
Stop the zookeeper
service on the OpenContrail controller nodes:
salt -C 'I@opencontrail:control' service.stop zookeeper
Remove the Zookeeper files on OpenContrail controller nodes:
salt -C 'I@opencontrail:control' cmd.run 'rm -rf /var/lib/zookeeper/version-2/*'
Run the zookeeper
state.
Mirantis recommends running the following command using Linux GNU Screen or alternatives.
salt -C 'I@opencontrail:control' cmd.run "su root -c 'salt-call state.sls zookeeper'"
This state restores the databases and creates a file in
/var/backups/zookeeper/dbrestored
.
Caution
If you rerun the state, it will not restore the database
again. To repeat the restore procedure, first delete the
/var/backups/zookeeper/dbrestored
file and
then rerun the zookeeper
state again.
Start the zookeeper
service on the OpenContrail controller nodes:
salt -C 'I@opencontrail:control' service.start zookeeper
Start the supervisor-config
service on the OpenContrail control
nodes:
salt -C ‘I@opencontrail:control’ service.start supervisor-config
Start the supervisor-control
service on the OpenContrail control
nodes:
salt -C 'I@opencontrail:control' service.start supervisor-control
Start the zookeeper
service on the OpenContrail controller nodes:
salt -C 'I@opencontrail:control' service.start zookeeper
Verify that the Zookeeper files are present again on OpenContrail controller nodes:
salt -C 'I@opencontrail:control' cmd.run 'ls /var/lib/zookeeper/version-2'
Wait 60 seconds and verify that the OpenContrail is in correct state on OpenContrail controller nodes:
salt -C 'I@opencontrail:control' cmd.run 'contrail-status'