You may need to restore the Cassandra database after a hardware or software failure.
Warning
During the automatic and manual restore procedures, all current Cassandra data is deleted. Therefore, starting from the MCP 2019.2.5 maintenance update, a database backup in Cassandra is not created before the restore procedure. If a backup of current data is required, you can create an instant backup. For details, see: OpenContrail 4.x: Create an instant backup of a Cassandra database.
To restore the Cassandra database:
Log in to the Salt Master node.
Open your project Git repository with the Reclass model on the cluster level.
Add the following snippet to
cluster/<cluster_name>/infra/backup/client_cassandra.yml
:
cassandra:
backup:
client:
enabled: true
restore_latest: 1
restore_from: remote
where:
restore_latest
can have, for example, the following values:
1
, which means restoring the database from the last complete backup.
2
, which means restoring the database from the second latest
complete backup.
restore_from
can have the local
or remote
values. The
remote
value uses scp
to get the files from the cassandra
server.
Proceed either with automatic restore steps using the Jenkins web UI pipeline or with manual restore steps:
Automatic restore steps:
Verify that the following class is present in
cluster/cicd/control/leader.yml
:
classes:
- system.jenkins.client.job.deploy.update.restore_cassandra
If you manually add this class, apply the changes:
salt -C 'I@jenkins:client' state.sls jenkins.client
Log in to the Jenkins web UI.
Open the cassandra - restore pipeline.
Specify the following parameters:
Parameter |
Description and values |
---|---|
SALT_MASTER_CREDENTIALS |
The Salt Master credentials to use for connection, defaults to
|
SALT_MASTER_URL |
The Salt Master node host URL with the |
Click Deploy.
Manual restore steps:
Stop the supervisor-database
service on the OpenContrail control
nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-database'
Remove the Cassandra files on OpenContrail control nodes:
salt -C 'I@opencontrail:control' cmd.run 'rm -rf /var/lib/configdb/*'
Start the supervisor-database
service on the Cassandra client
backup node:
salt -C 'I@cassandra:backup:client' cmd.run 'doctrail controller systemctl start contrail-database'
Run the cassandra
state:
salt -C 'I@cassandra:backup:client' cmd.run "su root -c 'salt-call state.sls cassandra'"
This state restores the databases and creates a file in
/var/backups/cassandra/dbrestored
.
Caution
If you rerun the state, it will not restore the database
again. To repeat the restore procedure, first delete the
/var/backups/cassandra/dbrestored
file and
then rerun the cassandra
state again.
Reboot the Cassandra backup client role node first:
salt -C 'I@cassandra:backup:client' system.reboot
Reboot the other OpenContrail control nodes:
salt -C 'I@opencontrail:control and not I@cassandra:backup:client' system.reboot
Wait for 60 seconds and restart the supervisor-database
service on
the OpenContrail control nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl restart contrail-database'
Wait for 60 seconds and verify that OpenContrail is in correct state on control nodes:
salt -C 'I@opencontrail:control' cmd.run 'doctrail controller contrail-status'