You may need to restore the Cassandra database after a hardware or software failure.
To restore the Cassandra database:
Log in to the Salt Master node.
Add the following lines to cluster/opencontrail/control_init.yml
:
cassandra:
backup:
client:
enabled: true
restore_latest: 1
restore_from: remote
where:
restore_latest
can have, for example, the following values:
1
, which means restoring the database from the last complete backup.
2
, which means restoring the database from the second latest
complete backup.
restore_from
can have the local
or remote
values. The
remote
value uses scp
to get the files from the cassandra
server.
Proceed either with automatic restore steps using the Jenkins web UI pipeline or with manual restore steps:
Automatic restore steps:
Add the upgrade pipeline to DriveTrain:
Add the following lines to cluster/cicd/control/leader.yml
:
classes:
- system.jenkins.client.job.deploy.update.restore_cassandra
Run the salt -C 'I@jenkins:client' state.sls jenkins.client state.
Log in to the Jenkins web UI.
Open the cassandra - restore pipeline.
Specify the following parameters:
Parameter |
Description and values |
---|---|
SALT_MASTER_CREDENTIALS |
The Salt Master credentials to use for connection, defaults to
|
SALT_MASTER_URL |
The Salt Master node host URL with the |
Click Deploy.
Manual restore steps:
Stop the neutron-server
service to prevent CRUD API calls
during the restore procedure:
salt -C 'I@neutron:server' service.stop neutron-server
Stop the supervisor-config
service since the OpenContrail
configuration services are actively using Cassandra ConfigDB:
salt -C 'I@opencontrail:control' service.stop supervisor-config
Stop the supervisor-database
service on the OpenContrail control
nodes:
salt -C 'I@opencontrail:control' service.stop supervisor-database
Remove the Cassandra files on the OpenContrail control nodes:
salt -C 'I@opencontrail:control' cmd.run 'rm -rf /var/lib/cassandra/*'
Start the supervisor-database
service on the Cassandra client
backup node and wait until the supervisor-database
service is up:
salt -C 'I@cassandra:backup:client' service.start supervisor-database
salt -C 'I@cassandra:backup:client' cmd.run "service supervisor-database status"
Verify the Cassandra cluster status and make sure that all nodes
are in the normal UN
state :
salt -C 'I@cassandra:backup:client' cmd.run 'nodetool status'
Apply the cassandra
state:
salt -C 'I@cassandra:backup:client' state.sls cassandra
This state restores the databases and creates a file in
/var/backups/cassandra/dbrestored
.
Caution
If you rerun the state, it will not restore the database
again. To repeat the restore procedure, first delete the
/var/backups/cassandra/dbrestored
file and
then rerun the cassandra
state again.
Verify data load on the Cassandra cluster nodes:
salt -C 'I@cassandra:backup:client' cmd.run 'nodetool status'
In the nodetool status
command output, verify that the Load
column has similar values on all nodes. Cassandra should make
replication of data between all cluster nodes.
Start the supervisor-config
service:
salt -C 'I@opencontrail:control' service.start supervisor-config
Wait for 60 seconds and verify that OpenContrail is in correct state on the control nodes:
salt -C 'I@opencontrail:control' cmd.run 'contrail-status'
Start the neutron-server
service:
salt -C 'I@neutron:server' service.start neutron-server