OpenContrail 3.2: Restore a ZooKeeper database

OpenContrail 3.2: Restore a ZooKeeper database

You may need to restore a ZooKeeper database after a hardware or software failure.

To restore a ZooKeeper database:

  1. Log in to the Salt Master node.

  2. Add the following lines to cluster/opencontrail/control.yml:

    zookeeper:
      backup:
        client:
          enabled: true
          restore_latest: 1
          restore_from: remote
    

    where:

    • restore_latest can have, for example, the following values:

      • 1, which means restoring the database from the last complete backup.

      • 2, which means restoring the database from the second latest complete backup.

    • restore_from can have the local or remote values. The remote value uses scp to get the files from the zookeeper server.

  3. Proceed either with automatic restore steps using the Jenkins web UI pipeline or with manual restore steps:

    • Automatic restore steps:

      1. Add the upgrade pipeline to DriveTrain:

        1. Add the following lines to cluster/cicd/control/leader.yml:

          classes:
          - system.jenkins.client.job.deploy.update.restore_zookeeper
          
        2. Run the salt -C 'I@jenkins:client' state.sls jenkins.client state.

      2. Log in to the Jenkins web UI.

      3. Open the zookeeper - restore pipeline.

      4. Specify the following parameters:

        Parameter

        Description and values

        SALT_MASTER_CREDENTIALS

        The Salt Master credentials to use for connection, defaults to salt.

        SALT_MASTER_URL

        The Salt Master node host URL with the salt-api port, defaults to the jenkins_salt_api_url parameter. For example, http://172.18.170.27:6969.

      5. Click Deploy.

    • Manual restore steps:

      1. Stop the supervisor-config service on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' service.stop supervisor-config
        
      2. Stop the supervisor-control service on the OpenContrail control nodes:

        salt -C 'I@opencontrail:control' service.stop supervisor-control
        
      3. Stop the zookeeper service on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' service.stop zookeeper
        
      4. Remove the Zookeeper files on OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'rm -rf /var/lib/zookeeper/version-2/*'
        
      5. Run the zookeeper state.

        Mirantis recommends running the following command using Linux GNU Screen or alternatives.

        salt -C 'I@opencontrail:control' cmd.run "su root -c 'salt-call state.sls zookeeper'"
        

        This state restores the databases and creates a file in /var/backups/zookeeper/dbrestored.

        Caution

        If you rerun the state, it will not restore the database again. To repeat the restore procedure, first delete the /var/backups/zookeeper/dbrestored file and then rerun the zookeeper state again.

      6. Start the zookeeper service on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' service.start zookeeper
        
      7. Start the supervisor-config service on the OpenContrail control nodes:

        
        

        salt -C ‘I@opencontrail:control’ service.start supervisor-config

      8. Start the supervisor-control service on the OpenContrail control nodes:

        salt -C 'I@opencontrail:control' service.start supervisor-control
        
      9. Start the zookeeper service on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' service.start zookeeper
        
      10. Verify that the Zookeeper files are present again on OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'ls /var/lib/zookeeper/version-2'
        
      11. Wait 60 seconds and verify that the OpenContrail is in correct state on OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'contrail-status'