OpenContrail 4.x: Restore a ZooKeeper database

OpenContrail 4.x: Restore a ZooKeeper databaseΒΆ

You may need to restore a ZooKeeper database after a hardware or software failure.

To restore a ZooKeeper database:

  1. Log in to the Salt Master node.

  2. Open your project Git repository with the Reclass model on the cluster level.

  3. Add the following snippet to cluster/<cluster_name>/infra/backup/client_zookeeper.yml:

    zookeeper:
      backup:
        client:
          enabled: true
          restore_latest: 1
          restore_from: remote
    

    where:

    • restore_latest can have, for example, the following values:
      • 1, which means restoring the database from the last complete backup.
      • 2, which means restoring the database from the second latest complete backup.
    • restore_from can have the local or remote values. The remote value uses scp to get the files from the zookeeper server.
  4. Proceed either with automatic restore steps using the Jenkins web UI pipeline or with manual restore steps:

    • Automatic restore steps:

      1. Verify that the following class is present in cluster/cicd/control/leader.yml:

        classes:
        - system.jenkins.client.job.deploy.update.restore_zookeeper
        

        If you manually add this class, apply the changes:

        salt -C 'I@jenkins:client' state.sls jenkins.client
        
      2. Log in to the Jenkins web UI.

      3. Open the zookeeper - restore pipeline.

      4. Specify the following parameters:

        Parameter Description and values
        SALT_MASTER_CREDENTIALS The Salt Master credentials to use for connection, defaults to salt.
        SALT_MASTER_URL The Salt Master node host URL with the salt-api port, defaults to the jenkins_salt_api_url parameter. For example, http://172.18.170.27:6969.
      5. Click Deploy.

    • Manual restore steps:

      1. Stop the config services on the OpenContrail control nodes:

        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-api'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-schema'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-svc-monitor'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-device-manager'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-config-nodemgr'
        
      2. Stop the control services on the OpenContrail control nodes:

        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-control'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-named'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-dns'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl stop contrail-control-nodemgr'
        
      3. Stop the zookeeper service on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller service zookeeper stop'
        
      4. Remove the ZooKeeper files from the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'rm -rf /var/lib/config_zookeeper_data/version-2/*'
        
      5. Run the zookeeper state.

        Mirantis recommends running the following command using Linux GNU Screen or alternatives.

        salt -C 'I@opencontrail:control' state.apply zookeeper.backup
        

        This state restores the databases and creates a file in /var/backups/zookeeper/dbrestored.

        Caution

        If you rerun the state, it will not restore the database again. To repeat the restore procedure, first delete the /var/backups/zookeeper/dbrestored file and then rerun the zookeeper state again.

      6. Verify that the ZooKeeper files are present again on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller ls /var/lib/zookeeper/version-2'
        
      7. Start the zookeeper service on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller service zookeeper start'
        
      8. Start the config services on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-api'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-schema'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-svc-monitor'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-device-manager'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-config-nodemgr'
        
      9. Start the control services on the OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-control'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-named'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-dns'
        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller systemctl start contrail-control-nodemgr'
        
      10. Wait 60 seconds and verify that the OpenContrail is in correct state on OpenContrail controller nodes:

        salt -C 'I@opencontrail:control' cmd.run 'doctrail controller contrail-status'