Workflows of the OpenStack database backup and restoration¶
This section provides technical details about the internal implementation of automated backup and restoration routines built into MOSK. The below information would be helpful for troubleshooting of any issues related to the process or understanding the impact these procedures impose on a running cloud.
Backup workflow¶
The OpenStack database backup workflow consists of the following phases.
Backup phase 1¶
The mariadb-phy-backup
job launches the
mariadb-phy-backup-<TIMESTAMP>
pod. This pod contains the main backup
script, which is responsible for:
Basic sanity checks and choosing right node for backup
Verifying the wsrep status and changing the
wsrep_desync
parameter settingsManaging the
mariadb-phy-backup-runner
pod
During the first backup phase, the following actions take place:
Sanity check: verification of the Kubernetes status and wsrep status of each MariaDB pod. If some pods have wrong statuses, the backup job fails unless the
--allow-unsafe-backup
parameter is passed to the main script in the Kubernetes backup job.Note
Since MOSK 22.4, the
--allow-unsafe-backup
functionality is removed from the product for security and backup procedure simplification purposes.Mirantis does not recommend setting the
--allow-unsafe-backup
parameter unless it is absolutely required. To ensure the consistency of a backup, verify that the MariaDB Galera cluster is in a working state before you proceed with the backup.
Select the replica to back up. The system selects the replica with the highest number in its name as a target replica. For example, if the MariaDB server pods have the
mariadb-server-0
,mariadb-server-1
, andmariadb-server-2
names, themariadb-server-2
replica will be backed up.Desynchronize the replica from the Galera cluster. The script connects the target replica and sets the
wsrep_desync
variable toON
. Then, the replica stops receiving write-sets and receives the wsrep statusDonor/Desynced
. The Kubernetes health check of thatmariadb-server
pod fails and the Kubernetes status of that pod becomesNot ready
. If the pod has theprimary
label, the MariaDB Controller sets thebackup
label to it and the pod is removed from the endpoints list of the MariaDB service.
Backup phase 2¶
The main script in the
mariadb-phy-backup
pod launches the Kubernetes podmariadb-phy-backup-runner-<TIMESTAMP>
on the same node where the targetmariadb-server
replica is running, which is nodeX
in the example.The
mariadb-phy-backup-runner
pod has bothmysql
data directory andbackup
directory mounted. The pod performs the following actions:Verifies that there is enough space in the
/var/backup
folder to perform the backup. The amount of available space in the folder should be greater than<DB-SIZE> * <MARIADB-BACKUP-REQUIRED-SPACE-RATIO
in KB.Performs the actual backup using the mariabackup tool.
If the number of current backups is greater than the value of the
MARIADB_BACKUPS_TO_KEEP
job parameter, the script removes all old backups exceeding the allowed number of backups.Exits with
0
code.
The script waits untill the
mariadb-phy-backup-runner
pod is completed and collects its logs.The script puts the backed up replica back to sync with the Galera cluster by setting
wsrep_desync
toOFF
and waits for the replica to becomeReady
in Kubernetes.
Restoration workflow¶
The OpenStack database restoration workflow consists of the following phases.
Restoration phase 1¶
The mariadb-phy-restore
job launches the mariadb-phy-restore
pod.
This pod contains the main restore script, which is responsible for:
Scaling of the
mariadb-server
StatefulSetVerifying of the
mariadb-server
pods statusesManaging of the
openstack-mariadb-phy-restore-runner
pods
Caution
During the restoration, the database is not available for OpenStack services that means a complete outage of all OpenStack services.
During the first phase, the following actions are performed:
Save the list of
mariadb-server
persistent volume claims (PVC).Scale the
mariadb
server StatefulSet to0
replicas. At this point, the database becomes unavailable for OpenStack services.
Restoration phase 2¶
The
mariadb-phy-restore
pod launchesopenstack-mariadb-phy-restore-runner
with the firstmariadb-server
replica PVC mounted to the/var/lib/mysql
folder and the backup PVC mounted to/var/backup
. Theopenstack-mariadb-phy-restore-runner
pod performs the following actions:Unarchives the database backup files to a temporary directory within
/var/backup
.Executes
mariabackup --prepare
on the unarchived data.Creates the
.prepared
file in the temporary directory in/var/backup
.Restores the backup to
/var/lib/mysql
.Exits with 0.
The script in the
mariadb-phy-restore
pod collects the logs from theopenstack-mariadb-phy-restore-runner
pod and removes the pod. Then, the script launches the nextopenstack-mariadb-phy-restore-runner
pod for the nextmariadb-server
replica PVC. Theopenstack-mariadb-phy-restore-runner
pod restores the backup to/var/lib/mysql
and exits with0
.Step 2 is repeated for every
mariadb-server
replica PVC sequentially.When the last replica’s data is restored, the last
openstack-mariadb-phy-restore-runner
pod removes the.prepared
file and the temporary folder with unachieved data from/var/backup
.
Restoration phase 3¶
The
mariadb-phy-restore
pod scales themariadb-server
StatefulSet back to the configured number of replicas.The
mariadb-phy-restore
pod waits until allmariadb-server
replicas are ready.