Newer documentation is now live.You are currently reading an older version.

MariaDB Pods fail to start after a non-graceful shutdown

After a non-graceful shutdown such as an unexpected power loss or a forced node reset, a mariadb-server Pod may get stuck in continuous restarts with the following error in the Pod logs:

[ERROR] Recovery failed! You must enable all engines that were enabled at the moment of the crash
[ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete tc log and start server with --tc-heuristic-recover={commit|rollback}

The error occurs because MariaDB attempts to replay the tc.log transaction coordinator log during crash recovery, but the wsrep provider registered in the log at the time of the crash is temporarily disabled during the recovery phase, preventing the replay from completing.

To resolve the issue:

  1. Create a backup of the /var/lib/mysql directory on the affected mariadb-server Pod.

  2. Verify that other replicas are up and ready.

  3. Remove the /var/lib/mysql/tc.log file for the affected mariadb-server Pod.

  4. Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod rejoins the cluster.