etcd alarms response

Available since MKE 3.6.10

etcd issues alarms to indicate problems that need to be quickly addressed to ensure uninterrupted function.

NOSPACE alarm

A NOSPACE alarm is issued in the event that etcd runs low on storage space, to protect the cluster from further writes. Once this low storage space state is reached, etcd will respond to all write requests with the mvcc: database space exceeded error message until the issue is rectified.

When MKE detects the NOSPACE alarm condition, it displays a critical banner to inform administrators. In addition, MKE restarts etcd with an increased value for the etcd datastore quota, thus allowing administrators to resolve the NOSPACE alarm without interference.

To resolve the NOSPACE alarm:

  1. Identify what data occupies most of the storage space. Be aware that in MKE the recommended etcdctl commands must be run in the ucp-kv container, the instruction for which is available in Troubleshoot the etcd key-value store with the CLI.

    If a bug-ridden appliction is the cause of the unexpected use of storage space, stop that application.

  2. Manually delete the unused data from etcd, if possible.

  3. Apply etcd defragmentation.

  4. If necessary, increase the etcd_storage_quota setting in the cluster_config table of the MKE configuration file.

Note

Contact Mirantis Support if you require assistance in resolving the etcd NOSPACE alarm.

CORRUPT alarm

The CORRUPT alarm is issued when a cluster corruption is detected by etcd. MKE cluster administrators are informed of the condition by way of a critical banner. To resolve such an issue, contact Mirantis Support and refer to the official etcd documentation regarding data corruption recovery.