Etcd

This section describes the alerts for the etcd service.


etcdDbSizeCritical

Available since 2.21.0 and 2.21.1 for MOSK 22.5

Severity

Critical

Summary

Etcd database passed 95% of quota.

Description

The {{ $labels.job }} etcd database reached {{ $value }} % of defined quota on the {{ $labels.node }} node.

etcdDbSizeMajor

Available since 2.21.0 and 2.21.1 for MOSK 22.5

Severity

Major

Summary

Etcd database passed 85% of quota.

Description

The {{ $labels.job }} etcd database reached {{ $value }} % of defined quota on the {{ $labels.node }} node.

etcdInsufficientMembers

Severity

Critical

Summary

Etcd cluster has insufficient members.

Description

The {{ $labels.job }} etcd cluster has {{ $value }} insufficient members.

etcdNoLeader

Severity

Critical

Summary

Etcd cluster has no leader.

Description

The {{ $labels.node }} member of the {{ $labels.job }} etcd cluster has no leader.

etcdHighNumberOfLeaderChanges

Severity

Warning

Summary

Etcd cluster has detected more than 3 leader changes within the last hour.

Description

The {{ $labels.node }} node of the {{ $labels.job }} etcd cluster has {{ $value }} leader changes within the last hour.

etcdHighNumberOfFailedProposals

Severity

Warning

Summary

Etcd cluster has more than 5 proposal failures.

Description

The {{ $labels.job }} etcd cluster has {{ $value }} proposal failures on the {{ $labels.node }} etcd node within the last hour.

etcdTargetsOutage

Severity

Critical

Summary

Etcd cluster Prometheus targets outage.

Description

Prometheus fails to scrape metrics from 2/3 of etcd nodes (more than 1/10 failed scrapes).