General alerts

This section lists the general available alerts.


NodeDown

Severity

Critical

Summary

{{ $labels.node }} node is down.

Description

The {{ $labels.node }} node is down. During the last 2 minutes Kubernetes treated the node as Not Ready or Unknown and kubelet was not accessible from Prometheus.

Watchdog

Severity

None

Summary

Watchdog alert that is always firing.

Description

This alert ensures that the entire alerting pipeline is functional. This alert should always be firing in Alertmanager against a receiver. Some integrations with various notification mechanisms can send a notification when this alert is not firing. For example, the DeadMansSnitch integration in PagerDuty.

StacklightGenericTargetsOutage

Severity

Major

Summary

{{ $labels.service_name }} service targets outage.

Description

Prometheus fails to scrape metrics from all {{ $labels.namespace }}/{{ $labels.service_name }} service endpoint(s).