General alerts

This section lists the general available alerts.


NodeDown

Severity

Critical

Summary

{{ $labels.node }} node is down.

Description

The {{ $labels.node }} node is down. Kubernetes treats the node as Not Ready and kubelet is not accessible from Prometheus.

Watchdog

Severity

None

Summary

Watchdog alert that is always firing.

Description

This alert ensures that the entire alerting pipeline is functional. This alert should always be firing in Alertmanager against a receiver. Some integrations with various notification mechanisms can send a notification when this alert is not firing. For example, the DeadMansSnitch integration in PagerDuty.

StacklightGenericTargetsOutage

Severity

Major

Summary

{{ $labels.service_name }} service targets outage.

Description

Prometheus fails to scrape metrics from all {{ $labels.namespace }}/{{ $labels.service_name }} service endpoint(s) (more than 1/10 failed scrapes).