The Keepalived process on the {{$labels.host}} node is down.
Raise condition
procstat_running{process_name="keepalived"}==0
Description
Raised when Keepalived on a particular host does not respond Telegraf,
typically indicating that Keepalived is down. The host label in the
raised alert contains the host name of the affected node.
Troubleshooting
Verify the Keepalived status on the affected node using
systemctlstatuskeepalived.
Inspect the Keepalived logs on the affected node using
journalctl-ukeepalived.
Inspect the Telegraf logs on the affected node using
journalctl-utelegraf.
The Keepalived process on the {{$labels.host}} node is not
responding.
Raise condition
keepalived_up==0
Description
Raises when Keepalived on a particular host does not respond to
Telegraf, typically indicating that Keepalived is running but is not
responsive on that node. The host label in the raised alert contains
the host name of the affected node.
Troubleshooting
Verify the Keepalived status on the affected node using
servicekeepalivedstatus.
Inspect the Keepalived logs on the affected node using
journalctl-ukeepalived.
Inspect the Telegraf logs on the affected node using
journalctl-utelegraf.
The Keepalived VRRP {{$labels.name}} is in the FAILED state on
the {{$labels.host}} node.
Raise condition
keepalived_state==0
Description
Raises when the Keepalived Virtual Router Redundancy Protocol (VRRP) is
in the FAILED state on a node, typically indicating network issues.
The host label in the raised alert contains the host name of the
affected node.
Troubleshooting
Inspect the Keepalived logs on the affected node using
journalctl-ukeepalived.
Inspect the Telegraf logs on the affected node using
journalctl-utelegraf.
The Keepalived VRRP {{$labels.name}} is in the UNKNOWN state
on the {{$labels.host}} node.
Raise condition
keepalived_state==-1
Description
Raises when the Keepalived Virtual Router Redundancy Protocol (VRRP) is
in the UNKNOWN state on a node, typically indicating that Keepalived
has improperly reported its state or Telegraf cannot gather the state.
The host label in the raised alert contains the host name of the
affected node.
Troubleshooting
Inspect the Keepalived logs on the affected node using
journalctl-ukeepalived.
Inspect the Telegraf logs on the affected node using
journalctl-utelegraf.
The Keepalived {{$labels.ip}} virtual IP is assigned more than
once.
Raise condition
count(ipcheck_assigned)by(ip)>1
Description
Raises when the virtual IP address (VIP) of Keepalived is assigned more
than once (on more than one node within a cluster).
Troubleshooting
On each node of the Keepalived cluster, ctl nodes by default, verify
if the VIP is assigned on two or more nodes or interfaces using the
ipa|grepVIP_address command.