This section describes the alerts for the Telegraf service.
Available starting from the 2019.2.5 maintenance update
Severity | Major |
---|---|
Summary | Telegraf failed to gather metrics. |
Raise condition |
|
Description | Raises when Telegraf has gathering errors on a node for the last 10
minutes. The host label in the raised alert contains the host name
of the affected node. |
Troubleshooting | Inspect the Telegraf logs by running journalctl -u telegraf on the
affected node. |
Tuning | Not required |
Available starting from the 2019.2.10 maintenance update
Severity | Major |
---|---|
Summary | Remote Telegraf failed to gather metrics. |
Raise condition | rate(internal_agent_gather_errors{job="remote_agent"}[10m]) > 0 |
Description | Raises when remote Telegraf has gathering errors for the last 10 minutes. |
Troubleshooting | Inspect the Telegraf monitoring_remote_agent service logs. |
Tuning | Not required |