Telegraf

This section lists the alerts for the Telegraf service.


TelegrafGatherErrors

Severity

Major

Summary

{{ $labels.job }} failed to gather metrics.

Description

The {{ $labels.job }} Prometheus target has gathering errors for the last 10 minutes.


TelegrafArpingCheckTargetDown

Severity

Major

Summary

Telegraf arping check Prometheus target is down.

Description

Prometheus fails to scrape metrics from the Telegraf arping check endpoint on the {{ $labels.node }} node (more than 1/10 failed scrapes).


TelegrafArpingCheckTargetsOutage

Severity

Critical

Summary

Telegraf arping check Prometheus targets outage.

Description

Prometheus fails to scrape metrics from all Telegraf arping check endpoints (more than 1/10 failed scrapes).


TelegrafDockerSwarmTargetDown

Severity

Critical

Summary

Telegraf Docker Swarm Prometheus target is down.

Description

Prometheus fails to scrape metrics from the Telegraf Docker Swarm endpoint(s) (more than 1/10 failed scrapes).


TelegrafOpenstackTargetDown

Severity

Critical

Summary

Telegraf OpenStack Prometheus target is down.

Description

Prometheus fails to scrape metrics from the Telegraf OpenStack service (more than 1/10 failed scrapes).


TelegrafSMARTTargetDown

Severity

Major

Summary

Telegraf SMART Prometheus target is down.

Description

Prometheus fails to scrape metrics from the Telegraf SMART endpoint on the {{ $labels.node }} node (more than 1/10 failed scrapes).


TelegrafSMARTTargetsOutage

Severity

Critical

Summary

Telegraf SMART Prometheus targets outage.

Description

Prometheus fails to scrape metrics from all Telegraf SMART endpoints (more than 1/10 failed scrapes).