This section describes the alerts for the Heka service.
Severity | Warning |
---|---|
Summary | The {{ $labels.queue }} queue is stalled on node
{{$labels.host }} for more than 1 hour. The corresponding Heka
service is either down or stuck. |
Raise condition | heka_output_queue_size > 134217728 |
Description | Raises when Heka freezes and the output queue is larger than
134217728 (128 MB). The host label in the raised alert contains
the name of the affected node. |
Troubleshooting | Restart the corresponding Heka log collector using
service log_collector restart . |
Tuning | Not required |