Enable Fluentd to expose metrics generated from logs

Enable Fluentd to expose metrics generated from logs¶

You can enable exposing metrics that are based on the log events. This allows monitoring of various activities such as disk failures (metric hdd_errors_total). By default, Fluentd generates metrics from the logs it gathers. However, you must configure Fluentd to expose such metrics to Prometheus. Prometheus gathers Fluentd metrics as a static Prometheus endpoint. For details, see Add a custom monitoring endpoint. To generate metrics from logs, StackLight LMA uses the fluent-plugin-prometheus plugin.

To configure Fluentd to expose metrics generated from logs:

Log in to the Salt Master node.
Add the following class to the cluster/<cluster_name>/init.yml file of the Reclass model:
```
system.fluentd.label.default_metric.prometheus
```
This class creates a new label default_metric that is used as a generic interface to expose new metrics to Prometheus.

(Optional) Create a filter for metric.metric_name to generate the metric.

Example:

reclass:
fluentd:
  agent:
    label:
      default_metric:
        filter:
          metric_out_of_memory:
            tag: metric.out_of_memory
            type: prometheus
            metric:
              - name: out_of_memory_total
                type: counter
                desc: The total number of OOM.
            label:
              - name: host
                value: ${Hostname}
          metric_hdd_errors_parse:
            tag: metric.hdd_errors
            type: parser
            key_name: Payload
            parser:
              type: regexp
              format: '/(?<device>[sv]d[a-z]+\d*)/'
          metric_hdd_errors:
            tag: metric.hdd_errors
            require:
              - metric_hdd_errors_parse
            type: prometheus
            metric:
              - name: hdd_errors_total
                type: counter
                desc: The total number of hdd errors.
            label:
              - name: host
                value: ${Hostname}
              - name: device
                value: ${device}
      systemd:
        output:
          push_to_default:
            tag: '*.systemd'
            type: copy
            store:
              - type: relabel
                label: default_output
              - type: rewrite_tag_filter
                rule:
                  - name: Payload
                    regexp: '^Out of memory'
                    result: metric.out_of_memory
                  - name: Payload
                    regexp: >-
                      'error.+[sv]d[a-z]+\d*'
                    result: metric.hdd_errors
                  - name: Payload
                    regexp: >-
                      '[sv]d[a-z]+\d*.+error'
                    result: metric.hdd_errors
          push_to_metric:
            tag: 'metric.**'
            type: relabel
            label: default_metric

updated: 2025-01-10 08:56

Enable sending CADF events to external SIEM systems

View Previous Section

Configure log rotation