The auditd events cause ‘backlog limit exceeded’ messages

If auditd generates a lot of events, some of them may be lost with the following numerous messages in dmesg or kernel logs:

auditd: backlog limit exceeded

You may also observe high or growing values of the lost counter in the auditctl output. For example:

auditctl -s
...
lost 1351280
...

To resolve the issue, you may need to update the rules loaded to auditd and adjust the size of the backlog buffer.

Update the rules loaded to auditd

If auditd contains a lot of rules, it may generate a lot of events and overrun the buffer. Therefore, verify and update your preset and custom rules. Preset rules are defined as presetRules, custom rules are defined as follows:

  • customRules

  • customRulesX32

  • customRulesX64

To verify and update the rules:

  1. In the Cluster object of the affected cluster, verify that the presetRules string does not start with the ! symbol.

  2. Verify all audit rules:

    1. Log in through SSH or directly using the console to the node having the buffer overrun symptoms.

    2. Run the following command:

      auditctl -l
      

      In the system response, identify the rules to exclude.

    3. In /etc/audit/rules.d, find the files containing the rules to exclude.

      • If the file is named 60-custom.rules, remove the rules from any of the following parameters located in the Cluster object:

        • customRules

        • customRulesX32

        • customRulesX64

      • If the file is named 50-<NAME>.rules, and you want to exclude all rules from that file, exclude the preset named <NAME> from the list of allowed presets defined under presetRules in the Cluster object.

      • If the file is named 50-<NAME>.rules, and you want to exclude only several rules from that file:

        1. Copy the rules you want to keep to one of the following parameters located in the Cluster object:

          • customRules

          • customRulesX32

          • customRulesX64

        2. Exclude the preset named <NAME> from the list of allowed presets.

Adjust the size of the backlog buffer

By default, the backlog buffer size is set to 8192, which is enough for most use cases. To prevent buffer overrun, you can adjust the default value to fit your needs. But keep in mind that increasing this value leads to higher memory requirements because the buffer uses RAM.

To estimate RAM requirements for the buffer, you can use the following calculation:

  • A buffer of 8192 audit records uses ~70 MiB of RAM

  • A buffer of 15000 audit records uses ~128 MiB of RAM

To change the backlog buffer size, adjust the backlogLimit value in the Cluster object of the affected cluster.

You may also want to change the size directly on the system and verify the result at once. But to change the size permanently, use the Cluster object.

To adjust the size of the backlog buffer on a node:

  1. Log in to the affected node through SSH or directly through the console.

  2. If enabledAtBoot is enabled, adjust the audit_backlog_limit value in kernel options:

    1. List grub configuration files where GRUB_CMDLINE_LINUX is defined:

      grep -rn 'GRUB_CMDLINE_LINUX' /etc/default/grub /etc/default/grub.d/* \
      | cut -d: -f1 | sort -u
      
    2. In each file obtained in the previous step, edit the GRUB_CMDLINE_LINUX string by changing the integer value after audit_backlog_limit= to the desired value.

  3. In /etc/audit/rules.d/audit.rules, adjust the buffer size by editing the integer value after -b.

  4. Select from the following options:

    • If the auditd configuration is not immutable, restart the auditd service:

      systemctl restart auditd.service
      
    • If the auditd configuration is immutable, reboot the node. The auditd configuration is immutable if any of the following conditions are met:

      • In the auditctl -s output, the enabled parameter is set to 2

      • The -e 2 flag is defined explicitly in parameters of any custom rule

      • The immutable preset is defined explicitly

      • The virtual preset all is enabled and the immutable preset is not excluded explicitly

      Caution

      Arrange the time to reboot the node according to your maintenance schedule. For the exact reboot procedure, use your maintenance policies.

  5. If the backlog limit exceeded message disappears, adjust the size permanently using the backlogLimit value in the Cluster object.