Randomize RabbitMQ reconnection intervals

Randomize RabbitMQ reconnection intervals¶

Note

This feature is available starting from the MCP 2019.2.15 maintenance update. Before using the feature, follow the steps described in Apply maintenance updates.

You can randomize RabbitMQ reconnection intervals (or timeouts) for the required OpenStack services. It is helpful for large OpenStack environments where a simultaneous reconnection of all OpenStack services after a RabbitMQ cluster partitioning can significantly prolong the RabbitMQ cluster recovery or cause the cluster to enter the split-brain mode.

Using this feature, the following OpenStack configuration options will be randomized:

kombu_reconnect_delay - from 30 to 60 seconds
rabbit_retry_interval - from 10 to 60 seconds
rabbit_retry_backoff - from 30 to 60 seconds
rabbit_interval_max - from 60 to 180 seconds

To randomize RabbitMQ reconnection intervals :

Open your project Git repository with the Reclass model on the cluster level.
Open the configuration file of the required OpenStack service. For example, for the OpenStack Compute service (Nova), open <cluster_name>/openstack/compute/init.yml.

Under message_queue, specify rabbit_timeouts_random: True:

parameters:
  nova:
    compute:
      message_queue:
        rabbit_timeouts_random: True

Log in to the Salt Master node.
Apply the corresponding OpenStack service state(s). For example, for the OpenStack Compute service (Nova), apply the following state:
```
salt -C 'I@nova:compute' state.sls nova.compute
```
Note

Each service configured with this feature on every node will receive new unique timeouts on every run of the corresponding OpenStack service Salt state.
Perform the steps 2-5 for other OpenStack services as required.

updated: 2025-01-10 08:56

Enable queue mirroring

View Previous Section

Remove a node