irqbalance module¶
TechPreview since 2.27.0 (17.2.0 and 16.2.0)
The irqbalance
module is designed to allow the cloud operator to install
and configure the irqbalance
service on cluster machines using the day-2
operations API.
Note
This module is implemented and validated against the following Ansible versions provided by Container Cloud for Ubuntu 20.04 and 22.04 in the Cluster releases 16.2.0 and 17.2.0: Ansible core 2.12.10 and Ansible collection 5.10.0.
To verify the Ansible version in a specific Cluster release, refer to the Cluster releases section in Release Notes. Use the Artifacts > System and MCR artifacts section of the corresponding Cluster release. For example, for 17.2.0.
For information on the irqbalance
service, refer to the official
irqbalance documentation for Ubuntu 22.04
and the
Upstream GitHub project.
Default configuration for the irqbalance service¶
The default configuration file /etc/default/irqbalance
can contain the
following settings, as defined in the irqbalance documentation:
# irqbalance is a daemon process that distributes interrupts across
# CPUs on SMP systems. The default is to rebalance once every 10
# seconds. This is the environment file that is specified to systemd via the
# EnvironmentFile key in the service unit file (or via whatever method the init
# system you're using has).
#
# IRQBALANCE_ONESHOT
# After starting, wait for a minute, then look at the interrupt
# load and balance it once; after balancing exit and do not change
# it again.
#
#IRQBALANCE_ONESHOT=
#
# IRQBALANCE_BANNED_CPUS
# 64 bit bitmask which allows you to indicate which CPUs should
# be skipped when reblancing IRQs. CPU numbers which have their
# corresponding bits set to one in this mask will not have any
# IRQs assigned to them on rebalance.
#
#IRQBALANCE_BANNED_CPUS=
#
# IRQBALANCE_BANNED_CPULIST
# The CPUs list which allows you to indicate which CPUs should
# be skipped when reblancing IRQs. CPU numbers in CPUs list will
# not have any IRQs assigned to them on rebalance.
#
# The format of CPUs list is:
# <cpu number>,...,<cpu number>
# or a range:
# <cpu number>-<cpu number>
# or a mixture:
# <cpu number>,...,<cpu number>-<cpu number>
#
#IRQBALANCE_BANNED_CPULIST=
#
# IRQBALANCE_ARGS
# Append any args here to the irqbalance daemon as documented in the man
# page.
#
#IRQBALANCE_ARGS=
Note
IRQBALANCE_BANNED_CPUS
is deprecated in irqbalance v1.8.0, which
is used in Ubuntu 22.04, and is being replaced with
IRQBALANCE_BANNED_CPULIST
. For details, see Release notes for
irqbalance v1.8.0.
Setting empty values for the irqbalance parameters¶
When the cloud operator defines values for the irqbalance
module in the
HOC
object, those values overwrite particular parameters in the
/etc/default/irqbalance
file. If the operator does not define a value,
the corresponding parameter in the
/etc/default/irqbalance
configuration file keeps its current value.
For example, if you define values.args
in the HOC
object, this value
overwrites the IRQBALANCE_ARGS
parameter in /etc/default/irqbalance
.
Otherwise, the IRQBALANCE_ARGS
value remains the same in the
configuration file.
If you need to provide an empty IRQBALANCE_ARGS
value, you can define
values.args: ""
(empty string) in the HOC
object. Other parameters defined in /etc/default/irqbalance
follow
the same logic.
Version 1.1.0 (latest)¶
The module allows installing, configuring, and enabling or disabling the
irqbalance
service on cluster machines.
Since v1.0.0, the following changes apply to the irqbalance
module:
Added the
oneshot
parameter.Changed the method of setting empty values for the
irqbalance
parameters for better usability:When a parameter is not defined in
values
of theHOC
object, the corresponding value remains the same in theirqbalance
configuration file.When a parameter is set to
""
(empty string) invalues
of theHOC
object , the corresponding value in theirqbalance
configuration file is also set to""
(empty string).
The module accepts the following parameters, all of them are optional:
Parameter |
Description |
---|---|
|
Enables the |
|
Defines the |
|
Defines the Caution
|
|
Defines the |
|
Defines the name of the irqbalance policy script, which is bash-compatible. |
|
Defines the full file path to store the irqbalance policy script that can
be used with the |
|
Enables the update of |
|
Defines the Note
|
Caution
When you configure the policy script, at least the following
parameters must be set: args
, policy_script
, and
policy_script_filepath
. Otherwise, the corresponding error message will
be displayed in the status of the HostOSConfiguration
object.
Note
If an error message in the status of the HostOSConfiguration
object contains schema validation failed
, verify whether the types of
used parameters are correct and whether the used combination of parameters
is allowed.
Note
If you enable the service without setting banned_cpulist
,
banned_cpus
, oneshot
, or args
, the corresponding values in
/etc/default/irqbalance
remain as they were before applying the
current HostOSConfiguration
.
Version 1.0.0¶
The module allows installing, configuring, and enabling or disabling the
irqbalance
service on cluster machines.
The module accepts the following parameters, all of them are optional:
Parameter |
Description |
---|---|
|
Enable the |
|
The |
|
The |
|
The |
|
The irqbalance policy script, which is bash-compatible. |
|
The full file path name to store the irqbalance policy script that can be used
with the |
|
Enables the update of |
Caution
When you configure the policy script, at least three parameters must
be set: args
, policy_script
, and policy_script_filepath
.
Otherwise, the corresponding error message will be displayed in the status
of the HostOSConfiguration
object.
Note
If an error message in the status of the HostOSConfiguration
object contains schema validation failed
, verify the following:
Whether the types of used parameters are correct
Whether the used combination of parameters is allowed
Note
If you enable the service without setting banned_cpulist
,
banned_cpus
, or args
, the corresponding values in
/etc/default/irqbalance
will remain as they were before applying the
current HostOSConfiguration
.
Module configuration examples¶
Example 1. Run irqbalance using defaults¶
spec:
...
configs:
...
- description: Example irqbalance configuration
module: irqbalance
moduleVersion: 1.0.0
order: 1
phase: "reconfigure"
values: {}
As a result of this configuration, no parameters will be set or overridden
in the irqbalance
configuration file.
Example 2. Run irqbalance and deny using certain CPU cores for IRQ balancing¶
spec:
...
configs:
...
- description: Example irqbalance configuration
module: irqbalance
moduleVersion: 1.0.0
order: 1
phase: "reconfigure"
values:
banned_cpulist: "0-15,31"
args: "--journal"
As a result of this configuration, IRQBALANCE_BANNED_CPULIST
and
IRQBALANCE_ARGS
will be set or overridden, and IRQBALANCE_BANNED_CPUS
will be removed from the irqbalance
configuration file.
Example 3. Run irqbalance using the policy script¶
spec:
...
configs:
...
- description: Example irqbalance configuration
module: irqbalance
moduleVersion: 1.0.0
order: 1
phase: "reconfigure"
values:
args: "--policyscript=/etc/default/irqbalance-numa.sh"
policy_script: |
#!/bin/bash
# specifying a -1 here forces irqbalance to consider an interrupt from a
# device to be equidistant from all NUMA nodes.
echo 'numa_node=-1'
policy_script_filepath: "/etc/default/irqbalance-numa.sh"
As a result of this configuration:
The
IRQBALANCE_ARGS
parameter will be set or overridden in theirqbalance
configuration fileThe contents of
policy_script
will be written to/etc/default/irqbalance-numa.sh
The
irqbalance
service will use the provided policy script
For the policy script description, refer to the irqbalance documentation.
In particular, refer to the numa_node
variable used in the example.
Troubleshoot irqbalance on the target host¶
Action |
Command |
---|---|
Verify the service status |
sudo systemctl status irqbalance
|
Verify the configuration |
less /etc/default/irqbalance
|
Verify the |
less /etc/init.d/irqbalance
|
Verify logs |
journalctl -u irqbalance*
|
Verify statistics of interrupts |
less -S /proc/interrupts
|
Verify connections of NICs to NUMA nodes |
cat /sys/class/net/<nic_name>/device/numa_node
Note The |