This section describes the alerts for the Open vSwitch (OVS) processes.
Warning
OVSInstanceArpingCheckDown
alert is available starting from the
MCP 2019.2.4 update.OVSTooManyPortRunningOnAgent
, OVSErrorOnPort
,
OVSNonInternalPortDown
and OVSGatherFailed
alerts are available
starting from the MCP 2019.2.6 update.Available starting from the 2019.2.3 maintenance update
Severity | Warning |
---|---|
Summary | The ovs-vswitchd process consumes more than 20% of system memory. |
Raise condition | procstat_memory_vms{process_name="ovs-vswitchd"} / on(host) mem_total
> 0.2 |
Description | Raises when the virtual memory of the ovs-switchd process exceeds
20% of the host memory. |
Tuning | Not required |
Available starting from the 2019.2.3 maintenance update
Severity | Critical |
---|---|
Summary | The ovs-vswitchd process consumes more than 30% of system memory. |
Raise condition | procstat_memory_vms{process_name="ovs-vswitchd"} / on(host) mem_total
> 0.3 |
Description | Raises when the virtual memory of the ovs-switchd process exceeds
30% of the host memory. |
Tuning | Not required |
Available starting from the 2019.2.4 maintenance update
Severity | Major |
---|---|
Summary | The OVS instance arping check is down. |
Raise condition | instance_arping_check_up == 0 |
Description | Raises when the OVS instance arping check on the {{ $labels.host }}
node is down for 2 minutes. The host label in the raised alert
contains the affected node name. |
Tuning | Not required |
Available starting from the 2019.2.6 maintenance update
Severity | Major |
---|---|
Summary | The number of OVS port is {{ $value }} (ovs-vsctl list port ) on
the {{ $labels.host }} host, which is more than the expected limit. |
Raise condition | sum by (host) (ovs_bridge_status) > 1500 |
Description | Raises when too many networks are created or OVS does not properly clean up the OVS ports. OVS may malfunction if too many ports are assigned to a single agent. Warning For production environments, configure the alert after deployment. |
Troubleshooting |
|
Tuning | For example, to change the threshold to
|
Available starting from the 2019.2.6 maintenance update
Severity | Critical |
---|---|
Summary | The {{ $labels.port }} OVS port on the {{ $labels.bridge }}
bridge running on the {{ $labels.host }} host is reporting errors. |
Raise condition | ovs_bridge_status == 2 |
Description | Raises when an OVS port reports errors, indicating that the port is not working properly. |
Troubleshooting |
|
Tuning | Not required |
Available starting from the 2019.2.6 maintenance update
Severity | Critical |
---|---|
Summary | The {{ $labels.port }} OVS port on the {{ $labels.bridge }}
bridge running on the {{ $labels.host }} host is down. |
Raise condition | ovs_bridge_status{type!="internal"} == 0 |
Description | Raises when the port on the OVS bridge is in the DOWN state, which
may lead to an unexpected network disturbance. |
Troubleshooting |
|
Tuning | Note required |
Available starting from the 2019.2.6 maintenance update
Severity | Critical |
---|---|
Summary | Failure to gather the OVS information on the {{ $labels.host }}
host. |
Raise condition | ovs_bridge_check == 0 |
Description | Raises when the check script for the OVS bridge fails to gather data. OVS is not monitored. |
Troubleshooting | Run /usr/local/bin/ovs_parse_bridge.py from the affected host and
inspect the output. |
Tuning | Not required |