OpenContrail vRouter
This section describes the alerts for the OpenContrail vRouter alerts.
ContrailBGPSessionsNoEstablished
Severity |
Warning |
Summary |
There are no established OpenContrail BGP sessions on the
{{ $labels.host }} node for 2 minutes. |
Raise condition |
max(contrail_bgp_session_count) by (host) == 0 |
Description |
Raises when no BGP sessions in the established state (FSM) exist on
a node. The host label in the raised alert contains the host name of
the affected node. |
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
bgp_peer module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
Not required |
ContrailBGPSessionsNoActive
Severity |
Warning |
Summary |
There are no active OpenContrail BGP sessions on the
{{ $labels.host }} node for 2 minutes. |
Raise condition |
max(contrail_bgp_session_up_count) by (host) == 0 |
Description |
Raises when no BGP sessions in the active state (FSM) exist on a
node. The host label in the raised alert contains the host name of
the affected node. |
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
bgp_peer module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
Not required |
ContrailBGPSessionsDown
Severity |
Warning |
Summary |
The OpenContrail BGP sessions on the {{ $labels.host }} node are
down for 2 minutes. |
Raise condition |
min(contrail_bgp_session_down_count) by (host) > 0 |
Description |
Raises when a node has BGP sessions in the down state. The host
label in the raised alert contains the host name of the affected node. |
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
bgp_peer module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
Not required |
ContrailXMPPSessionsMissingEstablished
Severity |
Warning |
Summary |
The OpenContrail XMPP sessions in the established state are missing
on the compute cluster for 2 minutes. |
Raise condition |
count(contrail_vrouter_xmpp) * 2 - sum(contrail_xmpp_session_up_count)
> 0 |
Description |
Raises when the compute cluster has no OpenContrail XMPP sessions in
the established state (FSM). No assumption is made for equal
sessions distribution across the cluster. The vRouter can have 0
sessions in the working state. However, a properly operating compute
cluster must have at least 2 connections per vRouter. |
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
Not required |
ContrailXMPPSessionsMissing
Severity |
Warning |
Summary |
The OpenContrail XMPP sessions are missing on the compute cluster for
2 minutes. |
Raise condition |
count(contrail_vrouter_xmpp) * 2 - sum(contrail_xmpp_session_count) >
0 |
Description |
Raises when the compute cluster has no OpenContrail XMPP sessions in any
state. The conditions are the same as for the
ContrailXMPPSessionsMissingEstablished alert. |
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
Not required |
ContrailXMPPSessionsDown
Severity |
Warning |
Summary |
The {{ $labels.host }} node contains the OpenContrail XMPP sessions
in the down state for 2 minutes. |
Raise condition |
min(contrail_xmpp_session_down_count) by (host) > 0 |
Description |
Raises when a node has OpenConrail XMPP sessions in the DOWN state.
The host label in the raised alert contains the host name of the
affected node. |
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
Not required |
ContrailXMPPSessionsTooHigh
Severity |
Warning |
Summary |
There are more than 500 open OpenContrail XMPP sessions on the
{{ $labels.host }} node for 2 minutes. |
Raise condition |
min(contrail_xmpp_session_count) by (host) >= 500 |
Description |
Raises when the number of OpenContrail XMPP sessions reaches 500 on one
node. The host label in the raised alert contains the host name of
the affected node.
Warning
For production environments, configure the alert after
deployment.
|
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
For example, to change the threshold to 1000 sessions:
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
ContrailXMPPSessionsTooHigh:
if: >-
min(contrail_xmpp_session_count) by (host) >= 1000
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|
ContrailXMPPSessionsChangesTooHigh
Severity |
Warning |
Summary |
The OpenContrail XMPP sessions on the {{ $labels.host }} node have
changed more than 100 times. |
Raise condition |
abs(delta(contrail_xmpp_session_count[2m])) >= 100 |
Description |
Raises when the number of OpenContrail XMPP session changes reaches 100
on one node, calculated as an absolute difference of the first and last
point in a two-minute time frame. The host label in the raised alert
contains the host name of the affected node.
Warning
For production environments, configure the alert after
deployment.
|
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
For example, to change the threshold to >= 250 sessions:
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
ContrailXMPPSessionsChangesTooHigh:
if: >-
abs(delta(contrail_xmpp_session_count[2m])) >= 250
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|
ContrailVrouterXMPPSessionsZero
Severity |
Warning |
Summary |
There are no OpenContrail vRouter XMPP sessions on the
{{ $labels.host }} node for 2 minutes. |
Raise condition |
min(contrail_vrouter_xmpp) by (host) == 0 |
Description |
Raises when a node has no OpenContrail vRouter XMPP sessions. The
host label in the raised alert contains the host name of the
affected node. |
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
Not required |
ContrailVrouterXMPPSessionsTooHigh
Severity |
Warning |
Summary |
There are more than 10 open OpenContrail vRouter XMPP sessions on the
{{ $labels.host }} node for 2 minutes. |
Raise condition |
min(contrail_vrouter_xmpp) by (host) >= 10 |
Description |
Raises when the number of OpenContrail vrouter XMPP sessions reaches 10
on one node. The host label in the raised alert contains the host
name of the affected node.
Warning
For production environments, configure the alert after
deployment.
|
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > Infrastructure > Control Nodes and
select the affected node to inspect the analytics data of the
OpenContrail controller nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
For example, to change the threshold to 20 sessions:
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
ContrailVrouterXMPPSessionsTooHigh:
if: >-
min(contrail_vrouter_xmpp) by (host) >= 20
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|
ContrailVrouterXMPPSessionsChangesTooHigh
Severity |
Warning |
Summary |
The OpenContrail vRouter XMPP sessions on the {{$labels.host }} node
have changed more than 5 times. |
Raise condition |
abs(delta(contrail_vrouter_xmpp[2m])) >= 5 |
Description |
Raises when the number of OpenContrail vRouter XMPP session changes
reaches 5 on one node, calculated as an absolute difference of the first
and last points in a two-minute time frame. The host label in the
raised alert contains the host name of the affected node.
Warning
For production environments, configure the alert after
deployment.
|
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > DNS Nodes and select the affected
node to inspect the analytics data of the OpenContrail controller
nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
For example, to change the threshold to 10 sessions:
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
ContrailVrouterXMPPSessionsChangesTooHigh:
if: >-
abs(delta(contrail_vrouter_xmpp[2m])) >= 10
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|
ContrailVrouterDNSXMPPSessionsZero
Severity |
Warning |
Summary |
There are no OpenContrail vRouter DNS-XMPP sessions on the
{{ $labels.host }} node for 2 minutes. |
Raise condition |
min(contrail_vrouter_dns_xmpp) by (host) == 0 |
Description |
Raises when one node has no OpenContrail DNS-XMPP sessions. The host
label in the raised alert contains the host name of the affected node. |
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > DNS Nodes and select the affected
node to inspect the analytics data of the OpenContrail controller
nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
Not required |
ContrailVrouterDNSXMPPSessionsTooHigh
Severity |
Warning |
Summary |
More than 10 OpenContrail vRouter DNS-XMPP sessions are open on the
{{ $labels.host }} node for 2 minutes. |
Raise condition |
min(contrail_vrouter_dns_xmpp) by (host) >= 10 |
Description |
Raises when the number of OpenContrail DNS-XMPP sessions reaches 10 on
one node. The host label in the raised alert contains the host name
of the affected node.
Warning
For production environments, configure the alert after
deployment.
|
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > DNS Nodes and select the affected
node to inspect the analytics data of the OpenContrail controller
nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
For example, to change the threshold to 20 sessions:
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
ContrailVrouterDNSXMPPSessionsTooHigh:
if: >-
min(contrail_vrouter_dns_xmpp) by (host) >= 20
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|
ContrailVrouterDNSXMPPSessionsChangesTooHigh
Severity |
Warning |
Summary |
The OpenContrail vRouter DNS-XMPP sessions on the {{ $labels.host }}
node have changed more than 5 times. |
Raise condition |
abs(delta(contrail_vrouter_dns_xmpp[2m])) >= 5 |
Description |
Raises when the number of OpenContrail DNS-XMPP session changes reaches
5 on one node, calculated as an absolute difference of the first and
last points in a two-minute time frame.
Warning
For production environments, configure the alert after
deployment.
|
Troubleshooting |
- Log in to the OpenContrail web UI using the credentials from
/etc/contrail/contrail-webui-userauth.js on the network nodes.
- Navigate to Monitor > DNS Nodes and select the affected
node to inspect the analytics data of the OpenContrail controller
nodes.
- In Introspect, inspect the introspection data filtered
by request type. Select the
xmpp_server module.
- Verify the BGP routers configuration in
Configure > Infrastructure > BGP Routers.
|
Tuning |
For example, to change the threshold to 10 sessions:
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
ContrailVrouterDNSXMPPSessionsChangesTooHigh:
if: >-
abs(delta(contrail_vrouter_dns_xmpp[2m])) >= 10
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|
ContrailVrouterLLSSessionsTooHigh
Severity |
Warning |
Summary |
There are more than 10 open OpenContrail vRouter LLS sessions on the
{{ $labels.host }} node for 2 minutes. |
Raise condition |
min(contrail_vrouter_lls) by (host) >= 10 |
Description |
Raises when the number of OpenContrail vRouter LocalLinkService
sessions reaches 10 on one node.
Warning
For production environments, configure the alert after
deployment.
|
Tuning |
For example, to change the threshold to 20 sessions:
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
ContrailVrouterLLSSessionsTooHigh:
if: >-
min(contrail_vrouter_lls) by (host) >= 20
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|
ContrailVrouterLLSSessionsChangesTooHigh
Severity |
Warning |
Summary |
The OpenContrail vRouter LLS sessions on the {{$labels.host }} node
have changed more than 5 times. |
Raise condition |
abs(delta(contrail_vrouter_lls[2m])) >= 5 |
Description |
Raises when the number of OpenContrail vRouter LLS session changes
reaches 5 on one node, calculated as an absolute difference of the first
and last points in a two-minute time frame.
Warning
For production environments, configure the alert after
deployment.
|
Tuning |
For example, to change the threshold to 10 sessions:
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
ContrailVrouterLLSSessionsChangesTooHigh:
if: >-
abs(delta(contrail_vrouter_lls[2m])) >= 10
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|
ContrailGlobalVrouterConfigCheckDisabled
Available since 2019.2.4
Severity |
Critical |
Summary |
The OpenContrail global vRouter configuration check is disabled. |
Raise condition |
absent(contrail_global_vrouter_config_exit_code) == 1 |
Description |
Raises when Prometheus has no metric with the
contrail_global_vrouter_config_exit_code name. |
Troubleshooting |
Inspect the Telegraf logs on the ntw nodes. |
Tuning |
Not required |
ContrailGlobalVrouterConfigCheckFailed
Available since 2019.2.4
Severity |
Critical |
Summary |
The OpenContrail global vRouter configuration check failed on
the {{ $labels.host }} node. |
Raise condition |
contrail_global_vrouter_config_exit_code != 0 |
Description |
Raises when the OpenContrail Virtual Network Controller (VNC) API
returns 0 or more than 1 global-vrouter-configs . |
Troubleshooting |
Inspect the output of the contrail-status command on any
ntw node. |
Tuning |
Not required |