Nova resources

Nova resources

This section describes the alerts for Nova resources consumption.

Warning

The following set of alerts has been removed starting from the 2019.2.4 maintenance update. For the existing MCP deployments, disable these alerts as described in Manage alerts.


NovaHypervisorVCPUsFullMinor

Removed since the 2019.2.4 maintenance update

Severity

Minor

Summary

{{ $value }} VCPUs on the {{ $labels.hostname }} node (>= {{ cpu_minor_threshold * 100 }}%) are used.

Raise condition

label_replace(system_load15, "hostname", "$1", "host", "(.*)") > on (hostname) openstack_nova_vcpus * 0.85

Description

Raises when the hypervisor consumes more than 85% of the available VCPU (the average load for 15 minutes), according to the data from Nova API and the load average from /proc/loadavg on the appropriate node. For details, see Hypervisors. The hostname label in the raised alert contains the affected node name.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaHypervisorVCPUsFullMajor

Removed since the 2019.2.4 maintenance update

Severity

Major

Summary

{{ $value }} VCPUs on the {{ $labels.hostname }} node (>= {{ cpu_major_threshold * 100 }}%) are used.

Raise condition

label_replace(system_load15, "hostname", "$1", "host", "(.*)") > on (hostname) openstack_nova_vcpus * 0.95

Description

Raises when the hypervisor consumes more than 95% of the available VCPU (the average load for 15 minutes), according to the data from Nova API and the load average from /proc/loadavg on the appropriate node. For details, see Hypervisors. The hostname label in the raised alert contains the name of the affected node.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaHypervisorMemoryFullMajor

Removed since the 2019.2.4 maintenance update

Severity

Major

Summary

{{ $value }}MB of RAM on the {{ $labels.hostname }} node (>= {{ ram_major_threshold * 100 }}%) is used.

Raise condition

openstack_nova_used_ram > openstack_nova_ram * 0.85

Description

Raises when the hypervisor allocates more than 85% of the available RAM, according to the data from Nova API. For details, see Hypervisors. The hostname label in the raised alert contains the name of the affected node.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaHypervisorMemoryFullCritical

Removed since the 2019.2.4 maintenance update

Severity

Critical

Summary

{{ $value }}MB of RAM on the {{ $labels.hostname }} node (>= {{ ram_critical_threshold * 100 }}%) is used.

Raise condition

openstack_nova_used_ram > openstack_nova_ram * 0.95

Description

Raises when the hypervisor allocates more than 95% of the available RAM, according to the data from Nova API. For details, see Hypervisors. The hostname label in the raised alert contains the name of the affected node.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaHypervisorDiskFullMajor

Removed since the 2019.2.4 maintenance update

Severity

Major

Summary

{{ $value }}GB of disk space on the {{ $labels.hostname }} node (>= {{ disk_major_threshold * 100 }}%) is used.

Raise condition

openstack_nova_used_disk > openstack_nova_disk * 0.85

Description

Raises when the hypervisor allocates more than 85% of the available disk space, according to the data from Nova API. For details, see Hypervisors. The hostname label in the raised alert contains the name of the affected node.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaHypervisorDiskFullCritical

Removed since the 2019.2.4 maintenance update

Severity

Critical

Summary

{{ $value }}GB of disk space on the {{ $labels.hostname }} node (>= {{ disk_critical_threshold *100 }}%) is used.

Raise condition

openstack_nova_used_disk > openstack_nova_disk * 0.95

Description

Raises when the hypervisor allocates more than 95% of the available disk space, according to the data from Nova API. For details, see Hypervisors. The hostname label in the raised alert contains the name of the affected node.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaAggregateMemoryFullMajor

Removed since the 2019.2.4 maintenance update

Severity

Major

Summary

{{ $value }}MB of RAM on the {{ $labels.aggregate }} aggregate is used (at least {{ ram_major_threshold * 100}}%).

Raise condition

openstack_nova_aggregate_used_ram > openstack_nova_aggregate_ram * 0.85

Description

Raises when the RAM allocation over all hypervisors within a host aggregate is more than 85% of the total available RAM, according to the data from Nova API. For details, see Hypervisors and Host aggregates. The aggregate label in the raised alert contains the name of the affected host aggregate.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the list of host aggregate members using the openstack aggregate list and openstack aggregate show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaAggregateMemoryFullCritical

Removed since the 2019.2.4 maintenance update

Severity

Critical

Summary

{{ $value }}MB of RAM on the {{ $labels.aggregate }} aggregate (>= {{ ram_critical_threshold * 100 }}%) is used.

Raise condition

openstack_nova_aggregate_used_ram > openstack_nova_aggregate_ram * 0.95

Description

Raises when the RAM allocation over all hypervisors within a host aggregate is more than 95% of the total available RAM, according to the data fromNova API. For details, see Hypervisors and Host aggregates. The aggregate label in the raised alert contains the name of the affected host aggregate.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the list of host aggregate members using the openstack aggregate list and openstack aggregate show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaAggregateDiskFullMajor

Removed since the 2019.2.4 maintenance update

Severity

Major

Summary

{{ $value }}GB of disk space on the {{ $labels.aggregate }} aggregate (>= {{ disk_major_threshold *100 }}%) is used.

Raise condition

openstack_nova_aggregate_used_disk > openstack_nova_aggregate_disk * 0.85

Description

Raises when the disk space allocation over all hypervisors within a host aggregate is more than 95% of the total available disk space, according to the data from Nova API. For details, see Hypervisors and Host aggregates. The aggregate label in the raised alert contains the name of the affected host aggregate.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the list of host aggregate members using the openstack aggregate list and openstack aggregate show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaAggregateDiskFullCritical

Removed since the 2019.2.4 maintenance update

Severity

Critical

Summary

{{ $value }}GB of disk space on the {{ $labels.aggregate }} aggregate (>= {{ disk_critical_threshold *100 }}%) is used.

Raise condition

openstack_nova_aggregate_used_disk > openstack_nova_aggregate_disk * 0.95

Description

Raises when the disk space allocation over all hypervisors within a host aggregate is more than 95% of the total available disk space over all hypervisors within the host aggregate, according to the data from Nova API. For details, see Hypervisors and Host aggregates. The aggregate label in the raised alert contains the name of the affected host aggregate.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the list of host aggregate members using the openstack aggregate list and openstack aggregate show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaTotalVCPUsFullMinor

Removed since the 2019.2.4 maintenance update

Severity

Minor

Summary

{{ $value }} VCPUs in the cloud (>= {{ cpu_minor_threshold * 100 }}%) are used.

Raise condition

sum(label_replace(system_load15, "hostname", "$1", "host", "(.*)") and on (hostname) openstack_nova_vcpus) > max(sum(openstack_nova_vcpus) by (instance)) * 0.85

Description

Raises when the VCPU consumption over all hypervisors (the average load for 15 minutes) is more than 85% of the total available VCPU, according to the data from Nova API and /proc/loadavg on the appropriate node. For details, see Hypervisors.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaTotalVCPUsFullMajor

Removed since the 2019.2.4 maintenance update

Severity

Major

Summary

{{ $value }} VCPUs in the cloud (>= {{cpu_major_threshold * 100 }}%) are used.

Raise condition

sum(label_replace(system_load15, "hostname", "$1", "host", "(.*)") and on (hostname) openstack_nova_vcpus) > max(sum(openstack_nova_vcpus) by (instance)) * 0.95

Description

Raises when the VCPU consumption over all hypervisors (the average load for 15 minutes) is more than 95% of the total available VCPU, according to the data from Nova API and /proc/loadavg on the appropriate node. For details, see Hypervisors.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaTotalMemoryFullMajor

Removed since the 2019.2.4 maintenance update

Severity

Major

Summary

{{ $value }}MB of RAM in the cloud (>= {{ram_major_threshold * 100}}%) is used.

Raise condition

openstack_nova_total_used_ram > openstack_nova_total_ram * 0.85

Description

Raises when the RAM allocation over all hypervisors is more than 85% of the total available RAM, according to the data from Nova API. For details, see Hypervisors.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaTotalMemoryFullCritical

Removed since the 2019.2.4 maintenance update

Severity

Critical

Summary

{{ $value }}MB of RAM in the cloud (>= {{ram_critical_threshold * 100 }}%) is used.

Raise condition

openstack_nova_total_used_ram > openstack_nova_total_ram * 0.95

Description

Raises when the RAM allocation over all hypervisors is more than 95% of the total available RAM, according to the data from Nova API. For details, see Hypervisors.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaTotalDiskFullMajor

Removed since the 2019.2.4 maintenance update

Severity

Major

Summary

{{ $value }}GB of disk space in the cloud (>= {{disk_major_threshold * 100 }}%) is used.

Raise condition

openstack_nova_total_used_disk > openstack_nova_total_disk * 0.85

Description

Raises when the disk space allocation over all hypervisors is more than 85% of the total disk space, according to the data from Nova API. For details, see Hypervisors.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.

NovaTotalDiskFullCritical

Removed since the 2019.2.4 maintenance update

Severity

Critical

Summary

{{ $value }}GB of disk space in the cloud (>= {{disk_critical_threshold * 100 }}%) is used.

Raise condition

openstack_nova_total_used_disk > openstack_nova_total_disk * 0.95

Description

Raises when the disk space allocation over all hypervisors is more than 95% of the total disk space, according to the data from Nova API. For details, see Hypervisors.

Troubleshooting

  • Verify the hypervisor capacity using the openstack hypervisor list or openstack hypervisor show commands.

  • Verify the status of the monitoring_remote_agent service by running docker service ls on a mon node.

  • Inspect the monitoring_remote_agent service logs by running docker service logs monitoring_remote_agent on a mon node.

Tuning

Disable the alert as described in Manage alerts.