Enabling introspective instance monitor¶
Available since MOSK 25.1 TechPreview
The introspective instance monitor in the Instance High Availability service enhances the reliability of the cloud environment by monitoring virtual machines for failure events, including operating system crashes, kernel panics, and unresponsive states. Upon detecting such events in real time, the monitor initiates automated recovery actions, such as rebooting the affected instance. This allows for reduced downtime and maintains high availability of an OpenStack environment.
As a cloud operator, you can enable and configure the instance introspection
through the spec:features:masakari:monitors:introspective
definition in the
OpenStackDeployment
custom resource. The list of supported options include:
enabled
(boolean)Enables or disables the introspection monitor. Default:
false
.
guest_monitoring_interval
(integer)Defines the time interval (in seconds) for monitoring the status of the guest virtual machine. Default:
10
.
guest_monitoring_timeout
(integer)Sets the timeout (in seconds) for detecting a non-responsive guest VM before marking it as failed. Default:
2
.
guest_monitoring_failure_threshold
(integer)Defines the number of consecutive failures required before a notification is sent or recovery action is initiated. Default:
3
.
Example configuration:
spec:
features:
masakari:
monitors:
introspective:
enabled: true
guest_monitoring_interval: 10
guest_monitoring_timeout: 2
guest_monitoring_failure_threshold: 3
The introspective instance monitor relies on the QEMU Guest Agent being installed within the guest virtual machine. This agent enables communication between the host and guest operating systems, ensuring precise monitoring of the virtual machine health. Without the QEMU Guest Agent, the introspection monitor cannot accurately assess the state of the virtual machine, which may prevent the initiation of necessary recovery actions. To start monitoring, refer to Configure the introspective instance monitor.