Use the Instance HA service¶
The OpenStack Instance HA service (Masakari) provides automated recovery for Nova instances and compute hosts to minimize downtime.
If an instance process crashes or hangs, the Instance HA service detects the failure and automatically restarts the instance on the same compute host.
If an entire physical server fails, the Instance HA service triggers an evacuation of instances from the failed compute host. This process automatically rebuilds the affected instances on a healthy compute host within the same failover segment. Because this is a recovery process rather than a live migration, note the following impact on your data:
Memory state: all data currently stored in the instance RAM will be lost
Disk state:
If your instance uses Cinder volumes or shared storage, such as Ceph, your data is preserved and the instance will boot from its last written state
If the instance uses local ephemeral storage, all data on the disk will be lost as the instance is rebuilt from its base image
To verify that the Instance HA service is active and available in your OpenStack cloud, execute the following command and verify that the response is successful:
openstack catalog show instance-ha
Example of a positive system response:
+-----------+----------------------------------------------------------------------+
| Field | Value |
+-----------+----------------------------------------------------------------------+
| endpoints | CustomRegion |
| | public: https://masakari.it.just.works/v1 |
| | CustomRegion |
| | internal: http://masakari-api.openstack.svc.cluster.local:15868/v1 |
| | CustomRegion |
| | admin: http://masakari-api.openstack.svc.cluster.local:15868/v1 |
| | |
| id | ed359ae64a2847f89c82c38177eb8392 |
| name | masakari |
| type | instance-ha |
+-----------+----------------------------------------------------------------------+
Note
The Instance HA service is primarily managed by cloud administrators. As a non-admin user, you cannot interact with the Masakari API directly to create failover segments or manage hosts.
Enable High Availability for an instance¶
Create an instance. For example, to create a minimal CirrOS instance:
openstack server create --image Cirros-6.0 --flavor m1.tiny --network DemoNetwork DemoInstance01
Enable High Availability (HA) for the instance:
Note
Although the Instance HA service may be enabled globally in your cloud, it typically operates on an opt-in basis to prioritize critical workloads.
By default, the service only monitors and recovers instances marked with a specific metadata key (typically
HA_Enabled). Depending on your cloud configuration of theha_enabled_instance_metadata_keysetting, you may need different keys for host failures versus instance failures.To ensure your instance is protected by the HA engine, apply the metadata property. For example, using the default
HA_Enabledmetadata key:openstack server set --property HA_Enabled=True DemoInstance01
Confirm that the metadata property has been successfully applied to the instance:
openstack server show DemoInstance01 -c properties
The output should list the
HA_Enabled='True'property under thepropertiesfield.
Note
If HA_Enabled=True does not trigger a recovery, contact your
cloud administrator to verify if a custom metadata key name has been
configured in the Instance HA service settings. For details, refer to
Configure high availability with Masakari for cloud administrators.
See also
Verify the Instance HA service (for cloud administrators only)