Resource oversubscription
The Compute service (OpenStack Nova) enables you to spawn instances that can collectively consume more resources than what is physically available on a compute node through resource oversubscription, also known as overcommit or allocation ratio.
Resources available for oversubscription on a compute node include the number of CPUs, amount of RAM, and amount of available disk space. When making a scheduling decision, the scheduler of the Compute service takes into account the actual amount of resources multiplied by the allocation ratio. Thereby, the service allocates resources based on the assumption that not all instances will be using their full allocation of resources at the same time.
Oversubscription enables you to increase the density of workloads and compute resource utilization and, thus, achieve better Return on Investment (ROI) on compute hardware. In addition, oversubscription can also help avoid the need to create too many fine-grained flavors, which is commonly known as flavor explosion.
Configuring initial resource oversubscription
There are two ways to control the oversubscription values for compute nodes:
The legacy approach entails utilizing the
{cpu,disk,ram}_allocation_ratioconfiguration options offered by the Compute service. A drawback of this method is that restarting the Compute service is mandatory to apply the new configuration. This introduces the risk of possible interruptions of cloud user operations, for example, instance build failures.The modern and recommended approach, adopted in MOSK 23.1, involves using the
initial_{cpu,disk,ram}_allocation_ratioconfiguration options, which are employed exclusively during the initial provisioning of a compute node. This may occur during the initial deployment of the cluster or when new compute nodes are added subsequently. Any further alterations can be performed dynamically using the OpenStack Placement service API without necessitating the restart of the service.
There is no definitive method for selecting optimal oversubscription values. As a cloud operator, you should continuously monitor your workloads, ideally have a comprehensive understanding of their nature, and experimentally determine the maximum values that do not impact performance. This approach ensures maximum workload density and cloud resource utilization.
To configure the initial compute resource oversubscription in
MOSK, specify the spec:features:nova:allocation_ratios
parameter in the OpenStackDeployment custom resource as explained in the
table below.
Parameter |
|
|---|---|
Configuration |
Configure initial oversubscription of CPU, disk space, and RAM resources on compute nodes. By default, the following values are applied:
Warning Mirantis strongly advises against oversubscribing RAM, by any amount. See Preventing resource overconsumption for details. Changing the resource oversubscription configuration through the
|
Usage |
Configuration example: kind: OpenStackDeployment
spec:
features:
nova:
allocation_ratios:
cpu: 8
disk: 1.6
ram: 1.0
Configuration example of setting different oversubscription values for specific nodes: spec:
nodes:
compute-type::hi-perf:
features:
nova:
allocation_ratios:
cpu: 2.0
disk: 1.0
In the example configuration above, the compute nodes labeled with
|
Preventing resource overconsumption
When using oversubscription, it is important to conduct thorough cloud management and monitoring to avoid system overloading and performance degradation. If many or all instances on a compute node start using all allocated resources at once and, thereby, overconsume physical resources, failure scenarios depend on the resource being exhausted.
There are workload types that are not suitable for running in an oversubscribed environment, especially those with high performance, latency-sensitive, or real-time requirements. Such workloads are better suited for compute nodes with dedicated CPUs, ensuring that only processes of a single instance run on each CPU core.