OpenStack known issues¶
This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.5.
[29501] Cinder periodic database cleanup resets the state of volumes
[25594] Security groups shared through RBAC cannot be used to create instances
[30450] High CPU load of MariaDB¶
One of the most common symptoms of the high CPU load of MariaDB is slow API responses. To troubleshoot the issue, verify the CPU consumption of MariaDB using the General > Kubernetes Pods Grafana dashboard or through the CLI as follows:
Obtain the resource consumption details for the MariaDB server:
kubectl -n openstack exec -it mariadb-server-0 -- bash mysql@mariadb-server-0:/$ top
Example of system response:
top - 19:16:29 up 278 days, 20:56, 0 users, load average: 16.62, 16.54, 16.39 Tasks: 8 total, 1 running, 7 sleeping, 0 stopped, 0 zombie %Cpu(s): 6.3 us, 2.8 sy, 0.0 ni, 89.6 id, 0.0 wa, 0.0 hi, 1.3 si, 0.0 st MiB Mem : 515709.3 total, 375731.7 free, 111383.8 used, 28593.7 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 399307.2 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 275 mysql 20 0 76.3g 18.8g 1.0g S 786.4 3.7 22656,15 mysqld
Determine which exact query is progressing. This is usually the one in the
Sending data
state:mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e "show processlist;" | grep -v Sleep
Example of system response:
Id User Host db Command Time State Info Progress 60067757 placementgF9D11u29 10.233.195.246:40746 placement Query 10 Sending data SELECT a.id, a.resource_class_id, a.used, a.updated_at, a.created_at, c.id AS consumer_id, c.generat 0.000
Obtain more information about the query and the used tables:
mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e " ANALYZE FORMAT=JSON SELECT <QUERY_ID>;"
Example of system response:
"table": { "table_name": "c", "access_type": "eq_ref", "possible_keys": [ "uniq_consumers0uuid", "consumers_project_id_user_id_uuid_idx", "consumers_project_id_uuid_idx" ], "key": "uniq_consumers0uuid", "key_length": "110", "used_key_parts": ["uuid"], "ref": ["placement.a.consumer_id"], "r_loops": 838200, "rows": 1, "r_rows": 1, "r_table_time_ms": 62602.5453, "r_other_time_ms": 369.5835268, "filtered": 100, "r_filtered": 0.005249344, "attached_condition": "c.user_id = u.`id` and c.project_id = p.`id`" }
If you are observing a huge difference between the
filtered
andr_filtered
columns for the query, as in the example of system response above, analyze the performance of tables by running the ANALYZE TABLE <TABLE_NAME>; and ANALYZE TABLE <TABLE_NAME> PERSISTENT FOR ALL; commands:mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD MariaDB > ANALYZE TABLE placement.allocations; MariaDB > ANALYZE TABLE placement.allocations PERSISTENT FOR ALL; MariaDB > ANALYZE TABLE placement.consumers; MariaDB > ANALYZE TABLE placement.consumers PERSISTENT FOR ALL; MariaDB > ANALYZE TABLE placement.users; MariaDB > ANALYZE TABLE placement.users PERSISTENT FOR ALL; MariaDB > ANALYZE TABLE placement.projects; MariaDB > ANALYZE TABLE placement.projects PERSISTENT FOR ALL;
[29501] Cinder periodic database cleanup resets the state of volumes¶
Due to an issue in the database auto-cleanup job for the Block Storage service
(OpenStack Cinder), the state of volumes that are attached to instances gets
reset every time the job runs. The instances can still write and read block
storage data, however, volume objects appear in the OpenStack API as
not attached
causing confusion.
The workaround is to temporarily disable the job until the issue is fixed and execute the script below to restore the affected instances.
To disable the job, update the OpenStackDeployment
custom resource as
follows:
kind: OpenStackDeployment
spec:
features:
database:
cleanup:
cinder:
enabled: false
To restore the affected instances:
Obtain one of the Nova API pods:
nova_api_pod=$(kubectl -n openstack get pod -l application=nova,component=os-api --no-headers | head -n1 | awk '{print $1}')
Download the
restore_volume_attachments.py
script to your local environment.Note
The provided script does not fix the Cinder database clean-up job and is only intended to restore the functionality of the affected instances. Therefore, leave the job disabled.
Copy the script to the Nova API pod:
kubectl -n openstack cp restore_volume_attachments.py $nova_api_pod:tmp/
Run the script in the dry-run mode to only list affected instances and volumes:
kubectl -n openstack exec -ti $nova_api_pod -- python /tmp/restore_volume_attachments.py --dry-run
Run the script to restore the volume attachments:
kubectl -n openstack exec -ti $nova_api_pod -- python /tmp/restore_volume_attachments.py
[25124] MPLSoGRE encapsulation has limited throughput¶
Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) provides limited throughput while sending data between VMs up to 38 Mbps, as per Mirantis tests.
As a workaround, switch the encapsulation type to VXLAN in the
OpenStackDeployment
custom resource:
spec:
services:
networking:
neutron:
values:
conf:
bagpipe_bgp:
dataplane_driver_ipvpn:
mpls_over_gre: "False"
vxlan_encap: "True"