OpenStack known issues¶

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.5.

[30450] High CPU load of MariaDB
[29501] Cinder periodic database cleanup resets the state of volumes
[25124] MPLSoGRE encapsulation has limited throughput
[25594] Security groups shared through RBAC cannot be used to create instances

[30450] High CPU load of MariaDB¶

One of the most common symptoms of the high CPU load of MariaDB is slow API responses. To troubleshoot the issue, verify the CPU consumption of MariaDB using the General > Kubernetes Pods Grafana dashboard or through the CLI as follows:

Obtain the resource consumption details for the MariaDB server:

kubectl -n openstack exec -it mariadb-server-0 -- bash
mysql@mariadb-server-0:/$ top

Example of system response:

top - 19:16:29 up 278 days, 20:56,  0 users,  load average: 16.62, 16.54, 16.39
Tasks:   8 total,   1 running,   7 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.3 us,  2.8 sy,  0.0 ni, 89.6 id,  0.0 wa,  0.0 hi,  1.3 si,  0.0 st
MiB Mem : 515709.3 total, 375731.7 free, 111383.8 used,  28593.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used. 399307.2 avail Mem
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    275 mysql     20   0   76.3g  18.8g   1.0g S 786.4   3.7  22656,15 mysqld

Determine which exact query is progressing. This is usually the one in the Sending data state:

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e "show processlist;" | grep -v Sleep

Example of system response:

Id      User    Host    db      Command Time    State   Info    Progress
60067757   placementgF9D11u29   10.233.195.246:40746   placement   Query   10   Sending data   SELECT a.id, a.resource_class_id, a.used, a.updated_at, a.created_at, c.id AS consumer_id, c.generat    0.000

Obtain more information about the query and the used tables:

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e " ANALYZE FORMAT=JSON SELECT <QUERY_ID>;"

Example of system response:

"table": {
   "table_name": "c",
   "access_type": "eq_ref",
   "possible_keys": [
     "uniq_consumers0uuid",
     "consumers_project_id_user_id_uuid_idx",
     "consumers_project_id_uuid_idx"
   ],
   "key": "uniq_consumers0uuid",
   "key_length": "110",
   "used_key_parts": ["uuid"],
   "ref": ["placement.a.consumer_id"],
   "r_loops": 838200,
   "rows": 1,
   "r_rows": 1,
   "r_table_time_ms": 62602.5453,
   "r_other_time_ms": 369.5835268,
   "filtered": 100,
   "r_filtered": 0.005249344,
   "attached_condition": "c.user_id = u.`id` and c.project_id = p.`id`"
}

If you are observing a huge difference between the filtered and r_filtered columns for the query, as in the example of system response above, analyze the performance of tables by running the ANALYZE TABLE <TABLE_NAME>; and ANALYZE TABLE <TABLE_NAME> PERSISTENT FOR ALL; commands:

mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD
MariaDB > ANALYZE TABLE placement.allocations;
MariaDB > ANALYZE TABLE placement.allocations PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.consumers;
MariaDB > ANALYZE TABLE placement.consumers PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.users;
MariaDB > ANALYZE TABLE placement.users PERSISTENT FOR ALL;
MariaDB > ANALYZE TABLE placement.projects;
MariaDB > ANALYZE TABLE placement.projects PERSISTENT FOR ALL;

[29501] Cinder periodic database cleanup resets the state of volumes¶

Fixed in 23.1

Due to an issue in the database auto-cleanup job for the Block Storage service (OpenStack Cinder), the state of volumes that are attached to instances gets reset every time the job runs. The instances can still write and read block storage data, however, volume objects appear in the OpenStack API as not attached causing confusion.

The workaround is to temporarily disable the job until the issue is fixed and execute the script below to restore the affected instances.

To disable the job, update the OpenStackDeployment custom resource as follows:

  kind: OpenStackDeployment
  spec:
    features:
      database:
        cleanup:
          cinder:
            enabled: false

To restore the affected instances:

Obtain one of the Nova API pods:

nova_api_pod=$(kubectl -n openstack get pod -l application=nova,component=os-api --no-headers | head -n1 | awk '{print $1}')

Download the restore_volume_attachments.py script to your local environment.

Note

The provided script does not fix the Cinder database clean-up job and is only intended to restore the functionality of the affected instances. Therefore, leave the job disabled.

Copy the script to the Nova API pod:

kubectl -n openstack cp restore_volume_attachments.py $nova_api_pod:tmp/

Run the script in the dry-run mode to only list affected instances and volumes:

kubectl -n openstack exec -ti $nova_api_pod -- python /tmp/restore_volume_attachments.py --dry-run

Run the script to restore the volume attachments:

kubectl -n openstack exec -ti $nova_api_pod -- python /tmp/restore_volume_attachments.py

[25124] MPLSoGRE encapsulation has limited throughput¶

Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) provides limited throughput while sending data between VMs up to 38 Mbps, as per Mirantis tests.

As a workaround, switch the encapsulation type to VXLAN in the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          conf:
            bagpipe_bgp:
              dataplane_driver_ipvpn:
                mpls_over_gre: "False"
                vxlan_encap: "True"

[25594] Security groups shared through RBAC cannot be used to create instances¶

Fixed in MOSK 22.5 for Yoga

It is not possible to create an instance that uses a security group shared through role-based access control (RBAC) with only specifying the network ID when calling Nova. In such case, before creating a port in the given network, Nova verifies if the given security group exists in Neutron. However, Nova asks only for the security groups filtered by project_id. Therefore, it will not get the shared security group back from the Neutron API. For details, see the OpenStack known issue #1942615.

Note

The bug affects only OpenStack Victoria and is fixed for OpenStack Yoga in MOSK 22.5.

Workaround:

Create a port in Neutron:

openstack port create --network <NET> --security-group <SG_ID> shared-sg-port

Pass the created port to Nova:

openstack server create --image <IMAGE> --flavor <FLAVOR> --port shared-sg-port vm-with-shared-sg

Note

If security groups shared through RBAC are used, apply them to ports only, not to instances directly.