OpenStack known issues

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 22.5.

[30450] High CPU load of MariaDB

Fixed in 23.1

One of the most common symptoms of the high CPU load of MariaDB is slow API responses. To troubleshoot the issue, verify the CPU consumption of MariaDB using the General > Kubernetes Pods Grafana dashboard or through the CLI as follows:

  1. Obtain the resource consumption details for the MariaDB server:

    kubectl -n openstack exec -it mariadb-server-0 -- bash
    mysql@mariadb-server-0:/$ top
    

    Example of system response:

    top - 19:16:29 up 278 days, 20:56,  0 users,  load average: 16.62, 16.54, 16.39
    Tasks:   8 total,   1 running,   7 sleeping,   0 stopped,   0 zombie
    %Cpu(s):  6.3 us,  2.8 sy,  0.0 ni, 89.6 id,  0.0 wa,  0.0 hi,  1.3 si,  0.0 st
    MiB Mem : 515709.3 total, 375731.7 free, 111383.8 used,  28593.7 buff/cache
    MiB Swap:      0.0 total,      0.0 free,      0.0 used. 399307.2 avail Mem
        PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
        275 mysql     20   0   76.3g  18.8g   1.0g S 786.4   3.7  22656,15 mysqld
    
  2. Determine which exact query is progressing. This is usually the one in the Sending data state:

    mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e "show processlist;" | grep -v Sleep
    

    Example of system response:

    Id      User    Host    db      Command Time    State   Info    Progress
    60067757   placementgF9D11u29   10.233.195.246:40746   placement   Query   10   Sending data   SELECT a.id, a.resource_class_id, a.used, a.updated_at, a.created_at, c.id AS consumer_id, c.generat    0.000
    
  3. Obtain more information about the query and the used tables:

    mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e " ANALYZE FORMAT=JSON SELECT <QUERY_ID>;"
    

    Example of system response:

    "table": {
       "table_name": "c",
       "access_type": "eq_ref",
       "possible_keys": [
         "uniq_consumers0uuid",
         "consumers_project_id_user_id_uuid_idx",
         "consumers_project_id_uuid_idx"
       ],
       "key": "uniq_consumers0uuid",
       "key_length": "110",
       "used_key_parts": ["uuid"],
       "ref": ["placement.a.consumer_id"],
       "r_loops": 838200,
       "rows": 1,
       "r_rows": 1,
       "r_table_time_ms": 62602.5453,
       "r_other_time_ms": 369.5835268,
       "filtered": 100,
       "r_filtered": 0.005249344,
       "attached_condition": "c.user_id = u.`id` and c.project_id = p.`id`"
    }
    
  4. If you are observing a huge difference between the filtered and r_filtered columns for the query, as in the example of system response above, analyze the performance of tables by running the ANALYZE TABLE <TABLE_NAME>; and ANALYZE TABLE <TABLE_NAME> PERSISTENT FOR ALL; commands:

    mysql@mariadb-server-0:/$ mysql -u root -p$MYSQL_DBADMIN_PASSWORD
    MariaDB > ANALYZE TABLE placement.allocations;
    MariaDB > ANALYZE TABLE placement.allocations PERSISTENT FOR ALL;
    MariaDB > ANALYZE TABLE placement.consumers;
    MariaDB > ANALYZE TABLE placement.consumers PERSISTENT FOR ALL;
    MariaDB > ANALYZE TABLE placement.users;
    MariaDB > ANALYZE TABLE placement.users PERSISTENT FOR ALL;
    MariaDB > ANALYZE TABLE placement.projects;
    MariaDB > ANALYZE TABLE placement.projects PERSISTENT FOR ALL;
    

[29501] Cinder periodic database cleanup resets the state of volumes

Fixed in 23.1

Due to an issue in the database auto-cleanup job for the Block Storage service (OpenStack Cinder), the state of volumes that are attached to instances gets reset every time the job runs. The instances can still write and read block storage data, however, volume objects appear in the OpenStack API as not attached causing confusion.

The workaround is to temporarily disable the job until the issue is fixed and execute the script below to restore the affected instances.

To disable the job, update the OpenStackDeployment custom resource as follows:

  kind: OpenStackDeployment
  spec:
    features:
      database:
        cleanup:
          cinder:
            enabled: false

To restore the affected instances:

  1. Obtain one of the Nova API pods:

    nova_api_pod=$(kubectl -n openstack get pod -l application=nova,component=os-api --no-headers | head -n1 | awk '{print $1}')
    
  2. Download the restore_volume_attachments.py script to your local environment.

    Note

    The provided script does not fix the Cinder database clean-up job and is only intended to restore the functionality of the affected instances. Therefore, leave the job disabled.

  3. Copy the script to the Nova API pod:

    kubectl -n openstack cp restore_volume_attachments.py $nova_api_pod:tmp/
    
  4. Run the script in the dry-run mode to only list affected instances and volumes:

    kubectl -n openstack exec -ti $nova_api_pod -- python /tmp/restore_volume_attachments.py --dry-run
    
  5. Run the script to restore the volume attachments:

    kubectl -n openstack exec -ti $nova_api_pod -- python /tmp/restore_volume_attachments.py
    

[25124] MPLSoGRE encapsulation has limited throughput

Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) provides limited throughput while sending data between VMs up to 38 Mbps, as per Mirantis tests.

As a workaround, switch the encapsulation type to VXLAN in the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          conf:
            bagpipe_bgp:
              dataplane_driver_ipvpn:
                mpls_over_gre: "False"
                vxlan_encap: "True"

[25594] Security groups shared through RBAC cannot be used to create instances

Fixed in MOSK 22.5 for Yoga

It is not possible to create an instance that uses a security group shared through role-based access control (RBAC) with only specifying the network ID when calling Nova. In such case, before creating a port in the given network, Nova verifies if the given security group exists in Neutron. However, Nova asks only for the security groups filtered by project_id. Therefore, it will not get the shared security group back from the Neutron API. For details, see the OpenStack known issue #1942615.

Note

The bug affects only OpenStack Victoria and is fixed for OpenStack Yoga in MOSK 22.5.

Workaround:

  1. Create a port in Neutron:

    openstack port create --network <NET> --security-group <SG_ID> shared-sg-port
    
  2. Pass the created port to Nova:

    openstack server create --image <IMAGE> --flavor <FLAVOR> --port shared-sg-port vm-with-shared-sg
    

Note

If security groups shared through RBAC are used, apply them to ports only, not to instances directly.