OpenStack known issues

This section lists the OpenStack known issues with workarounds for the Mirantis OpenStack for Kubernetes release 23.2.

[25124] MPLSoGRE encapsulation has limited throughput

Fixed in 23.3

Multiprotocol Label Switching over Generic Routing Encapsulation (MPLSoGRE) provides limited throughput while sending data between VMs up to 38 Mbps, as per Mirantis tests.

As a workaround, switch the encapsulation type to VXLAN in the OpenStackDeployment custom resource:

spec:
  services:
    networking:
      neutron:
        values:
          conf:
            bagpipe_bgp:
              dataplane_driver_ipvpn:
                mpls_over_gre: "False"
                vxlan_encap: "True"

[31186,34132] Pods get stuck during MariaDB operations

Due to the upstream MariaDB issue, during MariaDB operations on a management cluster, Pods may get stuck in continuous restarts with the following example error:

[ERROR] WSREP: Corrupt buffer header: \
addr: 0x7faec6f8e518, \
seqno: 3185219421952815104, \
size: 909455917, \
ctx: 0x557094f65038, \
flags: 11577. store: 49, \
type: 49

Workaround:

  1. Create a backup of the /var/lib/mysql directory on the mariadb-server Pod.

  2. Verify that other replicas are up and ready.

  3. Remove the galera.cache file for the affected mariadb-server Pod.

  4. Remove the affected mariadb-server Pod or wait until it is automatically restarted.

After Kubernetes restarts the Pod, the Pod clones the database in 1-2 minutes and restores the quorum.

[34897] Machines are not available after Victoria to Wallaby update

Fixed in 23.3

After update of OpenStack from Victoria to Wallaby, the machines from nodes with DPDK become unavailable.

Workaround:

  1. Search for the nodes with the OVS ports:

    for i in $(kubectl -n openstack get pods |grep openvswitch-vswitchd | awk '{print $1}'); do kubectl -n openstack exec -it -c openvswitch-vswitchd $i -- ovs-vsctl show |grep -q "tag: 4095" && echo $i; done
    
  2. Restart the neutron-ovs-agent agent on the affected nodes.

[42386] A load balancer service does not obtain the external IP address

Due to the MetalLB upstream issue, a load balancer service may not obtain the external IP address.

The issue occurs when two services share the same external IP address and have the same externalTrafficPolicy value. Initially, the services have the external IP address assigned and are accessible. After modifying the externalTrafficPolicy value for both services from Cluster to Local, the first service that has been changed remains with no external IP address assigned. Though, the second service, which was changed later, has the external IP assigned as expected.

To work around the issue, make a dummy change to the service object where external IP is <pending>:

  1. Identify the service that is stuck:

    kubectl get svc -A | grep pending
    

    Example of system response:

    stacklight  iam-proxy-prometheus  LoadBalancer  10.233.28.196  <pending>  443:30430/TCP
    
  2. Add an arbitrary label to the service that is stuck. For example:

    kubectl label svc -n stacklight iam-proxy-prometheus reconcile=1
    

    Example of system response:

    service/iam-proxy-prometheus labeled
    
  3. Verify that the external IP was allocated to the service:

    kubectl get svc -n stacklight iam-proxy-prometheus
    

    Example of system response:

    NAME                  TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)        AGE
    iam-proxy-prometheus  LoadBalancer  10.233.28.196  10.0.34.108  443:30430/TCP  12d