Known issues¶

MKE 3.6.3 known issues with available workaround solutions include:

[MKE-9501] Issues with MKE clusters with more than 120 nodes
[FIELD-5928] MKE cluster agent fails during upgrade
[MKE-9358] cgroup v2 (unsupported) is enabled in RHEL 9.0 by default
[MKE-10017] ucp-pause containers use incorrect version after upgrade rollback
[MKE-8914] Windows Server Core with Containers images incompatible with GCP
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

[MKE-9501] Issues with MKE clusters with more than 120 nodes¶

A limitation in MKE 3.6.3 can cause issues in clusters that deploy more than 120 nodes.

If you plan to run a cluster with more than 120 nodes, Mirantis strongly recommends that you upgrade to MKE 3.6.4. If, however, it is imperative that you run one of the affected MKE versions with 121+ nodes, contact Mirantis support to secure a workaround.

[FIELD-5928] MKE cluster agent fails during upgrade¶

During upgrade to MKE 3.6.3, the MKE cluster agent fails in circumstances wherein a custom NGINX Ingress Controller is deployed in the ingress-nginx namespace.

Workaround:

Remove the custom NGINX Ingress Controller deployment from ingress-nginx namespace prior to upgrade.

Intsall the custom NGINX Ingress Controller on a different namespace:

helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx-custom --create-namespace

[MKE-9358] cgroup v2 (unsupported) is enabled in RHEL 9.0 by default¶

As MKE does not support cgroup v2 on Linux platforms, RHEL 9.0 users will be unable to use the software due to cgroup v2 default enablement.

As a workaround, RHEL 9.0 users must disable cgroup v2.

[MKE-10017] ucp-pause containers use incorrect version after upgrade rollback¶

After rolling back to MKE 3.6.3 during an upgrade to any later version, some MKE nodes may run ucp-pause containers built from the upgraded version of the MKE images.

Workaround:

Perform the following steps on each Linux node where the ucp-pause containers are built from the upgraded MKE image version.

Verify that the ucp-pause containers are using the MKE image version to which you tried to upgrade:

docker ps -a | grep ucp-pause

Example output:

01a80dd175de   mirantiseng/ucp-pause:3.7.0   "/pause"   17 minutes ago   Up 16 minutes   k8s_POD_ucp-node-feature-discovery-9bwsj_node-feature-discovery_0a601160-ecf7-412f-bff8-e421a4f1712d_0
498371f35994   mirantiseng/ucp-pause:3.7.0   "/pause"   20 minutes ago   Up 18 minutes   k8s_POD_coredns-7fb76597fc-k2q2k_kube-system_83fee771-dc1d-4e34-ae45-f0ab9dee5942_0
a94cfcfb18f6   mirantiseng/ucp-pause:3.7.0   "/pause"   22 minutes ago   Up 21 minutes   k8s_POD_calico-kube-controllers-58c64b9976-mg5dn_kube-system_0b80ed92-be02-40de-827e-6a6b6e7f27da_0
0a2cf203f77c   mirantiseng/ucp-pause:3.7.0   "/pause"   22 minutes ago   Up 21 minutes   k8s_POD_calico-node-f2xhl_kube-system_3c4a27c5-b832-417d-bc30-b6a7ca8f7627_0

If the ucp-pause containers are using the correct image version, proceed to the next node.

Copy the cri-dockerd-mke.service configuration file from the tmp directory to /usr/lib/systemd/system:
```
sudo cp /tmp/cri-dockerd-mke.service /usr/lib/systemd/system
```
Restart kubelet to load the most recent configuration file:
```
docker rm -f ucp-kubelet
```

Delete all ucp-pause containers that are on the node:

docker rm -f <pause-containrer-id-1> <pause-containrer-id-n>

Verify that the ucp-pause containers are using the correct MKE image version:

docker ps -a | grep ucp-pause

Example output:

236b3dfb1bf6   mirantiseng/ucp-pause:3.6.4   "/pause"   12 seconds ago   Up 11 seconds   k8s_POD_calico-node-dp7hd_kube-system_d59d9004-5a59-46f8-8281-3c917c62fe20_0
56994306b181   mirantiseng/ucp-pause:3.6.4   "/pause"   12 seconds ago   Up 11 seconds   k8s_POD_calico-kube-controllers-64844db68f-br9dh_kube-system_5ea39708-231a-45f5-aa7c-f7b842131941_0
e62ae3c2a871   mirantiseng/ucp-pause:3.6.4   "/pause"   12 seconds ago   Up 11 seconds   k8s_POD_ucp-node-feature-discovery-rdrb7_node-feature-discovery_848cda05-74ec-4db2-825f-05afa53b2502_0
d51eba420f34   mirantiseng/ucp-pause:3.6.4   "/pause"   12 seconds ago   Up 11 seconds   k8s_POD_coredns-78c7f4f4c7-lljzc_kube-system_92936b7c-6a7c-4eb5-a83f-22514acac636_0

[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

Create a new VPC and set the MTU value to 1500.
Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.