Known issues#
The MKE 4k known issues with available workarounds are described herein.
Post-install kubelet parameter modifications require a k0s restart#
Modifications made to the kubelet parameters in the mke4.yaml configuration
file after the initial MKE 4k installation require a restart of k0s on every
cluster node. To do this:
-
Wait for a short time, roughly 60 seconds after the application of the
mkectl applycommand, to give the pods time to enter theirRunningstate. -
Run the
systemctl restart k0scontrollercommand on all manager nodes and thesystemctl restart k0scontrollercommand on all worker nodes.
Upgrade may fail on clusters with two manager nodes#
MKE 3 upgrades to MKE 4k may fail on clusters that have only two manager nodes.
Info
Mirantis does not sanction upgrading MKE 3 clusters that have an even number of manager nodes. In general, having an even number of manager nodes is avoided in clustering systems due to quorum and availability factors.
Calico IPVS mode is not supported#
Calico IPVS mode is not yet supported for MKE 4k. As such, upgrading from an MKE 3 cluster using that networking modes results in an error:
FATA[0640] Upgrade failed due to error: failed to run step [Upgrade Tasks]:
unable to install BOP: unable to apply MKE4 config: failed to wait for pods:
failed to wait for pods: failed to list pods: client rate limiter Wait returned
an error: context deadline exceeded
Upgrade to MKE 4k fails if kubeconfig file is present in source MKE 3.x#
Upgrade to MKE 4k fails if the ~/.mke/mke.kubeconf file is present in the
source MKE 3.x system.
Workaround:
Make a backup of the old ~/.mke/mke.kubeconf file and then delete it.
reset command must be run with --force flag#
You must run the reset command with the --force flag, as without this flag
the command will always return an error.
mkectl reset -f mke4.yaml
Example output:
time="2025-09-08T19:35:44-04:00" level=info msg="==> Running phase: Disconnect from hosts"
Error: reset requires --force
Addition of extra scopes with mkectl login causes CLI to authenticate twice#
When you create the kubeconfig with the mkectl login command and add extra
scopes using the --oidc-extra-scopes flag, the MKE 4k CLI attempts to authenticate
two times during the generation of the configuration and on each cluster
interaction with the generated kubeconfig.
Workaround:
When adding extra scopes to the --oidc-extra-scopes flag, make sure to also
add the offline_access scope. For example:
--oidc-extra-scopes=groups,offline_access`
mkectl config get command generates log lines that malform YAML output#
The mkectl config get command output contains log records at the
beginning of the generated output that invalidate the resulting YAML
configuration file.
Workaround:
Exclude unwanted logs by running the mkectl config get command with the higher
log level.
mkectl config get -l fatal
Pod logs do not display when MKE 4k is unininstalled and then reinstalled on same nodes#
If you uninstall MKE 4k and later try to reinstall it on the same nodes, the installation will succeed but the pods that run on manager nodes from the previous installation will not display logs and will present the following CA error:
tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes-ca"
Workaround:
-
Following the uninstallation of MKE 4k, reboot the manager nodes before you reinstall the software.
-
Run the following command on each manager node:
rm -rf /var/lib/kubelet/
System addons fail to upgrade from MKE 4k 4.1.1 to MKE 4k 4.1.2#
When you upgrade a MKE 4k 4.1.1 cluster that is configured with an external authentication provider (OIDC, SAML, LDAP) to MKE 4k 4.1.2, mkectl reports overall upgrade success despite the upgrade failure of the following system addons: Dex, NGINX Ingress Controller, MKE 4k Dashboard.
The mke-operator logs present the following error message, regarding a missing secret:
failed to create or update ClusterDeployment: failed to prepare MKE ClusterDeployment: failed to prepare service authentication: unable to retrieve the Dex deployment secret: Secret "authentication-credentials" not found
The root cause of this is that the naming convention for authentication secrets changed between MKE 4k releases, from protocol-specific names (for example, ldap-bind-password) to a universal name (authentication-credentials). The mke-operator fails early in the reconcile loop because it attempts to locate the authentication-credentials secret, which does not yet exist, which thus prevents the upgrading of the system addons.
Workaround:
Manually create the secret by copying the data from the old secret to the new expected secret name. This allows the operator to locate the required credentials and proceed with the upgrade loop.
-
Identify the existing authentication secret. For example, the
ldap-bind-passwordfor LDAP configurations. -
Copy the content of the secret into a new Secret named
authentication-credentialswithin the same namespace.kubectl get secret ldap-bind-password -o json -n mke| \ jq 'del(.metadata.resourceVersion, .metadata.uid, .metadata.creationTimestamp, .metadata.selfLink, .metadata.ownerReferences) | .metadata.name = "authentication-credentials"' | \ kubectl apply -f - -
Verify the new secret.
kubectl get secret authentication-credentials
Child clusters with AWS nodes can hang in unavailable state#
Due to an issue with the AWS cloud provider, MKE 4k child clusters can hang in an unavailable state for operations wherein the control plane nodes are scaled down (recreated or removed, for example).
Workaround method 1:
Restart the CAPI manager pod in the k0rdent namespace. On its recreation, the pod will reconfigure the connection to the child cluster and continue the scale-down process.
Workaround method 2:
Mnually change the configuration of Network Load Balancer (NLB) target groups; however, to do this, you must:
-
Locate the NLB of the child cluster in the AWS console.
-
Set the target group to the listener port
6443. -
Enable Target deregistration management in the Attributes tab.
With this method, be aware that the target group can be reconciled by Cluster API Provider AWS (CAPA), which will result in the reversion of the configuration.
Airgapped v4.1.2 to airgapped v4.1.3 upgrades fail#
Upgrades from airgapped MKE 4k 4.1.2 systems to airgapped MKE 4k 4.1.3 hang at the
Waiting for Management object to be ready with new release (timeout: 20m0s)
step. If the upgrade stops at this point, and there are "waiting for capi"
conditions in Management and the CoreProvider object has the config map not
found condition, you have reached the known issue
kubernetes-sigs/cluster-api-operator#966.
Solution:
-
Download and re-upload the
cluster-api-provider-k0sproject-k0smotron-componentsimage, replacingREGISTRYwith the airgap registry hostname):skopeo copy -a --insecure-policy docker://registry.mirantis.com/k0rdent-enterprise/capi/cluster-api-provider-k0sproject-k0smotron-components:v1.6.0 oci-archive:cluster-api-provider-k0sproject-k0smotron-components_v1.6.0.tar skopeo copy -a --insecure-policy oci-archive:cluster-api-provider-k0sproject-k0smotron-components_v1.6.0.tar docker://${REGISTRY}/mke/capi/cluster-api-provider-k0sproject-k0smotron-components:v1.6.0 -
Delete the
capi-operatorpod to restart the reconcile process.
Use of --keep-ingress-nginx flag during upgrade requires two correlated flags#
To apply the --keep-ingress-nginx flag during an upgrade from a previous MKE
4k version to MKE 4k 4.1.3 you must also apply both the
--ingress-nginx-alt-http-node-port flag and the
--ingress-nginx-alt-https-node-port flag, otherwise, the target cluster will
not be fully functional.
Long upgrade times from MKE 4k 4.1.2 to version 4.1.4#
The upgrade from MKE 4k 4.1.2 to version 4.1.4 may appear to stall while waiting for the Management resource to become ready, as this step can take a significant amount of time.
Example output:
DBG checking Management Ready condition
DBG k0rdent installation is not Ready, components status:
DBG capi: true
DBG kcm: true
DBG cluster-api-provider-aws: true
DBG cluster-api-provider-gcp: true
DBG cluster-api-provider-k0sproject-k0smotron: true
DBG cluster-api-provider-vsphere: true
DBG cluster-api-provider-azure: InfrastructureProvider is not yet ready: condition Ready is in status False
DBG mke-operator: true
DBG projectsveltos: true
DBG cluster-api-provider-docker: true
DBG cluster-api-provider-infoblox: true
DBG cluster-api-provider-ipam: true
DBG cluster-api-provider-openstack: true
No action is required. The upgrade will automatically proceed after 25–35 minutes.
Workaround:
To avoid the long wait, you can use kubectl to disable CAPI providers prior to running the upgrade.
Warning
Do not disable CAPI providers if you have child clusters as this will result in the loss of those child clusters.
kubectl patch management kcm \
--type='json' \
-p="$(kubectl get management kcm -o json | \
jq -c '[{"op":"replace","path":"/spec/providers","value":(.spec.providers | map(select(.name == "projectsveltos" or .name == "mke-operator")))}]')"
Use of Calico CNI with BPF dataplane requires customer intervention#
Due to known Calico CNI issues, MKE 4k 4.1.4 customers using Calico CNI with BPF
dataplane must disable bpfConnectTimeLoadBalancing in the
defaultfelixConfiguration configuration of the Calico CNI configuration
file:
defaultFelixConfiguration:
enabled: true
bpfConnectTimeLoadBalancing: Disabled
....
For fresh MKE 4k installations, this must be done at the time the cluster is created. To upgrade existing MKE 4k 4.1.2 clusters with BPF dataplane to version 4.1.4, this must be done prior to the upgrade.