Skip to content

Known issues#

The MKE 4k known issues with available workarounds are described herein.

Post-install kubelet parameter modifications require a k0s restart#

Modifications made to the kubelet parameters in the mke4.yaml configuration file after the initial MKE 4k installation require a restart of k0s on every cluster node. To do this:

  1. Wait for a short time, roughly 60 seconds after the application of the mkectl apply command, to give the pods time to enter their Running state.

  2. Run the systemctl restart k0scontroller command on all manager nodes and the systemctl restart k0scontroller command on all worker nodes.

Upgrade may fail on clusters with two manager nodes#

MKE 3 upgrades to MKE 4k may fail on clusters that have only two manager nodes.

Info

Mirantis does not sanction upgrading MKE 3 clusters that have an even number of manager nodes. In general, having an even number of manager nodes is avoided in clustering systems due to quorum and availability factors.

Calico IPVS mode is not supported#

Calico IPVS mode is not yet supported for MKE 4k. As such, upgrading from an MKE 3 cluster using that networking modes results in an error:

FATA[0640] Upgrade failed due to error: failed to run step [Upgrade Tasks]:
unable to install BOP: unable to apply MKE4 config: failed to wait for pods:
failed to wait for pods: failed to list pods: client rate limiter Wait returned
an error: context deadline exceeded

Upgrade to MKE 4k fails if kubeconfig file is present in source MKE 3.x#

Upgrade to MKE 4k fails if the ~/.mke/mke.kubeconf file is present in the source MKE 3.x system.

Workaround:

Make a backup of the old ~/.mke/mke.kubeconf file and then delete it.

reset command must be run with --force flag#

You must run the reset command with the --force flag, as without this flag the command will always return an error.

mkectl reset -f mke4.yaml

Example output:

time="2025-09-08T19:35:44-04:00" level=info msg="==> Running phase: Disconnect from hosts"
Error: reset requires --force

Addition of extra scopes with mkectl login causes CLI to authenticate twice#

When you create the kubeconfig with the mkectl login command and add extra scopes using the --oidc-extra-scopes flag, the MKE 4k CLI attempts to authenticate two times during the generation of the configuration and on each cluster interaction with the generated kubeconfig.

Workaround:

When adding extra scopes to the --oidc-extra-scopes flag, make sure to also add the offline_access scope. For example:

--oidc-extra-scopes=groups,offline_access`

mkectl config get command generates log lines that malform YAML output#

The mkectl config get command output contains log records at the beginning of the generated output that invalidate the resulting YAML configuration file.

Workaround:

Exclude unwanted logs by running the mkectl config get command with the higher log level.

mkectl config get -l fatal

Pod logs do not display when MKE 4k is unininstalled and then reinstalled on same nodes#

If you uninstall MKE 4k and later try to reinstall it on the same nodes, the installation will succeed but the pods that run on manager nodes from the previous installation will not display logs and will present the following CA error:

tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes-ca"

Workaround:

  1. Following the uninstallation of MKE 4k, reboot the manager nodes before you reinstall the software.

  2. Run the following command on each manager node:

    rm -rf /var/lib/kubelet/
    

System addons fail to upgrade from MKE 4k 4.1.1 to MKE 4k 4.1.2#

When you upgrade a MKE 4k 4.1.1 cluster that is configured with an external authentication provider (OIDC, SAML, LDAP) to MKE 4k 4.1.2, mkectl reports overall upgrade success despite the upgrade failure of the following system addons: Dex, NGINX Ingress Controller, MKE 4k Dashboard.

The mke-operator logs present the following error message, regarding a missing secret:

failed to create or update ClusterDeployment: failed to prepare MKE ClusterDeployment: failed to prepare service authentication: unable to retrieve the Dex deployment secret: Secret "authentication-credentials" not found

The root cause of this is that the naming convention for authentication secrets changed between MKE 4k releases, from protocol-specific names (for example, ldap-bind-password) to a universal name (authentication-credentials). The mke-operator fails early in the reconcile loop because it attempts to locate the authentication-credentials secret, which does not yet exist, which thus prevents the upgrading of the system addons.

Workaround:

Manually create the secret by copying the data from the old secret to the new expected secret name. This allows the operator to locate the required credentials and proceed with the upgrade loop.

  1. Identify the existing authentication secret. For example, the ldap-bind-password for LDAP configurations.

  2. Copy the content of the secret into a new Secret named authentication-credentials within the same namespace.

    kubectl get secret ldap-bind-password -o json -n mke| \
    jq 'del(.metadata.resourceVersion, .metadata.uid, .metadata.creationTimestamp, .metadata.selfLink, .metadata.ownerReferences) | .metadata.name = "authentication-credentials"' | \
    kubectl apply -f -
    
  3. Verify the new secret.

    kubectl get secret authentication-credentials
    

Child clusters with AWS nodes can hang in unavailable state#

Due to an issue with the AWS cloud provider, MKE 4k child clusters can hang in an unavailable state for operations wherein the control plane nodes are scaled down (recreated or removed, for example).

Workaround method 1:

Restart the CAPI manager pod in the k0rdent namespace. On its recreation, the pod will reconfigure the connection to the child cluster and continue the scale-down process.

Workaround method 2:

Mnually change the configuration of Network Load Balancer (NLB) target groups; however, to do this, you must:

  1. Locate the NLB of the child cluster in the AWS console.

  2. Set the target group to the listener port 6443.

  3. Enable Target deregistration management in the Attributes tab.

With this method, be aware that the target group can be reconciled by Cluster API Provider AWS (CAPA), which will result in the reversion of the configuration.

Airgapped v4.1.2 to airgapped v4.1.3 upgrades fail#

Upgrades from airgapped MKE 4k 4.1.2 systems to airgapped MKE 4k 4.1.3 hang at the Waiting for Management object to be ready with new release (timeout: 20m0s) step. If the upgrade stops at this point, and there are "waiting for capi" conditions in Management and the CoreProvider object has the config map not found condition, you have reached the known issue kubernetes-sigs/cluster-api-operator#966.

Solution:

  1. Download and re-upload the cluster-api-provider-k0sproject-k0smotron-components image, replacing REGISTRY with the airgap registry hostname):

    skopeo copy -a --insecure-policy docker://registry.mirantis.com/k0rdent-enterprise/capi/cluster-api-provider-k0sproject-k0smotron-components:v1.6.0 oci-archive:cluster-api-provider-k0sproject-k0smotron-components_v1.6.0.tar
    skopeo copy -a --insecure-policy oci-archive:cluster-api-provider-k0sproject-k0smotron-components_v1.6.0.tar docker://${REGISTRY}/mke/capi/cluster-api-provider-k0sproject-k0smotron-components:v1.6.0
    
  2. Delete the capi-operator pod to restart the reconcile process.

Use of --keep-ingress-nginx flag during upgrade requires two correlated flags#

To apply the --keep-ingress-nginx flag during an upgrade from a previous MKE 4k version to MKE 4k 4.1.3 you must also apply both the --ingress-nginx-alt-http-node-port flag and the --ingress-nginx-alt-https-node-port flag, otherwise, the target cluster will not be fully functional.

Long upgrade times from MKE 4k 4.1.2 to version 4.1.4#

The upgrade from MKE 4k 4.1.2 to version 4.1.4 may appear to stall while waiting for the Management resource to become ready, as this step can take a significant amount of time.

Example output:

DBG checking Management Ready condition
DBG k0rdent installation is not Ready, components status:
DBG capi: true
DBG kcm: true
DBG cluster-api-provider-aws: true
DBG cluster-api-provider-gcp: true
DBG cluster-api-provider-k0sproject-k0smotron: true
DBG cluster-api-provider-vsphere: true
DBG cluster-api-provider-azure: InfrastructureProvider is not yet ready: condition Ready is in status False
DBG mke-operator: true
DBG projectsveltos: true
DBG cluster-api-provider-docker: true
DBG cluster-api-provider-infoblox: true
DBG cluster-api-provider-ipam: true
DBG cluster-api-provider-openstack: true

No action is required. The upgrade will automatically proceed after 25–35 minutes.

Workaround:

To avoid the long wait, you can use kubectl to disable CAPI providers prior to running the upgrade.

Warning

Do not disable CAPI providers if you have child clusters as this will result in the loss of those child clusters.

kubectl patch management kcm \
  --type='json' \
  -p="$(kubectl get management kcm -o json | \
    jq -c '[{"op":"replace","path":"/spec/providers","value":(.spec.providers | map(select(.name == "projectsveltos" or .name == "mke-operator")))}]')"

Use of Calico CNI with BPF dataplane requires customer intervention#

Due to known Calico CNI issues, MKE 4k 4.1.4 customers using Calico CNI with BPF dataplane must disable bpfConnectTimeLoadBalancing in the defaultfelixConfiguration configuration of the Calico CNI configuration file:

   defaultFelixConfiguration:
     enabled: true
     bpfConnectTimeLoadBalancing: Disabled
   ....

For fresh MKE 4k installations, this must be done at the time the cluster is created. To upgrade existing MKE 4k 4.1.2 clusters with BPF dataplane to version 4.1.4, this must be done prior to the upgrade.