Skip to content
Due to upgrade issues with the Envoy gateway and the offline installation environments, upgrading to MKE 4k 4.1.3 is not recommended. These issues are fixed in the 4.1.4 release. For version 4.1.3, Mirantis only supports fresh installations.

Known issues#

The MKE 4k known issues with available workarounds are described herein.

Post-install kubelet parameter modifications require a k0s restart#

Modifications made to the kubelet parameters in the mke4.yaml configuration file after the initial MKE 4k installation require a restart of k0s on every cluster node. To do this:

  1. Wait for a short time, roughly 60 seconds after the application of the mkectl apply command, to give the pods time to enter their Running state.

  2. Run the systemctl restart k0scontroller command on all manager nodes and the systemctl restart k0scontroller command on all worker nodes.

Upgrade may fail on clusters with two manager nodes#

MKE 3 upgrades to MKE 4k may fail on clusters that have only two manager nodes.

Info

Mirantis does not sanction upgrading MKE 3 clusters that have an even number of manager nodes. In general, having an even number of manager nodes is avoided in clustering systems due to quorum and availability factors.

Calico IPVS mode is not supported#

Calico IPVS mode is not yet supported for MKE 4k. As such, upgrading from an MKE 3 cluster using that networking modes results in an error:

FATA[0640] Upgrade failed due to error: failed to run step [Upgrade Tasks]:
unable to install BOP: unable to apply MKE4 config: failed to wait for pods:
failed to wait for pods: failed to list pods: client rate limiter Wait returned
an error: context deadline exceeded

Upgrade to MKE 4k fails if kubeconfig file is present in source MKE 3.x#

Upgrade to MKE 4k fails if the ~/.mke/mke.kubeconf file is present in the source MKE 3.x system.

Workaround:

Make a backup of the old ~/.mke/mke.kubeconf file and then delete it.

reset command must be run with --force flag#

You must run the reset command with the --force flag, as without this flag the command will always return an error.

mkectl reset -f mke4.yaml

Example output:

time="2025-09-08T19:35:44-04:00" level=info msg="==> Running phase: Disconnect from hosts"
Error: reset requires --force

Addition of extra scopes with mkectl login causes CLI to authenticate twice#

When you create the kubeconfig with the mkectl login command and add extra scopes using the --oidc-extra-scopes flag, the MKE 4k CLI attempts to authenticate two times during the generation of the configuration and on each cluster interaction with the generated kubeconfig.

Workaround:

When adding extra scopes to the --oidc-extra-scopes flag, make sure to also add the offline_access scope. For example:

--oidc-extra-scopes=groups,offline_access`

mkectl config get command generates log lines that malform YAML output#

The mkectl config get command output contains log records at the beginning of the generated output that invalidate the resulting YAML configuration file.

Workaround:

Exclude unwanted logs by running the mkectl config get command with the higher log level.

mkectl config get -l fatal

Pod logs do not display when MKE 4k is unininstalled and then reinstalled on same nodes#

If you uninstall MKE 4k and later try to reinstall it on the same nodes, the installation will succeed but the pods that run on manager nodes from the previous installation will not display logs and will present the following CA error:

tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes-ca"

Workaround:

  1. Following the uninstallation of MKE 4k, reboot the manager nodes before you reinstall the software.

  2. Run the following command on each manager node:

    rm -rf /var/lib/kubelet/
    

System addons fail to upgrade from MKE 4k 4.1.1 to MKE 4k 4.1.2#

When you upgrade a MKE 4k 4.1.1 cluster that is configured with an external authentication provider (OIDC, SAML, LDAP) to MKE 4k 4.1.2, mkectl reports overall upgrade success despite the upgrade failure of the following system addons: Dex, NGINX Ingress Controller, MKE 4k Dashboard.

The mke-operator logs present the following error message, regarding a missing secret:

failed to create or update ClusterDeployment: failed to prepare MKE ClusterDeployment: failed to prepare service authentication: unable to retrieve the Dex deployment secret: Secret "authentication-credentials" not found

The root cause of this is that the naming convention for authentication secrets changed between MKE 4k releases, from protocol-specific names (for example, ldap-bind-password) to a universal name (authentication-credentials). The mke-operator fails early in the reconcile loop because it attempts to locate the authentication-credentials secret, which does not yet exist, which thus prevents the upgrading of the system addons.

Workaround:

Manually create the secret by copying the data from the old secret to the new expected secret name. This allows the operator to locate the required credentials and proceed with the upgrade loop.

  1. Identify the existing authentication secret. For example, the ldap-bind-password for LDAP configurations.

  2. Copy the content of the secret into a new Secret named authentication-credentials within the same namespace.

    kubectl get secret ldap-bind-password -o json -n mke| \
    jq 'del(.metadata.resourceVersion, .metadata.uid, .metadata.creationTimestamp, .metadata.selfLink, .metadata.ownerReferences) | .metadata.name = "authentication-credentials"' | \
    kubectl apply -f -
    
  3. Verify the new secret.

    kubectl get secret authentication-credentials
    

Child clusters with AWS nodes can hang in unavailable state#

Due to an issue with the AWS cloud provider, MKE 4k child clusters can hang in an unavailable state for operations wherein the control plane nodes are scaled down (recreated or removed, for example).

Workaround method 1:

Restart the CAPI manager pod in the k0rdent namespace. On its recreation, the pod will reconfigure the connection to the child cluster and continue the scale-down process.

Workaround method 2:

Mnually change the configuration of Network Load Balancer (NLB) target groups; however, to do this, you must:

  1. Locate the NLB of the child cluster in the AWS console.

  2. Set the target group to the listener port 6443.

  3. Enable Target deregistration management in the Attributes tab.

With this method, be aware that the target group can be reconciled by Cluster API Provider AWS (CAPA), which will result in the reversion of the configuration.

Airgapped v4.1.2 to airgapped v4.1.3 upgrades fail#

Upgrades from airgapped MKE 4k 4.1.2 systems to airgapped MKE 4k 4.1.3 hang at the Waiting for Management object to be ready with new release (timeout: 20m0s) step. If the upgrade stops at this point, and there are "waiting for capi" conditions in Management and the CoreProvider object has the config map not found condition, you have reached the known issue kubernetes-sigs/cluster-api-operator#966.

Solution:

  1. Download and re-upload the cluster-api-provider-k0sproject-k0smotron-components image, replacing REGISTRY with the airgap registry hostname):

    skopeo copy -a --insecure-policy docker://registry.mirantis.com/k0rdent-enterprise/capi/cluster-api-provider-k0sproject-k0smotron-components:v1.6.0 oci-archive:cluster-api-provider-k0sproject-k0smotron-components_v1.6.0.tar
    skopeo copy -a --insecure-policy oci-archive:cluster-api-provider-k0sproject-k0smotron-components_v1.6.0.tar docker://${REGISTRY}/k0rdent-enterprise/capi/cluster-api-provider-k0sproject-k0smotron-components:v1.6.0
    
  2. Delete the capi-operator pod to restart the reconcile process.

Use of --keep-ingress-nginx flag during upgrade requires two correlated flags#

To apply the --keep-ingress-nginx flag during an upgrade from a previous MKE 4k version to MKE 4k 4.1.3 you must also apply both the --ingress-nginx-alt-http-node-port flag and the --ingress-nginx-alt-https-node-port flag, otherwise, the target cluster will not be fully functional.