Troubleshooting#
Steps to Debug KubeVirt Cluster Deployments#
-
Check the
ClusterDeploymentstatus condition on the management or regional cluster:kubectl -n $CLUSTER_NAMESPACE get clusterdeployment.k0rdent.mirantis.com $CLUSTER_NAME -o=jsonpath='{.status.conditions[?(@.type=="Ready")]}' | jq -
Check the
KubevirtClusterstatus condition on the management or regional cluster:kubectl -n $CLUSTER_NAMESPACE get kubevirtcluster $CLUSTER_NAME -o=jsonpath='{.status.conditions[?(@.type=="Ready")]}' | jq -
Check the vm,vmi status on the KubeVirt Infrastructure Cluster:
kubectl --kubeconfig $KUBEVIRT_INFRA_KUBECONFIG_PATH -n $CLUSTER_NAMESPACE get vm -l cluster.x-k8s.io/cluster-name=$CLUSTER_NAME -o=yaml -
Check the logs of the virt-handler pod on the KubeVirt Infrastructure Cluster:
kubectl --kubeconfig $KUBEVIRT_INFRA_KUBECONFIG_PATH -n kubevirt get pods -l kubevirt.io=virt-handler -
Sometimes you need to SSH into the VM created on the KubeVirt Infrastructure Cluster to check system or k0s logs:
virtctl console -n $CLUSTER_NAMESPACE $VM_NAME --kubeconfig $KUBEVIRT_INFRA_KUBECONFIG_PATHor you can port-forward the SSH port to a virtualmachine and access directly via SSH:
virtctl port-forward vmi/$VM_NAME -n $CLUSTER_NAMESPACE --kubeconfig $KUBEVIRT_INFRA_KUBECONFIG_PATH $LOCAL_PORT:22Then you can SSH into the VM:
ssh -p $LOCAL_PORT capk@127.0.0.1 -i $SSH_PRIVATE_KEY_PATHWarning
The SSH key pair is generated by Cluster API Provider KubeVirt during the provisioning process. You can retrieve the private key from the secret created in the management or regional cluster:
kubectl get secret -n $CLUSTER_NAMESPACE $CLUSTER_NAME-ssh-keys -o=jsonpath={.data.key} | base64 -d -
Check logs on the VM:
For k0s logs:
sudo journalctl -u k0sworkerFor containers logs, see
/var/log/containersdirectory. For more information see k0s troubleshooting.
Known Issues#
The KubeVirtCluster deployment fails on proto: integer overflow error#
When deploying the KubeVirt cluster, if the namespace where the ClusterDeployment has been created does not exist
on the KubeVirt Infrastructure Cluster, the following misleading error may appear in the
cluster-api-provider-kubevirt logs:
E0126 13:41:19.622971 1 controller.go:474] "Reconciler error" err="failed to create load balancer: proto: integer overflow"
controller="kubevirtcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="KubevirtCluster"
KubevirtCluster="kcm-system/my-kubevirt-clusterdeployment1" namespace="kcm-system" name="my-kubevirt-clusterdeployment1"
reconcileID="2074dadd-23e8-4d64-bf87-b7da2905f347"
The ClusterDeployment Ready condition will be:
kubectl -n kcm-system get clusterdeployment my-kubevirt-clusterdeployment1 -o jsonpath='{.status.conditions[?(@.type=="Ready")]}'
{
"lastTransitionTime": "2026-01-26T13:37:06Z",
"message": "* InfrastructureReady:
* LoadBalancerAvailable: proto: integer overflow (0/1 conditions met)
* ControlPlaneInitialized: Control plane not yet initialized
* ControlPlaneAvailable: K0sControlPlane status.initialization.controlPlaneInitialized is false
* WorkersAvailable:
* MachineDeployment my-kubevirt-clusterdeployment1-md: 0 available replicas, at least 1 required (spec.strategy.rollout.maxUnavailable is 0, spec.replicas is 1)
* RemoteConnectionProbe: Remote connection not established yet",
"reason": "Failed",
"status": "False",
"type": "Ready"
}
Workaround
Most likely the issue is caused by the missing namespace where the ClusterDeployment exists on the KubeVirt
Infrastructure Cluster. You must create the namespace in advance before creating the ClusterDeployment object:
kubectl --kubeconfig <kubevirt-infra-cluster-kubeconfig> create namespace <cld-namespace>
bridge-marker CrashLoopBackOff when Ceph RADOS Gateway uses port 8081#
On bare-metal worker nodes where Mirantis k0rdent Virtualization and Ceph are deployed together, the CNAO-managed
bridge-marker DaemonSet may fail to start. The bridge-marker pod runs with hostNetwork: true and binds its
health HTTP server to port 8081. If Ceph RADOS Gateway (RGW) is also listening on port 8081 on the same node,
bridge-marker cannot bind to the port and exits with "address already in use". The container terminates with
exit code 2, probes never succeed, and the pods remain in CrashLoopBackOff:
kubectl -n kubevirt get pods -l app=bridge-marker
NAME READY STATUS RESTARTS AGE
bridge-marker-6pcbm 0/1 CrashLoopBackOff 262 21h
bridge-marker-ck89b 0/1 CrashLoopBackOff 261 21h
bridge-marker-fmpgf 0/1 CrashLoopBackOff 261 21h
The bridge-marker container exposes port 8081 on the host:
kubectl -n kubevirt describe pod -l app=bridge-marker
Containers:
bridge-marker:
Port: 8081/TCP
Host Port: 8081/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Workaround
Configure the Ceph RADOS Gateway to use port 8082 instead of 8081 in the CephDeployment object. Edit the
CephDeployment in the ceph-lcm-mirantis namespace and set spec.objectStorage.rgw.gateway.port to 8082:
kubectl edit cephdeployment -n ceph-lcm-mirantis
apiVersion: lcm.mirantis.com/v1alpha1
kind: CephDeployment
metadata:
name: rook-ceph
namespace: ceph-lcm-mirantis
spec:
objectStorage:
rgw:
gateway:
port: 8082
securePort: 8443
After the Ceph controller applies the updated configuration, verify that RGW is listening on port 8082 and that
bridge-marker pods become Ready:
kubectl -n kubevirt get pods -l app=bridge-marker
kubectl get cephdeployment -n ceph-lcm-mirantis
Sidecar and Root feature gates are not converted from v1beta1 to v1 in HCO 1.18.2-mira#
Starting with HCO 1.18.2-mira, the HyperConverged API moves from hco.kubevirt.io/v1beta1 to
hco.kubevirt.io/v1. The conversion webhook migrates most fields automatically, but custom Sidecar and
Root feature gate settings from v1beta1 are not preserved.
In v1beta1, these settings were configured as boolean fields under spec.featureGates (for example,
sidecar: true or root: true). In v1, feature gates use a list of objects with name and optional state
fields. After upgrade, clusters that previously enabled or disabled sidecar or root behavior may revert to
defaults unless the settings are reapplied manually.
Before upgrading, record the existing v1beta1 feature gate values:
kubectl -n kubevirt get hyperconverged kubevirt-hyperconverged -o jsonpath='{.spec.featureGates}{"\n"}' | jq
After upgrading to 1.18.2-mira, verify the converted HyperConverged CR:
kubectl -n kubevirt get hyperconverged kubevirt-hyperconverged -o jsonpath='{.spec.featureGates}' | jq
Workaround
If sidecar or root settings were lost during conversion, reapply them in the v1 HyperConverged CR:
apiVersion: hco.kubevirt.io/v1
kind: HyperConverged
metadata:
name: kubevirt-hyperconverged
namespace: kubevirt
spec:
featureGates:
- name: downwardMetrics
- name: sidecar
- name: root
virtualization:
platform: k0s
deployment:
nodePlacements:
infra:
nodeSelector:
kubernetes.io/os: linux
Only include sidecar and root entries if they were previously enabled in your environment.
For more information on the v1 HyperConverged API format, see
Configuration.