AWS-based managed cluster requiring PVs fails to be deployed¶
Note
The issue below affects only the Kubernetes 1.18 deployments and is fixed in the Cluster release 7.0.0 that introduces support for Kubernetes 1.20.
On a management cluster with multiple AWS-based managed clusters, some clusters
may fail to complete the deployments that require persistent volumes (PVs),
for example, Prometheus. Some of the affected pods may get stuck in the
Pending
state with the pod has unbound immediate PersistentVolumeClaims
and node(s) had volume node affinity conflict
errors.
Warning
The below issue resolution applies to HA deployments where data can be rebuilt from replicas. If you have a non-HA deployment, back up any existing data before proceeding, since all data will be lost while applying the issue resolution.
To apply the issue resolution:
Obtain the persistent volume claims related to the storage mounts of the affected pods:
kubectl get pod/<pod_name1> pod/<pod_name2> \ -o jsonpath='{.spec.volumes[?(@.persistentVolumeClaim)].persistentVolumeClaim.claimName}'
Note
In the above command and in the subsequent step, substitute the parameters enclosed in angle brackets with the corresponding values.
Delete the affected
Pods
andPersistentVolumeClaims
to reschedule them. For example, for StackLight:kubectl -n stacklight delete \ pod/<pod_name1> pod/<pod_name2> ... pvc/<pvc_name2> pvc/<pvc_name2> ...