OpenSearchPVCMismatch alert raises due to the OpenSearch PVC size mismatch

Caution

The below issue resolution applies since Container Cloud 2.22.0 to existing clusters with insufficient resources. Before Container Cloud 2.22.0, use the workaround described in the StackLight known issue 27732-1. New clusters deployed on top of Container Cloud 2.22.0 are not affected.

The OpenSearch elasticsearch.persistentVolumeClaimSize custom setting can be overwritten by logging.persistentVolumeClaimSize during deployment of a Container Cloud cluster of any type and is set to the default 30Gi. This issue raises the OpenSearchPVCMismatch alert. Since elasticsearch.persistentVolumeClaim is immutable, you cannot update the value by editing of the Cluster object.

Note

This issue does not affect cluster operability if the current volume capacity is enough for the cluster needs.

To apply the issue resolution, select from the following use cases:

StackLight with an expandable StorageClass for OpenSearch PVCs
  1. Verify that the StorageClass provisioner has enough space to satisfy the new size:

    kubectl get helmbundle stacklight-bundle -n stacklight -o json | jq '.spec.releases[] |
     select(.name == "opensearch") | .values.volumeClaimTemplate.resources.requests.storage'
    

    The system response contains the value of the elasticsearch.persistentVolumeClaimSize parameter.

  2. Scale down the opensearch-master StatefulSet with dependent resources to 0 and disable the elasticsearch-curator CronJob:

    kubectl -n stacklight scale --replicas 0 deployment opensearch-dashboards \
    && kubectl -n stacklight get pods -l app=opensearch-dashboards | awk '{if (NR!=1) {print $1}}' | \
    xargs -r kubectl -n stacklight wait --for=delete --timeout=10m pod
    
    kubectl -n stacklight scale --replicas 0 deployment metricbeat \
    && kubectl -n stacklight get pods -l app=metricbeat | awk '{if (NR!=1) {print $1}}' | \
    xargs -r kubectl -n stacklight wait --for=delete --timeout=10m pod
    
    kubectl -n stacklight patch cronjobs elasticsearch-curator -p '{"spec": {"suspend": true}}'
    
    kubectl -n stacklight scale --replicas 0 statefulset opensearch-master \
    && kubectl -n stacklight get pods -l app=opensearch-master | awk '{if (NR!=1) {print $1}}' | \
    xargs -r kubectl -n stacklight wait --for=delete --timeout=30m pod
    
  3. Patch the PVC with the correct value for elasticsearch.persistentVolumeClaimSize:

    pvc_size=$(kubectl -n stacklight get statefulset -l 'app=opensearch-master' \
    -o json | jq -r '.items[] | select(.spec.volumeClaimTemplates[].metadata.name // "" |
     startswith("opensearch-master")).spec.volumeClaimTemplates[].spec.resources.requests.storage')
    
     kubectl -n stacklight patch pvc opensearch-master-opensearch-master-0 \
     -p  '{ "spec": { "resources": { "requests": { "storage": "'"${pvc_size}"'" }}}}'
    
  4. Scale up the opensearch-master StatefulSet with dependent resources to 1 and enable the elasticsearch-curator CronJob:

    replicas=$(kubectl get helmbundle stacklight-bundle -n stacklight \
    -o json | jq '.spec.releases[] | select(.name == "opensearch") | .values.replicas')
    
    kubectl -n stacklight scale --replicas ${replicas} statefulset opensearch-master \
    && kubectl -n stacklight wait --for=condition=Ready --timeout=30m pod -l app=opensearch-master
    
    kubectl -n stacklight scale --replicas 1 deployment opensearch-dashboards \
    && kubectl -n stacklight wait --for=condition=Ready --timeout=10m pod -l app=opensearch-dashboards
    
    kubectl -n stacklight scale --replicas 1 deployment metricbeat \
    && kubectl -n stacklight wait --for=condition=Ready --timeout=10m pod -l app=metricbeat
    
    kubectl -n stacklight patch cronjobs elasticsearch-curator -p '{"spec": {"suspend": false}}'
    
StackLight with a non-expandable StorageClass for OpenSearch PVCs

If StackLight is operating in HA mode, the local volume provisioner (LVP) has a non-expandable StorageClass used for OpenSearch PVCs provisioning. Thus, the affected PV nodes have insufficient disk space.

If StackLight is operating in non-HA mode, the default non-expandable storage provisioner is used.

Warning

After applying this issue resolution, the existing OpenSearch data will be lost. If data loss is acceptable, proceed with the steps below.

  1. Move the existing log data to a new PV if required.

  2. Verify that the provisioner has enough space to satisfy the new size:

    kubectl get helmbundle stacklight-bundle -n stacklight -o json | jq '.spec.releases[] |
     select(.name == "opensearch") | .values.volumeClaimTemplate.resources.requests.storage'
    

    The system response contains the value of the elasticsearch.persistentVolumeClaimSize parameter.

    To satisfy the required size:

    • For LVP, increase the disk size

    • For non-LVP, make sure that the default StorageClass provisioner has enough space

  3. Scale down the opensearch-master StatefulSet with dependent resources to 0 and disable the elasticsearch-curator CronJob:

    kubectl -n stacklight scale --replicas 0 deployment opensearch-dashboards \
    && kubectl -n stacklight get pods -l app=opensearch-dashboards | awk '{if (NR!=1) {print $1}}' | \
    xargs -r kubectl -n stacklight wait --for=delete --timeout=10m pod
    
    kubectl -n stacklight scale --replicas 0 deployment metricbeat \
    && kubectl -n stacklight get pods -l app=metricbeat | awk '{if (NR!=1) {print $1}}' | \
    xargs -r kubectl -n stacklight wait --for=delete --timeout=10m pod
    
    kubectl -n stacklight patch cronjobs elasticsearch-curator -p '{"spec": {"suspend": true}}'
    
    kubectl -n stacklight scale --replicas 0 statefulset opensearch-master \
    && kubectl -n stacklight get pods -l app=opensearch-master | awk '{if (NR!=1) {print $1}}' | \
    xargs -r kubectl -n stacklight wait --for=delete --timeout=30m pod
    
  4. Delete existing PVCs:

    kubectl delete pvc -l 'app=opensearch-master' -n stacklight
    

    Warning

    This command removes all existing logs data from PVCs.

  5. Scale up the opensearch-master StatefulSet with dependent resources and enable the elasticsearch-curator CronJob:

    replicas=$(kubectl get helmbundle stacklight-bundle -n stacklight \
    -o json | jq '.spec.releases[] | select(.name == "opensearch") | .values.replicas')
    
    kubectl -n stacklight scale --replicas ${replicas} statefulset opensearch-master \
    && kubectl -n stacklight wait --for=condition=Ready --timeout=30m pod -l app=opensearch-master
    
    kubectl -n stacklight scale --replicas 1 deployment opensearch-dashboards \
    && kubectl -n stacklight wait --for=condition=Ready --timeout=10m pod -l app=opensearch-dashboards
    
    kubectl -n stacklight scale --replicas 1 deployment metricbeat \
    && kubectl -n stacklight wait --for=condition=Ready --timeout=10m pod -l app=metricbeat
    
    kubectl -n stacklight patch cronjobs elasticsearch-curator -p '{"spec": {"suspend": false}}'
    

Tip

To verify whether a StorageClass is expandable:

kubectl get pvc -l 'app=opensearch-master' -n stacklight \
-Ao jsonpath='{range .items[*]}{.spec.storageClassName}{"\n"}{end}' | \
xargs -I{} bash -c "echo -n 'StorageClass: {}, expandable: ' \
&& kubectl get storageclass {} -Ao jsonpath='{.allowVolumeExpansion}' && echo ''"

Example of a system response for an expandable StorageClass:

StorageClass: csi-sc-cinderplugin, expandable: true

Example of a system response for a non-expandable StorageClass:

StorageClass: stacklight-elasticsearch-data, expandable:
StorageClass: stacklight-elasticsearch-data, expandable:
StorageClass: stacklight-elasticsearch-data, expandable: