Troubleshoot Kubernetes resources alerts

This section describes the investigation and troubleshooting steps for the Kubernetes resources alerts.


KubeCPUOvercommitPods

Root cause

The sum of Kubernetes Pods CPU requests is higher than the average capacity of the cluster without one node or 80% of total nodes CPU capacity, depending on what is higher. It is a common issue of a cluster with too many resources deployed.

Investigation

Select one of the following options to verify nodes CPU requests:

  • Inspect the allocated resources section in the output of the following command:

    kubectl describe nodes
    
  • Inspect the Cluster CPU Capacity panel of the Kubernetes Cluster Grafana dashboard.

Mitigation

Increase the node(s) CPU capacity or add a worker node(s).

KubeMemOvercommitPods

Root cause

The sum of Kubernetes Pods RAM requests is higher than the average capacity of the cluster without one node or 80% of total nodes RAM capacity, depending on what is higher. It is a common issue of a cluster with too many resources deployed.

Investigation

Select one of the following options to verify nodes RAM requests:

  • Inspect the allocated resources section in the output of the following command:

    kubectl describe nodes
    
  • Inspect the Cluster Mem Capacity panel of the Kubernetes Cluster Grafana dashboard.

Mitigation

Increase the node(s) CPU capacity or add a worker node(s).