Kubernetes applications

This section lists the alerts for Kubernetes applications.

For troubleshooting guidelines, see Troubleshoot Kubernetes applications alerts.


KubePodsCrashLooping

Severity

Warning

Summary

Pod of {{ $labels.created_by_name }} {{ $labels.created_by_kind }} in crash loop.

Description

At least one Pod container of {{ $labels.created_by_name }} {{ $labels.created_by_kind }} in the {{ $labels.namespace }} namespace was restarted more than once during the last 20 minutes.

KubePodsNotReady

Severity

Warning

Summary

Pods of {{ $labels.created_by_name }} {{ $labels.created_by_kind }} in non-ready state.

Description

{{ $labels.created_by_name }} {{ $labels.created_by_kind }} in the {{ $labels.namespace }} namespace has Pods in non-Ready state for longer than 15 minutes.

KubePodsRegularLongTermRestarts

Available since 2.16.0

Severity

Warning

Summary

{{ $labels.created_by_name }} {{ $labels.created_by_kind }} Pod restarted regularly.

Description

The Pod of {{ $labels.created_by_name }} {{ $labels.created_by_kind }} in the {{ $labels.namespace }} namespace has a container that was restarted at least once a day during the last 2 days.

KubeDeploymentGenerationMismatch

Severity

Major

Summary

Deployment {{ $labels.deployment }} generation does not match the metadata.

Description

The {{ $labels.namespace }}/{{ $labels.deployment }} Deployment generation does not match the metadata, indicating that the Deployment has failed but has not been rolled back.

KubeDeploymentReplicasMismatch

Severity

Major

Summary

Deployment {{ $labels.deployment }} has wrong number of replicas.

Description

The {{ $labels.namespace }}/{{ $labels.deployment }} Deployment has not matched the expected number of replicas for longer than 30 minutes.

KubeDeploymentOutage

Severity

Critical

Summary

Deployment {{ $labels.deployment }} outage.

Description

The {{ $labels.namespace }}/{{ $labels.deployment }} Deployment has {{ $value }} Pod(s) down (less than specified maximum of unavailable Pods) for the last 2 minutes.

KubeStatefulSetReplicasMismatch

Severity

Major

Summary

StatefulSet {{ $labels.statefulset }} has a wrong number of ready replicas.

Description

The {{ $labels.namespace }}/{{ $labels.statefulset }} StatefulSet has not matched the expected number of ready replicas for longer than 30 minutes.

KubeStatefulSetGenerationMismatch

Severity

Critical

Summary

StatefulSet {{ $labels.statefulset }} generation does not match the metadata.

Description

The {{ $labels.namespace }}/{{ $labels.statefulset }} StatefulSet generation does not match the metadata, indicating that the StatefulSet has failed but has not been rolled back.

KubeStatefulSetOutage

Severity

Critical

Summary

StatefulSet {{ $labels.statefulset }} outage.

Description

The {{ $labels.namespace }}/{{ $labels.statefulset }} StatefulSet has more than 1 not ready replica for the last 2 minutes.

KubeStatefulSetUpdateNotRolledOut

Severity

Major

Summary

StatefulSet {{ $labels.statefulset }} update has not been rolled out.

Description

The {{ $labels.namespace }}/{{ $labels.statefulset }} StatefulSet update has not been rolled out.

KubeDaemonSetRolloutStuck

Severity

Major

Summary

DaemonSet {{ $labels.daemonset }} is not ready.

Description

{{ $value }} Pods of the {{ $labels.namespace }}/{{ $labels.daemonset }} DaemonSet are scheduled but not ready.

KubeDaemonSetNotScheduled

Severity

Warning

Summary

DaemonSet {{ $labels.daemonset }} has not scheduled pods

Description

{{ $value }} Pods of the {{ $labels.namespace }}/{{ $labels.daemonset }} DaemonSet are not scheduled.

KubeDaemonSetMisScheduled

Severity

Warning

Summary

DaemonSet {{ $labels.daemonset }} has misscheduled pods.

Description

{{ $value }} Pods of the {{ $labels.namespace }}/{{ $labels.daemonset }} DaemonSet are running where they are not supposed to run.

KubeDaemonSetOutage

Severity

Critical

Summary

DaemonSet {{ $labels.daemonset }} outage.

Description

All Pods of the {{ $labels.namespace }}/{{ $labels.daemonset }} DaemonSet are scheduled but not ready for the last 2 minutes.

KubeCronJobRunning

Severity

Warning

Summary

CronJob {{ $labels.cronjob }} is stuck.

Description

The {{ $labels.namespace }}/{{ $labels.cronjob }} CronJob missed its scheduled time (waiting for 15 minutes to start).

KubeJobFailed

Severity

Warning

Summary

Job {{ $labels.created_by_name }} has failed.

Description

{{ $value }} Pod(s) of the {{ $labels.namespace }}/{{ $labels.created_by_name }} Job failed to complete.