cAdvisorTargetsOutage
|
cAdvisorTargetDown
|
CalicoTargetsOutage
|
CalicoTargetDown
|
CephClusterFullCritical
|
CephClusterFullWarning
|
CephClusterHealthCritical
|
CephClusterHealthWarning
|
CephOSDDiskNotResponding
|
CephOSDDown with the same rook_cluster label
Before MCC 2.25.0 (17.0.0 and 16.0.0)
|
CephOSDDiskUnavailable
|
CephOSDDown with the same rook_cluster label
Before MCC 2.25.0 (17.0.0 and 16.0.0)
|
CephOSDNodeDown Since MCC 2.25.0 (17.0.0 and 16.0.0)
|
With the same node label:
CephOSDDiskNotResponding
CephOSDDiskUnavailable
|
CephOSDPgNumTooHighCritical
|
CephOSDPgNumTooHighWarning
|
DockerSwarmServiceReplicasFlapping
|
DockerSwarmServiceReplicasDown with the same service_id ,
service_mode , and service_name labels
|
DockerSwarmServiceReplicasOutage
|
DockerSwarmServiceReplicasDown with the same service_id ,
service_mode , and service_name labels
|
etcdDbSizeCritical
|
etcdDbSizeMajor with the same job and instance labels
|
etcdHighNumberOfFailedGRPCRequestsCritical
|
etcdHighNumberOfFailedGRPCRequestsWarning with the same
grpc_method , grpc_service , job , and instance labels
|
ExternalEndpointDown
|
ExternalEndpointTCPFailure with the same instance and job
labels
|
FileDescriptorUsageMajor
|
FileDescriptorUsageWarning with the same node label
|
FluentdTargetsOutage
|
FluentdTargetDown
|
KubeAPICertExpirationHigh
|
KubeAPICertExpirationMedium
|
KubeAPIErrorsHighMajor
|
KubeAPIErrorsHighWarning with the same instance label
|
KubeAPIOutage
|
KubeAPIDown
|
KubeAPIResourceErrorsHighMajor
|
KubeAPIResourceErrorsHighWarning with the same instance ,
resource , and subresource labels
|
KubeClientCertificateExpirationInOneDay Removed in MCC 2.28.0 (17.3.0 and 16.3.0)
|
KubeClientCertificateExpirationInSevenDays with the same
instance label
|
KubeDaemonSetOutage
|
CalicoTargetsOutage
KubeDaemonSetRolloutStuck with the same daemonset and
namespace labels
FluentdTargetsOutage
NodeExporterTargetsOutage
TelegrafSMARTTargetsOutage
|
KubeDeploymentOutage
|
KubeDeploymentReplicasMismatch with the same deployment and
namespace labels
GrafanaTargetDown
KubeDNSTargetsOutage Removed in MCC 2.25.0 (17.0.0 and 16.0.0)
KubernetesMasterAPITargetsOutage
KubeStateMetricsTargetDown
PrometheusEsExporterTargetDown
PrometheusMsTeamsTargetDown
PrometheusRelayTargetDown
ServiceNowWebhookReceiverTargetDown
SfNotifierTargetDown
TelegrafDockerSwarmTargetDown
TelegrafOpenstackTargetDown
|
KubeJobFailed
|
KubePodsNotReady for created_by_kind=Job and with the same
created_by_name label (removed in Container Cloud 2.25.0, Cluster releases 17.0.0 and 16.0.0)
|
KubeletTargetsOutage
|
KubeletTargetDown
|
KubePersistentVolumeUsageCritical
|
With the same namespace and persistentvolumeclaim labels:
|
KubePodsCrashLooping
|
KubePodsRegularLongTermRestarts with the same created_by_name ,
created_by_kind , and namespace labels
|
KubeStatefulSetOutage
|
Alerts with the same namespace and statefulset labels:
AlertmanagerTargetDown Since MCC 2.25.0 (17.0.0 and 16.0.0)
AlertmanagerClusterTargetDown Before MCC 2.25.0 (17.0.0 and 16.0.0)
ElasticsearchExporterTargetDown
FluentdTargetsOutage
OpenSearchClusterStatusCritical
PostgresqlReplicaDown
PostgresqlTargetDown Since MCC 2.25.0 (17.0.0 and 16.0.0)
PostgresqlTargetsOutage Before MCC 2.25.0 (17.0.0 and 16.0.0)
PrometheusEsExporterTargetDown
PrometheusServerTargetDown Since MCC 2.25.0 (17.0.0 and 16.0.0)
PrometheusServerTargetsOutage Before MCC 2.25.0 (17.0.0 and 16.0.0)
|
MCCLicenseExpirationHigh
|
MCCLicenseExpirationMedium
|
MCCSSLCertExpirationHigh
|
MCCSSLCertExpirationMedium with the same namespace and
service_name labels
|
MCCSSLProbesServiceTargetOutage
|
MCCSSLProbesEndpointTargetOutage with the same namespace and
service_name labels
|
MKEAPICertExpirationHigh
|
MKEAPICertExpirationMedium
|
MKEAPIOutage
|
MKEAPIDown
|
MKEMetricsEngineTargetsOutage
|
MKEMetricsEngineTargetDown
|
MKENodeDiskFullCritical
|
MKENodeDiskFullWarning with the same node label
|
NodeDown
|
KubeDaemonSetMisScheduled for the following DaemonSets
(removed in Container Cloud 2.27.0, Cluster releases 17.2.0 and 16.2.0):
KubeDaemonSetRolloutStuck for the calico-node and
ucp-nvidia-device-plugin DaemonSets
For resource=nodes :
Alerts with the same node label:
Since MCC 2.25.0 (Cluster releases 17.0.0 and 16.0.0)`:
AlertmanagerTargetDown
CephClusterTargetDown
etcdTargetDown
GrafanaTargetDown
HelmControllerTargetDown
KubeAPIDown
MCCCacheTargetDown
MCCControllerTargetDown
MCCProviderTargetDown
MKEAPIDown
PostgresqlTargetDown
PrometheusMsTeamsTargetDown
PrometheusRelayTargetDown
PrometheusServerTargetDown
ServiceNowWebhookReceiverTargetDown
SfNotifierTargetDown
TelegrafDockerSwarmTargetDown
TelemeterClientTargetDown
TelemeterServerFederationTargetDown
TelemeterServerTargetDown
|
NodeExporterTargetsOutage
|
NodeExporterTargetDown
|
OpenSearchClusterStatusCritical
|
OpenSearchClusterStatusWarning and
OpenSearchNumberOfUnassignedShards (removed in Container Cloud 2.27.0,
Cluster releases 17.2.0 and 16.2.0) with the same cluster label
For created_by_name=~"elasticsearch-curator-." :
|
OpenSearchClusterStatusWarning
Since MCC 2.26.0 (17.1.0 and 16.1.0)
|
|
OpenSearchHeapUsageCritical
|
OpenSearchHeapUsageWarning with the same cluster and name
labels
|
OpenSearchStorageUsageCritical
Since MCC 2.26.0 (17.1.0 and 16.1.0)
|
KubePersistentVolumeFullInFourDays and OpenSearchStorageUsageMajor
with the same namespace and persistentvolumeclaim labels
|
OpenSearchStorageUsageMajor
Since MCC 2.26.0 (17.1.0 and 16.1.0)
|
KubePersistentVolumeFullInFourDays with the same namespace
and persistentvolumeclaim labels
|
PostgresqlPatroniClusterUnlocked
|
With the same cluster and namespace labels:
|
PostgresqlReplicaDown
|
|
PrometheusErrorSendingAlertsMajor
|
PrometheusErrorSendingAlertsWarning with the same alertmanager
and pod labels
|
SystemDiskFullMajor
|
SystemDiskFullWarning with the same device , mountpoint , and
node labels
|
SystemDiskInodesFullMajor
|
SystemDiskInodesFullWarning with the same device ,
mountpoint , and node labels
|
SystemLoadTooHighCritical
|
SystemLoadTooHighWarning with the same node label
|
SystemMemoryFullMajor
|
SystemMemoryFullWarning with the same node label
|
SSLCertExpirationHigh
|
SSLCertExpirationMedium with the same instance label
|
TelegrafSMARTTargetsOutage
|
TelegrafSMARTTargetDown
|
TelemeterServerTargetDown
|
TelemeterServerFederationTargetDown
|