Ceph

Ceph

This section describes the alerts for the Ceph cluster.


CephClusterHealthMinor

Severity

Minor

Summary

Ceph cluster health is WARNING.

Description

The Ceph cluster is in the WARNING state. For details, run ceph -s.


CephClusterHealthCritical

Severity

Critical

Summary

Ceph cluster health is CRITICAL.

Description

The Ceph cluster is in the CRITICAL state. For details, run ceph -s.


CephMonQuorumAtRisk

Severity

Major

Summary

Ceph cluster quorum is at risk.

Description

The Ceph cluster quorum is low.


CephOSDDownMinor

Severity

Minor

Summary

Ceph OSDs are down.

Description

{{ $value }} of Ceph OSDs in the Ceph cluster are down. For details, run ceph osd tree.


CephOSDDiskNotResponding

Severity

Critical

Summary

Disk is not responding.

Description

The {{ $labels.device }} disk device is not responding on the {{ $labels.host }} host.


CephOSDDiskUnavailable

Severity

Critical

Summary

Disk is not accessible.

Description

The {{ $labels.device }} disk device is not accessible on the {{ $labels.host }} host.


CephClusterFullWarning

Severity

Warning

Summary

Ceph cluster is nearly full.

Description

The Ceph cluster utilization has crossed 85%, expansion is required.


CephClusterFullCritical

Severity

Critical

Summary

Ceph cluster is full.

Description

The Ceph cluster utilization has crossed 95%, immediate expansion is required.


CephOSDPgNumTooHighWarning

Severity

Warning

Summary

Some Ceph OSDs have more than 200 PGs.

Description

Some Ceph OSDs contain more than 200 Placement Groups. This may have a negative impact on the cluster performance. For details, run ceph pg dump.


CephOSDPgNumTooHighCritical

Severity

Critical

Summary

Some Ceph OSDs have more than 300 PGs.

Description

Some Ceph OSDs contain more than 300 Placement Groups. This may have a negative impact on the cluster performance. For details, run ceph pg dump.


CephMonHighNumberOfLeaderChanges

Severity

Warning

Summary

Too many leader changes occur in the Ceph cluster.

Description

{{ $value }} leader changes per minute occur for the {{ $labels.instance }} instance of the {{ $labels.job }} Ceph Monitor.


CephNodeDown

Severity

Critical

Summary

Ceph node {{ $labels.node }} went down.

Description

The {{ $labels.node }} Ceph node is down and requires immediate verification.


CephOSDVersionMismatch

Severity

Warning

Summary

Multiple versions of Ceph OSDs are running.

Description

{{ $value }} different versions of Ceph OSD components are running.


CephMonVersionMismatch

Severity

Warning

Summary

Multiple versions of Ceph Monitors are running.

Description

{{ $value }} different versions of Ceph Monitor components are running.


CephPGInconsistent

Severity

Minor

Summary

Too many inconsistent Ceph PGs.

Description

The Ceph cluster detects inconsistencies in one or more replicas of an object in {{ $value }} Placement Groups.


CephPGUndersized

Severity

Minor

Summary

Too many undersized Ceph PGs.

Description

The Ceph cluster reports {{ $value }} Placement Groups have fewer copies than the configured pool replication level.