PostgreSQL and Patroni

This section lists the alerts for the PoststgreSQL and Patroni services.


PostgresqlDataPageCorruption

Severity

Critical

Summary

Patroni cluster member is experiencing data page corruption.

Description

The {{ $labels.namespace }}/{{ $labels.pod }} Patroni Pod in the {{ $labels.cluster }} cluster fails to calculate the data page checksum due to a possible hardware fault.

PostgresqlDeadlocksDetected

Severity

Warning

Summary

PostgreSQL transactions deadlocks.

Description

The transactions submitted to the {{ $labels.datname }} database in the {{ $labels.cluster }} Patroni cluster in the {{ $labels.namespace }} namespace are experiencing deadlocks.

PostgresqlInsufficientWorkingMemory

Severity

Warning

Summary

Insufficient memory for PostgreSQL queries.

Description

The query data does not fit into working memory of the {{ $labels.pod }} Pod in the {{ $labels.cluster }} Patroni cluster in the {{ $labels.namespace }} namespace.

PostgresqlPatroniClusterSplitBrain

Severity

Critical

Summary

Patroni cluster split-brain detected.

Description

The {{ $labels.cluster }} Patroni cluster in the {{ $labels.namespace }} namespace has multiple primaries, split-brain detected.

PostgresqlPatroniClusterUnlocked

Severity

Major

Summary

Patroni cluster primary node is missing.

Description

The {{ $labels.cluster }} Patroni cluster in the {{ $labels.namespace }} namespace is down due to missing primary node.

PostgresqlReplicaDown

Severity

Warning

Summary

Patroni cluster has replicas with inoperable PostgreSQL.

Description

The {{ $labels.cluster }} Patroni cluster in the {{ $labels.namespace }} namespace has {{ $value }}% of replicas with inoperable PostgreSQL.

PostgresqlReplicationNonStreamingReplicas

Severity

Warning

Summary

Patroni cluster has non-streaming replicas.

Description

The {{ $labels.cluster }} Patroni cluster in the {{ $labels.namespace }} namespace has replicas not streaming segments from the primary node.

PostgresqlReplicationPaused

Severity

Major

Summary

Replication has stopped.

Description

Replication has stopped on the {{ $labels.namespace }}/{{ $labels.pod }} replica Pod in the {{ $labels.cluster }} cluster.

PostgresqlReplicationSlowWalApplication

Severity

Warning

Summary

WAL segment application is slow.

Description

Slow replication while applying WAL segments on the {{ $labels.namespace }}/{{ $labels.pod }} replica Pod in the {{ $labels.cluster }} cluster.

PostgresqlReplicationSlowWalDownload

Severity

Warning

Summary

Streaming replication is slow.

Description

Slow replication while downloading WAL segments for the {{ $labels.namespace }}/{{ $labels.pod }} replica Pod in the {{ $labels.cluster }} cluster.

PostgresqlReplicationWalArchiveWriteFailing

Severity

Major

Summary

Patroni cluster WAL segment writes are failing.

Description

The {{ $labels.namespace }}/{{ $labels.pod }} Patroni Pod in the {{ $labels.cluster }} cluster fails to write replication segments.

PostgresqlTargetsOutage

Replaced with PostgresqlTargetDown in MCC 2.25.0 (17.0.0 and 16.0.0)

Severity

Critical

Summary

Patroni cluster Prometheus targets outage.

Description

Prometheus fails to scrape metrics from 2/3 of Patroni {{ $labels.cluster }} cluster endpoints (more than 1/10 failed scrapes).

PostgresqlTargetDown

“Since MCC 2.25.0 (17.0.0 and 16.0.0) to replace PostgresqlTargetsOutage”

Severity

Critical

Summary

Patroni cluster Prometheus target down.

Description

Prometheus fails to scrape metrics from the {{ $labels.pod }} Pod of the {{ $labels.cluster }} cluster on the {{ $labels.node }} node.