Documentation Portal

NTP

NTP¶

This section describes the alerts for the NTP service.

NtpOffsetTooHigh¶

Severity	Warning
Summary	The NTP offset on the `{{ $labels.host }}` node is more than 200 milliseconds for 2 minutes.
Raise condition	`ntpq_offset >= 200`
Description	Raises when the NTP offset on a node reaches the threshold of 200 milliseconds for 2 minutes, typically indicating that the host fails to synchronize the time with the NTP server or the NTP server is malfunctioning. A too high offset affects the metrics collection and querying the time series database. The `host` label in the raised alert contains the host name of the affected node.
Troubleshooting	Synchronize the time with a properly operating NTP server: Enter the NTP CLI by running `ntpq` on the affected node. List the NTP peers by running `peers` and exit the NTP CLI. Set the date and time using `ntpdate -q <peer_from_list>`. If the issue persists: Enter the NTP CLI by running `ntpq` on the affected node. List the associations by running `as`. Investigate the reason for the server rejection by running `rv <association_id>` with a chosen association ID. Inspect the output for the occurrence of `flash` code, `rootdispersion`, `dispersion`, and `jitter`. Avoid syncing with servers that have a large dispersion.
Tuning	For example, to change the threshold of the NTP offset to `500`: On the cluster level of the Reclass model, create a common file for all alert customizations. Skip this step to use an existing defined file. Create a file for alert customizations: touch cluster/<cluster_name>/stacklight/custom/alerts.yml Define the new file in `cluster/<cluster_name>/stacklight/server.yml`: classes: - cluster.<cluster_name>.stacklight.custom.alerts ... In the defined alert customizations file, modify the alert threshold by overriding the `if` parameter: parameters: prometheus: server: alert: NtpOffsetTooHigh: if: >- ntpq_offset >= 500 From the Salt Master node, apply the changes: salt 'I@prometheus:server' state.sls prometheus.server Verify the updated alert definition in the Prometheus web UI.

updated: 2025-01-10 08:56

NGINX

View Previous Section

Open vSwitch

View Next Section