NTP
This section describes the alerts for the NTP service.
NtpOffsetTooHigh
Severity |
Warning |
Summary |
The NTP offset on the {{ $labels.host }} node is more than 200
milliseconds for 2 minutes. |
Raise condition |
ntpq_offset >= 200 |
Description |
Raises when the NTP offset on a node reaches the threshold of 200
milliseconds for 2 minutes, typically indicating that the host fails to
synchronize the time with the NTP server or the NTP server is
malfunctioning. A too high offset affects the metrics collection and
querying the time series database. The host label in the raised
alert contains the host name of the affected node. |
Troubleshooting |
Synchronize the time with a properly operating NTP server:
- Enter the NTP CLI by running
ntpq on the affected node.
- List the NTP peers by running
peers and exit the NTP CLI.
- Set the date and time using
ntpdate -q <peer_from_list> .
If the issue persists:
- Enter the NTP CLI by running
ntpq on the affected node.
- List the associations by running
as .
- Investigate the reason for the server rejection by running
rv <association_id> with a chosen association ID.
- Inspect the output for the occurrence of
flash code,
rootdispersion , dispersion , and jitter . Avoid syncing
with servers that have a large dispersion.
|
Tuning |
For example, to change the threshold of the NTP offset to 500 :
On the cluster level of the Reclass model, create a common file for
all alert customizations. Skip this step to use an existing defined
file.
Create a file for alert customizations:
touch cluster/<cluster_name>/stacklight/custom/alerts.yml
Define the new file in
cluster/<cluster_name>/stacklight/server.yml :
classes:
- cluster.<cluster_name>.stacklight.custom.alerts
...
In the defined alert customizations file, modify the alert threshold
by overriding the if parameter:
parameters:
prometheus:
server:
alert:
NtpOffsetTooHigh:
if: >-
ntpq_offset >= 500
From the Salt Master node, apply the changes:
salt 'I@prometheus:server' state.sls prometheus.server
Verify the updated alert definition in the Prometheus web UI.
|