Configure Graceful Node Shutdown with kubelet node profiles¶
Available since MKE 3.7.12
To configure Graceful Node Shutdown grace periods in MKE cluster, set the
following flags in the [cluster_config.custom_kubelet_flags_profiles]
section of the MKE configuration file:
--shutdown-grace-period=0s
--shutdown-grace-period-critical-pods=0s
The GracefulNodeShutdown
feature gate is enabled by default, with
shutdown grace period parameters both set to 0s
.
When you add your custom kubelet profiles, insert and set the
GracefulNodeShutdown
flags in the MKE configuration file. For example:[cluster_config.custom_kubelet_flags_profiles] manager = "--shutdown-grace-period=30s --shutdown-grace-period-critical-pods=20s" worker = "--shutdown-grace-period=60s --shutdown-grace-period-critical-pods=50s"
From a labeled node with
GracefulNodeShutdown
enabled, verify that the inhibitor lock is taken by the kubelet:systemd-inhibit --list Who: kubelet (UID 0/root, PID 337097/kubelet) What: shutdown Why: Kubelet needs time to handle node shutdown Mode: delay 1 inhibitors listed.
Troubleshooting¶
The Graceful Node Shutdown feature may present various issues.
Missing kubelet inhibitors and ucp-kubelet errors¶
A Graceful Node Shutdown configuration of --shutdown-grace-period=60s
--shutdown-grace-period-critical-pods=50s
can result in the following error
message:
Failed to start node shutdown manager" err="node shutdown manager was unable
to update logind InhibitDelayMaxSec to 60s (ShutdownGracePeriod), current
value of InhibitDelayMaxSec (30s) is less than requested ShutdownGracePeriod
The error message indicates missing kubelet inhibitors and ucp-kubelet errors,
due to the current default InhibitDelayMaxSec
setting of 30s
in
the operating system.
You can resolve the issue either by changing the InhibitDelayMaxSec
parameter setting to a larger value or by removing it.
The configuration file that contains the InhibitDelayMaxSec
parameter
setting can be located in any one of a number of locations:
/etc/systemd/logind.conf
/etc/systemd/logind.conf.d/.conf
/run/systemd/logind.conf.d/.conf
/usr/lib/systemd/logind.conf.d/*.conf
/usr/lib/systemd/logind.conf.d/unattended-upgrades-logind-maxdelay.conf
Graceful node drain does not occur and the pods are not terminated¶
Due to the systemd PrepareForShutdown
signal not being sent to dbus, in
some operating system distributions graceful node drain does not occur and the
pods are not terminated.
Currently, in the following cases, the PrepareForShutdown
signal is
triggered and the Graceful Node Shutdown feature works as intended:
systemctl reboot
systemctl poweroff
shutdown -h
shutdown -h +0
shutdown -h +5