Configure kernel parameters

MKE uses a number of kernel parameters in its deployment.

Note

The MKE parameter values are not set by MKE, but by either MCR or an upstream component.

kernel.<subtree>

Parameter

Values

Description

panic

  • Default: Distribution dependent

  • MKE: 1

Sets the number of seconds the kernel waits to reboot following a panic.

Note

The kernel.panic parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

panic_on_oops

  • Default: Distribution dependent

  • MKE: 1

Sets whether the kernel should panic on an oops rather than continuing to attempt operations.

Note

The kernel.panic_on_oops parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

keys.root_maxkeys

  • Default: 1000000

  • MKE: 1000000

Sets the maximum number of keys that the root user (UID 0 in the root user namespace) can own.

Note

The kernel.keys.root_maxkeys parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

keys.root_maxbytes

  • Default: 25000000

  • MKE: 25000000

Sets the maximum number of bytes of data that the root user (UID 0 in the root user namespace) can hold in the payloads of the keys owned by root.

Allocate 25 bytes per key multiplied by the number of kernel/keys/root_maxkeys.

Note

The keys.root_maxbytes parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

pty.nr

  • Default: Dependent on number of logins. Not user-configurable.

  • MKE: 1

Sets the number of open PTYs.

net.bridge.bridge-nf-<subtree>

Parameter

Values

Description

call-arptables

  • Default: No default

  • MKE: 1

Sets whether arptables rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

call-ip6tables

  • Default: No default

  • MKE: 1

Sets whether ip6tables rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

call-iptables

  • Default: No default

  • MKE: 1

Sets whether iptables rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

filter-pppoe-tagged

  • Default: No default

  • MKE: 0

Sets whether netfilter rules apply to bridged PPPOE network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

filter-vlan-tagged

  • Default: No default

  • MKE: 0

Sets whether netfilter rules apply to bridged VLAN network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

pass-vlan-input-dev

  • Default: No default

  • MKE: 0

Sets whether netfilter strips the incoming VLAN interface name from bridged traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

net.fan.<subtree>

Parameter

Values

Description

vxlan

  • Default: No default

  • MKE: 4

Sets the version of the VXLAN module on older kernels, not present on kernel version 5.x. If the VXLAN module is not loaded this key is not present.

net.ipv4.<subtree>

Note

  • The *.vs.* default values persist, changing only because the ipvs kernel module was not previously loaded. For more information, refer to the Linux kernel documentation.

Parameter

Values

Description

conf.all.accept_redirects

  • Default: 1

  • MKE: 0

Sets whether ICMP redirects are permitted. This key affects all interfaces.

conf.all.forwarding

  • Default: 0

  • MKE: 1

Sets whether network traffic is forwarded. This key affects all interfaces.

conf.all.route_localnet

  • Default: 0

  • MKE: 1

Sets 127/8 for local routing. This key affects all interfaces.

conf.default.forwarding

  • Default: 0

  • MKE: 1

Sets 127/8 for local routing. This key affects new interfaces.

conf.lo.forwarding

  • Default: 0

  • MKE: 1

Sets forwarding for localhost traffic.

ip_forward

  • Default: 0

  • MKE: 1

Sets whether traffic forwards between interfaces. For Kubernetes to run, this parameter must be set to 1.

vs.am_droprate

  • Default: 10

  • MKE: 10

Sets the always mode drop rate used in mode 3 of the drop_rate defense.

vs.amemthresh

  • Default: 1024

  • MKE: 1024

Sets the available memory threshold in pages, which is used in the automatic modes of defense. When there is not enough available memory, this enables the strategy and the variable is set to 2. Otherwise, the strategy is disabled and the variable is set to 1.

vs.backup_only

  • Default: 0

  • MKE: 0

Sets whether the director function is disabled while the server is in back-up mode, to avoid packet loops for DR/TUN methods.

vs.cache_bypass

  • Default: 0

  • MKE: 0

Sets whether packets forward directly to the original destination when no cache server is available and the destination address is not local (iph->daddr is RTN_UNICAST). This mostly applies to transparent web cache clusters.

vs.conn_reuse_mode

  • Default: 1

  • MKE: 1

Sets how IPVS handles connections detected on port reuse. It is a bitmap with the following values:

  • 0 disables any special handling on port reuse. The new connection is delivered to the same real server that was servicing the previous connection, effectively disabling expire_nodest_conn.

  • bit 1 enables rescheduling of new connections when it is safe. That is, whenever expire_nodest_conn and for TCP sockets, when the connection is in TIME_WAIT state (which is only possible if you use NAT mode).

  • bit 2 is bit 1 plus, for TCP connections, when connections are in FIN_WAIT state, as this is the last state seen by load balancer in Direct Routing mode. This bit helps when adding new real servers to a very busy cluster.

vs.conntrack

  • Default: 0

  • MKE: 0

Sets whether connection-tracking entries are maintained for connections handled by IPVS. Enable if connections handled by IPVS are to be subject to stateful firewall rules. That is, iptables rules that make use of connection tracking. Otherwise, disable this setting to optimize performance. Connections handled by the IPVS FTP application module have connection tracking entries regardless of this setting, which is only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.

vs.drop_entry

  • Default: 0

  • MKE: 0

Sets whether entries are randomly dropped in the connection hash table, to collect memory back for new connections. In the current code, the drop_entry procedure can be activated every second, then it randomly scans 1/32 of the whole and drops entries that are in the SYN-RECV/SYNACK state, which should be effective against syn-flooding attack.

The valid values of drop_entry are 0 to 3, where 0 indicates that the strategy is always disabled, 1 and 2 indicate automatic modes (when there is not enough available memory, the strategy is enabled and the variable is automatically set to 2, otherwise the strategy is disabled and the variable is set to 1), and 3 indicates that the strategy is always enabled.

vs.drop_packet

  • Default: 0

  • MKE: 0

Sets whether rate packets are dropped prior to being forwarded to real servers. Rate 1 drops all incoming packets.

The value definition is the same as that for drop_entry. In automatic mode, the following formula determines the rate: rate = amemthresh / (amemthresh - available_memory) when available memory is less than the available memory threshold. When mode 3 is set, the always mode drop rate is controlled by the /proc/sys/net/ipv4/vs/am_droprate.

vs.expire_nodest_conn

  • Default: 0

  • MKE: 0

Sets whether the load balancer silently drops packets when its destination server is not available. This can be useful when the user-space monitoring program deletes the destination server (due to server overload or wrong detection) and later adds the server back, and the connections to the server can continue.

If this feature is enabled, the load balancer terminates the connection immediately whenever a packet arrives and its destination server is not available, after which the client program will be notified that the connection is closed. This is equivalent to the feature that is sometimes required to flush connections when the destination is not available.

vs.ignore_tunneled

  • Default: 0

  • MKE: 0

Sets whether IPVS configures the ipvs_property on all packets of unrecognized protocols. This prevents users from routing such tunneled protocols as IPIP, which is useful in preventing the rescheduling packets that have been tunneled to the IPVS host (that is, to prevent IPVS routing loops when IPVS is also acting as a real server).

vs.nat_icmp_send

  • Default: 0

  • MKE: 0

Sets whether ICMP error messages (ICMP_DEST_UNREACH) are sent for VS/NAT when the load balancer receives packets from real servers but the connection entries do not exist.

vs.pmtu_disc

  • Default: 0

  • MKE: 0

Sets whether all DF packets that exceed the PMTU are rejected with FRAG_NEEDED, irrespective of the forwarding method. For the TUN method, the flag can be disabled to fragment such packets.

vs.schedule_icmp

  • Default: 0

  • MKE: 0

Sets whether scheduling ICMP packets in IPVS is enabled.

vs.secure_tcp

  • Default: 0

  • MKE: 0

Sets the use of a more complicated TCP state transition table. For VS/NAT, the secure_tcp defense delays entering the TCP ESTABLISHED state until the three-way handshake completes. The value definition is the same as that of drop_entry and drop_packet.

vs.sloppy_sctp

  • Default: 0

  • MKE: 0

Sets whether IPVS is permitted to create a connection state on any packet, rather than an SCTP INIT only.

vs.sloppy_tcp

  • Default: 0

  • MKE: 0

Sets whether IPVS is permitted to create a connection state on any packet, rather than a TCP SYN only.

vs.snat_reroute

  • Default: 0

  • MKE: 1

Sets whether the route of SNATed packets is recalculated from real servers as if they originate from the director. If disabled, SNATed packets are routed as if they have been forwarded by the director.

If policy routing is in effect, then it is possible that the route of a packet originating from a director is routed differently to a packet being forwarded by the director.

If policy routing is not in effect, then the recalculated route will always be the same as the original route. It is an optimization to disable snat_reroute and avoid the recalculation.

vs.sync_persist_mode

  • Default: 0

  • MKE: 0

Sets the synchronization of connections when using persistence. The possible values are defined as follows:

  • 0 means all types of connections are synchronized.

  • 1 attempts to reduce the synchronization traffic depending on the connection type. For persistent services, avoid synchronization for normal connections, do it only for persistence templates. In such case, for TCP and SCTP it may need enabling sloppy_tcp and sloppy_sctp flags on back-up servers. For non-persistent services such optimization is not applied, mode 0 is assumed.

vs.sync_ports

  • Default: 1

  • MKE: 1

Sets the number of threads that the master and back-up servers can use for sync traffic. Every thread uses a single UDP port, thread 0 uses the default port 8848, and the last thread uses port 8848+sync_ports-1.

vs.sync_qlen_max

  • Default: Calculated

  • MKE: Calculated

Sets a hard limit for queued sync messages that are not yet sent. It defaults to 1/32 of the memory pages but actually represents number of messages. It will protect us from allocating large parts of memory when the sending rate is lower than the queuing rate.

vs.sync_refresh_period

  • Default: 0

  • MKE: 0

Sets (in seconds) the difference in the reported connection timer that triggers new sync messages. It can be used to avoid sync messages for the specified period (or half of the connection timeout if it is lower) if the connection state has not changed since last sync.

This is useful for normal connections with high traffic, to reduce sync rate. Additionally, retry sync_retries times with period of sync_refresh_period/8.

vs.sync_retries

  • Default: 0

  • MKE: 0

Sets sync retries with period of sync_refresh_period/8. Useful to protect against loss of sync messages. The range of the sync_retries is 0 to 3.

vs.sync_sock_size

  • Default: 0

  • MKE: 0

Sets the configuration of SNDBUF (master) or RCVBUF (slave) socket limit. Default value is 0 (preserve system defaults).

vs.sync_threshold

  • Default: 3 50

  • MKE: 3 50

Sets the synchronization threshold, which is the minimum number of incoming packets that a connection must receive before the connection is synchronized. A connection will be synchronized every time the number of its incoming packets modulus sync_period equals the threshold. The range of the threshold is 0 to sync_period. When sync_period and sync_refresh_period are 0, send sync only for state changes or only once when packets matches sync_threshold.

vs.sync_version

  • Default: 1

  • MKE: 1

Sets the version of the synchronization protocol to use when sending synchronization messages. The possible values are:

  • ``0 ``selects the original synchronization protocol (version 0). This should be used when sending synchronization messages to a legacy system that only understands the original synchronization protocol.

  • 1 selects the current synchronization protocol (version 1). This should be used whenever possible.

Kernels with this sync_version entry are able to receive messages of both version 1 and version 2 of the synchronization protocol.

net.netfilter.nf_conntrack_<subtree>

Note

  • The net.netfilter.nf_conntrack_<subtree> default values persist, changing only when the nf_conntrack kernel module has not been previously loaded. For more information, refer to the Linux kernel documentation.

Parameter

Values

Description

acct

  • Default: 0

  • MKE: 0

Sets whether connection-tracking flow accounting is enabled. Adds 64-bit byte and packet counter per flow.

buckets

  • Default: Calculated

  • MKE: Calculated

Sets the size of the hash table. If not specified during module loading, the default size is calculated by dividing total memory by 16384 to determine the number of buckets. The hash table will never have fewer than 1024 and never more than 262144 buckets. This sysctl is only writeable in the initial net namespace.

checksum

  • Default: 0

  • MKE: 0

Sets whether the checksum of incoming packets is verified. Packets with bad checksums are in an invalid state. If this is enabled, such packets are not considered for connection tracking.

dccp_loose

  • Default: 0

  • MKE: 1

Sets whether picking up already established connections for Datagram Congestion Control Protocol (DCCP) is permitted.

dccp_timeout_closereq

  • Default: Distribution dependent

  • MKE: 64

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_closing

  • Default: Distribution dependent

  • MKE: 64

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_open

  • Default: Distribution dependent

  • MKE: 43200

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_partopen

  • Default: Distribution dependent

  • MKE: 480

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_request

  • Default: Distribution dependent

  • MKE: 240

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_respond

  • Default: Distribution dependent

  • MKE: 480

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_timewait

  • Default: Distribution dependent

  • MKE: 240

The parameter description is not yet available in the Linux kernel documentation.

events

  • Default: 0

  • MKE: 1

Sets whether the connection tracking code provides userspace with connection-tracking events through ctnetlink.

expect_max

  • Default: Calculated

  • MKE: 1024

Sets the maximum size of the expectation table. The default value is nf_conntrack_buckets / 256. The minimum is 1.

frag6_high_thresh

  • Default: Calculated

  • MKE: 4194304

Sets the maximum memory used to reassemble IPv6 fragments. When nf_conntrack_frag6_high_thresh bytes of memory is allocated for this purpose, the fragment handler tosses packets until nf_conntrack_frag6_low_thresh is reached. The size of this parameter is calculated based on system memory.

frag6_low_thresh

  • Default: Calculated

  • MKE: 3145728

See nf_conntrack_frag6_high_thresh. The size of this parameter is calculated based on system memory.

frag6_timeout

  • Default: 60

  • MKE: 60

Sets the time to keep an IPv6 fragment in memory.

generic_timeout

  • Default: 600

  • MKE: 600

Sets the default for a generic timeout. This refers to layer 4 unknown and unsupported protocols.

gre_timeout

  • Default: 30

  • MKE: 30

Set the GRE timeout from the conntrack table.

gre_timeout_stream

  • Default: 180

  • MKE: 180

Sets the GRE timeout for streamed connections. This extended timeout is used when a GRE stream is detected.

helper

  • Default: 0

  • MKE: 0

Sets whether the automatic conntrack helper assignment is enabled. If disabled, you must set up iptables rules to assign helpers to connections. See the CT target description in the iptables-extensions(8) main page for more information.

icmp_timeout

  • Default: 30

  • MKE: 30

Sets the default for ICMP timeout.

icmpv6_timeout

  • Default: 30

  • MKE: 30

Sets the default for ICMP6 timeout.

log_invalid

  • Default: 0

  • MKE: 0

Sets whether invalid packets of a type specified by value are logged.

max

  • Default: Calculated

  • MKE: 131072

Sets the maximum number of allowed connection tracking entries. This value is set to nf_conntrack_buckets by default.

Connection-tracking entries are added to the table twice, once for the original direction and once for the reply direction (that is, with the reversed address). Thus, with default settings a maxed-out table will have an average hash chain length of 2, not 1.

sctp_timeout_closed

  • Default: Distribution dependent

  • MKE: 10

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_cookie_echoed

  • Default: Distribution dependent

  • MKE: 3

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_cookie_wait

  • Default: Distribution dependent

  • MKE: 3

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_established

  • Default: Distribution dependent

  • MKE: 432000

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_heartbeat_acked

  • Default: Distribution dependent

  • MKE: 210

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_heartbeat_sent

  • Default: Distribution dependent

  • MKE: 30

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_shutdown_ack_sent

  • Default: Distribution dependent

  • MKE: 3

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_shutdown_recd

  • Default: Distribution dependent

  • MKE: 0

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_shutdown_sent

  • Default: Distribution dependent

  • MKE: 0

The parameter description is not yet available in the Linux kernel documentation.

tcp_be_liberal

  • Default: 0

  • MKE: 0

Sets whether only out of window RST segments are marked as INVALID.

tcp_loose

  • Default: 0

  • MKE: 1

Sets whether already established connections are picked up.

tcp_max_retrans

  • Default: 3

  • MKE: 3

Sets the maximum number of packets that can be retransmitted without receiving an acceptable ACK from the destination. If this number is reached, a shorter timer is started. Timeout for unanswered.

tcp_timeout_close

  • Default: Distribution dependent

  • MKE: 10

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_close_wait

  • Default: Distribution dependent

  • MKE: 3600

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_fin_wait

  • Default: Distribution dependent

  • MKE: 120

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_last_ack

  • Default: Distribution dependent

  • MKE: 30

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_max_retrans

  • Default: Distribution dependent

  • MKE: 300

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_syn_recv

  • Default: Distribution dependent

  • MKE: 60

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_syn_sent

  • Default: Distribution dependent

  • MKE: 120

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_time_wait

  • Default: Distribution dependent

  • MKE: 120

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_unacknowledged

  • Default: Distribution dependent

  • MKE: 30

The parameter description is not yet available in the Linux kernel documentation.

timestamp

  • Default: 0

  • MKE: 0

Sets whether connection-tracking flow timestamping is enabled.

udp_timeout

  • Default: 30

  • MKE: 30

Sets the UDP timeout.

udp_timeout_stream

  • Default: 120

  • MKE: 120

Sets the extended timeout that is used whenever a UDP stream is detected.

net.nf_conntrack_<subtree>

Note

  • The net.nf_conntrack_<subtree> default values persist, changing only when the nf_conntrack kernel module has not been previously loaded. For more information, refer to the Linux kernel documentation.

Parameter

Values

Description

max

  • Default: Calculated

  • MKE: 131072

Sets the maximum number of connections to track. The size of this parameter is calculated based on system memory.

vm.overcommit_<subtree>

Parameter

Values

Description

memory

  • Default: Distribution dependent

  • MKE: 1

Sets whether the kernel permits memory overcommitment from malloc() calls.

Note

The vm.overcommit_memory parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

vm.panic_<subtree>

Parameter

Values

Description

on_oom

  • Default: 0

  • MKE: 0

Sets whether the kernel should panic on an out-of-memory, rather than continuing to attempt operations.

When set to 0 the kernel invokes the oom_killer, which kills the rogue processes and thus preserves the system.

Note

The vm.panic.on_oom parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.