This documentation provides information on how to deploy and operate a
Mirantis Kubernetes Engine (MKE). The documentation is intended to help
operators to understand the core concepts of the product. The documentation
provides sufficient information to deploy and operate the solution.
The information provided in this documentation set is being constantly
improved and amended based on the feedback and kind requests from the
consumers of MKE.
Mirantis Kubernetes Engine (MKE, formerly Universal Control Plane or UCP)
is the industry-leading container orchestration platform for developing
and running modern applications at scale, on private clouds, public clouds,
and on bare metal.
MKE delivers immediate value to your business by allowing you to adopt modern
application development and delivery models that are cloud-first and
cloud-ready. With MKE you get a centralized place with a graphical UI to manage
and monitor your Kubernetes and/or Swarm cluster instance.
Your business benefits from using MKE as a container orchestration platform,
especially in the following use cases:
More than one container orchestrator
Whether your application requirements are complex and require medium
to large clusters or simple ones that can be deployed quickly on development
environments, MKE gives you a container orchestration choice.
Deploy Kubernetes, Swarm, or both types of clusters and manage them
on a single MKE instance or centrally manage your instance using
Mirantis Container Cloud.
Robust and scalable applications deployment
Monolithic applications are old school, microservices are the modern way
to deploy an application at scale. Delivering applications through
an automated CI/CD pipeline can dramatically improve time-to-market
and service agility. Adopting microservices becomes a lot easier
when using Kubernetes and/or Swarm clusters to deploy and test
microservice-based applications.
Multi-tenant software offerings
Containerizing existing monolithic SaaS applications enables quicker
development cycles, automated continuous integration and deployment.
But these applications need to allow multiple users to share a single
instance of a software application. MKE can operate multi-tenant
environments, isolate teams and organizations, separate cluster
resources, and so on.
The MKE Reference Architecture provides a technical overview of Mirantis
Kubernetes Engine (MKE). It is your source for the product hardware and
software specifications, standards, component information, and configuration
detail.
Mirantis Kubernetes Engine (MKE) allows you to adopt modern application
development and delivery models that are cloud-first and cloud-ready. With MKE
you get a centralized place with a graphical UI to manage and monitor your
Kubernetes and/or Swarm cluster instance.
The core MKE components are:
ucp-cluster-agent
Reconciles the cluster-wide state, including Kubernetes
addons such as Kubecompose and KubeDNS, managing replication configurations
for the etcd and RethinkDB clusters, and syncing the node inventories of
SwarmKit and Swarm Classic. This component is a single-replica service that
runs on any manager node in the cluster.
ucp-manager-agent
Reconciles the node-local state on manager nodes,
including the configuration of the local Docker daemon, local date volumes,
certificates, and local container components. Each manager node in the
cluster runs a task from this service.
ucp-worker-agent
Performs the same reconciliation operations as
ucp-manager-agent but on worker nodes. This component runs a task on
each worker node.
The following MKE component names differ based on the node’s operating
system:
Take careful note of the minimum and recommended hardware requirements for MKE
manager and worker nodes prior to deployment.
Note
High availability (HA) installations require transferring files between
hosts.
On manager nodes, MKE only supports the workloads it requires to run.
Windows container images are typically larger than Linux
container images. As such, provision more local storage for Windows
nodes and for any MSR repositories that store Windows container
images.
Manager nodes manage a swarm and persist the swarm state. Using several
containers per node, the ucp-manager-agent automatically deploys all
MKE components on manager nodes, including the MKE web UI and the data stores
that MKE uses.
Note
Some Kubernetes components are run as Swarm services because the MKE control
plane is itself a Docker Swarm cluster.
The following tables detail the MKE services that run on manager
nodes:
The centralized service for identity and authentication used by MKE and
MSR.
ucp-auth-store
A container that stores authentication configurations and data for
users, organizations, and teams.
ucp-auth-worker
A container that performs scheduled LDAP synchronizations and cleans
authentication and authorization data.
ucp-client-root-ca
A certificate authority to sign client bundles.
ucp-cluster-agent
The agent that monitors the cluster-wide MKE components. Runs on only
one manager node.
ucp-cluster-root-ca
A certificate authority used for TLS communication between MKE
components.
ucp-controller
The MKE web server.
ucp-hardware-info
A container for collecting disk/hardware information about the host.
ucp-interlock
A container that monitors Swarm workloads configured to use layer 7
routing. Only runs when you enable layer 7 routing.
ucp-interlock-config
A service that manages Interlock configuration.
ucp-interlock-extension
A service that verifies the run status of the Interlock extension.
ucp-interlock-proxy
A service that provides load balancing and proxying for Swarm workloads.
Runs only when layer 7 routing is enabled.
ucp-kube-apiserver
A master component that serves the Kubernetes API. It persists its state
in etcd directly, and all other components communicate directly with
the API server. The Kubernetes API server is configured to encrypt
Secrets using AES-CBC with a 256-bit key. The encryption key is never
rotated, and the encryption key is stored on manager nodes, in a file
on disk.
ucp-kube-controller-manager
A master component that manages the desired state of controllers and
other Kubernetes objects. It monitors the API server and performs
background tasks when needed.
ucp-kubelet
The Kubernetes node agent running on every node, which is responsible
for running Kubernetes pods, reporting the health of the node, and
monitoring resource usage.
ucp-kube-proxy
The networking proxy running on every node, which enables pods to
contact Kubernetes services and other pods by way of cluster IP
addresses.
ucp-kube-scheduler
A master component that manages Pod scheduling, which communicates with
the API server only to obtain workloads that need to be scheduled.
ucp-kv
A container used to store the MKE configurations. Do not use it in your
applications, as it is for internal use only. Also used by Kubernetes
components.
ucp-manager-agent
The agent that monitors the manager node and ensures that the right MKE
services are running.
ucp-proxy
A TLS proxy that allows secure access from the local Mirantis Container
Runtime to MKE components.
ucp-sf-notifier
A Swarm service that sends notifications to Salesforce when alerts are
configured by OpsCare, and later when they are triggered.
ucp-swarm-manager
A container used to provide backward compatibility with Docker Swarm.
An MKE service that accounts for the removal of dockershim from
Kubernetes as of version 1.24, thus enabling MKE to continue using
Docker as the container runtime.
k8s_calico-kube-controllers
A cluster-scoped Kubernetes controller used to coordinate Calico
networking. Runs on one manager node only.
k8s_calico-node
The Calico node agent, which coordinates networking fabric according
to the cluster-wide Calico configuration. Part of the calico-node
DaemonSet. Runs on all nodes. Configure the container network interface
(CNI) plugin using the --cni-installer-url flag. If this flag is not
set, MKE uses Calico as the default CNI plugin.
k8s_enable-strictaffinity
An init container for Calico controller that sets the StrictAffinity in
Calico networking according to the configured boolean value.
k8s_firewalld-policy_calico-node
An init container for calico-node that verifies whether systems with
firewalld are compatible with Calico.
k8s_install-cni_calico-node
A container in which the Calico CNI plugin binaries are installed and
configured on each host. Part of the calico-node DaemonSet. Runs on
all nodes.
k8s_ucp-coredns_coredns
The CoreDNS plugin, which provides service discovery for Kubernetes
services and Pods.
k8s_ucp-gatekeeper_gatekeeper-controller-manager
The Gatekeeper manager controller for Kubernetes that provides policy
enforcement. Only runs when OPA Gatekeeper is enabled in MKE.
k8s_ucp-gatekeeper-audit_gatekeeper-audit
The audit controller for Kubernetes that provides audit functionality of
OPA Gatekeeper. Only runs when OPA Gatekeeper is enabled in MKE.
k8s_ucp-kube-compose
A custom Kubernetes resource component that translates Compose files
into Kubernetes constructs. Part of the Compose deployment. Runs on one
manager node only.
k8s_ucp-kube-compose-api
The API server for Kube Compose, which is part of the compose
deployment. Runs on one manager node only.
k8s_ucp-kube-ingress-controller
The Ingress controller for Kubernetes, which provides layer 7 routing
for Kubernertes services. Only runs with Ingress for Kubernetes
enabled.
k8s_ucp-metrics-inventory
A container that generates the inventory targets for Prometheus server.
Part of the Kubernetes Prometheus Metrics plugin.
k8s_ucp-metrics-prometheus
A container used to collect and process metrics for a node. Part of the
Kubernetes Prometheus Metrics plugin.
k8s_ucp-metrics-proxy
A container that runs a proxy for the metrics server. Part of the
Kubernetes Prometheus Metrics plugin.
Worker nodes are instances of MCR that participate in a swarm for the purpose
of executing containers. Such nodes receive and execute tasks dispatched from
manager nodes. Worker nodes must have at least one manager node, as they do not
participate in the Raft distributed state, perform scheduling, or serve
the swarm mode HTTP API.
Note
Some Kubernetes components are run as Swarm services because the MKE control
plane is itself a Docker Swarm cluster.
The following tables detail the MKE services that run on worker nodes.
A container for collecting host information regarding disks and
hardware.
ucp-interlock-config
A service that manages Interlock configuration.
ucp-interlock-extension
A helper service that reconfigures the ucp-interlock-proxy service,
based on the Swarm workloads that are running.
ucp-interlock-proxy
A service that provides load balancing and proxying for swarm
workloads. Only runs when you enable layer 7 routing.
ucp-kube-proxy
The networking proxy running on every node, which enables Pods to
contact Kubernetes services and other Pods through cluster IP
addresses. Named ucp-kube-proxy-win in Windows systems.
ucp-kubelet
The Kubernetes node agent running on every node, which is responsible
for running Kubernetes Pods, reporting the health of the node, and
monitoring resource usage. Named ucp-kubelet-win in Windows
systems.
ucp-pod-cleaner-win
A service that removes all the Kubernetes Pods that remain once
Kubernetes components are removed from Windows nodes. Runs only on
Windows nodes.
ucp-proxy
A TLS proxy that allows secure access from the local Mirantis Container
Runtime to MKE components.
ucp-tigera-node-win
The Calico node agent that coordinates networking fabric for Windows
nodes according to the cluster-wide Calico configuration. Runs on
Windows nodes when Kubernetes is set as the orchestrator.
ucp-tigera-felix-win
A Calico component that runs on every machine that provides endpoints.
Runs on Windows nodes when Kubernetes is set as the orchestrator.
ucp-worker-agent-x and ucp-worker-agent-y
A service that monitors the worker node and ensures that the correct MKE
services are running. The ucp-worker-agent service ensures that only
authorized users and other MKE services can run Docker commands on the
node. The ucp-worker-agent-<x/y> deploys a set of containers onto
worker nodes, which is a subset of the containers that
ucp-manager-agent deploys onto manager nodes. This component is
named ucp-worker-agent-win-<x/y> on Windows nodes.
An MKE service that accounts for the removal of dockershim from
Kubernetes as of version 1.24, thus enabling MKE to continue using
Docker as the container runtime.
k8s_calico-node
The Calico node agent that coordinates networking fabric according to
the cluster-wide Calico configuration. Part of the calico-node
DaemonSet. Runs on all nodes.
k8s_firewalld-policy_calico-node
An init container for calico-node that verifies whether systems with
firewalld are compatible with Calico.
k8s_install-cni_calico-node
A container that installs the Calico CNI plugin
binaries and configuration on each host. Part of the calico-node
DaemonSet. Runs on all nodes.
Admission controllers are plugins that govern and enforce
cluster usage. There are two types of admission controllers:
default and custom. The tables below list the available admission controllers.
For more information, see
Kubernetes documentation: Using Admission Controllers.
Note
You cannot enable or disable custom admission controllers.
Adds a default storage class to PersistentVolumeClaim objects that
do not request a specific storage class.
DefaultTolerationSeconds
Sets the pod default forgiveness toleration to tolerate the
notready:NoExecute and unreachable:NoExecute taints
based on the default-not-ready-toleration-seconds and
default-unreachable-toleration-seconds Kubernetes API server input
parameters if they do not already have toleration for the
node.kubernetes.io/not-ready:NoExecute or
node.kubernetes.io/unreachable:NoExecute taints. The default value
for both input parameters is five minutes.
LimitRanger
Ensures that incoming requests do not violate the constraints in a
namespace LimitRange object.
MutatingAdmissionWebhook
Calls any mutating webhooks that match the request.
NamespaceLifecycle
Ensures that users cannot create new objects in namespaces undergoing
termination and that MKE rejects requests in nonexistent namespaces.
It also prevents users from deleting the reserved default,
kube-system, and kube-public namespaces.
NodeRestriction
Limits the Node and Pod objects that a kubelet can modify.
PersistentVolumeLabel (deprecated)
Attaches region or zone labels automatically to PersistentVolumes as
defined by the cloud provider.
PodNodeSelector
Limits which node selectors can be used within a namespace by reading a
namespace annotation and a global configuration.
ResourceQuota
Observes incoming requests and ensures they do not violate any of the
constraints in a namespace ResourceQuota object.
ServiceAccount
Implements automation for ServiceAccount resources.
ValidatingAdmissionWebhook
Calls any validating webhooks that match the request.
Annotates Docker Compose-on-Kubernetes Stack resources with
the identity of the user performing the request so that the Docker
Compose-on-Kubernetes resource controller can manage Stacks
with correct user authorization.
Detects the deleted ServiceAccount resources to correctly remove
them from the scheduling authorization backend of an MKE node.
Simplifies creation of the RoleBindings and
ClusterRoleBindings resources by automatically converting
user, organization, and team Subject names into their
corresponding unique identifiers.
Prevents users from deleting the built-in cluster-admin,
ClusterRole, or ClusterRoleBinding resources.
Prevents under-privileged users from creating or updating
PersistentVolume resources with host paths.
Works in conjunction with the built-in PodSecurityPolicies
admission controller to prevent under-privileged users from
creating Pods with privileged options. To grant non-administrators
and non-cluster-admins access to privileged attributes, refer to
Use admission controllers for access in the MKE Operations Guide.
CheckImageSigning
Enforces MKE Docker Content Trust policy which, if enabled, requires
that all pods use container images that have been digitally signed by
trusted and authorized users, which are members of one or more teams in
MKE.
UCPNodeSelector
Adds a com.docker.ucp.orchestrator.kubernetes:* toleration to pods
in the kube-system namespace and removes the
com.docker.ucp.orchestrator.kubernetes tolerations from pods in
other namespaces. This ensures that user workloads do not run on
swarm-only nodes, which MKE taints with
com.docker.ucp.orchestrator.kubernetes:NoExecute. It also adds a
node affinity to prevent pods from running on manager nodes depending
on MKE settings.
Every Kubernetes Pod includes an empty pause container, which bootstraps the
Pod to establish all of the cgroups, reservations, and namespaces before its
individual containers are created. The pause container image is always present,
so the pod resource allocation happens instantaneously as containers are
created.
To display pause containers:
When using the client bundle, pause containers are hidden by default.
To display pause containers when using the client bundle:
dockerps-a|grep-Ipause
To display pause containers when not using the client bundle:
Certificate and keys for the authentication and authorization
service.
ucp-auth-store-certs
Certificate and keys for the authentication and authorization
store.
ucp-auth-store-data
Data of the authentication and authorization store, replicated
across managers.
ucp-auth-worker-certs
Certificate and keys for authentication worker.
ucp-auth-worker-data
Data of the authentication worker.
ucp-client-root-ca
Root key material for the MKE root CA that issues client
certificates.
ucp-cluster-root-ca
Root key material for the MKE root CA that issues certificates
for swarm members.
ucp-controller-client-certs
Certificate and keys that the MKE web server uses to communicate
with other MKE components.
ucp-controller-server-certs
Certificate and keys for the MKE web server running in the node.
ucp-kv
MKE configuration data, replicated across managers.
ucp-kv-certs
Certificates and keys for the key-value store.
ucp-metrics-data
Monitoring data that MKE gathers.
ucp-metrics-inventory
Configuration file that the ucp-metrics service uses.
ucp-node-certs
Certificate and keys for node communication.
ucp-backup
Backup artifacts that are created while processing a backup. The
artifacts persist on the volume for the duration of the backup and are
cleaned up when the backup completes, though the volume itself remains.
mke-containers
Symlinks to MKE component log files, created by ucp-agent.
Symlinks to MKE component log files, created by ucp-agent.
You can customize the volume driver for the volumes by creating
the volumes prior to installing MKE. During installation, MKE determines
which volumes do not yet exist on the node and creates those volumes using the
default volume driver.
By default, MKE stores the data for these volumes at
/var/lib/docker/volumes/<volume-name>/_data.
You can interact with MKE either through the web UI or the CLI.
With the MKE web UI you can manage your swarm, grant and revoke user
permissions, deploy, configure, manage, and monitor your applications.
In addition, MKE exposes the standard Docker API, so you can continue using
such existing tools as the Docker CLI client. As MKE secures your
cluster with RBAC, you must configure your Docker CLI client and
other client tools to authenticate your requests using client
certificates that you can download from your MKE profile page.
MKE allows administrators to authorize users to view, edit, and use cluster
resources by granting role-based permissions for specific resource sets.
To authorize access to cluster resources across your organization, high-level
actions that MKE administrators can take include the following:
Add and configure subjects (users, teams, organizations, and service
accounts).
Define custom roles (or use defaults) by adding permitted operations per
resource type.
Group cluster resources into resource sets of Swarm collections or Kubernetes
namespaces.
Create grants by combining subject, role, and resource set.
Note
Only administrators can manage Role-based access control (RBAC).
The following table describes the core elements used in RBAC:
Element
Description
Subjects
Subjects are granted roles that define the permitted operations for
one or more resource sets and include:
User
A person authenticated by the authentication backend. Users can belong
to more than one team and more than one organization.
Team
A group of users that share permissions defined at the team level. A
team can be in only one organization.
Organization
A group of teams that share a specific set of permissions, defined by
the roles of the organization.
Service account
A Kubernetes object that enables a workload to access cluster resources
assigned to a namespace.
Roles
Roles define what operations can be done by whom. A role is a set of
permitted operations for a type of resource, such as a container or
volume. It is assigned to a user or a team with a grant.
For example, the built-in RestrictedControl role includes
permissions to view and schedule but not to update nodes. Whereas a
custom role may include permissions to read, write, and execute
(r-w-x) volumes and secrets.
Most organizations use multiple roles to fine-tune the appropriate
access for different subjects. Users and teams may have different roles
for the different resources they access.
Resource sets
Users can group resources into two types of resource sets to control
user access: Docker Swarm collections and Kubernetes namespaces.
Docker Swarm collections
Collections have a directory-like structure that holds Swarm resources.
You can create collections in MKE by defining a directory path and
moving resources into it. Alternatively, you can use labels in your
YAML file to assign application resources to the path. Resource types
that users can access in Swarm collections include containers,
networks, nodes, services, secrets, and volumes.
Each Swarm resource can be in only one collection at a time, but
collections can be nested inside one another to a maximum depth of two
layers. Collection permission includes permission for child
collections.
For child collections and users belonging to more than one team,
the system concatenates permissions from multiple roles into an
effectiverole for the user, which specifies the operations that
are allowed for the target.
Kubernetes namespaces
Namespaces are virtual clusters that allow multiple teams to access a
given cluster with different permissions. Kubernetes automatically sets
up four namespaces, and users can add more as necessary, though unlike
Swarm collections they cannot be nested. Resource types that users can
access in Kubernetes namespaces include pods, deployments, network
policies, nodes, services, secrets, and more.
Grants
Grants consist of a subject, role, and resource set, and define how
specific users can access specific resources. All the grants of an
organization taken together constitute an access control list (ACL),
which is a comprehensive access policy for the organization.
For complete information on how to configure and use role-based access control
in MKE, refer to Authorize role-based access.
The MKE Installation Guide provides everything you need to install
and configure Mirantis Kubernetes Engine (MKE). The guide offers
detailed information, procedures, and examples that are specifically
designed to help DevOps engineers and administrators install and
configure the MKE container orchestration platform.
Before installing MKE, plan a single host name strategy to use consistently
throughout the cluster, keeping in mind that MKE and MCR both use host names.
There are two general strategies for creating host names: short host names and
fully qualified domain names (FQDN). Consider the following examples:
MCR uses three separate IP ranges for the docker0, docker_gwbridge, and
ucp-bridge interfaces. By default, MCR assigns the first available subnet
in default-address-pools (172.17.0.0/16) to docker0, the second
(172.18.0.0/16) to docker_gwbridge, and the third (172.19.0.0/16)
to ucp-bridge.
Note
The ucp-bridge bridge network specifically supports MKE component
containers.
You can reassign the docker0, docker_gwbridge, and ucp-bridge
subnets in default-address-pools. To do so, replace the relevant values in
default-address-pools in the /etc/docker/daemon.json file, making sure
that the setting includes at least three IP pools. Be aware that you must
restart the docker.service to activate your daemon.json file edits.
By default, default-address-pools contains the following values:
The list of CIDR ranges used to allocate subnets for local bridge
networks.
base
The CIDR range allocated for bridge networks in each IP address pool.
size
The CIDR netmask that determines the subnet size to allocate from the
base pool. If the size matches the netmask of the base,
then the pool contains one subnet. For example,
{"base":"172.17.0.0/16","size":16} creates the subnet:
172.17.0.0/16 (172.17.0.1 - 172.17.255.255).
For example, {"base":"192.168.0.0/16","size":20} allocates
/20 subnets from 192.168.0.0/16, including the following subnets for
bridge networks:
MCR creates and configures the host system with the docker0 virtual network
interface, an ethernet bridge through which all traffic between MCR
and the container moves. MCR uses docker0 to handle all container
routing. You can specify an alternative network interface when you start the
container.
MCR allocates IP addresses from the docker0 configurable IP range to the
containers that connect to docker0. The default IP range, or subnet, for
docker0 is 172.17.0.0/16.
You can change the docker0 subnet in /etc/docker/daemon.json using the
settings in the following table. Be aware that you must restart the
docker.service to activate your daemon.json file edits.
Parameter
Description
default-address-pools
Modify the first pool in default-address-pools.
Caution
By default, MCR assigns the second pool to docker_gwbridge. If you
modify the first pool such that the size does not match the base
netmask, it can affect docker_gwbridge.
{"default-address-pools":[{"base":"172.17.0.0/16","size":16}, <-- Modify this value{"base":"172.18.0.0/16","size":16},{"base":"172.19.0.0/16","size":16},{"base":"172.20.0.0/16","size":16},{"base":"172.21.0.0/16","size":16},{"base":"172.22.0.0/16","size":16},{"base":"172.23.0.0/16","size":16},{"base":"172.24.0.0/16","size":16},{"base":"172.25.0.0/16","size":16},{"base":"172.26.0.0/16","size":16},{"base":"172.27.0.0/16","size":16},{"base":"172.28.0.0/16","size":16},{"base":"172.29.0.0/16","size":16},{"base":"172.30.0.0/16","size":16},{"base":"192.168.0.0/16","size":20}]}
fixed-cidr
Configures a CIDR range.
Customize the subnet for docker0 using standard CIDR notation.
The default subnet is 172.17.0.0/16, the network gateway is
172.17.0.1, and MCR allocates IPs 172.17.0.2 - 172.17.255.254
for your containers.
{"fixed-cidr":"172.17.0.0/16",}
bip
Configures a gateway IP address and CIDR netmask of the docker0
network.
Customize the subnet for docker0 using the
<gatewayIP>/<CIDRnetmask> notation.
The default subnet is 172.17.0.0/16, the network gateway is
172.17.0.1, and MCR allocates IPs 172.17.0.2 - 172.17.255.254
for your containers.
The docker_gwbridge is a virtual network interface that connects
overlay networks (including ingress) to individual MCR container networks.
Initializing a Docker swarm or joining a Docker host to a swarm automatically
creates docker_gwbridge in the kernel of the Docker host. The default
docker_gwbridge subnet (172.18.0.0/16) is the second available subnet
in default-address-pools.
To change the docker_gwbridge subnet, open daemon.json and modify the
second pool in default-address-pools:
{"default-address-pools":[{"base":"172.17.0.0/16","size":16},{"base":"172.18.0.0/16","size":16}, <-- Modify this value{"base":"172.19.0.0/16","size":16},{"base":"172.20.0.0/16","size":16},{"base":"172.21.0.0/16","size":16},{"base":"172.22.0.0/16","size":16},{"base":"172.23.0.0/16","size":16},{"base":"172.24.0.0/16","size":16},{"base":"172.25.0.0/16","size":16},{"base":"172.26.0.0/16","size":16},{"base":"172.27.0.0/16","size":16},{"base":"172.28.0.0/16","size":16},{"base":"172.29.0.0/16","size":16},{"base":"172.30.0.0/16","size":16},{"base":"192.168.0.0/16","size":20}]}
Caution
Modifying the first pool to customize the docker0 subnet can affect
the default docker_gwbridge subnet. Refer to
docker0 for more information.
You can only customize the docker_gwbridge settings before you join
the host to the swarm or after temporarily removing it.
The default address pool that Docker Swarm uses for its overlay network is
10.0.0.0/8. If this pool conflicts with your current network
implementation, you must use a custom IP address pool. Prior to installing MKE,
specify your custom address pool using the --default-addr-pool
option when initializing swarm.
Note
The Swarm default-addr-pool and MCR default-address-pools settings
define two separate IP address ranges used for different purposes.
A node.Status.Addr of 0.0.0.0 can cause unexpected problems. To
prevent any such issues, add the --advertise-addr flag to the
docker swarm join command.
To resolve the 0.0.0.0 situation, initiate the following workaround:
Stop the docker daemon that has .Status.Addr0.0.0.0.
In the /var/lib/docker/swarm/docker-state.json file, apply the correct
node IP to AdvertiseAddr and LocalAddr.
Kubernetes uses two internal IP ranges, either of
which can overlap and conflict with the underlying infrastructure, thus
requiring custom IP ranges.
The pod network
Either Calico or Azure IPAM services gives each Kubernetes pod
an IP address in the default 192.168.0.0/16 range. To customize this
range, during MKE installation, use the --pod-cidr flag with the
ucp install command.
The services network
You can access Kubernetes services with a VIP in the default 10.96.0.0/16
Cluster IP range. To customize this range, during MKE installation, use
the --service-cluster-ip-range flag with the ucp install
command.
The storage path for such persisted data as images, volumes, and cluster state
is docker data root (data-root in /etc/docker/daemon.json).
MKE clusters require that all nodes have the same docker data-root for the
Kubernetes network to function correctly. In addition, if the data-root is
changed on all nodes you must recreate the Kubernetes network configuration in
MKE by running the following commands:
The no-new-privileges setting prevents the container application processes
from gaining new privileges during the execution process.
For most Linux distributions, MKE supports setting no-new-privileges to
true in the /etc/docker/daemon.json file. The parameter is not,
however, supported on RHEL 7.9, CentOS 7.9, Oracle Linux 7.8, and Oracle Linux
7.9.
This option is not supported on Windows. It is a Linux kernel feature.
MCR hosts that run the devicemapper storage driver use the loop-lvm
configuration mode by default. This mode uses sparse files to build
the thin pool used by image and container snapshots and is designed to work
without any additional configuration.
Note
Mirantis recommends that you use direct-lvm mode in production
environments in lieu of loop-lvm mode. direct-lvm mode is more
efficient in its use of system resources than loop-lvm mode, and you can
scale it as necessary.
To report accurate memory metrics, MCR requires that you enable specific kernel
settings that are often disabled on Ubuntu and Debian systems. For detailed
instructions on how to do this, refer to the Docker documentation,
Your kernel does not support cgroup swap limit capabilities.
A well-configured network is essential for the proper functioning of your MKE
deployment. Pay particular attention to such key factors as IP address
provisioning, port management, and traffic enablement.
When installing MKE on a host, you need to open specific ports to
incoming traffic. Each port listens for incoming traffic from a particular set
of hosts, known as the port scope.
MKE uses the following scopes:
Scope
Description
External
Traffic arrives from outside the cluster through end-user interaction.
Internal
Traffic arrives from other hosts in the same cluster.
Self
Traffic arrives to Self ports only from processes on the same host.
These ports, however, do not need to be open to outside traffic.
Open the following ports for incoming traffic on each host type:
Hosts
Port
Scope
Purpose
Managers, workers
TCP 179
Internal
BGP peers, used for Kubernetes networking
Managers
TCP 443 (configurable)
External, internal
MKE web UI and API
Managers
TCP 2376 (configurable)
Internal
Docker swarm manager, used for backwards compatibility
Managers
TCP 2377 (configurable)
Internal
Control communication between swarm nodes
Managers, workers
UDP 4789
Internal
Overlay networking
Managers
TCP 6443 (configurable)
External, internal
Kubernetes API server endpoint
Managers, workers
TCP 6444
Self
Kubernetes API reverse proxy
Managers, workers
TCP, UDP 7946
Internal
Gossip-based clustering
Managers
TCP 9055
Internal
ucp-rethinkdb-exporter metrics
Managers, workers
TCP 9091
Internal
Felix Prometheus calico-node metrics
Managers
TCP 9094
Self
Felix Prometheus kube-controller metrics
Managers, workers
TCP 9099
Self
Calico health check
Managers, workers
TCP 9100
Internal
ucp-node-exporter metrics
Managers, workers
TCP 10248
Self
Kubelet health check
Managers, workers
TCP 10250
Internal
Kubelet
Managers, workers
TCP 12376
Internal
TLS authentication proxy that provides access to MCR
Managers, workers
TCP 12378
Self
etcd reverse proxy
Managers
TCP 12379
Internal
etcd Control API
Managers
TCP 12380
Internal
etcd Peer API
Managers
TCP 12381
Internal
MKE cluster certificate authority
Managers
TCP 12382
Internal
MKE client certificate authority
Managers
TCP 12383
Internal
Authentication storage backend
Managers
TCP 12384
Internal
Authentication storage backend for replication across
managers
Swarm workloads that require the use of encrypted overlay networks must
use iptables proxier with either the managed CNI or an unmanaged
alternative. Be aware that the other networking options detailed here
automatically disable Docker Swarm encrypted overlay networks.
Tigera, with customers paying for additional features
Cilium Open Source
Community
Planned
Mirantis
Community or Isovalent
Cilium Enterprise
Isovalent
Isovalent
Mirantis
Isovalent
To enable kube-proxy with iptables proxier while using the managed CNI:
Using default option kube-proxy with iptables proxier is the
equivalent of specifying --kube-proxy-mode=iptables at install time. To
verify that the option is operational, confirm the presence of the following
line in the ucp-kube-proxy container logs:
To enable kube-proxy with ipvs proxier while using the managed CNI:
Prior to MKE installation, verify that the following kernel modules are
available on all Linux manager and worker nodes:
ipvs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4
Specify --kube-proxy-mode=ipvs at install time.
Optional. Once installation is complete, configure the following
ipvs-related parameters in the MKE configuration file (otherwise, MKE will
use the Kubernetes default parameter settings):
ipvs_exclude_cidrs=""
ipvs_min_sync_period=""
ipvs_scheduler=""
ipvs_strict_arp=false
ipvs_sync_period=""
ipvs_tcp_timeout=""
ipvs_tcpfin_timeout=""
ipvs_udp_timeout=""
For more information on using these parameters, refer to kube-proxy
in the Kubernetes documentation.
Note
The ipvs-related parameters have no install time counterparts and
therefore must only be configured once MKE installation is complete.
Verify that kube-proxy with ipvs proxier is operational by confirming the
presence of the following lines in the ucp-kube-proxy container logs:
Verify that the prerequisites for eBPF use have been met, including kernel
compatibility, for all Linux manager and worker nodes. Refer to the Calico
documentation Enable the eBPF dataplane for more
information.
Specify --calico-ebpf-enabled at install time.
Verify that eBPF mode is operational by confirming the presence of the
following lines in the ucp-kube-proxy container logs:
Verify that the prerequisites for eBPF use have been met, including kernel
compatibility, for all Linux manager and worker nodes. Refer to the Calico
documentation Enable the eBPF dataplane for
more information.
Specify the following parameters at install time:
--unmanaged-cni
--kube-proxy-mode=disabled
--kube-default-drop-masq-bits
Verify that eBPF mode is operational by confirming the presence of the
following lines in ucp-kube-proxy container logs:
Calico is the default networking plugin for MKE. The default Calico
encapsulation setting for MKE is VXLAN, however the plugin also supports
IP-in-IP encapsulation. Refer to the Calico documentation on Overlay
networking for
more information.
Important
NetworkManager can impair the Calico agent routing function. To resolve
this issue, you must create a file called
/etc/NetworkManager/conf.d/calico.conf with the following content:
You can enable Multus CNI in the MKE cluster when you install MKE, using the
--multus-cni flag with the MKE install CLI command.
Multus CNI acts as a meta plugin, enabling the attachment of multiple network
interfaces to multi-homed Pods. Refer to Multus CNI on GitHub for more information.
Avoid firewall conflicts in the following Linux distributions:
Linux distribution
Procedure
SUSE Linux Enterprise Server 12 SP2
Installations have the FW_LO_NOTRACK flag turned on by default in
the openSUSE firewall. It speeds up packet processing on the
loopback interface but breaks certain firewall setups that redirect
outgoing packets via custom rules on the local machine.
To turn off the FW_LO_NOTRACK option:
In /etc/sysconfig/SuSEfirewall2, set FW_LO_NOTRACK="no".
Either restart the firewall or reboot the system.
SUSE Linux Enterprise Server 12 SP3
No change is required, as installations have the FW_LO_NOTRACK flag
turned off by default.
Before performing SUSE Linux Enterprise Server (SLES) installations, consider
the following prerequisite steps:
For SLES 15 installations, disable CLOUD_NETCONFIG_MANAGE prior to
installing MKE:
Set CLOUD_NETCONFIG_MANAGE="no" in the
/etc/sysconfig/network/ifcfg-eth0 network interface configuration
file.
Run the service network restart command.
By default, SLES disables connection tracking. To allow
Kubernetes controllers in Calico to reach the Kubernetes API server, enable
connection tracking on the loopback interface for SLES by running the
following commands for each node in the cluster:
Network lag of more than two seconds between MKE manager nodes can cause
problems in your MKE cluster. For example, such a lag can indicate to MKE
components that the other nodes are down, resulting in unnecessary leadership
elections that will result in temporary outages and reduced performance. To
resolve the issue, decrease the latency of the MKE node communication network.
Configure all containers in an MKE cluster to regularly synchronize with a
Network Time Protocol (NTP) server, to ensure consistency between all
containers in the cluster and to circumvent unexpected behavior that can lead
to poor performance.
Though MKE does not include a load balancer, you can configure your own to
balance user requests across all manager nodes. Before that, decide whether you
will add nodes to the load balancer using their IP address or their fully
qualified domain name (FQDN), and then use that strategy consistently
throughout the cluster. Take note of all IP addresses or FQDNs before you start
the installation.
If you plan to deploy both MKE and MSR, your load balancer must be able to
differentiate between the two: either by IP address or port number. Because
both MKE and MSR use port 443 by default, your options are as follows:
Configure your load balancer to expose either MKE or MSR on a port other than
443.
Configure your load balancer to listen on port 443 with separate virtual IP
addresses for MKE and MSR.
Configure separate load balancers for MKE and MSR, both listening on port
443.
If you want to install MKE in a high-availability configuration with a load
balancer in front of your MKE controllers, include the
appropriate IP address and FQDN for the load balancer VIP. To do so, use one
or more --san flags either with the ucp install command or in
interactive mode when MKE requests additional SANs.
MKE supports the setting of values for all IPVS related parameters that are
exposed by kube-proxy.
Kube-proxy runs on each cluster node, its role being to load-balance traffic
whose destination is services (via cluster IPs and node ports) to the correct
backend pods. Of the modes in which kube-proxy can run, IPVS (IP Virtual
Server) offers the widest choice of load balancing algorithms and superior
scalability.
You can only enable IPVS for MKE at installation, and it persists throughout
the life of the cluster. Thus, you cannot switch to iptables at a
later stage or switch over existing MKE clusters to use IPVS proxier.
Use the --existing-config parameter when installing MKE. You can also
change these values post-install using the MKE-sucp/config-toml
endpoint.
Caution
If you are using MKE 3.3.x with IPVS proxier and plan to upgrade to MKE
3.4.x, you must upgrade to MKE 3.4.3 or later as earlier versions of MKE
3.4.x do not support IPVS proxier.
You can customize MKE to use certificates signed by an External
Certificate Authority (ECA). When using your own certificates,
include a certificate bundle with the following:
ca.pem file with the root CA public certificate.
cert.pem file with the server certificate and any intermediate CA
public certificates. This certificate should also have Subject Alternative
Names (SANs) for all addresses used to reach the MKE manager.
key.pem file with a server private key.
You can either use separate certificates for every manager node or one
certificate for all managers. If you use separate certificates, you must use a
common SAN throughout. For example, MKE permits the following on a three-node
cluster:
node1.company.example.org with the SAN mke.company.org
node2.company.example.org with the SAN mke.company.org
node3.company.example.org with the SAN mke.company.org
If you use a single certificate for all manager nodes, MKE automatically copies
the certificate files both to new manager nodes and to those promoted to a
manager role.
Skip this step if you want to use the default named volumes.
MKE uses named volumes to persist data. If you want to customize
the drivers that manage such volumes, create the volumes
before installing MKE. During the installation process, the installer
will automatically detect the existing volumes and start using them.
Otherwise, MKE will create the default named volumes.
Sets whether arptables rules apply to bridged network traffic.
If the bridge module is not loaded, and thus no bridges are present,
this key is not present.
call-ip6tables
Default: No default
MKE:1
Sets whether ip6tables rules apply to bridged network traffic.
If the bridge module is not loaded, and thus no bridges are present,
this key is not present.
call-iptables
Default: No default
MKE:1
Sets whether iptables rules apply to bridged network traffic.
If the bridge module is not loaded, and thus no bridges are present,
this key is not present.
filter-pppoe-tagged
Default: No default
MKE:0
Sets whether netfilter rules apply to bridged PPPOE network
traffic. If the bridge module is not loaded, and thus no bridges are
present, this key is not present.
filter-vlan-tagged
Default: No default
MKE:0
Sets whether netfilter rules apply to bridged VLAN network traffic. If
the bridge module is not loaded, and thus no bridges are present, this
key is not present.
pass-vlan-input-dev
Default: No default
MKE:0
Sets whether netfilter strips the incoming VLAN interface name from
bridged traffic. If the bridge module is not loaded, and thus no bridges
are present, this key is not present.
The *.vs.* default values persist, changing only because the ipvs
kernel module was not previously loaded. For more information, refer
to the Linux kernel documentation.
Parameter
Values
Description
conf.all.accept_redirects
Default:1
MKE:0
Sets whether ICMP redirects are permitted. This key affects all
interfaces.
conf.all.forwarding
Default:0
MKE:1
Sets whether network traffic is forwarded. This key affects all
interfaces.
conf.all.route_localnet
Default:0
MKE:1
Sets 127/8 for local routing. This key
affects all interfaces.
conf.default.forwarding
Default:0
MKE:1
Sets 127/8 for local routing. This key
affects new interfaces.
conf.lo.forwarding
Default:0
MKE:1
Sets forwarding for localhost traffic.
ip_forward
Default:0
MKE:1
Sets whether traffic forwards between interfaces. For Kubernetes to run,
this parameter must be set to 1.
vs.am_droprate
Default:10
MKE:10
Sets the always mode drop rate used in mode 3 of the drop_rate
defense.
vs.amemthresh
Default:1024
MKE:1024
Sets the available memory threshold in pages, which is used in the
automatic modes of defense. When there is not enough available memory,
this enables the strategy and the variable is set to 2. Otherwise,
the strategy is disabled and the variable is set to 1.
vs.backup_only
Default:0
MKE:0
Sets whether the director function is disabled while the server is in
back-up mode, to avoid packet loops for DR/TUN methods.
vs.cache_bypass
Default:0
MKE:0
Sets whether packets forward directly to the original destination when
no cache server is available and the destination address is not local
(iph->daddrisRTN_UNICAST). This mostly applies to transparent web
cache clusters.
vs.conn_reuse_mode
Default:1
MKE:1
Sets how IPVS handles connections detected on port reuse. It is a
bitmap with the following values:
0 disables any special handling on port reuse. The new
connection is delivered to the same real server that was servicing the
previous connection, effectively disabling expire_nodest_conn.
bit1 enables rescheduling of new connections when it is safe.
That is, whenever expire_nodest_conn and for TCP sockets, when
the connection is in TIME_WAIT state (which is only possible if
you use NAT mode).
bit2 is bit 1 plus, for TCP connections, when connections
are in FIN_WAIT state, as this is the last state seen by load
balancer in Direct Routing mode. This bit helps when adding new
real servers to a very busy cluster.
vs.conntrack
Default:0
MKE:0
Sets whether connection-tracking entries are maintained for connections
handled by IPVS. Enable if connections handled by IPVS
are to be subject to stateful firewall rules. That is, iptables
rules that make use of connection tracking. Otherwise, disable this
setting to optimize performance. Connections handled by
the IPVS FTP application module have connection tracking entries
regardless of this setting, which is only available when IPVS is
compiled with CONFIG_IP_VS_NFCT enabled.
vs.drop_entry
Default:0
MKE:0
Sets whether entries are randomly dropped in the connection hash table,
to collect memory back for new connections. In the current
code, the drop_entry procedure can be activated every second, then
it randomly scans 1/32 of the whole and drops entries that are in the
SYN-RECV/SYNACK state, which should be effective against syn-flooding
attack.
The valid values of drop_entry are 0 to 3, where 0 indicates
that the strategy is always disabled, 1 and 2 indicate automatic
modes (when there is not enough available memory, the strategy
is enabled and the variable is automatically set to 2,
otherwise the strategy is disabled and the variable is set to
1), and 3 indicates that the strategy is always enabled.
vs.drop_packet
Default:0
MKE:0
Sets whether rate packets are dropped prior to being forwarded to real
servers. Rate 1 drops all incoming packets.
The value definition is the same as that for drop_entry. In
automatic mode, the following formula determines the rate:
rate = amemthresh / (amemthresh - available_memory) when available
memory is less than the available memory threshold. When mode 3 is
set, the always mode drop rate is controlled by the
/proc/sys/net/ipv4/vs/am_droprate.
vs.expire_nodest_conn
Default:0
MKE:0
Sets whether the load balancer silently drops packets when its
destination server is not available. This can be useful when the
user-space monitoring program deletes the destination server (due to
server overload or wrong detection) and later adds the server back, and
the connections to the server can continue.
If this feature is enabled, the load balancer terminates the connection
immediately whenever a packet arrives and its destination server is not
available, after which the client program will be notified that the
connection is closed. This is equivalent to the feature that is
sometimes required to flush connections when the destination is not
available.
vs.ignore_tunneled
Default:0
MKE:0
Sets whether IPVS configures the ipvs_property on all packets of
unrecognized protocols. This prevents users from routing such tunneled
protocols as IPIP, which is useful in preventing the rescheduling
packets that have been tunneled to the IPVS host (that is, to prevent
IPVS routing loops when IPVS is also acting as a real server).
vs.nat_icmp_send
Default:0
MKE:0
Sets whether ICMP error messages (ICMP_DEST_UNREACH) are sent for
VS/NAT when the load balancer receives packets from real servers but the
connection entries do not exist.
vs.pmtu_disc
Default:0
MKE:0
Sets whether all DF packets that exceed the PMTU are rejected with
FRAG_NEEDED, irrespective of the forwarding method. For the TUN
method, the flag can be disabled to fragment such packets.
vs.schedule_icmp
Default:0
MKE:0
Sets whether scheduling ICMP packets in IPVS is enabled.
vs.secure_tcp
Default:0
MKE:0
Sets the use of a more complicated TCP state transition table.
For VS/NAT, the secure_tcp defense delays entering the
TCPESTABLISHED state until the three-way handshake completes. The
value definition is the same as that of drop_entry and
drop_packet.
vs.sloppy_sctp
Default:0
MKE:0
Sets whether IPVS is permitted to create a connection state on any
packet, rather than an SCTP INIT only.
vs.sloppy_tcp
Default:0
MKE:0
Sets whether IPVS is permitted to create a connection state on any
packet, rather than a TCP SYN only.
vs.snat_reroute
Default:0
MKE:1
Sets whether the route of SNATed packets is recalculated from real
servers as if they originate from the director. If disabled, SNATed
packets are routed as if they have been forwarded by the director.
If policy routing is in effect, then it is possible that the route
of a packet originating from a director is routed differently to a
packet being forwarded by the director.
If policy routing is not in effect, then the recalculated route will
always be the same as the original route. It is an optimization
to disable snat_reroute and avoid the recalculation.
vs.sync_persist_mode
Default:0
MKE:0
Sets the synchronization of connections when using persistence. The
possible values are defined as follows:
0 means all types of connections are synchronized.
1 attempts to reduce the synchronization traffic depending on
the connection type. For persistent services, avoid synchronization
for normal connections, do it only for persistence templates.
In such case, for TCP and SCTP it may need enabling sloppy_tcp and
sloppy_sctp flags on back-up servers. For non-persistent services
such optimization is not applied, mode 0 is assumed.
vs.sync_ports
Default:1
MKE:1
Sets the number of threads that the master and back-up servers can use
for sync traffic. Every thread uses a single UDP port, thread 0 uses the
default port 8848, and the last thread uses port
8848+sync_ports-1.
vs.sync_qlen_max
Default: Calculated
MKE: Calculated
Sets a hard limit for queued sync messages that are not yet sent. It
defaults to 1/32 of the memory pages but actually represents
number of messages. It will protect us from allocating large
parts of memory when the sending rate is lower than the queuing
rate.
vs.sync_refresh_period
Default:0
MKE:0
Sets (in seconds) the difference in the reported connection timer that
triggers new sync messages. It can be used to avoid sync messages for
the specified period (or half of the connection timeout if it is lower)
if the connection state has not changed since last sync.
This is useful for normal connections with high traffic, to reduce
sync rate. Additionally, retry sync_retries times with period of
sync_refresh_period/8.
vs.sync_retries
Default:0
MKE:0
Sets sync retries with period of sync_refresh_period/8. Useful
to protect against loss of sync messages. The range of the
sync_retries is 0 to 3.
vs.sync_sock_size
Default:0
MKE:0
Sets the configuration of SNDBUF (master) or RCVBUF (slave) socket
limit. Default value is 0 (preserve system defaults).
vs.sync_threshold
Default:350
MKE:350
Sets the synchronization threshold, which is the minimum number of
incoming packets that a connection must receive before the
connection is synchronized. A connection will be synchronized every time
the number of its incoming packets modulus sync_period equals the
threshold. The range of the threshold is 0 to sync_period.
When sync_period and sync_refresh_period are 0, send sync only
for state changes or only once when packets matches sync_threshold.
vs.sync_version
Default:1
MKE:1
Sets the version of the synchronization protocol to use when sending
synchronization messages. The possible values are:
``0 ``selects the original synchronization protocol (version 0). This
should be used when sending synchronization messages to a legacy
system that only understands the original synchronization protocol.
1 selects the current synchronization protocol (version 1). This
should be used whenever possible.
Kernels with this sync_version entry are able to receive messages
of both version 1 and version 2 of the synchronization protocol.
The net.netfilter.nf_conntrack_<subtree> default values persist,
changing only when the nf_conntrack kernel module has not been
previously loaded. For more information, refer to the
Linux kernel documentation.
Parameter
Values
Description
acct
Default:0
MKE:0
Sets whether connection-tracking flow accounting is enabled. Adds 64-bit
byte and packet counter per flow.
buckets
Default: Calculated
MKE: Calculated
Sets the size of the hash table. If not specified during module loading,
the default size is calculated by dividing total memory by 16384 to
determine the number of buckets. The hash table will never have fewer
than 1024 and never more than 262144 buckets. This sysctl is only
writeable in the initial net namespace.
checksum
Default:0
MKE:0
Sets whether the checksum of incoming packets is verified. Packets with
bad checksums are in an invalid state. If this is enabled, such packets
are not considered for connection tracking.
dccp_loose
Default:0
MKE:1
Sets whether picking up already established connections for Datagram
Congestion Control Protocol (DCCP) is permitted.
dccp_timeout_closereq
Default: Distribution dependent
MKE:64
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_closing
Default: Distribution dependent
MKE:64
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_open
Default: Distribution dependent
MKE:43200
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_partopen
Default: Distribution dependent
MKE:480
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_request
Default: Distribution dependent
MKE:240
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_respond
Default: Distribution dependent
MKE:480
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_timewait
Default: Distribution dependent
MKE:240
The parameter description is not yet available in the Linux kernel
documentation.
events
Default:0
MKE:1
Sets whether the connection tracking code provides userspace with
connection-tracking events through ctnetlink.
expect_max
Default: Calculated
MKE:1024
Sets the maximum size of the expectation table. The default value is
nf_conntrack_buckets / 256. The minimum is 1.
frag6_high_thresh
Default: Calculated
MKE:4194304
Sets the maximum memory used to reassemble IPv6 fragments. When
nf_conntrack_frag6_high_thresh bytes of memory is allocated for this
purpose, the fragment handler tosses packets until
nf_conntrack_frag6_low_thresh is reached. The size of this parameter
is calculated based on system memory.
frag6_low_thresh
Default: Calculated
MKE:3145728
See nf_conntrack_frag6_high_thresh. The size of this parameter is
calculated based on system memory.
frag6_timeout
Default:60
MKE:60
Sets the time to keep an IPv6 fragment in memory.
generic_timeout
Default:600
MKE:600
Sets the default for a generic timeout. This refers to layer 4 unknown
and unsupported protocols.
gre_timeout
Default:30
MKE:30
Set the GRE timeout from the conntrack table.
gre_timeout_stream
Default:180
MKE:180
Sets the GRE timeout for streamed connections. This extended timeout
is used when a GRE stream is detected.
helper
Default:0
MKE:0
Sets whether the automatic conntrack helper assignment is enabled.
If disabled, you must set up iptables rules to assign
helpers to connections. See the CT target description in the
iptables-extensions(8) main page for more information.
icmp_timeout
Default:30
MKE:30
Sets the default for ICMP timeout.
icmpv6_timeout
Default:30
MKE:30
Sets the default for ICMP6 timeout.
log_invalid
Default:0
MKE:0
Sets whether invalid packets of a type specified by value are logged.
max
Default: Calculated
MKE:131072
Sets the maximum number of allowed connection tracking entries. This
value is set to nf_conntrack_buckets by default.
Connection-tracking entries are added to the table twice, once for the
original direction and once for the reply direction (that is, with
the reversed address). Thus, with default settings a maxed-out
table will have an average hash chain length of 2, not 1.
sctp_timeout_closed
Default: Distribution dependent
MKE:10
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_cookie_echoed
Default: Distribution dependent
MKE:3
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_cookie_wait
Default: Distribution dependent
MKE:3
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_established
Default: Distribution dependent
MKE:432000
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_heartbeat_acked
Default: Distribution dependent
MKE:210
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_heartbeat_sent
Default: Distribution dependent
MKE:30
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_shutdown_ack_sent
Default: Distribution dependent
MKE:3
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_shutdown_recd
Default: Distribution dependent
MKE:0
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_shutdown_sent
Default: Distribution dependent
MKE:0
The parameter description is not yet available in the Linux kernel
documentation.
tcp_be_liberal
Default:0
MKE:0
Sets whether only out of window RST segments are marked as INVALID.
tcp_loose
Default:0
MKE:1
Sets whether already established connections are picked up.
tcp_max_retrans
Default:3
MKE:3
Sets the maximum number of packets that can be retransmitted without
receiving an acceptable ACK from the destination. If this number
is reached, a shorter timer is started. Timeout for unanswered.
tcp_timeout_close
Default: Distribution dependent
MKE:10
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_close_wait
Default: Distribution dependent
MKE:3600
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_fin_wait
Default: Distribution dependent
MKE:120
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_last_ack
Default: Distribution dependent
MKE:30
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_max_retrans
Default: Distribution dependent
MKE:300
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_syn_recv
Default: Distribution dependent
MKE:60
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_syn_sent
Default: Distribution dependent
MKE:120
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_time_wait
Default: Distribution dependent
MKE:120
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_unacknowledged
Default: Distribution dependent
MKE:30
The parameter description is not yet available in the Linux kernel
documentation.
timestamp
Default:0
MKE:0
Sets whether connection-tracking flow timestamping is enabled.
udp_timeout
Default:30
MKE:30
Sets the UDP timeout.
udp_timeout_stream
Default:120
MKE:120
Sets the extended timeout that is used whenever a UDP stream is
detected.
The net.nf_conntrack_<subtree> default values persist, changing only
when the nf_conntrack kernel module has not been previously loaded.
For more information, refer to the Linux kernel documentation.
Parameter
Values
Description
max
Default: Calculated
MKE:131072
Sets the maximum number of connections to track. The size of this
parameter is calculated based on system memory.
To protect kernel parameters from being overridden by kublet, you can either
invoke the --kube-protect-kernel-defaults command option at the time of
MKE install, or following MKE install you can adjust the cluster_config|kube_protect_kernel_defaults parameter in the MKE configuration file.
Important
When enabled, kubelet can fail to start if the kernel parameters on
the nodes are not properly set. You must set those kernel parameters
on the nodes before you install MKE or before adding a new node to an
existing cluster.
Create a configuration file called /etc/sysctl.d/90-kubelet.conf
and add the following snippet to it:
The ucp install command runs in interactive mode,
prompting you for the necessary configuration values. For more information
about the ucp install command, including how to install MKE on a
system with SELinux enabled, refer to the MKE Operations Guide:
mirantis/ucp install.
Note
MKE installs Project Calico for Kubernetes container-to-container
communication. However, you may install an alternative CNI plugin, such as
Cilium, Weave, or Flannel. For more information, refer to the
MKE Operations Guide: Installing an unmanaged CNI plugin.
After you Install the MKE image, proceed with
downloading your MKE license as described below. This section also contains
steps to apply your new license using the MKE web UI.
Warning
Users are not authorized to run MKE without a valid license. For more
information, refer to Mirantis Agreements and Terms.
To download your MKE license:
Open an email from Mirantis Support with the subject
Welcome to Mirantis’ CloudCare Portal and follow the instructions
for logging in.
If you did not receive the CloudCare Portal email, it is likely that
you have not yet been added as a Designated Contact. To remedy this,
contact your Designated Administrator.
In the top navigation bar, click Environments.
Click the Cloud Name associated with the license you want to
download.
Scroll down to License Information and click the
License File URL. A new tab opens in your browser.
Click View file to download your license file.
To update your license settings in the MKE web UI:
Log in to your MKE instance using an administrator account.
In the left navigation, click Settings.
On the General tab, click Apply new license. A file
browser dialog displays.
Navigate to where you saved the license key (.lic) file, select it,
and click Open. MKE automatically updates with the new settings.
Note
Though MKE is generally a subscription-only service, Mirantis offers a free
trial license by request. Use our contact form to request a free trial license.
This section describes how to customize your MKE installation on AWS.
Note
You may skip this topic if you plan to install MKE on AWS with no
customizations or if you will only deploy Docker Swarm workloads. Refer to
Install the MKE image for the appropriate installation instruction.
Tag your instance, VPC, security-groups, and subnets by specifying
kubernetes.io/cluster/<unique-cluster-id> in the Key field
and <cluster-type> in the Value field.
Possible <cluster-type> values are as follows:
owned, if the cluster owns and manages the resources that it creates
shared, if the cluster shares its resources between multiple clusters
For example, Key: kubernetes.io/cluster/1729543642a6 and
Value: owned.
To enable introspection and resource provisioning, specify an instance
profile with appropriate policies for manager nodes. The following is
an example of a very permissive instance profile:
To enable access to dynamically provisioned resources, specify an instance
profile with appropriate policies for worker nodes. The following is an
example of a very permissive instance profile:
After you perform the steps described in Prerequisites, run the
following command to install MKE on a master node. Substitute <ucp-ip> with
the private IP address of the master node.
Mirantis Kubernetes Engine (MKE) closely integrates with Microsoft
Azure for its Kubernetes Networking and Persistent Storage feature set.
MKE deploys the Calico CNI provider. In Azure, the Calico CNI leverages
the Azure networking infrastructure for data path networking and the
Azure IPAM for IP address management.
To avoid significant issues during the installation process, you must meet the
following infrastructure prerequisites to successfully deploy MKE on Azure.
Deploy all MKE nodes (managers and workers) into the
same Azure resource group. You can deploy the Azure networking components
(virtual network, subnets, security groups) in a second Azure resource
group.
Size the Azure virtual network and subnet appropriately for
your environment, because addresses from this pool will be consumed by
Kubernetes Pods.
Attach all MKE worker and manager nodes to the same Azure subnet.
Set internal IP addresses for all nodes to Static rather than
the Dynamic default.
Match the Azure virtual machine object name to the Azure
virtual machine computer name and the node operating system hostname that is
the FQDN of the host (including domain names). All characters in the names
must be in lowercase.
Ensure the presence of an Azure Service Principal with Contributor
access to the Azure resource group hosting the MKE nodes. Kubernetes uses
this Service Principal to communicate with the Azure API. The Service
Principal ID and Secret Key are MKE prerequisites.
If you are using a separate resource group for the networking components,
the same Service Principal must have NetworkContributor access to this
resource group.
Ensure that an open NSG between all IPs on the Azure subnet passes into MKE
during installation. Kubernetes Pods integrate into the underlying Azure
networking stack, from an IPAM and routing perspective with the Azure CNI
IPAM module. As such, Azure network security groups (NSG) impact pod-to-pod
communication. End users may expose containerized services on a range of
underlying ports, resulting in a manual process to open an NSG port every
time a new containerized service deploys on the platform, affecting only
workloads that deploy on the Kubernetes orchestrator.
To limit exposure, restrict the use of the Azure subnet to container host
VMs and Kubernetes Pods. Additionally, you can leverage Kubernetes Network
Policies to provide micro segmentation for containerized applications and
services.
The MKE installation requires the following information:
subscriptionId
Azure Subscription ID in which to deploy the MKE objects
tenantId
Azure Active Directory Tenant ID in which to deploy the MKE objects
MKE configures the Azure IPAM module for Kubernetes so that it can allocate IP
addresses for Kubernetes Pods. Per Azure IPAM module requirements, the
configuration of each Azure VM that is part of the Kubernetes cluster must
include a pool of IP addresses.
You can use automatic or manual IPs provisioning for the Kubernetes cluster on
Azure.
Automatic provisioning
Allows for IP pool configuration and maintenance for standalone Azure
virtual machines (VMs). This service runs within the calico-node
daemonset and provisions 128 IP addresses for each node by default.
Note
If you are using a VXLAN data plane, MKE automatically uses Calico IPAM.
It is not necessary to do anything specific for Azure IPAM.
New MKE installations use Calico VXLAN as the default data plane (the
MKE configuration calico_vxlan is set to true). MKE does not use
Calico VXLAN if the MKE version is lower than 3.3.0 or if you upgrade
MKE from lower than 3.3.0 to 3.3.0 or higher.
Manual provisioning
Manual provisioning of additional IP address for each Azure VM can be done
through the Azure Portal, the Azure CLI az network nic
ip-config create, or an ARM template.
For MKE to integrate with Microsoft Azure, the azure.json configuration
file must be identical across all manager and worker nodes in your cluster. For
Linux nodes, place the file in /etc/kubernetes on each host. For Windows
nodes, place the file in C:\k on each host. Because root owns the
configuration file, set its permissions to 0644 to ensure that the
container user has read access.
The following is an example template for azure.json.
To avoid significant issue during the installation process, follow
these guidelines to either use the appropriate size network in Azure or
take the necessary actions to fit within the subnet.
Configure the subnet and the virtual network associated with the primary
interface of the Azure VMs with an adequate address prefix/range. The number of
required IP addresses depends on the workload and the number of nodes in the
cluster.
For example, for a cluster of 256 nodes, make sure that the address space
of the subnet and the virtual network can allocate at least 128 * 256
IP addresses, in order to run a maximum of 128 pods concurrently on a
node. This is in addition to initial IP allocations to VM
network interface card (NICs) during Azure resource creation.
Accounting for the allocation of IP addresses to NICs that occur during VM
bring-up, set the address space of the subnet and virtual network to
10.0.0.0/16. This ensures that the network can dynamically allocate
at least 32768 addresses, plus a buffer for initial allocations for
primary IP addresses.
Note
The Azure IPAM module queries the metadata of an Azure VM to obtain a list
of the IP addresses that are assigned to the VM NICs. The IPAM module
allocates these IP addresses to Kubernetes pods. You configure the IP
addresses as ipConfigurations in the NICs associated with a VM or
scale set member, so that Azure IPAM can provide the addresses to Kubernetes
on request.
Manually provision IP address pools as part of an Azure VM scale set¶
Configure IP Pools for each member of the VM scale set during
provisioning by associating multiple ipConfigurations with the scale
set’s networkInterfaceConfigurations.
The following example networkProfile configuration for an ARM template
configures pools of 32 IP addresses for each VM in the VM scale set.
During an MKE installation, you can alter the number of Azure IP
addresses that MKE automatically provisions for pods.
By default, MKE will provision 128 addresses, from the same Azure subnet as the
hosts, for each VM in the cluster. If, however, you have manually attached
additional IP addresses to the VMs (by way of an ARM Template, Azure CLI or
Azure Portal) or you are deploying in to small Azure subnet (less than /16),
you can use an --azure-ip-count flag at install time.
Note
Do not set the --azure-ip-count variable to a value of less than 6 if
you have not manually provisioned additional IP addresses for each VM. The
MKE installation needs at least 6 IP addresses to allocate to the core MKE
components that run as Kubernetes pods (in addition to the VM’s private IP
address).
Below are several example scenarios that require the defining of the
--azure-ip-count variable.
Scenario 1: Manually provisioned addresses
If you have manually provisioned additional IP addresses for each VM and want
to disable MKE from dynamically provisioning more IP addresses, you must
pass --azure-ip-count0 into the MKE installation command.
Scenario 2: Reducing the number of provisioned addresses
Pass --azure-ip-count<custom_value> into the MKE installation command
to reduce the number of IP addresses dynamically allocated from 128 to a
custom value due to:
Primary use of the Swarm Orchestrator
Deployment of MKE on a small Azure subnet (for example, /24)
Plans to run a small number of Kubernetes pods on each node
To adjust this value post-installation, refer to the instructions on how to
download the MKE configuration file, change the value, and update
the configuration via the API.
Note
If you reduce the value post-installation, existing VMs will not
reconcile and you will need to manually edit the IP count in Azure.
Run the following command to install MKE on a manager node.
The --pod-cidr option maps to the IP address range that you configured
for the Azure subnet.
Note
The pod-cidr range must match the Azure virtual network’s subnet
attached to the hosts. For example, if the Azure virtual network had the
range 172.0.0.0/16 with VMs provisioned on an Azure subnet of
172.0.1.0/24, then the Pod CIDR should also be 172.0.1.0/24.
This requirement applies only when MKE does not use the VXLAN data plane.
If MKE uses the VXLAN data plane, the pod-cidr range must be
different than the node IP subnet.
The --host-address maps to the private IP address of the master node.
The --azure-ip-count serves to adjust the amount of IP addresses
provisioned to each VM.
You can create your own Azure custom roles for use with MKE. You can assign
these roles to users, groups, and service principals at management group (in
preview only), subscription, and resource group scopes.
Deploy an MKE cluster into a single resource group¶
A resource group
is a container that holds resources for an Azure solution. These resources are
the virtual machines (VMs), networks, and storage accounts that are associated
with the swarm.
To create a custom all-in-one role with permissions to deploy an MKE cluster
into a single resource group:
Create the role permissions JSON file.
For example:
{"Name":"Docker Platform All-in-One","IsCustom":true,"Description":"Can install and manage Docker platform.","Actions":["Microsoft.Authorization/*/read","Microsoft.Authorization/roleAssignments/write","Microsoft.Compute/availabilitySets/read","Microsoft.Compute/availabilitySets/write","Microsoft.Compute/disks/read","Microsoft.Compute/disks/write","Microsoft.Compute/virtualMachines/extensions/read","Microsoft.Compute/virtualMachines/extensions/write","Microsoft.Compute/virtualMachines/read","Microsoft.Compute/virtualMachines/write","Microsoft.Network/loadBalancers/read","Microsoft.Network/loadBalancers/write","Microsoft.Network/loadBalancers/backendAddressPools/join/action","Microsoft.Network/networkInterfaces/read","Microsoft.Network/networkInterfaces/write","Microsoft.Network/networkInterfaces/join/action","Microsoft.Network/networkSecurityGroups/read","Microsoft.Network/networkSecurityGroups/write","Microsoft.Network/networkSecurityGroups/join/action","Microsoft.Network/networkSecurityGroups/securityRules/read","Microsoft.Network/networkSecurityGroups/securityRules/write","Microsoft.Network/publicIPAddresses/read","Microsoft.Network/publicIPAddresses/write","Microsoft.Network/publicIPAddresses/join/action","Microsoft.Network/virtualNetworks/read","Microsoft.Network/virtualNetworks/write","Microsoft.Network/virtualNetworks/subnets/read","Microsoft.Network/virtualNetworks/subnets/write","Microsoft.Network/virtualNetworks/subnets/join/action","Microsoft.Resources/subscriptions/resourcegroups/read","Microsoft.Resources/subscriptions/resourcegroups/write","Microsoft.Security/advancedThreatProtectionSettings/read","Microsoft.Security/advancedThreatProtectionSettings/write","Microsoft.Storage/*/read","Microsoft.Storage/storageAccounts/listKeys/action","Microsoft.Storage/storageAccounts/write"],"NotActions":[],"AssignableScopes":["/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"]}
Compute resources act as servers for running containers.
To create a custom role to deploy MKE compute resources only:
Create the role permissions JSON file.
For example:
{"Name":"Docker Platform","IsCustom":true,"Description":"Can install and run Docker platform.","Actions":["Microsoft.Authorization/*/read","Microsoft.Authorization/roleAssignments/write","Microsoft.Compute/availabilitySets/read","Microsoft.Compute/availabilitySets/write","Microsoft.Compute/disks/read","Microsoft.Compute/disks/write","Microsoft.Compute/virtualMachines/extensions/read","Microsoft.Compute/virtualMachines/extensions/write","Microsoft.Compute/virtualMachines/read","Microsoft.Compute/virtualMachines/write","Microsoft.Network/loadBalancers/read","Microsoft.Network/loadBalancers/write","Microsoft.Network/networkInterfaces/read","Microsoft.Network/networkInterfaces/write","Microsoft.Network/networkInterfaces/join/action","Microsoft.Network/publicIPAddresses/read","Microsoft.Network/virtualNetworks/read","Microsoft.Network/virtualNetworks/subnets/read","Microsoft.Network/virtualNetworks/subnets/join/action","Microsoft.Resources/subscriptions/resourcegroups/read","Microsoft.Resources/subscriptions/resourcegroups/write","Microsoft.Security/advancedThreatProtectionSettings/read","Microsoft.Security/advancedThreatProtectionSettings/write","Microsoft.Storage/storageAccounts/read","Microsoft.Storage/storageAccounts/listKeys/action","Microsoft.Storage/storageAccounts/write"],"NotActions":[],"AssignableScopes":["/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"]}
MKE includes support for installing and running MKE on Google Cloud Platform
(GCP). You will learn in this section how to prepare your system for MKE
installation on GCP, how to perform the installation, and some limitations
with the support for GCP on MKE.
All MKE instances have the necessary authorization for managing cloud
resources.
GCP defines authorization through the use of service accounts, roles, and
access scopes. For information on how to best configure the authorization
required for your MKE instances, refer to
Google Cloud official documentation: Service accounts.
An example of a permissible role for a service account is roles/owner,
and an example of an access scope that provides access to most Google
services is https://www.googleapis.com/auth/cloud-platform. As a best
practice, define a broad access scope such as this to an instance and then
restrict access using roles.
All of your MKE instances include the same prefix.
Each instance is tagged with the prefix of its associated instance names. For
example, if the instance names are testcluster-m1 and
testcluster-m2, tag the associated instance with
testcluster.
An MKE cluster that is running Kubernetes 1.13.0 or later, which does not
already have network load-balancing functionality.
A cluster network configuration that is compatible with MetalLB.
Available IPv4 addresses that MetalLB can allocate.
BGP operating mode requires one or more routers capable of communicating with
BGP.
When using the L2 operating mode, traffic on port 7946 must be allowed
between nodes, as required by memberlist. You can configure TCP, UDP, and
other ports.
Verification that kube-proxy is running in iptables mode.
Verification of the absence of any cloud provider configuration
To install MKE on an offline host, you must first use a separate computer with
an Internet connection to download a single package with all the images and
then copy that package to the host where you will install MKE. Once the package
is on the host and loaded, you can install MKE offline as described in
Install the MKE image.
Note
During the offline installation, both manager and worker nodes must be
offline.
This topic describes how to uninstall MKE from your cluster. After uninstalling
MKE, your instances of MCR will continue running in swarm mode and your
applications will run normally. You will not, however, be able to do the
following unless you reinstall MKE:
Enforce role-based access control (RBAC) to the cluster.
Monitor and manage the cluster from a central place.
Join new nodes using dockerswarmjoin.
Note
You cannot join new nodes to your cluster after uninstalling MKE because
your cluster will be in swarm mode, and swarm mode relies on MKE to
provide the CA certificates that allow nodes to communicate with each
other. After the certificates expire, the nodes will not be able to
communicate at all. Either reinstall MKE before the certificates expire,
or disable swarm mode by running dockerswarmleave--force on every
node.
To uninstall MKE:
Note
If SELinux is enabled, you must temporarily disable it prior to running the
uninstall-ucp command.
Log in to a manager node using SSH.
Run the uninstall-ucp command in interactive mode, thus prompting
you for the necessary configuration values:
On each manager node, remove the remaining MKE volumes:
dockervolumerm$(dockervolumels-fname=ucp-q)
Note
For more information about the uninstall-ucp failure, refer
to the logs in /var/log on any manager node. Be aware that you will
not be able to access the logs if the volume /var/log:/var/log is
not mounted while running the ucp container.
MKE keeps the configuration by default in
case you want to reinstall MKE later with the same configuration. For all
available uninstall-ucp options, refer to
mirantis/ucp uninstall-ucp.
Optional. Restore the host IP tables to their pre-MKE installation values by
restarting the node.
Note
The Calico network plugin changed the host IP tables from their original
values during MKE installation.
Swarm-only mode is an MKE configuration that supports only Swarm orchestration.
Lacking Kubernetes and its operational and health-check dependencies, the
resulting highly-stable application is smaller than a typical
mixed-orchestration MKE installation.
You can only enable or disable Swarm-only mode at the time of MKE installation.
MKE preserves the Swarm-only setting through upgrades, backups, and system
restoration. Installing MKE in Swarm-only mode pulls only the images required
to run MKE in this configuration. Refer to Swarm-only images for more
information.
Note
Installing MKE in Swarm-only mode removes all Kubernetes options from the
MKE web UI.
In addition, MKE includes the --swarm-only flag with the bootstrapper
images command, which you can use to pull or to check the
required images on manager nodes.
Caution
To restore Swarm-only clusters, invoke the ucp restore command
with the --swarm-only option.
In Swarm-only mode, MKE runs the Prometheus server and the authenticating proxy
in a single container on each manager node. Thus, unlike in conventional MKE
installations, you cannot configure Prometheus server placement. Prometheus
does not collect Kubernetes metrics in Swarm-only mode, and it requires an
additional reserved port on manager nodes: 12387.
The MKE Operations Guide provides the comprehensive information
you need to run the MKE container orchestration platform. The guide is
intended for anyone who needs to effectively develop and securely administer
applications at scale, on private clouds, public clouds, and on bare metal.
You can access an MKE cluster in a variety of ways including through the MKE
web UI, Docker CLI, and kubectl (the Kubernetes CLI). To use the
Docker CLI and kubectl with MKE, first download a client certificate
bundle. This topic describes the MKE web UI, how to download and configure the
client bundle, and how to configure kubectl with MKE.
MKE allows you to control your cluster visually using the web UI. Role-based
access control (RBAC) gives administrators and non-administrators access to
the following web UI features:
Administrators:
Manage cluster configurations.
View and edit all cluster images, networks, volumes, and containers.
Manage the permissions of users, teams, and organizations.
Grant node-specific task scheduling permissions to users.
Non-administrators:
View and edit all cluster images, networks, volumes, and containers.
Requires administrator to grant access.
To access the MKE web UI:
Open a browser and navigate to https://<ip-address> (substituting
<ip-address> with the IP address of the machine that ran
docker run).
The expected Docker CLI server version starts with ucp/, and the
expected kubectl context name starts with ucp_.
Optional. Change your context directly using the client certificate bundle
.zip files. In the directory where you downloaded the user bundle, add
the new context:
If you use the client certificate bundle with buildkit, make
sure that builds are not accidentally scheduled on manager nodes. For more
information, refer to Manage services node deployment.
MKE installations include Kubernetes. Users can
deploy, manage, and monitor Kubernetes using either the MKE web UI or kubectl.
To install and use kubectl:
Identify which version of Kubernetes you are running by using the MKE web
UI, the MKE API version endpoint, or the Docker CLI
docker version command with the client bundle.
Caution
Kubernetes requires that kubectl and Kubernetes be within one
minor version of each other.
Helm recommends that you specify a Role and RoleBinding to limit
the scope of Tiller to a particular namespace. Refer to the
official Helm documentation
for more information.
With MKE, you can add labels to your nodes. Labels are metadata
that describe the node, such as:
node role (development, QA, production)
node region (US, EU, APAC)
disk type (HDD, SSD)
Once you apply a label to a node, you can specify constraints when deploying a
service to ensure that the service only runs on nodes that meet particular
criteria.
Hint
Use resource sets (MKE collections or Kubernetes namespaces) to organize access to your cluster, rather than creating labels
for authorization and permissions to resources.
The following example procedure applies the ssd label to a node.
Log in to the MKE web UI with administrator credentials.
Click Shared Resources in the navigation menu to expand the
selections.
Click Nodes. The details pane will display the full list of
nodes.
Click the node on the list that you want to attach labels to. The details
pane will transition, presenting the Overview information
for the selected node.
Click the settings icon in the upper-right corner to open the
Edit Node page.
Navigate to the Labels section and click Add
Label.
Add a label, entering disk into the Key field and ssd
into the Value field.
Click Save to dismiss the Edit Node page and return
to the node Overview.
The following example procedure deploys a service with a constraint that
ensures that the service only runs on nodes with SSD storage
node.labels.disk==ssd.
To deploy an application stack with service constraints:
Log in to the MKE web UI with administrator credentials.
Verify that the target node orchestrator is set to Swarm.
Click Shared Resources in the left-side navigation panel to
expand the selections.
Click Stacks. The details pane will display the full list of
stacks.
Click the Create Stack button to open the Create
Application page.
Under 1. Configure Application, enter “wordpress” into the
Name field .
Under ORCHESTRATOR NODE, select Swarm Services.
Under 2. Add Application File, paste the following stack file in
the docker-compose.yml editor:
If a node is set to use Kubernetes as its orchestrator while simultaneously
running Swarm services, you must deploy placement constraints to prevent those
services from being scheduled on the node.
The necessary service constraints will be automatically adopted by any new
MKE-created Swarm services, as well as by older Swarm services that you have
updated. MKE does not automatically add placement constraints, however, to
Swarm services that were created using older versions of MKE, as to do so would
restart the service tasks.
To add placement constraints to older Swarm services:
Identify the Swarm services that do not have placement constraints:
services=$(dockerservicels-q)forservicein$services;doifdockerserviceinspect$service--format'{{.Spec.TaskTemplate.Placement.Constraints}}'|grep-q-v'node.labels.com.docker.ucp.orchestrator.swarm==true';thenname=$(dockerserviceinspect$service--format'{{.Spec.Name}}')if[$name="ucp-agent"]||[$name="ucp-agent-win"]||[$name="ucp-agent-s390x"];thencontinuefiecho"Service $name (ID: $service) is missing the node.labels.com.docker.ucp.orchestrator.swarm=true placement constraint"fidone
Add placement constraints to the Swarm services you identified:
Note
All service tasks will restart, thus causing some amount of service
downtime.
services=$(dockerservicels-q)forservicein$services;doifdockerserviceinspect$service--format'{{.Spec.TaskTemplate.Placement.Constraints}}'|grep-q-v'node.labels.com.docker.ucp.orchestrator.swarm=true';thenname=$(dockerserviceinspect$service--format'{{.Spec.Name}}')if[$name="ucp-agent"]||[$name="ucp-agent-win"];thencontinuefiecho"Updating service $name (ID: $service)"dockerserviceupdate--detach=true--constraint-addnode.labels.com.docker.ucp.orchestrator.swarm==true$servicefidone
Add or remove a service constraint using the MKE web UI¶
You can declare the deployment constraints in your docker-compose.yml
file or when you create a stack. Also, you can apply constraints when
you create a service.
To add or remove a service constraint:
Verify whether a service has deployment constraints:
Navigate to the Services page and select that service.
In the details pane, click Constraints to list the constraint
labels.
Edit the constraints on the service:
Click Configure and select Details to open the
Update Service page.
A SAN (Subject Alternative Name) is a structured means for associating various
values (such as domain names, IP addresses, email addresses, URIs, and so on)
with a security certificate.
MKE always runs with HTTPS enabled. As such, whenever you connect to MKE, you
must ensure that the MKE certificates recognize the host name in use. For
example, if MKE is behind a load balancer that forwards traffic to your MKE
instance, your requests will not be for the MKE host name or IP address but for
the host name of the load balancer. Thus, MKE will reject the requests, unless
you include the address of the load balancer as a SAN in the MKE certificates.
Note
To use your own TLS certificates, confirm first that these certificates
have the correct SAN values.
To use the self-signed certificate that MKE offers out-of-the-box, you can
use the --san argument to set up the SANs during MKE
deployment.
To add new SANs using the MKE web UI:
Log in to the MKE web UI using administrator credentials.
Navigate to the Nodes page.
Click on a manager node to display the details pane for that node.
Click Configure and select Details.
In the SANs section, click Add SAN and enter one or
more SANs for the cluster.
Click Save.
Repeat for every existing manager node in the cluster.
Note
Thereafter, the SANs are automatically applied to any new manager nodes
that join the cluster.
To add new SANs using the MKE CLI:
Get the current set of SANs for the given manager node:
dockernodeinspect--format'{{ index .Spec.Labels "com.docker.ucp.SANs"}}'<node-id>
Example of system response:
default-cs,127.0.0.1,172.17.0.1
Append the desired SAN to the list (for example,
default-cs,127.0.0.1,172.17.0.1,example.com) and run:
Prometheus is an open-source systems monitoring and alerting toolkit to which
you can configure MKE as a target.
Prometheus runs as a Kubernetes deployment that, by default, is a DaemonSet
that runs on every manager node. A key benefit of this is that you can set the
DaemonSet to not schedule on any nodes, which effectively disables Prometheus
if you do not use the MKE web interface.
Along with events and logs, metrics are data sources that provide a view into
your cluster, presenting numerical data values that have a time-series
component. There are several sources from which you can derive metrics, each
providing different meanings for a business and its applications.
As the metrics data is stored locally on disk for each Prometheus server, it
does not replicate on new managers or if you schedule Prometheus to run
on a new node. The metrics are kept no longer than 24 hours.
MKE provides a base set of metrics that gets you into production without having
to rely on external or third-party tools. Mirantis strongly encourages, though,
the use of additional monitoring to provide more comprehensive visibility into
your specific MKE environment.
High-level aggregate metrics that typically combine technical,
financial, and organizational data to create IT infrastructure
information for business leaders. Examples of business metrics
include:
Company or division-level application downtime
Aggregation resource utilization
Application resource demand growth
Application
Metrics on APM tools domains (such as AppDynamics and
DynaTrace) that supply information on the state or performance of the
application itself.
Service state
Container platform
Host infrastructure
Service
Metrics on the state of services that are running on the container
platform. Such metrics have very low cardinality, meaning the
values are typically from a small fixed set of possibilities (commonly
binary).
Application health
Convergence of Kubernetes deployments and Swarm services
Cluster load by number of services or containers or pods
Note
Web UI disk usage (including free space) reflects only the MKE
managed portion of the file system: /var/lib/docker. To monitor
the total space available on each filesystem of an MKE worker or
manager, deploy a third-party monitoring solution to oversee
the operating system.
The container health, according to its healthcheck.
The 0 value indicates that the container is not reporting as
healthy, which is likely because it either does not have a healthcheck
defined or because healthcheck results have not yet been returned
Indicates whether the container is healthy, according to
its healthcheck.
The 0 value indicates that the container is not reporting as
healthy, which is likely because it either does not have a healthcheck
defined or because healthcheck results have not yet been returned
Total disk space on the Docker root directory on this node in bytes.
Note that the ucp_engine_disk_free_bytes metric is not available
for Windows nodes
In addition to the core metrics that MKE
exposes, you can use Prometheus to scrape a variety of metrics associated
with MKE middleware components.
Herein, Mirantis outlines the components that expose Prometheus metrics, as
well as offering detail on various key metrics. You should note, however, that
this information is not exhaustive, but is rather a guideline to metrics that
you may find especially useful in determining the overall health of your MKE
deployment.
For specific key metrics, refer to the Usage information, which
offers valuable insights on interpreting the data and using it to troubleshoot
your MKE deployment.
MKE deploys Kube State Metrics to expose metrics on
the state of Kubernetes objects, such as Deployments, nodes, and Pods. These
metrics are exposed in MKE on the ucp-kube-state-metrics service and can
be scraped at ucp-kube-state-metrics.kube-system.svc.cluster.local:8080.
You can use workqueue metrics to learn how long it takes for various
components to fulfill different actions and to check the level of work queue
activity.
The metrics offered below are based on kube-controller-manager,
however the same metrics are available for other Kubernetes components.
Usage
Abnormal workqueue metrics can be symptomatic of issues in the specific
component. For example, an increase in workqueue_depth for the
Kubernetes Controller Manager can indicate that the component is being
oversaturated. In such cases, review the logs of the affected component.
Relates to the size of the workqueue. The larger the workqueue, the more
material there is to process. A growing trend in the size of the
workqueue can be indicative of issues in the cluster.
The kubelet agent runs on every node in an MKE cluster. Once you have set up
the MKE client bundle you can view the available kubelet metrics for each node
in an MKE cluster using the commands detailed below:
Obtain the name of the first available node in your MKE cluster:
Indicates the total number of running Pods, which you can use to verify
whether the number of Pods is in the expected range for your cluster.
Usage
If the number of Pods is unexpected on a node, review your Node Affinity
or Node Selector rules to verify the scheduling of Pods for the
appropriate nodes.
Indicates the number of containers per node. You can query for a
specific container state (running, created, exited). A high
number of exited containers on a node can indicate issues on that node.
If the number of containers is unexpected on a node, check your
Node Affinity or Node Selector rules to verify the scheduling of Pods
for the appropriate nodes.
Kube Proxy runs on each node in an MKE cluster. Once you have set up the MKE
client bundle, you can view the available Kube Proxy metrics for each node in
an MKE cluster using the commands detailed below:
Note
The Kube Proxy metrics are only available when Kube Proxy is enabled in the
MKE configuration and is running in either ipvs or iptables mode.
Obtain the name of the first available node in your MKE cluster:
Reflects the latency of client requests, in seconds. Such information
can be useful in determining whether your cluster is experiencing
performance degradation.
Example query
The following query illustrates the latency for all POST requests.
Displays the latency in seconds between Kube Proxy network rules, which
are consistently synchronized between nodes. If the measurement is
increasing consistently it can result in Kube Proxy being out of sync
across the nodes.
Kube Controller Manager is a collection of different Kubernetes controllers
whose primary task is to monitor changes in the state of various Kubenetes
objects. It runs on all manager nodes in an MKE cluster.
Key Kube Controller Manager metrics are detailed as follows:
Reflects the latency of calls to the API server, in seconds. Such
information can be useful in determining whether your cluster is
experiencing slower cluster performance.
Example query
The following query displays the 99th percentile latencies on requests
to the API server.
Presents the total number of HTTP requests to Kube Controller Manager,
segmented by HTTP response code. A sudden increase in requests or an
increase in requests with error response codes can indicate issues with
the cluster.
Example query
The following query displays the rate of successful HTTP requests (those
offering 2xx response codes).
The Kube API server is the core of the Kubernetes control plane. It provides a
means for obtaining information on Kubernetes objects and is also used to
modify the state of API objects. MKE runs an instance of the Kube API server on
each manager node.
The following are a number of key Kube Apiserver metrics:
MKE deploys RethinkDB Exporter on all manager nodes, to
allow metrics scraping from RethinkDB. The RethinkDB Exporter exports most of
the statistics from the RethinkDB stats table.
You can monitor the read and write throughput for each RethinkDB replica by
reviewing the following metrics:
Current number of document reads and writes per second from the server.
These metrics are organized into read/write categories and by replica. For
example, to view all the table read metrics on a specific node you can run the
following query:
MKE deploys NodeLocalDNS on every node, with the Prometheus plugin enabled. You
can scrape NodeLocalDNS metrics on port 9253, which provides regular
CoreDNS metrics that include the standard RED (Rate, Errors, Duration) metrics:
queries
durations
error counts
The metrics path is fixed to /metrics.
Metric
Description
coredns_build_info
Information to build CoreDNS.
coredns_cache_entries
Number of entries in the cache.
coredns_cache_size
Cache size.
coredns_cache_hits_total
Counter of cache hits by cache type.
coredns_cache_misses_total
Counter of cache misses.
coredns_cache_requests_total
Total number of DNS resolution requests in different dimensions.
coredns_dns_request_duration_seconds_bucket
Histogram of DNS request duration (bucket).
coredns_dns_request_duration_seconds_count
Histogram of DNS request duration (count).
coredns_dns_request_duration_seconds_sum
Histogram of DNS request duration (sum).
coredns_dns_request_size_bytes_bucket
Histogram of the size of DNS request (bucket).
coredns_dns_request_size_bytes_count
Histogram of the size of DNS request (count).
coredns_dns_request_size_bytes_sum
Histogram of the size of DNS request (sum).
coredns_dns_requests_total
Number of DNS requests.
coredns_dns_response_size_bytes_bucket
Histogram of the size of DNS response (bucket).
coredns_dns_response_size_bytes_count
Histogram of the size of DNS response (count).
coredns_dns_response_size_bytes_sum
Histogram of the size of DNS response (sum).
coredns_dns_responses_total
DNS response codes and number of DNS response codes.
coredns_forward_conn_cache_hits_total
Number of cache hits for each protocol and data flow.
coredns_forward_conn_cache_misses_total
Number of cache misses for each protocol and data flow.
coredns_forward_healthcheck_broken_total
Unhealthy upstream count.
coredns_forward_healthcheck_failures_total
Count of failed health checks per upstream.
coredns_forward_max_concurrent_rejects_total
Number of requests rejected due to excessive concurrent requests.
MKE deploys Prometheus by default on the manager nodes to provide a built-in
metrics backend. For cluster sizes over 100 nodes, or if you need to scrape
metrics from Prometheus instances, Mirantis recommends that you deploy
Prometheus on dedicated worker nodes in the cluster.
To deploy Prometheus on worker nodes:
Source an admin bundle.
Verify that ucp-metrics pods are running on all managers:
If you use SELinux, label your ucp-node-certs
directories properly on the worker nodes before you move the
ucp-metrics workload to them. To run ucp-metrics on a worker
node, update the ucp-node-certs label by running:
Patch the ucp-metrics DaemonSet’s nodeSelector with the same key and
value in use for the node label. This example shows the key
ucp-metrics and the value "".
Create a Prometheus deployment and ClusterIP service using YAML.
Note
On bare metal clusters, enable MetalLB so that
you can create a service of the load balancer type, and then perform the
following steps:
Replace ClusterIP with LoadBalancer in the service YAML.
Access the service through the load balancer.
If you run Prometheus external to MKE, change the domain for the
inventory container in the Prometheus deployment from
ucp-controller.kube-system.svc.cluster.local to an external
domain, to access MKE from the Prometheus node.
Forward port 9090 on the local host to the ClusterIP. The tunnel
you create does not need to be kept alive as its only purpose is to expose
the Prometheus UI.
ssh-L9090:10.96.254.107:9090ANY_NODE
Visit http://127.0.0.1:9090 to explore the MKE metrics that Prometheus
is collecting.
The information offered herein on how to set up a Grafana instance connected
to MKE Prometheus is derived from the official Deploy Grafana on
Kubernetes documentation
and modified to work with MKE. As it deploys Grafana with default
credentials, Mirantis strongly recommends that you adjust the configuration
detail to meet your specific needs prior to deploying Grafana with MKE in a
production environment.
You can now navigate to the Grafana UI, which has the MKE Prometheus data
source installed at http://localhost:3000/. Log in initially using admin
for both the user name and password, taking care to change your credentials
after successful log in.
MKE uses native Kubernetes RBAC, which is active by default for Kubernetes
clusters. The YAML files of many ecosystem applications and integrations use
Kubernetes RBAC to access service accounts. Also, organizations looking to run
MKE both on-premises and in hosted cloud services want to run Kubernetes
applications in both environments without having to manually change RBAC in
their YAML file.
Note
Kubernetes and Swarm roles have separate views. Using the MKE web UI, you
can view all the roles for a particular cluster:
Click Access Control in the navigation menu at the left.
Click Roles.
Select the Kubernetes tab or the Swarm tab to view the specific roles for each.
You create Kubernetes roles either through the CLI using Kubernetes kubectl
tool or through the MKE web UI.
To create a Kubernetes role using the MKE web UI:
Log in to the the MKE web UI.
In the navigation menu at the left, click Access Control to
display the available options.
Click Roles.
At the top of the details pane, click the Kubernetes tab.
Click Create to open the Create Kubernetes
Object page.
Click Namespace to select a namespace for the role from one of
the available options.
Provide the YAML file for the role. To do this, either enter it in the
Object YAML editor, or upload an existing .yml file using the
Click to upload a .yml file selection link at the right.
To create a grant for a Kubernetes role in the MKE web UI:
Log in to the the MKE web UI.
In the navigation menu at the left, click Access Control to
display the available options.
Click the Grants option.
At the top of the details paine, click the Kubernetes tab. All
existing grants to Kubernetes roles are present in the details pane.
Click Create Role Binding to open the Create Role
Binding page.
Select the subject type at the top of the 1. Subject section
(Users, Organizations, or Service
Account).
Create a role binding for the selected subject type:
Users: Select a type from the User drop-down list.
Organizations: Select a type from the
Organization drop-down list. Optionally, you can also select
a team using the Team(optional) drop-down list, if any have
been established.
Service Account: Select a NAMESPACE from the
Namespace drop-down list, then a type from the
Service Account drop-down list.
Click Next to activate the 2. Resource Set section.
Select a resource set for the subject.
By default, the default namespace is indicated. To use a
different namespace, select the Select Namespace button
associated with the desired namespace.
For ClusterRoleBinding, slide the Apply Role Binding to
all namespace (Cluster Role Binding) selector to the right.
Click Next to activate the 3. Role section.
Select the role type.
Role
Cluster Role
Note
Cluster Role type is the only role type available if you enabled Apply Role Binding to all namespace (Cluster
Role Binding) in the 2. Resource Set section.
Audit logs are a chronological record of security-relevant activities by
individual users, administrators, or software components that have had an
effect on an MKE system. They focus on external user/agent actions and
security, rather than attempting to understand state or events of the system
itself.
Audit logs capture all HTTP actions (GET, PUT, POST, PATCH, DELETE) to all MKE
API, Swarm API, and Kubernetes API endpoints (with the exception of the ignored
list) that are invoked and and sent to Mirantis Container Runtime via stdout.
The benefits that audit logs provide include:
Historical troubleshooting
You can use audit logs to determine a sequence of past events that can help
explain why an issue occurred.
Security analysis and auditing
A full record of all user interactions with the container infrastructure
can provide your security team with the visibility necessary to root out
questionable or unauthorized access attempts.
Chargeback
Use audit log about the resources to generate chargeback information.
Alerting
With a watch on an event stream or a notification the event creates, you can
build alerting features on top of event tools that generate alerts for ops
teams (PagerDuty, OpsGenie, Slack, or custom solutions).
The enablement of auditing in MKE does not automatically enable auditing in
Kubernetes objects. To do this, you must set the
kube_api_server_auditing parameter in the MKE configuration file to
true.
Once you have set the kube_api_server_auditing parameter to true,
the following default auditing values are configured on the Kubernetes API
server:
--audit-log-maxage: 30
--audit-log-maxbackup: 10
--audit-log-maxsize: 10
For information on how to enable and configure the Kubernetes API server
audit values, refer to cluster_config table detail in the MKE
configuration file.
You can enable MKE audit logging using the MKE web user interface, the MKE API,
and the MKE configuration file.
The level setting supports the following variables:
""
"metadata"
"request"
Caution
The support_dump_include_audit_logs flag specifies whether user
identification information from the ucp-controller container logs is
included in the support bundle. To prevent this information from being sent
with the support bundle, verify that support_dump_include_audit_logs
is set to false. When disabled, the support bundle collection tool
filters out any lines from the ucp-controller container logs that
contain the substring auditID.
With regard to audit logging, for reasons having to do with system security a
number of MKE API endpoints are either ignored or have their information
redacted.
You can set MKE to automatically record and transmit data to Mirantis through
an encrypted channel for monitoring and analysis purposes. The data collected
provides the Mirantis Customer Success Organization with information that helps
us to better understand the operational use of MKE by our customers. It also
provides key feedback in the form of product usage statistics, which enable our
product teams to enhance Mirantis products and services.
Specifically, with MKE you can send hourly usage reports, as well as
information on API and UI usage.
Caution
To send the telemetry, verify that dockerd and the MKE application container
can resolve api.segment.io and create a TCP (HTTPS) connection on port
443.
To enable telemetry in MKE:
Log in to the MKE web UI as an administrator.
At the top of the navigation menu at the left, click the user name
drop-down to display the available options.
Click Admin Settings to display the available
options.
Click Usage to open the Usage Reporting screen.
Toggle the Enable API and UI tracking slider to the right.
(Optional) Enter a unique label to identify the cluster in the usage
reporting.
Security Assertion Markup Language (SAML) is an open standard for the exchange
of authentication and authorization data between parties. It is commonly
supported by enterprise authentication systems. SAML-based single sign-on (SSO)
gives you access to MKE through a SAML 2.0-compliant identity provider.
MKE supports the Okta and ADFS
identity providers.
To integrate SAML authentication into MKE:
Configure the Identity Provider (IdP).
In the left-side navigation panel, navigate to
user name > Admin Settings > Authentication & Authorization.
Create (Edit) Teams to link with the Group memberships. This updates
team membership information when a user signs in with SAML.
Identity providers require certain values to successfully integrate
with MKE. As these values vary depending on the identity provider,
consult your identity provider documentation for instructions on
how to best provide the needed information.
URL for MKE, qualified with /enzi/v0/saml/acs. For example,
https://111.111.111.111/enzi/v0/saml/acs.
Service provider audience URI
URL for MKE, qualified with /enzi/v0/saml/metadata. For example,
https://111.111.111.111/enzi/v0/saml/metadata.
NameID format
Select Unspecified.
Application user name
Email. For example, a custom ${f:substringBefore(user.email,"@")}
specifies the user name portion of the email address.
Attribute Statements
Name: fullname
Value: user.displayName
Group Attribute Statement
Name: member-of
Filter: (user defined) for associate group membership.
The group name is returned with the assertion.
Name: is-admin
Filter: (user defined) for identifying whether the user is an admin.
Okta configuration
When two or more group names are expected to return with the assertion,
use the regex filter. For example, use the value apple|orange to
return groups apple and orange.
The service provider metadata URI value is the URL for
MKE, qualified with /enzi/v0/saml/metadata. For example,
https://111.111.111.111/enzi/v0/saml/metadata.
SAML configuration requires that you know the metadata URL for your chosen
identity provider, as well as the URL for the MKE host that contains the IP
address or domain of your MKE installation.
To configure SAML integration on MKE:
Log in to the MKE web UI.
In the navigation menu at the left, click the user name
drop-down to display the available options.
Click Admin Settings to display the available
options.
Click Authentication & Authorization.
In the Identity Provider section in the details pane, move the
slider next to SAML to enable the SAML settings.
In the SAML idP Server subsection, enter the URL for the
identity provider metadata in the IdP Metadata URL field.
Note
If the metadata URL is publicly certified, you can continue with the
default settings:
Skip TLS Verification unchecked
Root Certificates Bundle blank
Mirantis recommends TLS verification in production environments. If the
metadata URL cannot be certified by the default certificate authority
store, you must provide the certificates from the identity provider in
the Root Certificates Bundle field.
In the SAML Service Provider subsection, in the MKE
Host field, enter the URL that includes the IP address or
domain of your MKE installation.
The port number is optional. The current IP address or domain displays by
default.
(Optional) Customize the text of the sign-in button by entering the text for
the button in the Customize Sign In Button Text field. By
default, the button text is Sign in with SAML.
Copy the SERVICE PROVIDER METADATA URL, the
ASSERTION CONSUMER SERVICE (ACS) URL, and the SINGLE
LOGOUT (SLO) URL to paste into the identity provider workflow.
Click Save.
Note
To configure a service provider, enter the Identity Provider’s metadata
URL to obtain its metadata. To access the URL, you may need to provide the
CA certificate that can verify the remote server.
To link group membership with users, use the Edit or
Create team dialog to associate SAML group assertion with the
MKE team to synchronize user team membership when the user log in.
From the MKE web UI you can download a client bundle with which you can access
MKE using the CLI and the API.
A client bundle is a group of certificates that enable command-line access and
API access to the software. It lets you authorize a remote Docker engine to
access specific user accounts that are managed in MKE, absorbing all associated
RBAC controls in the process. Once you obtain the client bundle, you can
execute Docker Swarm commands from your remote machine to take effect on the
remote cluster.
Previously-authorized client bundle users can still access MKE, regardless
of the newly configured SAML access controls.
Mirantis recomments that you take the following steps to ensure that access
from the client bundle is in sync with the identity provider, and to thus
prevent any previously-authorized users from accessing MKE through their
existing client bundle:
Remove the user account from MKE that grants the client bundle
access.
If group membership in the identity provider changes, replicate
the change in MKE.
Continue using LDAP to sync group membership.
To download the client bundle:
Log in to the MKE web UI.
In the navigation menu at the left, click the user name
drop-down to display the available options.
Click your account name to display the available options.
Click My Profile.
Click the New Client Bundle drop-down in the details pane and
select Generate Client Bundle.
(Optional) Enter a name for the bundle into the Label field.
You can enhance the security and flexibility of MKE by implementing a SAML
proxy. With such a proxy, you can lock down your MKE deployment and still
benefit from the use of SAML authentication. The proxy, which sits between MKE
and Identity Providers (IdPs), forwards metadata requests between these two
entities, using designated ports during the configuration process.
To set up a SAML proxy in MKE:
Use the MKE web UI to add a proxy service.
Kubernetes
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
Kubernetes > Pods and click the Create
button to call the Create Kubernetes Object pane.
In the Namespace dropdown, select default.
In the Object YAML editor, paste the following
Deployment object YAML:
Be aware that the log entry can take up to five minutes to register.
Configure the SAML proxy.
MKE web UI
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user-name> > Admin Settings > Authentication &
Authorization to display the Authentication & Authorization pane.
Toggle the SAML control to enable SAML and expand the
SAML settings.
Enable the SAML Proxy setting to reveal the
Proxy URL, Proxy Username, and
Proxy Password fields.
Insert the pertinent field information and click Save.
CLI
Note
If upgrading from a previous version of MKE, you will need to add the
[auth.samlProxy] section to the MKE configuration file.
System for Cross-domain Identity Management (SCIM) is a standard for automating
the exchange of user identity information between identity domains or IT
systems. It offers an LDAP alternative for provisioning and managing users
and groups in MKE, as well as for syncing users and groups with an upstream
identity provider. Using SCIM schema and API, you can utilize Single
sign-on services (SSO) across various tools.
Mirantis certifies the use of Okta 3.2.0, however MKE offers the discovery
endpoints necessary to provide any system or application with the product
SCIM configuration.
In the SCIM configuration subsection, either enter the API
token in the API Token field or click Generate
to have MKE generate a UUID.
The base URL for all SCIM API calls is
https://<HostIP>/enzi/v0/scim/v2/. All SCIM methods are accessible
API endpoints of this base URL.
Bearer Auth is the API authentication method. When configured, you access SCIM
API endpoints through the Bearer<token> HTTP Authorization request header.
Note
SCIM API endpoints are not accessible by any other user (or their
token), including the MKE administrator and MKE admin Bearer token.
The only SCIM method MKE supports is an HTTP authentication request header
that contains a Bearer token.
Returns a list of SCIM users (by default, 200 users per page).
Use the startIndex and count query parameters to paginate long lists of
users. For example, to retrieve the first 20 Users, set startIndex to 1
and count to 20, provide the following JSON request:
GET Host IP/enzi/v0/scim/v2/Users?startIndex=1&count=20
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8
The response to the previous query returns paging metadata that is similar to
the following example:
Reactivate inactive users by specifying "active":true. To deactivate
active users, specify "active":false. The value of the {id} should be
the user’s ID.
All attribute values are overwritten, including attributes for which empty
values or no values have been provided. If a previously set attribute value is
left blank during a PUT operation, the value is updated with a blank value
in accordance with the attribute data type and storage provider. The value of
the {id} should be the user’s ID.
Updates an existing group resource, allowing the addition or removal of
individual (or groups of) users from the group with a single operation. Add
is the default operation.
To remove members from a group, set the operation attribute of a member object
to delete.
Updates an existing group resource, overwriting all values for a group
even if an attribute is empty or is not provided.
PUT replaces all members of a group with members that are provided by way
of the members attribute. If a previously set attribute is left blank
during a PUT operation, the new value is set to blank in accordance with
the data type of the attribute and the storage provider.
Returns a JSON structure that describes the SCIM specification features
that are available on a service provider using a schemas attribute of
urn:ietf:params:scim:schemas:core:2.0:ServiceProviderConfig.
MKE integrates with LDAP directory services, thus allowing you to
manage users and groups from your organization directory and to
automatically propagate the information to MKE and MSR.
Once you enable LDAP, MKE uses a remote directory server to create users
automatically, and all logins are forwarded thereafter to the directory server.
When you switch from built-in authentication to LDAP authentication, all
manually created users whose usernames fail to match any LDAP search
results remain available.
When you enable LDAP authentication, you configure MKE to create
user accounts only when users log in for the first time.
To control the integration of MKE with LDAP, you create user searches. For
these user searches, you use the MKE web UI to specify multiple search
configurations and specify multiple LDAP servers
with which to integrate. Searches start with the BaseDN, the Distinguished
Name of the node in the LDAP directory tree in which the search looks
for users.
MKE to LDAP synchronization workflow
The following occurs when MKE synchronizes with LDAP:
MKE creates a set of search results by iterating over each of the
user search configurations, in an order that you specify.
MKE choses an LDAP server from the list of domain servers by considering the
BaseDN from the user search configuration and selecting the domain
server with the longest domain suffix match.
Note
If no domain server has a domain suffix that matches the BaseDN from
the search configuration, MKE uses the default domain server.
MKE creates a list of users from the search and creates MKE
accounts for each one.
Note
If you select the Just-In-Time User Provisioning option, user
accounts are created only when users first log in.
Example workflow:
Consider an example with three LDAP domain servers and three user search
configurations.
The example LDAP domain servers:
LDAP domain server name
URL
default
ldaps://ldap.example.com
dc=subsidiary1,dc=com
ldaps://ldap.subsidiary1.com
dc=subsidiary2,dc=subsidiary1,dc=com
ldaps://ldap.subsidiary2.com
The example user search configurations:
User search configurations
Description
baseDN=\ou=people,dc=subsidiary1,dc=com
For this search configuration, dc=subsidiary1,dc=com is the only
server with a domain that is a suffix, so MKE uses the server
ldaps://ldap.subsidiary1.com for the search request.
For this search configuration, two of the domain servers have a
domain that is a suffix of this BaseDN. As
dc=subsidiary2,dc=subsidiary1,dc=com is the longer of the two,
however, MKE uses the server ldaps://ldap.subsidiary2.com for the
search request.
baseDN=\ou=eng,dc=example,dc=com
For this search configuration, no server with a domain specified is a
suffix of this BaseDN, so MKE uses the default server,
ldaps://ldap.example.com, for the search request.
Whenever user search results contain username collisions between the
domains, MKE uses only the first search result, and thus the ordering of the
user search configurations can be important. For example, if both the first and
third user search configurations result in a record with the username
jane.doe, the first has higher precedence and the second is ignored. As
such, it is important to implement a username attribute that is unique for
your users across all domains. As a best practice, choose something that is
specific to the subsidiary, such as the email address for each user.
MKE saves a minimum amount of user data required to operate, including any
user name and full name attributes that you specify in the configuration, as
well as the Distinguished Name (DN) of each synced user. MKE does not store
any other data from the directory server.
Use the MKE web UI to configure MKE to create and authenticate users using
an LDAP directory.
To configure an LDAP server, perform the following steps:
To set up a new LDAP server, configure the settings in the LDAP
Server subsection:
Control
Description
LDAP Server URL
The URL for the LDAP server.
Reader DN
The DN of the LDAP account that is used to search entries in the LDAP
server. As a best practice, this should be an LDAP read-only user.
Reader Password
The password of the account used to search entries in the LDAP
server.
Skip TLS verification
Sets whether to verify the LDAP server certificate when
TLS is in use. The connection is still encrypted, however it is
vulnerable to man-in-the-middle attacks.
Use Start TLS
Defines whether to authenticate or encrypt the connection after
connection is made to the LDAP server over TCP. To ignore the
setting, set the LDAP Server URL field to ldaps://.
No Simple Pagination (RFC 2696)
Indicates that your LDAP server does not support pagination.
Just-In-Time User Provisioning
Sets whether to create user accounts only when users log in for the
first time. Mirantis recommends using the default true value.
Note
Available as of MKE 3.6.4 The
disableReferralChasing setting, which is currently only available
by way of the MKE API, allows you to disable the default
behavior that occurs when a referral URL is received as a result of an
LDAP search request. Refer to LDAP Configuration through API for
more information.
In the LDAP Additional Domains subsection, click Add
LDAP Domain +. A set of input tools for configuring the additional domain
displays.
Configure the settings for the new LDAP domain:
Control
Description
LDAP Domain
Text field in which to enter the root domain component of this
server. A longest-suffix match of the BaseDN for LDAP searches
is used to select which LDAP server to use for search requests. If no
matching domain is found, the default LDAP server configuration is
put to use.
LDAP Server URL
Text field in which to enter the URL for the LDAP server.
Reader DN
Text field in which to enter the DN of the LDAP account that is used
to search entries in the LDAP server. As a best practice, this should
be an LDAP read-only user.
Reader Password
The password of the account used to search entries in the LDAP
server.
Skip TLS verification
Sets whether to verify the LDAP server certificate when
TLS is in use. The connection is still encrypted, however it is
vulnerable to man-in-the-middle attacks.
Use Start TLS
Sets whether to authenticate or encrypt the connection after
connection is made to the LDAP server over TCP. To ignore the
setting, set the LDAP Server URL field to ldaps://.
No Simple Pagination (RFC 2696)
Select if your LDAP server does not support pagination.
Note
Available as of MKE 3.6.4 The
disableReferralChasing setting, which is currently only available
by way of the MKE API, allows you to disable the default
behavior that occurs when a referral URL is received as a result of an
LDAP search request. Refer to LDAP Configuration through API for
more information.
Click Confirm to add the new LDAP domain.
Repeat the procedure to add any additional LDAP domains.
To add LDAP user search configurations to your LDAP integration:
In the LDAP User Search Configurations subsection, click
Add LDAP User Search Configuration +.A set of input tools for
configuring the LDAP user search configurations displays.
Field
Description
Base DN
Text field in which to enter the DN of the node in the directory tree,
where the search should begin seeking out users.
Username Attribute
Text field in which to enter the LDAP attribute that serves as
username on MKE. Only user entries with a valid username will be
created.
A valid username must not be longer than 100 characters and must not
contain any unprintable characters, whitespace characters, or any of
the following characters: /\[]:;|=,+*?<>'".
Full Name Attribute
Text field in which to enter the LDAP attribute that serves as the
user’s full name, for display purposes. If the field is left empty, MKE
does not create new users with a full name value.
Filter
Text field in which to enter an LDAP search filter to use to find
users. If the field is left empty, all directory entries in the
search scope with valid username attributes are created as users.
Search subtree instead of just one level
Whether to perform the LDAP search on a single level of the LDAP tree,
or search through the full LDAP tree starting at the BaseDN.
Match Group Members
Sets whether to filter users further, by selecting those who are also
members of a specific group on the directory server. The feature is
helpful when the LDAP server does not support memberOf search
filters.
Iterate through group members
Sets whether, when the Match Group Members option is
enabled to sync users, the sync is done by iterating over the target
group’s membership and making a separate LDAP query for each member,
rather than through the use of a broad user search filter. This
option can increase efficiency in situations where the number of
members of the target group is significantly smaller than the number
of users that would match the above search filter, or if your
directory server does not support simple pagination of search
results.
Group DN
Text field in which to enter the DN of the LDAP group from which to
select users, when the Match Group Members option is
enabled.
Group Member Attribute
Text field in which to enter the name of the LDAP group entry
attribute that corresponds to the DN of each of the group members.
Click Confirm to add the new LDAP user search configurations.
Repeat the procedure to add any additional user search configurations. More
than one such configuration can be useful in cases where users may be found
in multiple distinct subtrees of your organization directory. Any user entry
that matches at least one of the search configurations will be synced as a
user.
Prior to saving your configuration changes, you can use the dedicated
LDAP Test login tool to test the integration using the login
credentials of an LDAP user.
Input the credentials for the test user into the provided
Username and Passworfd fields:
Field
Description
Username
An LDAP user name for testing authentication to MKE. The value
corresponds to the Username Attribute that is specified
in the Add LDAP user search configurations section.
Password
The password used to authenticate (BIND) to the directory
server.
Click Test. A search is made against the directory using the
provided search BaseDN, scope, and filter. Once the user entry is found
in the directory, a BIND request is made using the input user DN and the
given password value.
Following LDAP integration, MKE synchronizes users at the top of the hour,
based on an intervial that is defined in hours.
To set LDAP synchronization, configure the following settings in the
LDAP Sync Configuration section:
Field
Description
Sync interval
The interval, in hours, to synchronize users between MKE and the LDAP
server. When the synchronization job runs, new users found in the LDAP
server are created in MKE with the default permission level. MKE users
that do not exist in the LDAP server become inactive.
Enable sync of admin users
This option specifies that system admins should be synced directly with
members of a group in your organization’s LDAP directory. The admins
will be synced to match the membership of the group. The configured
recovery admin user will also remain a system admin.
In addition to configuring MKE LDAP synchronization, you can also perform a hot
synchronization by clicking the Sync Now button in the
LDAP Sync Jobs subsection. Here you can also view the logs for each
sync jobs by clicking View Logs link associated with a particular
job.
Whenever a user is removed from LDAP, the effect on their MKE account
is determined by the Just-In-Time User Provisioning setting:
false: Users deleted from LDAP become inactive in MKE following the next
LDAP synchronization runs.
true: A user deleted from LDAP cannot authenticate. Their MKE accounts
remain active, however, and thus they can use their client bundles to run
commands. To prevent this, deactivate the user’s MKE user account.
MKE enables the syncing of teams within Organizations with LDAP, using either a
search query or by matching a group that is established in your LDAP directory.
Log in to the MKE web UI as an administrator.
Navigate to Access Control > Orgs & Teams to display the
Organizations that exist within your MKE instance.
Locate the name of the Organization that contains the MKE team that you want
to sync to LDAP and click it to display all of the MKE teams for that
Organization.
Hover your cursor over the MKE team that you want
to sync with LDAP to reveal its vertical ellipsis, at the far right.
Click the vertical ellipsis and select Edit to call the
Details screen for the team.
Toggle ENABLE SYNC TEAM MEMBERS to Yes to reveal
the LDAP sync controls.
Toggle LDAP MATCH METHOD to set the LDAP match method you want
to use to make the sync, Match Search Results (default) or
Match Group Members.
For Match Search Results:
Enter a Base DN into the Search Base DN field, as it is
established in LDAP.
Enter a search filter based on one or more attributes into the
Search filter field.
Optional. Check Search subtree instead of just one level to
enable search down through any sub-groups that exist within the group
you entered into the Search Base DN field.
For Match Group Members:
Enter the group Distinguised Name (DN) into the Group DN
field.
Enter a member attribute into the Group Member field.
Toggle IMMEDIATELY SYNC TEAM MEMBERS as appropriate.
For identity providers that require a client redirect URI, use
https://<MKE_HOST>/login. For identity providers that do not permit the use
of an IP address for the host, use https://<mke-cluster-domain>/login.
The requested scopes for all identity providers are "openidemail". Claims
are read solely from the ID token that your identity provider returns. MKE does
not use the UserInfo URL to obtain user information. The default username
claim is sub. To use a different username claim, you must specify that
value with the usernameClaim setting in the MKE configuration file.
The following example details the MKE configuration file settings for using an
external identity provider.
For the *signInCriteria array, term is set to hosted domain
("hd") and value is set to the domain from which the user is
permitted to sign in.
For the *adminRoleCriteria array, matchType is set to "contains",
in case any administrators are assigned to multiple roles that include
admin.
Using an external identity provider to sign in to the MKE web UI creates a
new user session, and thus users who sign in this way will not be signed out
when their ID token expires. Instead, the session lifetime is set using the
auth.sessions parameters in the MKE configuration
file.
Verify that LDAP and SAML teams are both enabled for syncing.
In the left-side navigation panel, navigate to
Access Control > Orgs & Teams
Select the required organization and then select the required team.
Click the gear icon in the upper right corner.
On the Details tab, select ENABLE SYNC TEAM MEMBERS.
Select ALLOW NON-LDAP MEMBERS.
To determine a user’s authentication protocol:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
Access Control > Users and select the target user.
If an LDAP DN attribute is present next to
Full Name and Admin, the user is managed by LDAP.
If, however, the LDAP DN attribute is not present, the user is
not managed by LDAP.
Unexpected behavior can result from having the same user name in both SAML and
LDAP.
If just-in-time (JIT) provisioning is enabled in LDAP, MKE only allows
log in attempts from the identity provider that first attempts to log in. MKE
then blocks all log in attempts from the second identify provider.
If JIT provisioning is disabled in LDAP, the LDAP synchronization, which occurs
at regular intervals, always overrides the ability of the SAML user account
to log in.
To allow overlapping user names:
There may at times be a user who has the same name in both LDAP and SAML who
you want to be able to sign in using either protocol.
Define a custom SAML attribute with a name of dn and a
value that is equivalent to the user account distinguished name (DN) with
the LDAP provider. Refer to Define a custom SAML attribute
in the Okta documentation for more information.
Note
MKE considers such users to be LDAP users. As such, should their LDAP DN
change, the custom SAML attribute must be updated to match.
Log in to the MKE web UI.
From the left-side navigation panel, navigate to
<user name> > Admin Settings > Authentication & Authorization
and scroll down to the LDAP section.
Under SAML integration, select
Allow LDAP users to sign in using SAML.
You can configure MKE to allow users to deploy and run services in worker
nodes only, to ensure that all cluster management functionality remains
performant and to enhance cluster security.
Important
If for whatever reason a user deploys a malicious service that can affect
the node on which it is running, that service will not be able to strike any
other nodes in the cluster or have any impact on cluster management
functionality.
Restrict services deployment to Swarm worker nodes¶
To keep manager nodes performant, it is necessary at times to restrict
service deployment to Swarm worker nodes.
To restrict services deployment to Swarm worker nodes:
Log in to the MKE web UI with administrator credentials.
Click the user name at the top of the navigation menu.
Navigate to Admin Settings > Orchestration.
Under Container Scheduling, toggle all of the sliders to the
left to restrict the deployment only to worker nodes.
Note
Creating a grant with the Scheduler role against the / collection
takes precedence over any other grants with NodeSchedule on
subcollections.
Restrict services deployment to Kubernetes worker nodes¶
By default, MKE clusters use Kubernetes taints and tolerations
to prevent user workloads from deploying to MKE manager or MSR nodes.
Note
Workloads deployed by an administrator in the kube-system namespace do
not follow scheduling constraints. If an administrator deploys a
workload in the kube-system namespace, a toleration is applied to bypass
the taint, and the workload is scheduled on all node types.
Schedule services deployment on manager and MSR nodes¶
Log in to the MKE web UI with administrator credentials.
Click the user name at the top of the navigation menu.
Navigate to Admin Settings > Orchestration.
Select from the following options:
Under Container Scheduling, toggle to the right the slider
for Allow administrators to deploy containers on MKE managers
or nodes running MSR.
Under Container Scheduling, toggle to the right the slider
for Allow all authenticated users, including service accounts,
to schedule on all nodes, including MKE managers and MSR nodes..
Following any scheduling action, MKE applies a toleration to new workloads, to
allow the Pods to be scheduled on all node types. For existing workloads,
however, it is necessary to manually add the toleration to the Pod
specification.
Add a toleration to the Pod specification for existing workloads¶
Add the following toleration to the Pod specification, either through the
MKE web UI or using the kubectl edit <object> <workload> command:
A NoSchedule taint is present on MKE manager and MSR nodes, and if you
disable scheduling on managers and/or workers a toleration for that taint
will not be applied to the deployments. As such, you should not schedule on
these nodes, except when the Kubernetes workload is deployed in the
kube-system namespace.
With MKE you can force applications to use only Docker images that are signed
by MKE users you trust. Every time a user attempts to deploy an application to
the cluster, MKE verifies that the application is using a trusted Docker
image. If a trusted Docker image is not in use, MKE halts the deployment.
By signing and verifying the Docker images, you ensure that the images in use
in your cluster are trusted and have not been altered, either in
the image registry or on their way from the image registry to your MKE cluster.
Example workflow
A developer makes changes to a service and pushes their changes to a
version control system.
A CI system creates a build, runs tests, and pushes an image to the Mirantis
Secure Registry (MSR) with the new changes.
The quality engineering team pulls the image, runs more tests, and signs
and pushes the image if the image is verified.
IT operations deploys the service, but only if the image in use is signed
by the QA team. Otherwise, MKE will not deploy.
To configure MKE to only allow running services that use Docker trusted
images:
Log in to the MKE web UI.
In the left-side navigation menu, click the user name drop-down to display
the available options.
Click Admin Settings > Docker Content Trust to reveal the
Content Trust Settings page.
Enable Run only signed images.
Important
At this point, MKE allows the deployment of any signed image, regardless
of signee.
(Optional) Make it necessary for the image to be signed by a particular
team or group of teams:
Click Add Team+ to reveal the two-part tool.
From the drop-down at the left, select an organization.
From the drop-down at the right, select a team belonging to the
organization you selected.
Repeat the procedure to configure additional teams.
Note
If you specify multiple teams, the image must be signed by
a member of each team, or someone who is a member of all of the
teams.
Click Save.
MKE immediately begins enforcing the image trust policy. Existing services
continue to run and you can restart them as necessary. From this point,
however, MKE only allows the deployment of new services that use a
trusted image.
MKE enables the setting of various user sessions properties, such as
session timeout and the permitted number of concurrent sessions.
To configure MKE login session properties:
Log in to the MKE web UI.
In the left-side navigation menu, click the user name drop-down to display
the available options.
Click Admin Settings > Authentication & Authorization to reveal
the MKE login session controls.
The following table offers information on the MKE login session controls:
Field
Description
Lifetime Minutes
The set duration of a login session in minutes, starting from the moment
MKE generates the session. MKE invalidates the active session once this
period expires and the user must re-authenticate to establish a
new session.
Default: 60
Minimum: 10
Renewal Threshold Minutes
The time increment in minutes by which MKE extends an active session
prior to session expiration. MKE extends the session by the amount
specified in Lifetime Minutes. The threshold value cannot be
greater than that set in Lifetime Minutes.
To specify that sessions not be extended, set the threshold value
to 0. Be aware, though, that this may cause MKE web
UI users to be unexpectedly logged out.
Default: 20
Maximum: 5 minutes less than Lifetime Minutes
Per User Limit
The maximum number of sessions a user can have running
simultaneously. If the creation of a new session results in the
exceeding of this limit, MKE will delete the session least recently put
to use. Specifically, every time you use a session token, the server
marks it with the current time (lastUsed metadata). When you create
a new session exceeds the per-user limit, the session
with the oldest lastUsed time is deleted, which is not necessarily
the oldest session.
To disable the Per User Limit setting, set the value to
0.
The MKE configuration file documentation is up-to-date for the latest MKE
release. As such, if you are running an earlier version of MKE, you may
encounter detail for configuration options and parameters that are not
applicable to the version of MKE you are currently running.
Refer to the MKE Release Notes for specific
version-by-version information on MKE configuration file additions and
changes.
The configuring of an MKE cluster takes place through the application of a TOML
file. You use this file, the MKE configuration file, to import and export MKE
configurations, to both create new MKE instances and to modify existing ones.
Refer to example-config in the MKE CLI reference documentation
to learn how to download an example MKE configuration file.
Put the MKE configuration file to work for the following use cases:
Set the configuration file to run at the install time of new MKE clusters
Use the API to import the file back into the same cluster
Use the API to import the file into multiple clusters
To make use of an MKE configuration file, you edit the file using either the
MKE web UI or the command line interface (CLI). Using the CLI, you can either
export the existing configuration file for editing, or use the
example-config command to view and edit an example TOML MKE
configuration file.
Working as an MKE admin, use the config-toml API from within the directory
of your client certificate bundle to export the current MKE settings to a TOML
file.
As detailed herein, the command set exports the current configuration for the
MKE hostname MKE_HOST to a file named mke-config.toml:
To customize a new MKE instance using a configuration file, you must create the
file prior to installation. Then, once the new configuration file is ready, you
can configure MKE to import it during the installation process using Docker
Swarm.
To import a configuration file at installation:
Create a Docker Swarm Config object named com.docker.mke.config and
the TOML value of your MKE configuration file contents.
When installing MKE on the cluster, specify the --existing-config flag
to force the installer to use the new Docker Swarm Config object for its
initial configuration.
Following the installation, delete the com.docker.mke.config object.
The length of time, in minutes, before the expiration of a session
where, if used, a session will be extended by the current configured
lifetime from then. A value of 0 disables session extension.
Default: 20
per_user_limit
no
The maximum number of sessions that a user can have simultaneously
active. If creating a new session will put a user over this
limit, the least recently used session is deleted.
A value of 0 disables session limiting.
Default: 10
store_token_per_session
no
If set, the user token is stored in sessionStorage instead of
localStorage. Setting this option logs the user out and
requires that they log back in, as they are actively changing the manner
in which their authentication is stored.
An array of claims that admin user ID tokens require for use with MKE. Creating
a new account using a token that satisfies the criteria determined by this
array automatically produces an administrator account.
Parameter
Required
Description
term
yes
Sets the name of the claim.
value
yes
Sets the value for the claim in the form of a string.
matchType
yes
Sets how the JWT claim is evaluated.
Valid values:
must - the JWT claim value must be the same as the configuration
value.
contains - the JWT claim value must contain the configuration
value.
The hardening_enabled option must be set to true to enable all
other hardening_configuration options.
Parameter
Required
Description
hardening_enabled
no
Parent option that when set to true enables security hardening
configuration options: limit_kernel_capabilities,
pid_limit, pid_limit_unspecified, and
use_strong_tls_ciphers.
Default: false
limit_kernel_capabilities
no
The option can only be enabled whenhardening_enabledis set
totrue.
Limits kernel capabilities to the minimum required by each container.
Components run using Docker default capabilities by default. When you
enable limit_kernel_capabilities all capabilities are
dropped, except those that are specifically in use by the component.
Several components run as privileged, with capabilities that cannot be
disabled.
Default: false
pid_limit
no
The option can only be enabled whenhardening_enabledis set
totrue.
Sets the maximum number of PIDs MKE can allow for their respective
orchestrators.
The pid_limit option must be set to the default 0 when it is
not in use.
Default: 0
pid_limit_unspecified
no
The option can only be enabled whenhardening_enabledis set
totrue.
When set to false, enables PID limiting, using the pid_limit
option value for the associated orchestrator.
Default: true
use_strong_tls_ciphers
no
The option can only be enabled whenhardening_enabledis set
totrue.
When set to true, in line with control 4.2.12 of the CIS Kubernetes
Benchmark 1.7.0, the use_strong_tls_ciphers parameter limits the
allowed ciphers for the cipher_suites_for_kube_api_server,
cipher_suites_for_kubelet and cipher_suites_for_etcd_server
parameters in the cluster_config table to the following:
An array of tables that specifies the MSR instances that are managed by the
current MKE instance.
Parameter
Required
Description
host_address
yes
Sets the address for connecting to the MSR instance tied to the MKE
cluster.
service_id
yes
Sets the MSR instance’s OpenID Connect Client ID, as registered with the
Docker authentication provider.
ca_bundle
no
Specifies the root CA bundle for the MSR instance if you are using a
custom certificate authority (CA). The value is a string with the
contents of a ca.pem file.
Specifies scheduling options and the default orchestrator for new nodes.
Note
If you run a kubectl command, such as kubectl describe
nodes, to view scheduling rules on Kubernetes nodes, the results that
present do not reflect the MKE admin settings conifguration. MKE uses taints
to control container scheduling on nodes and is thus unrelated to the
kubectlUnschedulable boolean flag.
Parameter
Required
Description
enable_admin_ucp_scheduling
no
Determines whether administrators can schedule containers on
manager nodes.
Valid values: true, false.
Default: false
You can also set the parameter using the MKE web UI:
Log in to the MKE web UI as an administrator.
Click the user name drop-down in the left-side navigation panel.
Click Admin Settings > Orchestration to view the
Orchestration screen.
Scroll down to the Container Scheduling section and
toggle on the Allow administrators to deploy containers on
MKE managers or nodes running MSR slider.
default_node_orchestrator
no
Sets the type of orchestrator to use for new nodes that join
the cluster.
Set to require the signing of images by content trust.
Valid values: true, false.
Default: false
You can also set the parameter using the MKE web UI:
Log in to the MKE web UI as an administrator.
Click the user name drop-down in the left-side navigation panel.
Click Admin Settings > Docker Content Trust to open the
Content Trust Settings screen.
Toggle on the Run only signed images slider.
require_signature_from
no
A string array that specifies which users or teams must sign images.
allow_repos
no
A string array that specifies repos that are to bypass content trust
check, for example, ["docker.io/mirantis/dtr-rethink","docker.io/mirantis/dtr-registry"....].
Configures the logging options for MKE components.
Parameter
Required
Description
protocol
no
The protocol to use for remote logging.
Valid values: tcp, udp.
Default: tcp
host
no
Specifies a remote syslog server to receive sent MKE controller logs. If
omitted, controller logs are sent through the default Docker daemon
logging driver from the ucp-controller container.
Set to enable attempted automatic license renewal when the
license nears expiration. If disabled, you must manually upload renewed
license after expiration.
Included when you need to set custom API headers. You can repeat this
section multiple times to specify multiple separate headers. If you
include custom headers, you must specify both name and value.
[[custom_api_server_headers]]
Item
Description
name
Set to specify the name of the custom header with name =
“X-Custom-Header-Name”.
value
Set to specify the value of the custom header with value = “Custom
Header Value”.
Configures the cluster that the current MKE instance manages.
The dns, dns_opt, and dns_search settings configure the DNS
settings for MKE components. These values, when assigned, override the
settings in a container /etc/resolv.conf file.
Parameter
Required
Description
controller_port
yes
Sets the port that the ucp-controller monitors.
Default: 443
kube_apiserver_port
yes
Sets the port the Kubernetes API server monitors.
kube_protect_kernel_defaults
no
Protects kernel parameters from being overridden by kubelet.
Default: false.
Important
When enabled, kubelet can fail to start if the following kernel
parameters are not properly set on the nodes before you install MKE
or before adding a new node to an existing cluster:
Sets the port that the ucp-swarm-manager monitors.
Default: 2376
swarm_strategy
no
Sets placement strategy for container scheduling. Be aware that this
does not affect swarm-mode services.
Valid values: spread, binpack, random.
dns
yes
Array of IP addresses that serve as nameservers.
dns_opt
yes
Array of options in use by DNS resolvers.
dns_search
yes
Array of domain names to search whenever a bare unqualified host name is
used inside of a container.
profiling_enabled
no
Determines whether specialized debugging endpoints are enabled for
profiling MKE performance.
Valid values: true, false.
Default: false
authz_cache_timeout
no
Sets the timeout in seconds for the RBAC information cache of MKE
non-Kubernetes resource listing APIs. Setting changes take immediate
effect and do not require a restart of the MKE controller.
Default: 0 (cache is not enabled)
Once you enable the cache, the result of non-Kubernetes resource listing
APIs only reflects the latest RBAC changes for the user when the
cached RBAC info times out.
kv_timeout
no
Sets the key-value store timeout setting, in milliseconds.
Default: 5000
kv_snapshot_count
Required
Sets the key-value store snapshot count.
Default: 20000
external_service_lb
no
Specifies an optional external load balancer for default links to
services with exposed ports in the MKE web interface.
cni_installer_url
no
Specifies the URL of a Kubernetes YAML file to use to install a
CNI plugin. Only applicable during initial installation. If left empty,
the default CNI plugin is put to use.
metrics_retention_time
no
Sets the metrics retention time.
metrics_scrape_interval
no
Sets the interval for how frequently managers gather metrics from nodes
in the cluster.
metrics_disk_usage_interval
no
Sets the interval for the gathering of storage metrics, an
operation that can become expensive when large volumes are present.
nvidia_device_plugin
no
Enables the nvidia-gpu-device-plugin, which is disabled by default.
rethinkdb_cache_size
no
Sets the size of the cache for MKE RethinkDB servers.
Default: 1GB
Leaving the field empty or specifying auto instructs
RethinkDB to automatically determine the cache size.
exclude_server_identity_headers
no
Determines whether the X-Server-Ip and X-Server-Name
headers are disabled.
Valid values: true, false.
Default: false
cloud_provider
no
Sets the cloud provider for the Kubernetes cluster.
pod_cidr
yes
Sets the subnet pool from which the IP for the Pod should be allocated
from the CNI IPAM plugin.
Default: 192.168.0.0/16
ipip_mtu
no
Sets the IPIP MTU size for the Calico IPIP tunnel interface.
azure_ip_count
yes
Sets the IP count for Azure allocator to allocate IPs per Azure virtual
machine.
service_cluster_ip_range
yes
Sets the subnet pool from which the IP for Services should be allocated.
Default: 10.96.0.0/16
nodeport_range
yes
Sets the port range for Kubernetes services within which the type
NodePort can be exposed.
Default: 32768-35535
custom_kube_api_server_flags
no
Sets the configuration options for the Kubernetes API server.
Be aware that this parameter function is only for development and
testing. Arbitrary Kubernetes configuration parameters are not tested
and supported under the MKE Software Support Agreement.
custom_kube_controller_manager_flags
no
Sets the configuration options for the Kubernetes controller manager.
Be aware that this parameter function is only for development and
testing. Arbitrary Kubernetes configuration parameters are not tested
and supported under the MKE Software Support Agreement.
custom_kubelet_flags
no
Sets the configuration options for kubelet.
Be aware that this parameter function is only for development and
testing. Arbitrary Kubernetes configuration parameters are not tested
and supported under the MKE Software Support Agreement.
custom_kubelet_flags_profilesAvailable since MKE 3.7.10
no
Sets a profile that can be applied to the kubelet agent on any node.
custom_kube_scheduler_flags
no
Sets the configuration options for the Kubernetes scheduler.
Be aware that this arameter function is only for development and
testing. Arbitrary Kubernetes configuration parameters are not tested
and supported under the MKE Software Support Agreement.
local_volume_collection_mapping
no
Set to store data about collections for volumes in the MKE local KV
store instead of on the volume labels. The parameter is used to enforce
access control on volumes.
manager_kube_reserved_resources
no
Reserves resources for MKE and Kubernetes components that are
running on manager nodes.
worker_kube_reserved_resources
no
Reserves resources for MKE and Kubernetes components that are
running on worker nodes.
kubelet_max_pods
yes
Sets the number of Pods that can run on a node.
Maximum: 250
Default: 110
kubelet_pods_per_core
no
Sets the maximum number of Pods per core.
0 indicates that there is no limit on the number of Pods per core.
The number cannot exceed the kubelet_max_pods setting.
Recommended: 10
Default: 0
secure_overlay
no
Enables IPSec network encryption in Kubernetes.
Valid values: true, false.
Default: false
image_scan_aggregation_enabled
no
Enables image scan result aggregation. The feature displays image
vulnerabilities in shared resource/containers and shared
resources/images pages.
Valid values: true, false.
Default: false
swarm_polling_disabled
no
Determines whether resource polling is disabled for both Swarm and
Kubernetes resources, which is recommended for production instances.
Valid values: true, false.
Default: false
oidc_client_id
no
Sets the OIDC client ID, using the eNZi service ID that is in the ODIC
authorization flow.
hide_swarm_ui
no
Determines whether the UI is hidden for all Swarm-only object types (has
no effect on Admin Settings).
Valid values: true, false.
Default: false
You can also set the parameter using the MKE web UI:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, click the user name
drop-down.
Click Admin Settings > Tuning to open the
Tuning screen.
Toggle on the Hide Swarm Navigation slider located under
the Configure MKE UI heading.
unmanaged_cni
yes
Sets Calico as the CNI provider, managed by MKE. Note that Calico is the
default CNI provider.
calico_ebpf_enabled
yes
Enables Calico eBPF mode.
kube_default_drop_masq_bits
yes
Sets the use of Kubernetes default values for iptables drop and
masquerade bits.
kube_proxy_mode
yes
Sets the operational mode for kube-proxy.
Valid values: iptables, ipvs, disabled.
Default: iptables
cipher_suites_for_kube_api_server
no
Sets the value for the kube-apiserver--tls-cipher-suites
parameter.
cipher_suites_for_kubelet
no
Sets the value for the kubelet--tls-cipher-suites parameter.
cipher_suites_for_etcd_server
no
Sets the value for the etcd server --cipher-suites
parameter.
image_prune_schedule
no
Sets the cron expression used for the scheduling of image pruning. The
parameter accepts either full crontab specifications or descriptors, but
not both.
Full crontab specifications, which include <seconds><minutes><hours><dayofmonth><month><dayofweek>. For example,
"000***".
Descriptors, which are textual in nature, with a preceding @ symbol.
For example: "@midnight" or "@every1h30m".
Implement pubkey_auth_cache_enabled only in cases in which there are certain performance issues in high-load clusters, and only under the guidance of Mirantis Support personnel.
Enables public key authentication cache.
Note
ucp-controller must be restarted for setting changes to take
effect.
Default: false.
prometheus_memory_limit
no
The maximum amount of memory that can be used by the Prometheus
container.
Default: 2Gi.
prometheus_memory_request
no
The minimum amount of memory reserved for the Prometheus container.
Default: 1Gi.
shared_sans
no
Subject alternative names for manager nodes.
kube_manager_terminated_pod_gc_threshold
no
Allows users to set the threshold for the terminated Pod garbage
collector in Kube Controller Manager according to their cluster-specific
requirement.
Default: 12500
kube_api_server_request_timeout
no
Timeout for Kube API server requests.
Default: 1m
cadvisor_enabled
no
Enables the ucp-cadvisor comoponent, which runs a standalone
cadvisor instance on each node to provide additional container level
metrics with all expected labels.
Default: false
calico_controller_probes_tuning
no
Enables the user to specify values for the Calico controller liveness
and readiness probes.
Set the configuration for the NGINX Ingress Controller to manage traffic that
originates outside of your cluster (ingress traffic).
Note
Prior versions of MKE use Istio Ingress to manage traffic that originates
from outside of the cluster, which employs many of the same parameters as
NGINX Ingress Controller.
Parameter
Required
Description
enabled
No
Disables HTTP ingress for Kubernetes.
Valid values: true, false.
Default: false
ingress_num_replicas
No
Sets the number of NGINX Ingress Controller deployment replicas.
Default: 2
ingress_external_ips
No
Sets the list of external IPs for Ingress service.
Default: [] (empty)
ingress_enable_lb
No
Enables an external load balancer.
Valid values: true, false.
Default: false
ingress_preserve_client_ip
No
Enables preserving inbound traffic source IP.
Valid values: true, false.
Default: false
ingress_exposed_ports
No
Sets ports to expose.
For each port, provide arrays that contain the following port
information (defaults as displayed):
name = http2
port = 80
target_port = 0
node_port = 33000
name = https
port = 443
target_port = 0
node_port = 33001
name = tcp
port = 31400
target_port = 0
node_port = 33002
ingress_node_affinity
No
Sets node affinity.
key = com.docker.ucp.manager
value = ""
target_port = 0
node_port = 0
ingress_node_toleration
No
Sets node toleration.
For each node, provide an array that contains the following information
(defaults as displayed):
key = com.docker.ucp.manager
value = ""
operator = Exists
effect = NoSchedule
config_map
No
Sets advanced options for the NGINX proxy.
NGINX Ingress Controller uses ConfigMap to configure the NGINX
proxy. For the complete list of available options, refer to the NGINX
Ingress Controller documentation ConfigMap: configuration options.
Examples:
map-hash-bucket-size="128"
ssl-protocols="SSLv2"
ingress_extra_args.http_port
No
Sets the container port for servicing HTTP traffic.
Default: 80
ingress_extra_args.https_port
No
Sets the container port for servicing HTTPS traffic.
Default: 443
ingress_extra_args.enable_ssl_passthrough
No
Enables SSL passthrough.
Default: false
ingress_extra_args.default_ssl_certificate
No
Sets the Secret that contains an SSL certificate to be used as a default
TLS certificate.
Enable and disable OPA Gatekeeper for policy enforcement.
Note
By design, when the OPA Gatekeeper is disabled using the configuration file,
the Pods are deleted but the policies are not cleaned up. Thus, when the OPA
Gatekeeper is re-enabled, the cluster can immediately adopt the existing
policies.
The retention of the policies poses no risk, as they are just data on the
API server and have no value outside of a OPA Gatekeeper deployment.
Parameter
Required
Description
enabled
No
Enables the Gatekeeper function.
Valid values: true, false.
Default: false.
excluded_namespaces
No
Excludes from the Gatekeeper admission webhook all of the resources that
are contained in a list of namespaces. Specify as a comma-separated
list.
Length of time during which lameduck will run, expessed with integers
and time suffixes, such as s for seconds and m for minutes.
Note
The configured value for duration must be greater than 0s.
Default values are applied for any fields that are left blank.
Default: 7s.
Caution
Editing the CoreDNS config map outside of MKE to configure the lameduck
function is not supported. Any such attempts will be superseded by the
values that are configured in the MKE configuration file.
Configures backup scheduling and notifications for MKE.
Parameter
Required
Description
notification-delay
yes
Sets the number of days that elapse before a user is notified that they
have not performed a recent backup. Set to -1 to disable
notifications.
Default: 7
enabled
yes
Enables backup scheduling.
Valid values: true, false.
Default: false
path
yes
Sets the storage path for scheduled backups. Use
chmodo+w/<path> to ensure that other users have write privileges.
no_passphrase
yes
Sets whether a passphrase is necessary to encrypt the TAR file. A value
of true negates the use of a passphrase. A non-empty value in the
passphrase parameter requires that no-passphrase be set to
false.
Default: false
passphrase
yes
Encrypts the TAR file with a passphrase for all scheduled backups. Must
remain empty if no_passphrase is set to true.
Do not share the configuration file if a passphrase is used, as the
passphrase displays in plain text.
cron_spec
yes
Sets the cron expression in use for scheduling backups. The parameter
accepts either full crontab specifications or descriptors, but not both.
Full crontab specifications include <seconds><minutes><hours><dayofmonth><month><dayofweek>. For example:
"000***".
Descriptors, which are textual in nature, have a preceding @
symbol. For example: "@midnight" or "@every1h30m".
Minimum Time To Live (TTL) for retaining certain events in etcd.
Default: 0
cron_expression
yes
Sets the cron expression to use for scheduling backups.
cron_expression accepts either full crontab specifications or
descriptors. It does not, though, concurrenlty accept both.
Full crontab specifications include <seconds> <minutes> <hours>
<day-of-month> <month> <day-of-week>.
For example, 000**MON
Descriptors, which are textual in nature, have a preceding @
symbol. For example: “@weekly”, “@monthly” or “@every 72h”.
The etcd cleanup operation starts with the deletion of the events, which
is followed by the compacting of the etcd revisions. The cleanup
scheduling inerval must be set for a minimum of 72 hours.
Enables defragmentation of the etcd cluster after successful cleanup.
Warning
The etcd cluster defragmentation process can cause temporary
performance degradation. To minimize possible impact, schedule
cron_expression to occur during off-peak hours or periods of low
activity.
Valid values: true, false.
Default: false
defrag_pause_seconds
no
Sets the period of time, in seconds, to pause between issuing defrag
commands to etcd members.
Default: 60
defrag_timeout_seconds
no
Sets the period of time, in seconds, that each etcd member is allotted
to complete defragmentation. If the defragmentation of a member times
out before the process is successfully completed, the entire cluster
defragmentation is aborted.
Configures use of Windows GMSA credentia specifications.
Parameter
Required
Description
windows_gmsa
no
Allows creation of GMSA credential specifications for the Kubernetes
cluster, as well as automatic population of full credential
specifications for any Pod on which the GMSA credential specification is
referenced in the security context of that Pod.
For information on how to enable GMSA and how to obtain different
components of the GMSA specification for one or more GMSA accounts in
your domain, refer to the official Windows documentation.
For detail on how to use the MKE web UI to scale your cluster, refer to
Join Linux nodes or Join Windows worker nodes, depending on which
operating system you use. In particular, these topics offer information on
adding nodes to a cluster and configuring node availability.
You can also use the command line to perform all scaling operations.
Scale operation
Command
Obtain the join token
Run the following command on a manager node to obtain the join token
that is required for cluster scaling. Use either worker or manager for the
<node-type>:
dockerswarmjoin-token<node-type>
Configure a custom listen address
Specify the address and port where the new node listens for inbound
cluster management traffic:
Mirantis Kubernetes Engine (MKE) offers support for a Key
Management Service (KMS) plugin that allows access to third-party secrets
management solutions, such as Vault. MKE uses this plugin to facilitate
access from Kubernetes clusters.
MKE will not health check, clean up, or otherwise manage the KMS plugin. Thus,
you must deploy KMS before a machine becomes a MKE manager, or else it
may be considered unhealthy.
Use MKE to configure the KMS plugin configuration. MKE maintains
ownership of the Kubernetes EncryptionConfig file, where the KMS plugin is
configured for Kubernetes. MKE does not check the file contents following
deployment.
MKE adds new configuration options to the cluster configuration table.
Configuration of these options takes place through the API and not the MKE web
UI.
The following table presents the configuration options for the KMS plugin, all
of which are optional.
Parameter
Type
Description
kms_enabled
bool
Sets MKE to configure a KMS plugin.
kms_name
string
Name of the KMS plugin resource (for example, vault).
kms_endpoint
string
Path of the KMS plugin socket. The path must refer to a UNIX socket on
the host (for example, /tmp/socketfile.sock). MKE bind mounts this
file to make it accessible to the API server.
kms_cachesize
int
Number of data encryption keys (DEKs) to cache in the clear.
Mirantis Kubernetes Engine (MKE) can use local network drivers to orchestrate
your cluster. You can create a config network with a driver such as MAC VLAN,
and use this network in the same way as any other named network in MKE. In
addition, if it is set up as attachable you can attach containers.
Warning
Encrypting communication between containers on different nodes only works
with overlay networks.
To create a node-specific network for use with MKE, always do so through MKE,
using either the MKE web UI or the CLI with an admin bundle. If you create such
a network without MKE, it will not have the correct access label and
it will not be available in MKE.
In the left-side navigation menu, click Swarm > Networks.
Click Create to call the Create Network screen.
Select macvlan from the Drivers` dropdown.
Enter macvlan into the Name field.
Select the type of network to create, Network or
Local Config.
If you select Local Config, the SCOPE is
automatically set to Local. You subsequently select the nodes
for which to create the Local Config from those listed. MKE will prefix
the network with the node name for each selected node to ensure consistent
application of access labels, and you then select a Collection
for the Local Configs to reside in. All Local Configs with
the same name must be in the same collection, or MKE returns an error. If
you do not not select a Collection, the network is placed in
your default collection, which is / in a new MKE installation.
If you select Network, the SCOPE is automatically
set to Swarm. Choose an existing Local Config from
which to create the network. The network and its labels and collection
placement are inherited from the related Local Configs.
The self-deployed MKE Cluster Root CA server issues certificates for MKE
cluster nodes and internal components that enable the components to communicate
with each other. The server also issues certificates that are used in admin
client bundles.
To rotate the certificate material of the MKE Cluster Root CA or provide your
own certificate and private key:
Caution
If there are unhealthy nodes in the cluster, CA rotation will be unable to
complete. If rotation seems to be hanging, run docker node ls
--format "{{.ID}} {{.Hostname}} {{.Status}} {{.TLSStatus}}" to determine
whether any nodes are down or are otherwise unable to rotate TLS
certificates.
MKE Cluster Root CA server is coupled with Docker Swarm Root CA, as MKE
nodes are also swarm nodes. Thus, if users want to rotate the Docker Swarm
Root CA certificate, they must not use the docker swarm ca
command in any form as it may break their MKE cluster.
Rotating MKE Cluster Root CA causes several MKE components to restart,
which can result in cluster downtime. As such, Mirantis recommends
performing such rotations outside of peak business hours.
You should only rotate the MKE Cluster Root CA certificate for reasons of
security, a good example being if the certificate has been compromised.
The MKE Cluster Root CA certificate is valid for 20 years, thus rotation is
typically not necessary.
You must use the MKE CLI to rotate the existing root CA certificate or to
provide your own root CA certificate and private key:
SSH into one of the manager nodes of your cluster.
Make a backup prior to making changes to MKE
Cluster Root CA.
The self-deployed MKE etcd Root CA server issues certificates for MKE
components that enable the components to communicate with etcd cluster.
Important
If you upgraded your cluster from any version of MKE prior to MKE 3.7.2, the
etcd root CA will not be unique. To ensure the uniqueness of the etcd root
CA, rotate the etcd CA material using the instructions herein.
To rotate the certificate material of the MKE etcd Root CA or provide your
own certificate and private key:
Caution
Rotating MKE etcd Root CA causes several MKE components to restart,
which can result in cluster downtime. As such, Mirantis recommends
performing such rotations outside of peak business hours.
Other than the aforementioned purpose of ensuring the uniqueness of the
etcd root CA, you should only rotate the MKE etcd Root CA certificate for
reasons of security, a good example being if the certificate has been
compromised. The MKE etcd Root CA certificate is valid for 20 years, thus
rotation is typically not necessary.
You must use the MKE CLI to rotate the existing root CA certificate and private
key:
SSH into one of the manager nodes of your cluster.
Make a backup prior to making changes to MKE
etcd Root CA.
MKE deploys the MKE Client Root CA server to act as the default signer of the
Kubernetes Controller Manager, while also signing TLS certificates for
non-admin client bundles. In addition, this CA server is used by default when
accessing MKE API using HTTPS.
Note
To replace the MKE Client Root CA server with an external CA for MKE API use
only, refer to Use your own TLS certificates.
To rotate the existing root CA certificate or provide your own certificate
and private key:
Caution
As rotating the MKE Client Root CA invalidates all previously created
non-admin client bundles, you will need to recreate these bundles
following the rotation.
You should only rotate the MKE Client Root CA certificate for reasons of
security, a good example being if the certificate has been compromised.
The MKE Client Root CA certificate is valid for 20 years, thus rotation is
typically not necessary.
You must use the MKE CLI to rotate the existing root CA certificate or to
provide your own root CA certificate and private key:
SSH into one of the manager nodes of your cluster.
Make a backup prior to making changes to MKE
Cluster Root CA.
To ensure all communications between clients and MKE are encrypted, all MKE
services are exposed using HTTPS. By default, this is done using
self-signed TLS certificates that are not trusted by client tools such as
web browsers. Thus, when you try to access MKE, your browser warns that it
does not trust MKE or that MKE has an invalid certificate.
You can configure MKE to use your own TLS certificates. As a result, your
browser and other client tools will trust your MKE installation.
Mirantis recommends that you make this change outside of peak business hours.
Your applications will continue to run normally, but existing MKE client
certificates will become invalid, and thus users will have to download new
certificates to access MKE from the CLI.
To configure MKE to use your own TLS certificates and keys:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Certificates.
Upload your certificates and keys based on the following table.
Note
All keys and certificates must be uploaded in PEM format.
Type
Description
Private key
The unencrypted private key for MKE. This key must correspond to the
public key used in the server certificate. This key does not use a
password.
Click Upload Key to upload a PEM file.
Server certificate
The MKE public key certificate, which establishes a chain of trust up
to the root CA certificate. It is followed by the certificates of any
intermediate certificate authorities.
Click Upload Certificate to upload a PEM file.
CA certificate
The public key certificate of the root certificate authority that
issued the MKE server certificate. If you do not have a CA
certificate, use the top-most intermediate certificate instead.
Click Upload CA Certificate to upload a PEM file.
Client CA
This field may contain one or more Root CA certificates that the MKE
controller uses to verify that client certificates are issued by a
trusted entity.
Click Upload CA Certificate to upload a PEM file.
Click Download MKE Server CA Certificate to download the
certificate as a PEM file.
Note
MKE is automatically configured to trust its internal CAs, which
issue client certificates as part of generated client bundles.
However, you may supply MKE with additional custom root CA
certificates using this field to enable MKE to trust the client
certificates issued by your corporate or trusted third-party
certificate authorities. Note that your custom root certificates
will be appended to MKE internal root CA certificates.
Click Save.
After replacing the TLS certificates, your users will not be able to
authenticate with their old client certificate bundles. Ask your users
to access the MKE web UI and download new client certificate
bundles.
Mirantis offers its own image registry, Mirantis Secure Registry (MSR), which
you can use to store and manage the images that you deploy to your cluster.
This topic describes how to use MKE to push the official WordPress image to MSR
and later deploy that image to your cluster.
To create an MSR image repository:
Log in to the MKE web UI.
From the left-side navigation panel, navigate to
<user name> > Admin Settings > Mirantis Secure Registry.
In the Installed MSRs section, capture the MSR URL for your
cluster.
In a new browser tab, navigate to the MSR URL captured in the previous step.
From the left-side navigation panel, click Repositories.
Click New repository.
In the namespace field under New Repository, select
the required namespace. The default namespace is your user name.
In the name field under New Repository, enter the
name wordpress.
To create the repository, click Save.
To push an image to MSR:
In this example, you will pull the official WordPress image from Docker Hub,
tag it, and push it to MSR. Once pushed to MSR, only authorized users will
be able to make changes to the image. Pushing to MSR requires CLI access to
a licensed MSR installation.
Pull the public WordPress image from Docker Hub:
dockerpullwordpress
Tag the image, using the IP address or DNS name of your MSR instance. For
example:
The Deployment object YAML specifies your MSR image in the Pod
template spec: image:<msr-url>:<port>/admin/wordpress:latest. Also,
the YAML file defines a NodePort service that exposes the WordPress
application so that it is accessible from outside the cluster.
Click Create. Creating the new Kubernetes objects will open the
Controllers page.
After a few seconds, verify that wordpress-deployment has a
green status icon and is thus successfully deployed.
When you add a node to your cluster, by default its workloads are managed
by Swarm. Changing the default orchestrator does not affect existing nodes
in the cluster. You can also change the orchestrator type for individual
nodes in the cluster.
The workloads on your cluster can be scheduled by Kubernetes, Swarm, or a
combination of the two. If you choose to run a mixed cluster, be aware that
different orchestrators are not aware of each other, and thus there is no
coordination between them.
Mirantis recommends that you decide which orchestrator you will use when
initially setting up your cluster. Once you start deploying workloads, avoid
changing the orchestrator setting. If you do change the node orchestrator,
your workloads will be evicted and you will need to deploy them again using the
new orchestrator.
Caution
When you promote a worker node to be a manager, its orchestrator type
automatically changes to Mixed. If you later demote that node to be
a worker, its orchestrator type remains as Mixed.
Note
The default behavior for Mirantis Secure Registry (MSR) nodes is to run in
the Mixed orchestration mode. If you change the MSR orchestrator type to
Swarm or Kubernetes only, reconciliation will revert the node back to the
Mixed mode.
When you change the node orchestrator, existing workloads are
evicted and they are not automatically migrated to the new orchestrator.
You must manually migrate them to the new orchestrator. For example, if you
deploy WordPress on Swarm, and you change the node orchestrator to
Kubernetes, MKE does not migrate the workload, and WordPress continues
running on Swarm. You must manually migrate your WordPress deployment to
Kubernetes.
The following table summarizes the results of changing a node
orchestrator.
Workload
Orchestrator-related change
Containers
Containers continue running on the node.
Docker service
The node is drained and tasks are rescheduled to another node.
Pods and other imperative resources
Imperative resources continue running on the node.
Deployments and other declarative resources
New declarative resources will not be scheduled on the node and
existing ones will be rescheduled at a time that can vary based on
resource details.
If a node is running containers and you change the node to Kubernetes,
the containers will continue running and Kubernetes will not be aware of
them. This is functionally the same as running the node in the Mixed mode.
Warning
The Mixed mode is not intended for production use and it may impact
the existing workloads on the node.
This is because the two orchestrator types have different views of
the node resources and they are not aware of the other orchestrator
resources. One orchestrator can schedule a workload without knowing
that the node resources are already committed to another workload
that was scheduled by the other orchestrator. When this happens, the
node can run out of memory or other resources.
Mirantis strongly recommends against using the Mixed mode in production
environments.
To change the node orchestrator using the MKE web UI:
Log in to the MKE web UI as an administrator.
From the left-side navigation panel, navigate to
Shared Resources > Nodes.
Click the node that you want to assign to a different orchestrator.
In the upper right, click the Edit Node icon.
In the Details pane, in the Role section under
ORCHESTRATOR TYPE, select either Swarm,
Kubernetes, or Mixed.
Warning
Mirantis strongly recommends against using the Mixed mode in
production environments.
Click Save to assign the node to the selected orchestrator.
To change the node orchestrator using the CLI:
Set the orchestrator on a node by assigning the orchestrator labels,
com.docker.ucp.orchestrator.swarm or
com.docker.ucp.orchestrator.kubernetes to true.
Change the node orchestrator. Select from the following options:
You must first add the target orchestrator label and then remove
the old orchestrator label. Doing this in the reverse order can
fail to change the orchestrator.
Verify the value of the orchestrator label by inspecting the node:
dockernodeinspect<node-id>|grep-iorchestrator
Example output:
"com.docker.ucp.orchestrator.kubernetes":"true"
Important
The com.docker.ucp.orchestrator label is not displayed in the MKE web UI
Labels list, which presents in the Overview pane for
each node.
MKE administrators can filter the view of Kubernetes objects by the
namespace that the objects are assigned to, specifying a single namespace
or all available namespaces. This topic describes how to deploy services to two
newly created namespaces and then view those services, filtered by namespace.
To create two namespaces:
Log in to the MKE web UI as an administrator.
From the left-side navigation panel, click Kubernetes.
Click Create to open the Create Kubernetes Object
page.
Leave the Namespace drop-down blank.
In the Object YAML editor, paste the following YAML code:
Click Create to deploy the service in the green
namespace.
To view the newly created services:
In the left-side navigation panel, click Namespaces.
In the upper-right corner, click the Set context for all
namespaces toggle. The indicator in the left-side navigation panel under
Namespaces changes to All Namespaces.
Click Services to view your services.
Filter the view by namespace:
In the left-side navigation panel, click Namespaces.
Hover over the blue namespace and click Set Context.
The indicator in the left-side navigation panel under
Namespaces changes to blue.
Click Services to view the app-service-blue service.
Note that the app-service-green service does not display.
Perform the forgoing steps on the green namespace to view only the
services deployed in the green namespace.
MKE is designed to facilitate high availability (HA). You can join multiple
manager nodes to the cluster, so that if one manager node fails, another one
can automatically take its place without impacting the cluster.
Including multiple manager nodes in your cluster allows you to handle manager
node failures and load-balance user requests across all manager nodes.
The following table exhibits the relationship between the number of manager
nodes used and the number of faults that your cluster can tolerate:
Manager nodes
Failures tolerated
1
0
3
1
5
2
For deployment into product environments, follow these best practices:
For HA with minimal network overhead, Mirantis recommends using three manager
nodes and a maximum of five. Adding more manager nodes than this can lead to
performance degradation, as configuration changes must be replicated across
all manager nodes.
You should bring failed manager nodes back online as soon as possible, as
each failed manager node decreases the number of failures that your cluster
can tolerate.
You should distribute your manager nodes across different availability
zones. This way your cluster can continue working even if an entire
availability zone goes down.
MKE allows you to add or remove nodes from your cluster as your needs change
over time.
Because MKE leverages the clustering functionality provided by
Mirantis Container Runtime (MCR), you use the docker swarm join
command to add more nodes to your cluster. When you join a new node, MCR
services start running on the node automatically.
You can add both Linux manager and
worker nodes to your cluster.
Prior to adding a node that was previously a part of the same MKE cluster or
a different one, you must run the following command to remove any stale MKE
volumes:
You can promote worker nodes to managers to make MKE fault tolerant. You can
also demote a manager node into a worker node.
Log in to the MKE web UI.
In the left-side navigation panel, navigate to
Shared Resources > Nodes and select the required node.
In the upper right, select the Edit Node icon.
In the Role section, click Manager or
Worker.
Click Save and wait until the operation completes.
Navigate to Shared Resources > Nodes and verify the new node
role.
Note
If you are load balancing user requests to MKE across multiple manager
nodes, you must remove these nodes from the load-balancing pool when
demoting them to workers.
MKE allows you to add or remove nodes from your cluster as your needs change
over time.
Because MKE leverages the clustering functionality provided by
Mirantis Container Runtime (MCR), you use the docker swarm join
command to add more nodes to your cluster. When you join a new node, MCR
services start running on the node automatically.
The following features are not yet supported using Windows Server:
Category
Feature
Networking
Encrypted networks are not supported. If you have upgraded from a
previous version of MKE, you will need to recreate an unencrypted
version of the ucp-hrm network.
Secrets
When using secrets with Windows services, Windows stores temporary
secret files on your disk. You can use BitLocker on the volume
containing the Docker root directory to encrypt the secret data at
rest.
When creating a service that uses Windows containers, the options
to specify UID, GID, and mode are not supported for secrets.
Secrets are only accessible by administrators and users with system
access within the container.
Mounts
On Windows, Docker cannot listen on a Unix socket. Use TCP or a
named pipe instead.
If the cluster is deployed in a site that is offline, sideload MKE images
onto the Windows Server nodes. For more information, refer to
Install MKE offline.
On a manager node, list the images that are required on Windows nodes:
After joining multiple manager nodes for high availability (HA), you can
configure your own load balancer to balance user requests across all
manager nodes.
Use of a load balancer allows users to access MKE using a centralized domain
name. The load balancer can detect when a manager node fails and stop
forwarding requests to that node, so that users are unaffected by the failure.
By default, both MKE and Mirantis Secure Registry (MSR) use port 443. If you
plan to deploy MKE and MSR together, your load balancer must
distinguish traffic between the two by IP address or port number.
If you want MKE and MSR both to use port 443, then you must either use separate
load balancers for each or use two virtual IPs. Otherwise, you must configure
your load balancer to expose MKE or MSR on a port other than 443.
Two-factor authentication (2FA) adds an extra layer of security when logging
in to the MKE web UI. Once enabled, 2FA requires the user to submit an
additional authentication code generated on a separate mobile device along
with their user name and password at login.
MKE 2FA requires the use of a time-based one-time password (TOTP)
application installed on a mobile device to generate a time-based
authentication code for each login to the MKE web UI. Examples of such
applications include 1Password,
Authy, and
LastPass Authenticator.
To configure 2FA:
Install a TOTP application to your mobile device.
In the MKE web UI, navigate to My Profile > Security.
Toggle the Two-factor authentication control to
enabled.
Open the TOTP application and scan the offered QR code. The device will
display a six-digit code.
Enter the six-digit code in the offered field and click
Register. The TOTP application will save your MKE account.
Important
A set of recovery codes displays in the MKE web UI when two-factor
authentication is enabled. Save these codes in a safe location, as they
can be used to access the MKE web UI if for any reason the
configured mobile device becomes unavailable. Refer to
Recover 2FA for details.
Once 2FA is enabled, you will need to provide an authentication code each time
you log in to the MKE web UI. Typically, the TOTP application installed on your
mobile device generates the code and refreshes it every 30 seconds.
Access the MKE web UI with 2FA enabled:
In the MKE web UI, click Sign in. The Sign in page
will display.
Enter a valid user name and password.
Access the MKE code in the TOTP application on your mobile device.
Enter the current code in the 2FA Code field in the MKE web UI.
Note
Multiple authentication failures may indicate a lack of synchronization between the mobile device clock and the mobile provider.
If the mobile device with authentication codes is unavailable, you can
re-access MKE using any of the recovery codes that display in the MKE web UI
when 2FA is first enabled.
To recover 2FA:
Enter one of the recovery codes when prompted for the two-factor
authentication code upon login to the MKE web UI.
Navigate to My Profile > Security.
Disable 2FA and then re-enable it.
Open the TOTP application and scan the offered QR code. The device will
display a six-digit code.
Enter the six-digit code in the offered field and click
Register. The TOTP application will save your MKE account.
If there are no recovery codes to draw from, ask your system administrator to
disable 2FA in order to regain access to the MKE web UI. Once done, repeat the
Configure 2FA procedure to reinstate 2FA
protection.
MKE administrators are not able to re-enable 2FA for users.
You can configure MKE so that a user account is temporarily blocked from
logging in following a series of unsuccessful login attempts. The account
lockout feature only prevents log in attempts that are made using basic
authorization or LDAP. Log in attempts using either SAML or OIDC do not trigger
the account lockout feature. Admin accounts are never locked.
Account lockouts expire after a set amount of time, after which the affected
user can log in as normal. Subsequent log in attempts on a locked
account do not extend the lockout period. Log in attempts against a locked
account always cause a standard incorrect credentials error, providing no
indication to the user that the account is locked. Only MKE admins can see
account lockout status.
Set the following parameters in the auth.account_lock section of the MKE
configuration file:
Set the value of enabled to true.
Set the value of failureTriggers to the number of failed log in
attempts that can be made before an account is locked.
Set the value of durationSeconds to the desired lockout duration. A
value of 0 indicates that the account will remain locked until it is
unlocked by an administrator.
The account remains locked until the specified amount of time has elapsed.
Otherwise, you must either have an administrator unlock the account or globally
disable the account lockout feature.
To unlock a locked account:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
Access Control > Users and select the user who is locked out of
their account.
Click the gear icon in the upper right corner.
Navigate to the Security tab.
Note
An expired account lock only resets once a new log in attempt is made.
Until such time, the account will present as locked to administrators.
Using kubelet node profiles, you can customize your kubelet settings at a
node-by-node level, rather than setting cluster-wide flags that apply to
all of your kubelet agents.
Note
MKE does not currently support kubelet node profiles on windows nodes.
Once you have added the new kubelet node profiles to the MKE configuration file and uploaded the
file to MKE, you can apply the profiles to your nodes.
Any changes you make to a kubelet node profile will instantly affect the nodes
in which the profile is in use. As such, Mirantis strongly recommends that you
first test any modifications in a limited scope, by creating a new profile with
the modifications and applying it to a small number of nodes.
Warning
Misconfigured modifications made to a kubelet node profile that is in use by
a large number of cluster nodes can result in those nodes becoming
nonoperational.
Example scenario:
You have defined the following kubelet node profiles in the MKE configuration
file:
Apply the new lowtest label to a
small set of test nodes.
Once the profile is verified on your
test nodes, remove lowtest from the profile definition and update
low to use the updated --kube-reserved=cpu value.
Configure Graceful Node Shutdown with kubelet node profiles¶
Available since MKE 3.7.12
To configure Graceful Node Shutdown grace periods in MKE cluster, set the
following flags in the [cluster_config.custom_kubelet_flags_profiles]
section of the MKE configuration file:
--shutdown-grace-period=0s
--shutdown-grace-period-critical-pods=0s
The GracefulNodeShutdown feature gate is enabled by default, with
shutdown grace period parameters both set to 0s.
When you add your custom kubelet profiles, insert and set the GracefulNodeShutdown
flags in the MKE configuration file. For example:
The Graceful Node Shutdown feature may present various issues.
Missing kubelet inhibitors and ucp-kubelet errors¶
A Graceful Node Shutdown configuration of --shutdown-grace-period=60s--shutdown-grace-period-critical-pods=50s can result in the following error
message:
The error message indicates missing kubelet inhibitors and ucp-kubelet errors,
due to the current default InhibitDelayMaxSec setting of 30s in
the operating system.
You can resolve the issue either by changing the InhibitDelayMaxSec
parameter setting to a larger value or by removing it.
The configuration file that contains the InhibitDelayMaxSec parameter
setting can be located in any one of a number of locations:
Graceful node drain does not occur and the pods are not terminated¶
Due to the systemdPrepareForShutdown signal not being sent to dbus, in
some operating system distributions graceful node drain does not occur and the
pods are not terminated.
Currently, in the following cases, the PrepareForShutdown signal is
triggered and the Graceful Node Shutdown feature works as intended:
If you delete from the MKE configuration file a kubelet node profile, the nodes
that are using that will enter an erroneous state. For this reason, MKE
prevents users from deleting any kubelet node profile that is in use by a
cluster node. It is a best practice, though, to verify before deleting any
profile that it is not in use.
Example scenario:
To check whether any nodes are using a previously defined low profile, run:
The result indicates that one node is using the indicated kubelet node
low profile. That node should be cleared of the profile before that profile
is deleted.
Any time there is an issue with your cluster, OpsCare routes notifications from
your MKE deployment to Mirantis support engineers. These company personnel will
then either directly resolve the problem or arrange to troubleshoot the matter
with you.
To configure OpsCare you must first obtain a Salesforce username, password, and
environment ID from your Mirantis Customer Success Manager. You then store
these credentials as Swarm secrets using the following naming convention:
User name: sfdc_opscare_api_username
Password: sfdc_opscare_api_password
Environment ID: sfdc_environment_id
Note
Every cluster that uses OpsCare must have its own unique
sfdc_environment_id.
OpsCare requires that MKE has access to mirantis.my.salesforce.com
on port 443.
Any custom certificates in use must contain all of the manager node
private IP addresses.
The provided Salesforce credentials are not associated with the Mirantis
support portal login, but are for Opscare alerting only.
OpsCare uses a predefined group of MKE alerts
to notify your Customer Success Manager of problems with your deployment. This
alerts group is identical to those seen in any MKE cluster that is provisioned
by Mirantis Container Cloud. A single watchdog alert serves to verify the
proper function of the OpsCare alert pipeline as a whole.
To verify that the OpsCare alerts are functioning properly:
You must disable OpsCare before you can delete the three secrets in use.
To disable OpsCare:
Log in to the MKE web UI.
Using the left-side navigation panel, navigate to
<username> > Admin Settings > Usage.
Toggle the Enable Ops Care slider to the left.
Alternatively, you can disable OpsCare by changing the ops_care entry in
the MKE configuration file to false.
Configure cluster and service networking in an existing cluster¶
On systems that use the managed CNI, you can switch existing clusters to either
kube-proxy with ipvs proxier or eBPF mode.
MKE does not support switching kube-proxy in an existing cluster from ipvs
proxier to iptables proxier, nor does it support disabling eBPF mode after it
has been enabled. Using a CNI that supports both cluster and service networking
requires that you disable kube-proxy.
Refer to Cluster and service networking options in the MKE Installation
Guide for information on how to configure cluster and service networking at
install time.
Caution
The configuration changes described here cannot be reversed. As such,
Mirantis recommends that you make a cluster backup, drain your workloads,
and take your cluster offline prior to performing any of these changes.
Caution
Swarm workloads that require the use of encrypted overlay networks must use
iptables proxier. Be aware that the other networking options detailed here
automatically disable Docker Swarm encrypted overlay networks.
To switch an existing cluster to kube-proxy with ipvs proxier while using the
managed CNI:
Optional. Configure the following ipvs-related parameters in the MKE
configuration file (otherwise, MKE will use the Kubernetes default parameter
settings):
ipvs_exclude_cidrs=""
ipvs_min_sync_period=""
ipvs_scheduler=""
ipvs_strict_arp=false
ipvs_sync_period=""
ipvs_tcp_timeout=""
ipvs_tcpfin_timeout=""
ipvs_udp_timeout=""
For more information on using these parameters, refer to kube-proxy
in the Kubernetes documentation.
To switch an existing cluster to eBPF mode while using the managed CNI:
Verify that the prerequisites for eBPF use have been met, including kernel
compatibility, for all Linux manager and worker nodes. Refer to the Calico
documentation Enable the eBPF dataplane for
more information.
If the count returned by the command does not quickly converge at 0,
check the ucp-kube-proxy logs on the nodes where either of the
following took place:
Verify that the ucp-kube-proxy container started on all nodes, that
the kube-proxy cleanup took place, and that ucp-kube-proxy did not
launch kube-proxy.
If the count returned by the command does not quickly converge at 0,
check the ucp-kube-proxy logs on the nodes where either of the
following took place:
MKE administrators can schedule the cleanup of unused images, whitelisting
which images to keep. To determine which images will be removed, they can
perform a dry run prior to setting the image-pruning schedule.
etcd is a consistent, distributed key-value store that provides a
reliable way to store data that needs to be accessed by a distributed system or
cluster of machines. It handles leader elections during network
partitions and can tolerate machine failure, even in the leader node.
For MKE, etcd serves as the Kubernetes backing store for all cluster data, with
an etcd replica deployed on each MKE manager node. This is a primary reason why
Mirantis recommends that you deploy an odd number of MKE manager nodes, as etcd
uses the Raft consensus algorithm and thus requires that a quorum of
nodes agree on any updates to the cluster state.
You can control the etcd distributed key-value storage quota using the
etcd_storage_quota parameter in the MKE configuration file. By default, the value of the parameter is 2GB.
For information on how to adjust the parameter, refer to
Configure an MKE cluster.
If you choose to increase the etcd quota, be aware that this quota has a limit
and such action should be used in conjunction with other strategies, such as
decreasing events TTL to ensure that the etcd database does not run out of
space.
Important
If a manager node virtual machine runs out of disk space, or if all of its
system memory is depleted, etcd can cause the MKE cluster to move into an
irrecoverable state. To prevent this from happening, configure the disk
space and the memory of the manager node VMs to levels that are well in
excess of the set etcd storage quota. Be aware, though, that warning banners
will display in the MKE web UI if the etcd storage quota is set at an amount
in excess of 40% of system memory.
Kubernetes events are generated in response to changes within Kubernetes
resources, such as nodes, Pods, or containers. These events are created with a
time to live (TTL), after which they are automatically cleaned up. Should it
happen, however, that a large amount of Kubernetes events are generated or
other cluster issues arise, it may be necessary to manually clean up the
Kubernetes events to prevent etcd from exceeding its quota. MKE offers an API
that you can use to directly clean up event objects within your cluster, with
which you can specify whether all events should be deleted or only those that
have a certain TTL.
Note
The etcd cleanup API is a preventative measure only. If etcd already
exceeds the established quota MKE may no longer be operational, and as a
result the API will not work.
To trigger etcd cleanup:
Issue a POST to the https://MKE_HOST/api/ucp/etcd/cleanup endpoint.
You can specify two parameters:
dryRun
Sets where to issue a dry cleanup run instead of the production run. A dry
run returns a list of etcd keys (Kubernetes events) that will be
deleted without actually deleting them. Defaults to false.
MinTTLToKeepSeconds
Sets the minimum TTL to retain, meaning that only events with a lower TTL
are deleted. By default, all events are deleted regardless of TTL.
Mirantis recommends that you adjust these parameters based on the size of
the etcd database and the amount of time that has elapsed since the last
cleanup.
{"CleanupInProgress":false,
"CleanupResult":"Cluster Cleanup finished & Revisions Compacted. Issue a cluster defrag to permanently clear up space.",
"DefragInProgress":false,
"DefragResult":"",
"MemberInfo":[{"MemberID":16494148364752423721,
"Endpoint":"<https://172.31.47.35:12379",>
"EtcdVersion":"3.5.6",
"DbSize":"1 MB",
"IsLeader":true,
"Alarms":null
}]}
The CleanupResult field in the response indicates any issues that arise.
It also indicates when the cleanup is finished.
Note
Although the etcd cleanup process deletes the keys, you must run an
etcd defragmentation to release the storage
space used by those keys. The defragmentation is a blocking operation, and
as such it is not run automatically but must be run in order for the cleanup
to release space back to the filesystem.
The etcd distributed key-value store retains a history of its keyspace. That
history is set for compaction following a specified number of revisions,
however it only releases the used space back to the host filesystem following
defragmentation. For more information, refer to the
etcd documentation.
With MKE you can defragment the etcd cluster while avoiding cluster outages. To
do this, you apply defragmentation to etcd members one at a time. MKE will
defragment the current etcd leader last, to prevent the triggering of multiple
leader elections.
Important
In a High Availability (HA) cluster, the defragmentation process subtly
affects cluster dynamics, because when a node undergoes defragmentation it
temporarily leaves the pool of active nodes. This subsequent reduction in
the active node count results in a proportional increase of the load on the
remaining nodes, which can lead to performance degradation if the remaining
nodes do not have the capacity to handle the additional load. In addition,
at the end of the process, when the leader node is undergoing
defragmentation, there is a brief period during which cluster write
operations do not take place. This pause occurs when the system initiates
and completes the leader election process, and though it is automated and
brief it does result in a momentary write block on the cluster.
Taking the described factors into account, Mirantis recommends taking a
cautious scheduling approach in defragmenting HA clusters. Ideally, the
defragmentation should occur during planned maintenance windows rather than
relying on a recurring cron job, as during such periods you can closely
monitor potential impacts on performance and availability and mitigate as
necessary.
To defragment the etcd cluster:
Trigger the etcd cluster defragmentation by issuing a POST to the
https://MKE_HOST/api/ucp/etcd/defrag endpoint.
You can specify two parameters:
timeoutSeconds
Sets how long MKE waits for each member to finish defragmentation.
Default: 60 seconds. MKE will cancel the defragmentation if the timeout
occurs before the member defragmentation completes.
pauseSeconds
Sets how long MKE waits between each member defragmentation. Default: 60
seconds.
Mirantis recommends that you adjust these parameters based on the size of
the etcd database and the amount of time that has elapsed since the last
defragmentation.
You can monitor this endpoint until the defragmentation is complete. The
information is also available in the ucp-controller logs.
To manually remove the etcd defragmentation lock file:
To maintain etcd cluster availability, MKE uses a lock file that prevents
multiple defragmentations from being simultaneously implemented. MKE removes
the lock file at the conclusion of defragmentation, however you can manually
remove it as necessary.
Manually remove the lock file by running the following command:
A NOSPACE alarm is issued in the event that etcd runs low on storage space, to
protect the cluster from further writes. Once this low storage space state is
reached, etcd will respond to all write requests with the
mvcc:databasespaceexceeded error message until the issue is rectified.
When MKE detects the NOSPACE alarm condition, it displays a critical banner to
inform administrators. In addition, MKE restarts etcd with an increased value
for the etcd datastore quota, thus allowing administrators to resolve the
NOSPACE alarm without interference.
The CORRUPT alarm is issued when a cluster corruption is detected by etcd. MKE
cluster administrators are informed of the condition by way of a critical
banner. To resolve such an issue, contact Mirantis Support
and refer to the official etcd documentation regarding data corruption
recovery.
Hybrid Windows clusters concurrently run two versions of Windows Server, with
one version deployed on one set of nodes and the second version deployed on a
different set of nodes. The Windows versions that MKE supports are:
A Windows Server 2019 node cannot run a container that uses a Windows Server
2022 image.
For a Windows Server 2022 node to run a container that uses a Windows Server
2019 image, you must run the container with Hyper-V isolation. Refer to
the Microsoft documentation Hyper-V isolation for containers
for more information.
Mirantis recommends that you use the same version of Windows Server for both
your container images and for the node on which the containers run. For
reference purposes, in both Kubernetes and Swarm clusters, MKE assigns a label
to Windows nodes that includes the Windows Server version.
To run Windows workloads in a hybrid Windows Kubernetes cluster, you must
target your workloads to nodes that are running the correct Windows version.
Failure to correctly target your workloads may result in an error
when Kubernetes schedules the Pod on an incompatible node:
Create a deployment with the appropriate node selectors. Use 10.0.17763
for Windows Server 2019 workloads and 10.0.20348 for Windows Server
2022 workloads.
For example purposes, paste the following content into a file called
win2019-deployment.yaml:
To run Windows workloads in a hybrid Windows Swarm cluster, you must target
your workloads to nodes that are running the correct Windows version.
Failure to correctly target your workloads may result in an operating system
mismatch error.
Verify that nodes running the appropriate Windows version are present in the
cluster. Use an OsVersion label of 10.0.17763 for Windows Server
2019 and 10.0.20348 for Windows Server 2022. For example:
Create a service that runs the required version of Windows
Server, in this case Windows Server 2022. The service requires the inclusion
of various constraints, to ensure that it is scheduled on the correct node.
For example:
With NodeLocalDNS you can run a local instance of the DNS caching agent on each
node in the MKE cluster. This can significantly improve cluster performance,
compared to relying on a centralized CoreDNS instance to resolve external DNS
records, as a local NodeLocalDNS instance can cache DNS results and eliminate
the network latency factor.
The NodeLocalDNS feature is enabled and disabled through the MKE configuration
file, comprehensive information for which is available at
Use an MKE configuration file.
Before installing the NodelocalDNS in an MKE cluster, you must verify the
following settings in MKE:
The unmanaged CNI plugin is not enabled.
kube-proxy is running in iptables mode.
Note
If you are running MKE on RHEL, centOS, or Rocky Linux, review
Troubleshoot NodeLocalDNS to learn of issues that NodeLocalDNS has with
these operating systems and their corresponding fixes.
Result
NAMEREADYSTATUSRESTARTSAGEIPNODENOMINATEDNODEREADINESSGATES
node-local-dns-k9gns1/1Running0116s172.31.44.29ubuntu-18-ubuntu-1<none><none>
node-local-dns-zskp91/1Running0116s172.31.32.242ubuntu-18-ubuntu-0<none><none>
curlhttp://localhost:9253/metrics|grepcoredns_cache_hits_total
%Total%Received%XferdAverageSpeedTimeTimeTimeCurrent
DloadUploadTotalSpentLeftSpeed
00000000--:--:----:--:----:--:--0# HELP coredns_cache_hits_total The count of cache hits.# TYPE coredns_cache_hits_total counter
coredns_cache_hits_total{server="dns://10.96.0.10:53",type="denial",view="",zones="."}8
coredns_cache_hits_total{server="dns://10.96.0.10:53",type="denial",view="",zones="cluster.local."}71
coredns_cache_hits_total{server="dns://10.96.0.10:53",type="success",view="",zones="."}8
coredns_cache_hits_total{server="dns://10.96.0.10:53",type="success",view="",zones="cluster.local."}53
coredns_cache_hits_total{server="dns://169.254.0.10:53",type="success",view="",zones="."}6100640550640550030.5M0--:--:----:--:----:--:--30.5M
MKE allows administrators to authorize users to view, edit, and use cluster
resources by granting role-based permissions for specific resource sets. This
section describes how to configure all the relevant components of role-based
access control (RBAC).
This topic describes how to create organizations, teams, and users.
Note
Individual users can belong to multiple teams but a team can belong to
only one organization.
New users have a default permission level that you can extend by adding
the user to a team and creating grants. Alternatively, you can make the
user an administrator to extend their permission level.
In addition to integrating with LDAP services, MKE provides built-in
authentication. You must manually create users to use MKE built-in
authentication.
Once you enable LDAP you can sync your LDAP directory to the teams and users
that are present in MKE.
To enable LDAP:
Log in to the MKE web UI as an MKE administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Authentication & Authorization.
Scroll down to the Identity Provider Integration section.
Toggle LDAP to Enabled. A list of LDAP settings
displays.
Enter the values that correspond with your LDAP server installation.
Use the built-in MKE LDAP Test login tool to confirm that your
LDAP settings are correctly configured.
To synchronize LDAP users into MKE teams:
In the left-side navigation panel, navigate to
Access Control > Orgs & Teams and select an organization.
Click + to create a team.
Enter a team name and description.
Toggle ENABLE SYNC TEAM MEMBERS to Yes.
Choose between the following two methods for matching group members from an
LDAP directory. Refer to the table below for more information.
Keep the default Match Search Results method and fill out
Search Base DN, Search filter, and
Search subtree instead of just one level as required.
Toggle LDAP MATCH METHOD to change the method for matching
group members in the LDAP directory to Match Group Members.
Optional. Select Immediately Sync Team Members to run an LDAP
sync operation after saving the configuration for the team.
Optional. To allow non-LDAP team members to sync the LDAP directory, select
Allow non-LDAP members.
Note
If you do not select Allow non-LDAP members, manually-added
and SAML users are removed during the LDAP sync.
Click Create.
Repeat the preceding steps to synchronize LDAP users into additional teams.
There are two methods for matching group members from an LDAP directory:
Bind method
Description
Match Search Results (search bind)
Specifies that team members are synced using a search
query against the LDAP directory of your organization. The team
membership is synced to match the users in the search results.
Search Base DN
The distinguished name of the node in the directory tree where the
search starts looking for users.
Search filter
Filter to find users. If empty, existing users in the search scope are
added as members of the team.
Search subtree instead of just one level
Defines search through the full LDAP tree, not just one level, starting
at the base DN.
Match Group Members (direct bind)
Specifies that team members are synced directly with
members of a group in your LDAP directory. The team
membership syncs to match the membership of the group.
Group DN
The distinguished name of the group from which you select users.
Group Member Attribute
The value of this attribute corresponds to the distinguished
names of the members of the group.
Roles define a set of API operations permitted for a resource set. You apply
roles to users and teams by creating grants. Roles have the following important
characteristics:
Roles are always enabled.
Roles cannot be edited. To change a role, you must delete it and create a
new role with the changes you want to implement.
To delete roles used within a grant, you must first delete the grant.
Only administrators can create and delete roles.
This topic explains how to create custom Swarm roles and describes default and
Swarm operations roles.
Users have no access to Swarm or Kubernetes resources. Maps to NoAccess role in UCP 2.1.x.
View Only
Users can view resources but cannot create them.
Restricted Control
Users can view and edit resources but cannot run a service or container
in a way that affects the node where it is running. Users cannot mount a
node directory, exec into containers, or run containers in
privileged mode or with additional kernel capabilities.
Scheduler
Users can view worker and manager nodes and schedule, but not view,
workloads on these nodes. By default, all users are granted the
Scheduler role for the Shared collection. To
view workloads, users need Container View permissions.
Full Control
Users can view and edit all granted resources. They can create
containers without any restriction, but cannot see the containers of
other users.
To learn how to apply a default role using a grant, refer to
Create grants.
The following describes the set of operations (calls) that you can
execute to the Swarm resources. Each permission corresponds to a CLI command
and enables the user to execute that command. Refer to the Docker CLI documentation
for a complete list of commands and examples.
Operation
Command
Description
Config
dockerconfig
Manage Docker configurations.
Container
dockercontainer
Manage Docker containers.
Container
dockercontainercreate
Create a new container.
Container
dockercreate[OPTIONS]IMAGE[COMMAND][ARG...]
Create new containers.
Container
dockerupdate[OPTIONS]CONTAINER[CONTAINER...]
Update configuration of one or more containers. Using this command can
also prevent containers from consuming too many resources from their
Docker host.
Container
dockerrm[OPTIONS]CONTAINER[CONTAINER...]
Remove one or more containers.
Image
dockerimageCOMMAND
Remove one or more containers.
Image
dockerimageremove
Remove one or more images.
Network
dockernetwork
Manage networks. You can use child commands to create, inspect, list,
remove, prune, connect, and disconnect networks.
Node
dockernodeCOMMAND
Manage Swarm nodes.
Secret
dockersecretCOMMAND
Manage Docker secrets.
Service
dockerserviceCOMMAND
Manage services.
Volume
dockervolumecreate[OPTIONS][VOLUME]
Create a new volume that containers can consume and store data in.
Volume
dockervolumerm[OPTIONS]VOLUME[VOLUME...]
Remove one or more volumes. Users cannot remove a volume that is in use
by a container.
MKE enables access control to cluster resources by grouping them into two types
of resource sets: Swarm collections (for Swarm workloads) and Kubernetes
namespaces (for Kubernetes workloads). Refer to Role-based access control for
a description of the difference between Swarm collections and Kubernetes
namespaces. Administrators use grants to combine resources sets, giving users
permission to access specific cluster resources.
Users assign resources to collections with labels. The following resource types
have editable labels and thus you can assign them to collections: services,
nodes, secrets, and configs. For these resources types, change
com.docker.ucp.access.label to move a resource to a different collection.
Collections have generic names by default, but you can assign them meaningful
names as required (such as dev, test, and prod).
Note
The following resource types do not have editable labels and thus you cannot
assign them to collections: containers, networks, and volumes.
Groups of resources identified by a shared label are called stacks. You can
place one stack of resources in multiple collections. MKE automatically places
resources in the default collection. Users can change this using a specific
com.docker.ucp.access.label in the stack/compose file.
The system uses com.docker.ucp.collection.* to enable efficient
resource lookup. You do not need to manage these labels, as MKE controls them
automatically. Nodes have the following labels set to true by
default:
Each user has a default collection, which can be changed in the MKE
preferences.
To deploy resources, they must belong to a collection. When a user deploys
a resource without using an access label to specify its collection, MKE
automatically places the resource in the default collection.
Default collections are useful for the following types of users:
Users who work only on a well-defined portion of the system
Users who deploy stacks but do not want to edit the contents of their
compose files
Custom collections are appropriate for users with more complex roles in the
system, such as administrators.
Note
For those using Docker Compose, the system applies default collection labels
across all resources in the stack unless you explicitly set
com.docker.ucp.access.label.
This topic describes how to group and isolate cluster resources into swarm
collections and Kubernetes namespaces.
Log in to the MKE web UI as an administrator and complete the following steps:
To create a Swarm collection:
Navigate to Shared Resources > Collections.
Click View Children next to Swarm.
Click Create Collection.
Enter a collection name and click Create.
To move a resource to a different collection:
In the left-side navigation panel, navigate to the resource type you want to
move and click it. As an example, navigate to and click on Shared
Resources > Nodes.
Click the node you want to move to display the information window for that
node.
Click the slider icon at the top right of the information window to
display the edit dialog for the node.
Scroll down to Labels and change the
com.docker.ucp.access.label swarm label to the name of your collection.
Note
Optionally, you can navigate to Collection in the left-side
navigation panel and select the collection to which you want to move the
resource.
To create a Kubernetes namespace:
Navigate to Kubernetes > Namespaces and click
Create.
MKE administrators create grants to control how users and organizations access
resource sets. A grant defines user permissions to access resources. Each grant
associates one subject with one role and one resource set. For example, you can
grant the ProdTeamRestrictedControl over services in the
/Production collection.
The following is a common workflow for creating grants:
create-manually.
Define custom roles (or use defaults) by adding
permitted API operations per type of resource.
Create grants by combining subject, role, and resource set.
Note
This section assumes that you have created the relevant objects for the
grant, including the subject, role, and resource set (Kubernetes namespace
or Swarm collection).
To create a Kubernetes grant:
Log in to the MKE web UI.
Navigate to Access Control > Grants.
Select the Kubernetes tab and click
Create Role Binding.
Under Subject, select Users,
Organizations, or Service Account.
For Users, select the user from the pull-down menu.
For Organizations, select the organization and, optionally,
the team from the pull-down menu.
For Service Account, select the namespace and service account
from the pull-down menu.
Click Next to save your selections.
Under Resource Set, toggle the switch labeled Apply
Role Binding to all namespaces (Cluster Role Binding).
Click Next.
Under Role, select a cluster role.
Click Create.
To create a Swarm grant:
Log in to the MKE web UI.
Navigate to Access Control > Grants.
Select the Swarm tab and click Create Grant.
Under Subject, select Users or
Organizations.
For Users, select a user from the pull-down menu.
For Organizations, select the organization and, optionally,
the team from the pull-down menu.
Click Next to save your selections.
Under Resource Set, click View Children until the
required collection displays.
Click Select Collection next to the required collection.
Click Next.
Under Role, select a role type from the drop-down menu.
Click Create.
Note
MKE places new users in the docker-datacenter organization by default.
To apply permissions to all MKE users, create a grant with the
docker-datacenter organization as a subject.
By default, only administrators can pull images into a cluster managed by
MKE. This topic describes how to give non-administrator users permission
to pull images.
Images are always in the swarm collection, as they are a shared resource.
Grant users the ImageCreate permission for the Swarm collection to
allow them to pull images.
To grant a user permission to pull images:
Log in to the MKE web UI as an administrator.
Navigate to Access Control > Roles.
Select the Swarm tab and click Create.
On the Details tab, enter Pullimages for the role name.
On the Operations tab, select Image Create from the
IMAGE OPERATIONS drop-down.
Click Create.
Navigate to Access Control > Grants.
Select the Swarm tab and click Create Grant.
Under Subject, click Users and select the required
user from the drop-down.
Click Next.
Under Resource Set, select the Swarm collection and
click Next.
Under Role, select Pull images from the drop-down.
This topic describes how to reset passwords for users and administrators.
To change a user password in MKE:
Log in to the MKE web UI with administrator credentials.
Click Access Control > Users.
Select the user whose password you want to change.
Click the gear icon in the top right corner.
Select Security from the left navigation.
Enter the new password, confirm that it is correct, and click
Update Password.
Note
For users managed with an LDAP service, you must change user passwords on
the LDAP server.
To change an administrator password in MKE:
SSH to an MKE manager node and run:
dockerrun--net=host-vucp-auth-api-certs:/tls-it\"$(dockerinspect--format\'{{ .Spec.TaskTemplate.ContainerSpec.Image }}'\
ucp-auth-api)"\"$(dockerinspect--format\'{{ index .Spec.TaskTemplate.ContainerSpec.Args 0 }}'\
ucp-auth-api)"\
passwd-i
Optional. If you have DEBUG set as your global log level within MKE,
running $(dockerinspect--format'{{index.Spec.TaskTemplate.ContainerSpec.Args0}}`
returns --debug instead of --db-addr.
Pass Args1 to $dockerinspect instead to reset your administrator
password:
dockerrun--net=host-vucp-auth-api-certs:/tls-it\"$(dockerinspect--format\'{{ .Spec.TaskTemplate.ContainerSpec.Image }}'\
ucp-auth-api)"\"$(dockerinspect--format\'{{ index .Spec.TaskTemplate.ContainerSpec.Args 1 }}'\
ucp-auth-api)"\
passwd-i
Note
Alternatively, ask another administrator to change your password.
This topic describes how to grant two teams access to separate volumes in two
different resource collections such that neither team can see the volumes of
the other team. MKE allows you to do this even if the volumes are on the same
nodes.
To create two teams:
Log in to the MKE web UI.
Navigate to Orgs & Teams.
Create two teams in the engineering
organization named Dev and Prod.
To create grants for controlling access to the new volumes:
Create a grant for the Dev team to access
the dev-volumes collection with the
Restricted Control built-in role.
Create a grant for the Prod team to
access the prod-volumes collection with the
Restricted Control built-in role.
To create a volume as a team member:
Log in as one of the users on the Dev team.
Navigate to Swarm > Volumes and click Create.
On the Details tab, name the new volume dev-data.
On the Collection tab, navigate to the dev-volumes
collection and click Create.
Log in as one of the users on the Prod team.
Navigate to Swarm > Volumes and click Create.
On the Details tab, name the new volume prod-data.
On the Collection tab, navigate to the prod-volumes
collection and click Create.
As a result, the user on the Prod team cannot see the
Dev team volumes, and the user on the Dev team cannot
see the Prod team volumes. MKE administrators can see all of the
volumes created by either team.
You can use MKE to physically isolate resources by organizing nodes into
collections and granting Scheduler access for different users. Control
access to nodes by moving them to dedicated collections where you can grant
access to specific users, teams, and organizations.
The following tutorials explain how to isolate nodes using Swarm and
Kubernetes.
This tutorial explains how to give a team access to a node collection and a
resource collection. MKE access control ensures that team members cannot view
or use Swarm resources that are not in their collection.
Note
You need an MKE license and at least two worker nodes to complete this
tutorial.
The following is a high-level overview of the steps you will take to isolate
cluster nodes:
Create an Ops team and assign a user to it.
Create a Prod collection for the team node.
Assign a worker node to the Prod collection.
Grant the Ops teams access to its collection.
To create a team:
Log in to the MKE web UI.
Create a team named Ops in your organization.
Add a user to the team who is not an administrator.
To create the team collections:
In this example, the Ops team uses a collection for its assigned
nodes and another for its resources.
The Prod collection is for the worker nodes and the
Webserver sub-collection is for an application that you will deploy
on the corresponding worker nodes.
To move a worker node to a different collection:
Note
MKE places worker nodes in the Shared collection by default, and
it places those running MSR in the System collection.
Navigate to Shared Resources > Nodes to view all of the nodes in
the swarm.
Find a node located in the Shared collection. You cannot move
worker nodes that are assigned to the System collection.
Click the slider icon on the node details page.
In the Labels section on the Details tab, change
com.docker.ucp.access.label from /Shared to /Prod.
Click Save to move the node to the Prod collection.
To create two grants for team access to the two collections:
Create a grant for the Ops team to access
the Webserver collection with the built-in
Restricted Control role.
Create a grant for the Ops team to access
the Prod collection with the built-in Scheduler
role.
The cluster is now set up for node isolation. Users with access to nodes in the
Prod collection can deploy Swarm services and Kubernetes apps. They
cannot, however, schedule workloads on nodes that are not in the collection.
To deploy a Swarm service as a team member:
When a user deploys a Swarm service, MKE assigns its resources to the
default collection. As a user on the Ops team, set
Webserver to be your default collection.
Note
From the resource target collection, MKE walks up the ancestor
collections until it finds the highest ancestor that the user has
Scheduler access to. MKE schedules tasks on any nodes in the
tree below this ancestor. In this example, MKE assigns the user service to
the Webserver collection and schedules tasks on nodes in the
Prod collection.
Log in as a user on the Ops team.
Navigate to Shared Resources > Collections.
Navigate to the Webserver collection.
Under the vertical ellipsis menu, select Set to default.
Navigate to Swarm > Services and click Create to
create a Swarm service.
Name the service NGINX, enter nginx:latest in the Image*
field, and click Create.
Click the NGINX service when it turns green.
Scroll down to TASKS, click the NGINX container,
and confirm that it is in the Webserver collection.
Navigate to the Metrics tab on the container page, select the
node, and confirm that it is in the Prod collection.
Note
An alternative approach is to use a grant instead of changing the
default collection. An administrator can create a grant for a role that has
the Service Create permission for the Webserver
collection or a child collection. In this case, the user sets the value of
com.docker.ucp.access.label to the new collection or one of its
children that has a Service Create grant for the required user.
This topic describes how to use a Kubernetes namespace to deploy a Kubernetes
workload to worker nodes using the MKE web UI.
MKE uses the scheduler.alpha.kubernetes.io/node-selector annotation key to
assign node selectors to namespaces. Assigning the name of the node selector
to this annotation pins all applications deployed in the namespace to the nodes
that have the given node selector specified.
To isolate cluster nodes with Kubernetes:
Create a Kubernetes namespace.
Note
You can also associate nodes with a namespace by providing the namespace
definition information in a configuration file.
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to Kubernetes and
click Create to open the Create Kubernetes Object
page.
This tutorial explains how to set up a complete access architecture for a
fictitious company called OrcaBank.
OrcaBank is reorganizing their application teams by product with each team
providing shared services as necessary. Developers at OrcaBank perform their
own DevOps and deploy and manage the lifecycle of their applications.
OrcaBank has four teams with the following resource needs:
Security needs view-only access to all applications in the
cluster.
DB (database) needs full access to all database applications and
resources.
Mobile needs full access to their mobile applications and limited
access to shared DB services.
Payments needs full access to their payments applications and
limited access to shared DB services.
OrcaBank is taking advantage of the flexibility in the MKE grant model by
applying two grants to each application team. One grant allows each team to
fully manage the apps in their own collection, and the second grant gives them
the (limited) access they need to networks and secrets within the
db collection.
The resulting access architecture has applications connecting across collection
boundaries. By assigning multiple grants per team, the Mobile and Payments
applications teams can connect to dedicated database resources through a secure
and controlled interface, leveraging database networks and secrets.
Note
MKE deploys all resources across the same group of worker nodes while
providing the option to segment nodes.
To set up a complete access control architecture:
Set up LDAP/AD integration and create the required teams.
OrcaBank will standardize on LDAP for centralized authentication to help
their identity team scale across all the platforms they manage.
To implement LDAP authentication in MKE, OrcaBank is using the MKE native
LDAP/AD integration to map LDAP groups directly to MKE teams. You can add or
remove users from MKE teams via LDAP, which the OrcaBank identity team will
centrally manage.
Define an Ops role that allows users to perform
all operations against configs, containers, images, networks, nodes,
secrets, services, and volumes.
Define a View&UseNetworks+Secrets role
that enables users to view and connect to networks and view and use
secrets used by DB containers, but that prevents them from seeing or
impacting the DB applications themselves.
Note
You will also use the built-in ViewOnly role that allows users to
see all resources, but not edit or use them.
Create the required Swarm collections.
All OrcaBank applications share the same physical resources, so all nodes
and applications are configured in collections that nest under the built-in
Shared collection.
/Shared/mobile to host all mobile applications and
resources.
/Shared/payments to host all payments applications and
resources.
/Shared/db to serve as a top-level collection for all db
resources.
/Shared/db/mobile to hold db resources for
mobile applications.
/Shared/db/payments to hold db resources for
payments applications.
Note
The OrcaBank grant composition will ensure that the Swarm collection
architecture gives the DB team access to all db resources and
restricts app teams to shared db resources.
Create the required grants:
For the Security team, create grants
to access the following collections with the View Only
built-in role: /Shared/mobile, /Shared/payments,
/Shared/db, /Shared/db/mobile, and
/Shared/db/payments.
For the DB team, create grants to
access the /Shared/db, /Shared/db/mobile, and
/Shared/db/payments collections with the Ops
custom role.
For the Mobile team, create a grant to
access the /Shared/mobile collection with the Ops
custom role.
For the Mobile team, create a grant to
access the /Shared/db/mobile collection with the
View & Use Networks + Secrets custom role.
For the Payments team, create a grant
to access the /Shared/payments collection with the
Ops custom role.
For the Payments team, create a grant
to access the /Shared/db/payments collection with the
View & Use Networks + Secrets custom role.
Set up access control architecture with additional security requirements¶
In the previous tutorial, you assigned multiple grants to resources across
collection boundaries on a single platform. In this tutorial, you will
implement the following stricter security requirements for the fictitious
company, OrcaBank:
OrcaBank is adding a staging zone to their deployment model, deploying
applications first from development, then from staging, and finally from
production.
OrcaBank will no longer permit production applications to share any physical
infrastructure with non-production infrastructure. They will use node access
control to segment application scheduling and access.
Note
Node access control is an MKE feature that provides secure multi-tenancy
with node-based isolation. Use it to place nodes in different collections
so that you can schedule and isolate resources on disparate physical or
virtual hardware. For more information, refer to Isolate nodes.
OrcaBank will still use its three application teams from the previous tutorial
(DB, Mobile, and Payments) but with
varying levels of segmentation between them. The new access architecture will
organize the MKE cluster into staging and production collections with separate
security zones on separate physical infrastructure.
The four OrcaBank teams now have the following production and staging needs:
Security` needs view-only access to all applications in
production and no access to staging.
DB needs full access to all database applications and resources
in production and no access to staging.
In both production and staging, Mobile needs full access to their
applications and limited access to shared DB services.
In both production and staging, Payments needs full access to
their applications and limited access to shared DB services.
The resulting access architecture will provide physical segmentation between
production and staging using node access control.
Applications are scheduled only on MKE worker nodes in the dedicated
application collection. Applications use shared resources across
collection boundaries to access the databases in the /prod/db
collection.
To set up a complete access control architecture with additional security
requirements:
Verify LDAP, teams, and roles are set up properly:
Verify LDAP is enabled and syncing. If it is not, configure that now.
Verify the following teams are present in your organization:
Security, DB, Mobile, and
Payment, and if they are not, create them.
Verify that there is a View & Use Networks + Secrets role. If
there is not, define a
View&UseNetworks+Secrets role that enables users to view and
connect to networks and view and use secrets used by DB
containers. Configure the role so that it prevents those who use it from
seeing or impacting the DB applications themselves.
Note
You will also use the following built-in roles:
View Only allows users to see but not edit all cluster
resources.
Full Control allows users complete control of all
collections granted to them. They can also create containers without
restriction but cannot see the containers of other users. This role
will replace the custom Ops role from the previous
tutorial.
Create the required Swarm collections.
In the previous tutorial, OrcaBank created separate collections for each
application team and nested them all under /Shared.
To meet their new security requirements for production, OrcaBank will add
top-level prod and staging collections with
mobile and payments application collections nested
underneath. The prod collection (but not the staging
collection) will also include a db collection with a second set
of mobile and payments collections nested
underneath.
OrcaBank will also segment their nodes such that the production and staging
zones will have dedicated nodes, and in production each application will be
on a dedicated node.
Create the required grants as described in Create grants:
For the Security team, create grants
to access the following collections with the View Only
built-in role: /prod, /prod/mobile,
/prod/payments, /prod/db,
/prod/db/mobile, and /prod/db/payments.
For the DB team, create grants to
access the following collections with the Full Control
built-in role: /prod/db, /prod/db/mobile, and
/prod/db/payments.
For the Mobile team, create grants to
access the /prod/mobile and /staging/mobile
collections with the Full Control built-in role.
For the Mobile team, create a grant to
access the /prod/db/mobile collection with the
View & Use Networks + Secrets custom role.
For the Payments team, create grants
to access the /prod/payments and
/staging/payments collections with the
Full Control built-in role.
For the Payments team, create a grant
to access the /prod/db/payments collection with the
View & Use Networks + Secrets custom role.
Prior to upgrading MKE, review the MKE release notes for information that may be relevant to the upgrade
process.
In line with your MKE upgrade, you should plan to upgrade the Mirantis
Container Runtime (MCR) instance on each cluster node to version 20.10.0 or
later. Mirantis recommends that you schedule the upgrade for non-business hours
to ensure minimal user impact.
Important
Do not make changes to your MKE configuration while upgrading, as doing so
can cause misconfiguration.
MKE uses semantic versioning. While downgrades are not supported, Mirantis
supports upgrades according to the following rules:
When you upgrade from one patch version to another, you can skip patch
versions as no data migration takes place between patch versions.
When you upgrade between minor releases, you cannot skip releases. You can,
however, upgrade from any patch version from the previous minor release to
any patch version of the subsequent minor release.
When you upgrade between major releases, you cannot skip releases.
Warning
Upgrading from one MKE minor version to another minor version can result in
a downgrading of MKE middleware components. For more information, refer to
the component listings in the release notes of both the source and target
MKE versions.
Available as of MKE 3.7.0 MKE supports automated
rollbacks. As such, if an MKE upgrade fails for any reason, the system will
automatically revert to the previously running MKE version and thus ensure that
the cluster remains in a usable state.
Note
Rollback will be automatically initiated in the event that any step of the
upgrade process does not progress within 20 minutes.
The automated rollbacks feature is enabled by default. To opt out of the
function, refer to the MKE upgrade CLI command
documentation.
Azure installations have additional prerequisites. Refer to
Install MKE on Azure for more information.
To perform storage verifications:
Verify that no more than 70% of /var/ storage is used. If more than 70%
is used, allocate enough storage to meet this requirement. Refer to
MKE hardware requirements for the minimum and recommended storage requirements.
Verify whether any node local file systems have disk storage issues,
including MSR backend storage, for example, NFS.
Verify that you are using Overlay2 storage drivers, as they are more stable.
If you are not, you should transition to Overlay2 at this time.
Transitioning from device mapper to Overlay2 is a destructive rebuild.
To perform operating system verifications:
Patch all relevant packages to the most recent cluster node operating
system version, including the kernel.
Perform rolling restart of each node to confirm in-memory settings are the
same as startup scripts.
After performing rolling restarts, run check-config.sh on each cluster
node checking for kernel compatibility issues.
To perform procedural verifications:
Perform Swarm, MKE, and MSR backups.
Gather Compose, service, and stack files.
Generate an MKE support bundle for this specific point in time.
Preinstall MKE, MSR, and MCR images. If your cluster does not have an
Internet connection, Mirantis provides tarballs containing all the
required container images. If your cluster does have an Internet connection,
pull the required container images onto your nodes:
Load troubleshooting packages, for example, netshoot.
To upgrade MCR:
The MKE upgrade requires MCR 20.10.0 or later to be running on every cluster
node. If it is not, perform the following steps first on manager and then on
worker nodes:
Log in to the node using SSH.
Upgrade MCR to version 20.10.0 or later.
Using the MKE web UI, verify that the node is in a healthy state:
Log in to the MKE web UI.
Navigate to Shared Resources > Nodes.
Verify that the node is healthy and a part of the cluster.
Caution
Mirantis recommends upgrading in the following order: MCR, MKE, MSR. This
topic is limited to the upgrade instructions for MKE.
To perform cluster verifications:
Verify that your cluster is in a healthy state, as it will be easier to
troubleshoot should a problem occur.
Create a backup of your cluster, thus allowing you to recover should
something go wrong during the upgrade process.
Verify that the Docker engine is running on all MKE cluster nodes.
Note
You cannot use the backup archive during the upgrade process, as it is
version specific. For example, if you create a backup archive for
an MKE 3.4.2 cluster, you cannot use the archive file after you upgrade
to MKE 3.4.4.
If the MKE Interlock configuration is customized, the Interlock
component is managed by the user and thus cannot be upgraded using the
upgrade command. In such cases, Interlock must be manually
upgraded using Docker, as follows:
To upgrade MKE on machines that are not connected to the Internet, refer
to Install MKE offline to learn how to download the MKE package for
offline installation.
To manually interrupt the upgrade process, enter Control-C on the
terminal upon which you have initiated the upgrade bootstrapper. Doing so
will trigger an automatic rollback to the previous MKE version.
If no upgrade progress is made within 20 minutes, MKE will initiate a
rollback to the original version.
With all of these upgrade methods, manager nodes are automatically upgraded in
place. You cannot control the order of manager node upgrades. For each worker
node that requires an upgrade, you can upgrade that node in place or you can
replace the node with a new worker node. The type of upgrade you perform
depends on what is needed for each node.
Automated rollbacks are only supported when MKE is in control of the upgrade
process, which is while the upgrade containers are running. As such, the
feature scope is limited in terms of any failures encountered
during Phasedin-placeclusterupgrade and Replaceexistingworkernodesusingblue-greendeployment upgrade methods.
Automatically upgrades manager nodes and allows you to control the
upgrade order of worker nodes. This type of upgrade is more advanced
than the automated in-place cluster upgrade.
Only if the failure occurs before or during manager node upgrade.
This type of upgrade allows you to stand up a new cluster in parallel to
the current one and switch over when the upgrade is complete. It
requires that you join new worker nodes, schedule workloads to run on
them, pause, drain, and remove old worker nodes in batches (rather than
one at a time), and shut down servers to remove worker nodes. This is
the most advanced upgrade method.
Only if the failure occurs before or during manager node upgrade.
Automatedin-placeclusterupgrade is the standard method for upgrading
MKE. It updates all MKE components on all nodes within the MKE cluster
one-by-one until the upgrade is complete, and thus it is not ideal for those
who need to upgrade their worker nodes in a particular order.
Verify that all MCR instances have been upgraded to the corresponding new
version.
SSH into one MKE manager node and run the following command (do not run this
command on a workstation with a client bundle):
The Phasedin-placeclusterupgrade method allows for granular
control of the MKE upgrade process by first upgrading a manager node and
thereafter allowing you to upgrade worker nodes manually in your preferred
order. This allows you to migrate workloads and control traffic while
upgrading. You can temporarily run MKE worker nodes with different versions of
MKE and MCR.
This method allows you to handle failover by adding additional worker node
capacity during an upgrade. You can add worker nodes to a partially-upgraded
cluster, migrate workloads, and finish upgrading the remaining worker nodes.
Verify that all MCR instances have been upgraded to the corresponding new
version.
SSH into one MKE manager node and run the following command (do not run this
command on a workstation with a client bundle):
The --manual-worker-upgrade flag allows MKE to upgrade only the manager
nodes. It adds an upgrade-hold label to all worker nodes, which prevents
MKE from upgrading each worker node until you remove the label.
Optional. Join additional worker nodes to your cluster:
Replace existing worker nodes using blue-green deployment¶
Th Replaceexistingworkernodesusingblue-greendeployment upgrade method
creates a parallel environment for a new deployment, which reduces downtime,
upgrades worker nodes without disrupting workloads, and allows you to migrate
traffic to the new environment with worker node rollback capability.
Note
You do not have to replace all worker nodes in the cluster at one time, but
can instead replace them in groups.
Verify that all MCR instances have been upgraded to the corresponding new
version.
SSH into one MKE manager node and run the following command (do not run this
command on a workstation with a client bundle):
The --manual-worker-upgrade flag allows MKE to upgrade only the manager
nodes. It adds an upgrade-hold label to all worker nodes, which prevents
MKE from upgrading each worker node until the label is removed.
This topic describes common problems and errors that occur during the upgrade
process and how to identify and resolve them.
To check for multiple conflicting upgrades:
The upgrade command automatically checks for multiple ucp-worker-agents,
the existence of which can indicate that the cluster is still undergoing a
prior manual upgrade. You must resolve the conflicting node labels before
proceeding with the upgrade.
To check Kubernetes errors:
For more information on anything that might have gone wrong during the upgrade
process, check Kubernetes errors in node state messages after the upgrade is
complete.
To circumvent the SLESS12 SP5 Calico CNI error:
Beginning with MKE 3.7.12, MKE cluster upgrades on SUSE Linux Enterprise Server
12 SP5 result in a CalicoCNIPluginPodisUnhealthy error. You can
bypass this error by manually starting cri-dockerd:
You can upgrade your cluster to use Windows Server 2002 nodes in one of two
ways. The approach that Mirantis recommends is to join nodes that have a fresh
installation of Windows Server 2022, whereas the alternative is to perform an
in-place upgrade of existing Windows Server 2019 nodes.
Approach #1 (Recommended): Join new Windows Server 2022 nodes¶
The preferred method for upgrading to Windows Server 2022 is to first add new
nodes that are set to run the new operating system, and then remove the Windows
Server 2019 nodes that the new nodes are meant to replace. You can do this by
adding all of the new nodes prior to removing their original counterparts, or
you can perform the operation one node at a time, as shown in the following
procedure:
In the left-side navigation panel, navigate to
Shared Resources > Nodes and select the required Window Server
2019 node.
In the upper right, select the Edit Node icon.
In the Availability section, click Drain.
Click Save to evict the workloads from the node.
In the upper right, select the vertical ellipsis and click
Remove.
Click Confirm.
Note
If you are planning to run only Windows Server 2022 nodes, you can
remove any added constraints or nodeSelectors. If, though, you plan to run a
combination of Windows Server 2022 and Windows Server 2019 nodes, keep your
constraints or nodeSelectors in place and add them to any future workloads.
Refer to Operate a hybrid Windows cluster for more information.
Approach #2: Upgrade existing Windows Server nodes¶
While it is not recommended, you can upgrade to Windows Server 2022 by
performing an in-place upgrade of the existing Windows Server 2019 nodes.
If you are using a physical server, insert a drive that has the Windows
Server 2022 installation media installed. Otherwise, upload the ISO to
the server and mount the image.
Note
Windows core version users can mount the ISO in PowerShell using
Mount-DiskImage-ImagePath"path".
Navigate to the drive where the ISO is mounted and run setup.exe to
launch the setup wizard.
Once the upgrade completes, remove all the MKE images on the node and
re-pull them. Docker will automatically pull the image versions that are
built for Windows Server 2022.
If ucp-work-agent-win is not running on the node, use Docker Swarm to
rerun the service on the node:
dockerserviceupdateucp-worker-agent-win-x
If ucp-work-agent-win is still not running on the node, it could be due
operating system mismatches, which can occur after failing to update
registry keys during the Windows upgrade process.
Review the output of the following command, looking for references to
Windows Server 2019 or build number 17763:
When migrating manager Nodes, Mirantis recommends that you replace one manager
Node at a time, to preserve fault tolerance and minimize performance impact.
This topic describes how to use both the MKE web UI and the CLI to deploy a
multi-service application for voting on whether you prefer cats or dogs.
To deploy a multi-service application using the MKE web UI:
Log in to the MKE web UI.
Navigate to Shared Resources > Stacks and click
Create Stack.
In the Name field, enter voting-app.
Under ORCHESTRATOR MODE, select Swarm Services and
click Next.
In the Add Application File editor, paste the following
application definition written in the docker-compose.yml format:
version:"3"services:# A Redis key-value store to serve as message queueredis:image:redis:alpineports:-"6379"networks:-frontend# A PostgreSQL database for persistent storagedb:image:postgres:9.4volumes:-db-data:/var/lib/postgresql/datanetworks:-backend# Web UI for votingvote:image:dockersamples/examplevotingapp_vote:beforeports:-5000:80networks:-frontenddepends_on:-redis# Web UI to count voting resultsresult:image:dockersamples/examplevotingapp_result:beforeports:-5001:80networks:-backenddepends_on:-db# Worker service to read from message queueworker:image:dockersamples/examplevotingapp_workernetworks:-frontend-backendnetworks:frontend:backend:volumes:db-data:
Click Create to deploy the stack.
In the list on the Shared Resources > Stacks page, verify that
the application is deployed by looking for voting-app. If the
application is in the list, it is deployed.
To view the individual application services, click voting-app
and navigate to the Services tab.
Cast votes by accessing the service on port 5000.
Caution
MKE does not support referencing external files when using the MKE web UI
to deploy applications, and thus does not support the following keywords:
build
dockerfile
env_file
You must use a version control system to store the stack definition used
to deploy the stack, as MKE does not store the stack definition.
To deploy a multi-service application using the MKE CLI:
Create a file named docker-compose.yml with the following
content:
version:"3"services:# A Redis key-value store to serve as message queueredis:image:redis:alpineports:-"6379"networks:-frontend# A PostgreSQL database for persistent storagedb:image:postgres:9.4volumes:-db-data:/var/lib/postgresql/datanetworks:-backendenvironment:-POSTGRES_PASSWORD=<password># Web UI for votingvote:image:dockersamples/examplevotingapp_vote:beforeports:-5000:80networks:-frontenddepends_on:-redis# Web UI to count voting resultsresult:image:dockersamples/examplevotingapp_result:beforeports:-5001:80networks:-backenddepends_on:-db# Worker service to read from message queueworker:image:dockersamples/examplevotingapp_workernetworks:-frontend-backendnetworks:frontend:backend:volumes:db-data:
This topic describes how to use both the CLI and a Compose file to deploy
application resources to a particular Swarm collection. Attach the Swarm
collection path to the service access label to assign the service to the
required collection. MKE automatically assigns new services to the default
collection unless you use either of the methods presented here to assign a
different Swarm collection.
Navigate to the Shared Resources > Stacks and click
Create Stack.
Name the application wordpress.
Under ORCHESTRATOR MODE, select Swarm Services and
click Next.
In the Add Application File editor, paste the Compose file.
Click Create to deploy the application
Click Done when the deployment completes.
Note
MKE reports an error if the /Shared/wordpress collection does not
exist or if you do not have a grant for accessing it.
To confirm that the service deployed to the correct Swarm collection:
Navigate to Shared Resources > Stacks and select your
application.
Navigate to the to Services tab and select the required
service.
On the details pages, verify that the service is assigned to the correct
Swarm collection.
Note
MKE creates a default overlay network for your stack that attaches to
each container you deploy. This works well for administrators and those
assigned full control roles. If you have lesser permissions, define a custom
network with the same com.docker.ucp.access.label label as your services
and attach this network to each service. This correctly groups your network
with the other resources in your stack.
This topic describes how to create and use secrets with MKE by showing you
how to deploy a WordPress application that uses a secret for storing a
plaintext password. Other sensitive information you might use a secret to store
includes TLS certificates and private keys. MKE allows you to securely store
secrets and configure who can access and manage them using role-based access
control (RBAC).
The application you will create in this topic includes the following two
services:
wordpress
Apache, PHP, and WordPress
wordpress-db
MySQL database
The following example stores a password in a secret, and the secret is stored
in a file inside the container that runs the services you will deploy. The
services have access to the file, but no one else can see the plaintext
password. To make things simple, you will not configure the database to persist
data, and thus when the service stops, the data is lost.
To create a secret:
Log in to the MKE web UI.
Navigate to Swarm > Secrets and click Create.
Note
After you create the secret, you will not be able to edit or see the
secret again.
Name the secret wordpress-password-v1.
In the Content field, assign a value to the secret.
Optional. Define a permission label so that other users can be given
permission to use this secret.
Note
To use services and secrets together, they must either have the same
permission label or no label at all.
To create a network for your services:
Navigate to Swarm > Networks and click Create.
Create a network called wordpress-network with the default settings.
To create the MySQL service:
Navigate to Swarm > Services and click
Create.
Under Service Details, name the service wordpress-db.
Under Task Template, enter mysql:5.7.
In the left-side menu, navigate to Network, click
Attach Network +, and select wordpress-network from
the drop-down.
In the left-side menu, navigate to Environment, click
Use Secret +, and select wordpress-password-v1 from
the drop-down.
Click Confirm to associate the secret with the
service.
Scroll down to Environment variables and click
Add Environment Variable +.
Enter the following string to create an environment variable that contains
the path to the password file in the container:
If you specified a permission label on the secret, you must set the
same permission label on this service.
Click Create to deploy the MySQL service.
This creates a MySQL service that is attached to the wordpress-network
network and that uses the wordpress-password-v1 secret. By default, this
creates a file with the same name in /run/secrets/<secret-name> inside the
container running the service.
We also set the MYSQL_ROOT_PASSWORD_FILE environment variable to
configure MySQL to use the content of the
/run/secrets/wordpress-password-v1 file as the root password.
To create the WordPress service:
Navigate to Swarm > Services and click
Create.
Under Service Details, name the service wordpress.
Under Task Template, enter wordpress:latest.
In the left-side menu, navigate to Network, click
Attach Network +, and select wordpress-network from
the drop-down.
In the left-side menu, navigate to Environment, click
Use Secret +, and select wordpress-password-v1 from
the drop-down.
Click Confirm to associate the secret with the
service.
Scroll down to Environment variables and click
Add Environment Variable +.
Enter the following string to create an environment variable that contains
the path to the password file in the container:
Add another environment variable and enter the following string:
WORDPRESS_DB_HOST=wordpress-db:3306
If you specified a permission label on the secret, you must set the
same permission label on this service.
Click Create to deploy the WordPress service.
This creates a WordPress service that is attached to the same network as the
MySQL service so that they can communicate, and maps the port 80 of the
service to port 8000 of the cluster routing mesh.
Once you deploy this service, you will be able to access it on port 8000 using
the IP address of any node in your MKE cluster.
To update a secret:
If the secret is compromised, you need to change it, update the services
that use it, and delete the old secret.
Create a new secret named wordpress-password-v2.
From Swarm > Secrets, select the
wordpress-password-v1 secret to view all the services that you
need to update. In this example, it is straightforward, but that will not
always be the case.
Update wordpress-db to use the new secret.
Update the MYSQL_ROOT_PASSWORD_FILE environment variable with either
of the following methods:
Update the environment variable directly with the following:
Mount the secret file in /run/secrets/wordpress-password-v1 by setting
the Target Name field with wordpress-password-v1. This
mounts the file with the wordpress-password-v2 content in
/run/secrets/wordpress-password-v1.
Delete the wordpress-password-v1 secret and click Update.
Repeat the foregoing steps for the WordPress service.
MKE includes a system for application-layer (layer 7) routing that offers both
application routing and load balancing (ingress routing) for Swarm
orchestration. The Interlock architecture leverages Swarm components to provide
scalable layer 7 routing and Layer 4 VIP mode functionality.
Swarm mode provides MCR with a routing mesh, which enables users to access
services using the IP address of any node in the swarm. layer 7 routing enables
you to access services through any node in the swarm by using a domain name,
with Interlock routing the traffic to the node with the relevant container.
Interlock uses the Docker remote API to automatically configure extensions such
as NGINX and HAProxy for application traffic. Interlock is designed for:
Full integration with MCR, including Swarm services, secrets, and configs
Enhanced configuration, including context roots, TLS, zero downtime
deployment, and rollback
Support through extensions for external load balancers, such as NGINX,
HAProxy, and F5
Least privilege for extensions, such that they have no Docker API access
Note
Interlock and layer 7 routing are used for Swarm deployments. Refer to
NGINX Ingress Controller for information on routing traffic to your Kubernetes
applications.
The central piece of the layer 7 routing solution. The core service is
responsible for interacting with the Docker remote API and building an
upstream configuration for the extensions. Interlock uses the Docker API to
monitor events, and manages the extension and proxy services, and it serves
this on a gRPC API that the extensions are configured to access.
Interlock manages extension and proxy service updates for both configuration
changes and application service deployments. There is no operator intervention
required.
The Interlock service starts a single replica on a manager node. The Interlock
extension service runs a single replica on any available node, and the
Interlock proxy service starts two replicas on any available node. Interlock
prioritizes replica placement in the following order:
Replicas on the same worker node
Replicas on different worker nodes
Replicas on any available nodes, including managers
Interlock extension
A secondary service that queries the Interlock gRPC API for the
upstream configuration. The extension service configures the proxy service
according to the upstream configuration. For proxy services that use files
such as NGINX or HAProxy, the extension service generates the file and sends
it to Interlock using the gRPC API. Interlock then updates the corresponding
Docker configuration object for the proxy service.
Interlock proxy
A proxy and load-balancing service that handles requests for the
upstream application services. Interlock configures these using the data
created by the corresponding extension service. By default, this service is a
containerized NGINX deployment.
All layer 7 routing components are failure-tolerant and leverage Docker Swarm
for high availability.
Automatic configuration
Interlock uses the Docker API for automatic configuration, without needing you
to manually update or restart anything to make services available. MKE
monitors your services and automatically reconfigures proxy services.
Scalability
Interlock uses a modular design with a separate proxy service, allowing an
operator to individually customize and scale the proxy Layer to handle user
requests and meet services demands, with transparency and no downtime for
users.
TLS
You can leverage Docker secrets to securely manage TLS certificates and keys
for your services. Interlock supports both TLS termination and TCP
passthrough.
Context-based routing
Interlock supports advanced application request routing by context or path.
Host mode networking
Layer 7 routing leverages the Docker Swarm routing mesh by default, but
Interlock also supports running proxy and application services in host mode
networking, allowing you to bypass the routing mesh completely, thus promoting
maximum application performance.
Security
The layer 7 routing components that are exposed to the outside world run on
worker nodes, thus your cluster will not be affected if they are compromised.
SSL
Interlock leverages Docker secrets to securely store and use SSL certificates
for services, supporting both SSL termination and TCP passthrough.
Blue-green and canary service deployment
Interlock supports blue-green service deployment allowing an operator to
deploy a new application while the current version is serving. Once the new
application verifies the traffic, the operator can scale the older version to
zero. If there is a problem, the operation is easy to reverse.
Service cluster support
Interlock supports multiple extension and proxy service combinations, thus
allowing for operators to partition load balancing resources to be used, for
example, in region- or organization-based load balancing.
Least privilege
Interlock supports being deployed where the load balancing proxies do not need
to be colocated with a Swarm manager. This is a more secure approach to
deployment as it ensures that the extension and proxy services do not have
access to the Docker API.
When an application image is updated, the following actions occur:
The service is updated with a new version of the application.
The default “stop-first” policy stops the first replica before
scheduling the second. The interlock proxies remove ip1.0 out of the
backend pool as the app.1 task is removed.
The first application task is rescheduled with the new image after
the first task stops.
The interlock proxy.1 is then rescheduled with the new
NGINX configuration that contains the update for the new app.1 task.
After proxy.1 is complete, proxy.2 redeploys with the updated ngnix
configuration for the app.1 task.
In this scenario, the amount of time that the service is unavailable is
less than 30 seconds.
Swarm provides control over the order in which old tasks are removed
while new ones are created. This is controlled on the service-level with
--update-order.
stop-first (default)- Configures the currently updating task to
stop before the new task is scheduled.
start-first - Configures the current task to stop after the new
task has scheduled. This guarantees that the new task is running
before the old task has shut down.
Use start-first if …
You have a single application replica and you cannot have service
interruption. Both the old and new tasks run simultaneously during
the update, but this ensurse that there is no gap in service during
the update.
Use stop-first if …
Old and new tasks of your service cannot serve clients
simultaneously.
You do not have enough cluster resourcing to run old and new replicas
simultaneously.
In most cases, start-first is the best choice because it optimizes
for high availability during updates.
Swarm services use update-delay to control the speed at which a
service is updated. This adds a timed delay between application tasks as
they are updated. The delay controls the time from when the first task
of a service transitions to healthy state and the time that the second
task begins its update. The default is 0 seconds, which means that a
replica task begins updating as soon as the previous updated task
transitions in to a healthy state.
Use update-delay if …
You are optimizing for the least number of dropped connections and a
longer update cycle as an acceptable tradeoff.
Interlock update convergence takes a long time in your environment
(can occur when having large amount of overlay networks).
Do not use update-delay if …
Service updates must occur rapidly.
Old and new tasks of your service cannot serve clients
simultaneously.
Swarm uses application health checks extensively to ensure that its
updates do not cause service interruption. health-cmd can be
configured in a Dockerfile or compose file to define a method for health
checking an application. Without health checks, Swarm cannot determine
when an application is truly ready to service traffic and will mark it
as healthy as soon as the container process is running. This can
potentially send traffic to an application before it is capable of
serving clients, leading to dropped connections.
Use stop-grace-period to configure the maximum time period delay prior to
force killing of the task (default: 10 seconds). In short, under the default
setting a task can continue to run for no more than 10 seconds once its
shutdown cycle has been initiated. This benefits applications that require long
periods to process requests, allowing connection to terminate normally.
Interlock service clusters allow Interlock to be segmented into multiple
logical instances called “service clusters”, which have independently
managed proxies. Application traffic only uses the proxies for a
specific service cluster, allowing the full segmentation of traffic.
Each service cluster only connects to the networks using that specific
service cluster, which reduces the number of overlay networks to which
proxies connect. Because service clusters also deploy separate proxies,
this also reduces the amount of churn in LB configs when there are
service updates.
Interlock proxy containers connect to the overlay network of every Swarm
service. Having many networks connected to Interlock adds incremental
delay when Interlock updates its load balancer configuration. Each
network connected to Interlock generally adds 1-2 seconds of update
delay. With many networks, the Interlock update delay causes the LB
config to be out of date for too long, which can cause traffic to be
dropped.
Minimizing the number of overlay networks that Interlock connects to can
be accomplished in two ways:
Reduce the number of networks. If the architecture permits it,
applications can be grouped together to use the same networks.
Use Interlock service clusters. By segmenting Interlock, service
clusters also segment which networks are connected to Interlock,
reducing the number of networks to which each proxy is connected.
Use admin-defined networks and limit the number of networks per
service cluster.
VIP Mode can be used to reduce the impact of application updates on the
Interlock proxies. It utilizes the Swarm L4 load balancing VIPs instead
of individual task IPs to load balance traffic to a more stable internal
endpoint. This prevents the proxy LB configs from changing for most
kinds of app service updates reducing churn for Interlock. The following
features are not supported in VIP mode:
This topic describes how to route traffic to Swarm services by deploying
a layer 7 routing solution into a Swarm-orchestrated cluster. It has the
following prerequisites:
Enabling layer 7 routing causes the following to occur:
MKE creates the ucp-interlock overlay network.
MKE deploys the ucp-interlock service and attaches it both to the
Docker socket and the overlay network that was created. This allows
the Interlock service to use the Docker API, which is why this service needs
to run on a manger node.
The ucp-interlock service starts the ucp-interlock-extension
service and attaches it to the ucp-interlock network, allowing both
services to communicate.
The ucp-interlock-extension generates a configuration for the proxy
service to use. By default the proxy service is NGINX, so this
service generates a standard NGINX configuration. MKE creates the
com.docker.ucp.interlock.conf-1 configuration file and uses it to
configure all the internal components of this service.
The ucp-interlock service takes the proxy configuration and uses
it to start the ucp-interlock-proxy service.
Note
Layer 7 routing is disabled by default.
To enable layer 7 routing using the MKE web UI:
Log in to the MKE web UI as an administrator.
Navigate to <user-name> > Admin Settings.
Click Ingress.
Toggle the Swarm HTTP ingress slider to the right.
Optional. By default, the routing mesh service listens on port 8080 for HTTP
and 8443 for HTTPS. Change these ports if you already have services using
them.
The three primary Interlock services include the core service, the extensions,
and the proxy. The following is the default MKE configuration, which is created
automatically when you enable Interlock as described in this topic.
The value of LargeClientHeaderBuffers indicates the number of buffers to
use to read a large client request header, as well as the size of those
buffers.
To enable layer 7 routing from the command line:
Interlock uses a TOML file for the core service configuration. The following
example uses Swarm deployment and recovery features by creating a Docker
config object.
The Interlock core service must have access to a Swarm manager
(--constraintnode.role==manager), however the extension and proxy
services are recommended to run on workers.
Verify that the three services are created, one for the Interlock service,
one for the extension service, and one for the proxy service:
This topic describes how to configure Interlock for a production environment
and builds upon the instruction in the previous topic,
Deploy a layer 7 routing solution. It does not describe infrastructure deployment,
and it assumes you are using a typical Swarm cluster, using
docker init and docker swarm join from the nodes.
The layer 7 solution that ships with MKE is highly available, fault tolerant,
and designed to work independently of how many nodes you manage with MKE.
The following procedures require that you dedicate two worker nodes for running
the ucp-interlock-proxy service. This tuning ensures the following:
The proxy services have dedicated resources to handle user requests.
You can configure these nodes with higher performance network
interfaces.
No application traffic can be routed to a manager node, thus making
your deployment more secure.
If one of the two dedicated nodes fails, layer 7 routing continues
working.
To dedicate two nodes to running the proxy service:
Select two nodes that you will dedicate to running the proxy service.
Log in to one of the Swarm manager nodes.
Add labels to the two dedicated proxy service nodes, configuring them as
load balancer worker nodes, for example, lb-00 and lb-01:
This updates the proxy service to have two replicas, ensures that they
are constrained to the workers with the label nodetype==loadbalancer,
and configures the stop signal for the tasks to be a SIGQUIT with a
grace period of five seconds. This ensures that NGINX does not exit before
the client request is finished.
Inspect the service to verify that the replicas have started on the selected
nodes:
Optional. By default, the config service is global, scheduling one task on
every node in the cluster. To modify constraint scheduling, update the
ProxyConstraints variable in the Interlock configuration file. Refer
to Configure layer 7 routing service for more information.
Verify that the proxy service is running on the dedicated nodes:
dockerservicepsucp-interlock-proxy
Update the settings in the upstream load balancer, such as ELB or F5, with
the addresses of the dedicated ingress workers, thus directing all traffic
to these two worker nodes.
To install Interlock on your cluster without an Internet connection, you must
have the required Docker images loaded on your computer. This topic describes
how to export the required images from a local instance of MCR and then load
them to your Swarm-orchestrated cluster.
To export Docker images from a local instance:
Using a local instance of MCR, save the required images:
interlock-extension-nginx.tar - the Interlock extension
for NGINX.
interlock-proxy-nginx.tar - the official NGINX image based
on Alpine.
Note
Replace
mirantis/ucp-interlock-extension:3.7.16
and mirantis/ucp-interlock-proxy:3.7.16
with the corresponding extension and proxy image if you are not using
NGINX.
Copy the three files you just saved to each node in the cluster and load
each image:
This section describes how to customize layer 7 routing by updating the
ucp-interlock service with a new Docker configuration, including
configuration options and the procedure for creating a proxy service.
Optional. If you provide an invalid configuration, the ucp-interlock
service is configured to roll back to a previous stable configuration, by
default. Configure the service to pause instead of rolling back:
The following options are available to configure the extensions. Interlock must
contain at least one extension to service traffic.
Option
Type
Description
Image
string
Name of the Docker image to use for the extension.
Args
[]string
Arguments to pass to the extension service.
Labels
map[string]string
Labels to add to the extension service.
Networks
[]string
Allows the administrator to cherry pick a list of networks
that Interlock can connect to. If this option is not specified, the
proxy service can connect to all networks.
ContainerLabels
map[string]string
Labels for the extension service tasks.
Constraints
[]string
One or more constraints to use when scheduling the extension service.
PlacementPreferences
[]string
One of more placement preferences.
ServiceName
string
Name of the extension service.
ProxyImage
string
Name of the Docker image to use for the proxy service.
ProxyArgs
[]string
Arguments to pass to the proxy service.
ProxyLabels
map[string]string
Labels to add to the proxy service.
ProxyContainerLabels
map[string]string
Labels to add to the proxy service tasks.
ProxyServiceName
string
Name of the proxy service.
ProxyConfigPath
string
Path in the service for the generated proxy configuration.
ProxyReplicas
unit
Number or proxy service replicas.
ProxyStopSignal
string
Stop signal for the proxy service. For example, SIGQUIT.
ProxyStopGracePeriod
string
Stop grace period for the proxy service in seconds. For example, 5s.
ProxyConstraints
[]string
One or more constraints to use when scheduling the proxy service. Set
the variable to false, as it is currently set to true by
default.
ProxyPlacementPreferences
[]string
One or more placement preferences to use when scheduling the proxy
service.
ProxyUpdateDelay
string
Delay between rolling proxy container updates.
ServiceCluster
string
Name of the cluster that this extension serves.
PublishMode
string (ingress or host)
Publish mode that the proxy service uses.
PublishedPort
int
Port on which the proxy service serves non-SSL traffic.
PublishedSSLPort
int
Port on which the proxy service serves SSL traffic.
Template
int
Docker configuration object that is used as the extension template.
Config
config
Proxy configuration used by the extensions as described in this section.
HitlessServiceUpdate
bool
When set to true, services can be updated without restarting the
proxy container.
ConfigImage
config
Name for the config service used by hitless service updates. For
example, mirantis/ucp-interlock-config:3.2.1.
ConfigServiceName
config
Name of the config service. This name is equivalent to
ProxyServiceName. For example, ucp-interlock-config.
Options are available to the extensions, and the extensions use the options
needed for proxy service configuration. This provides overrides to the
extension configuration.
Because Interlock passes the extension configuration directly to the
extension, each extension has different configuration options available.
The default proxy service used by MKE to provide layer 7 routing is
NGINX. If users try to access a route that has not been configured, they
will see the default NGINX 404 page.
You can customize this by labeling a service with
com.docker.lb.default_backend=true. If users try to access a route that is
not configured, they will be redirected to the custom service.
If you want to customize the default NGINX proxy service used by MKE to provide
layer 7 routing, follow the steps below to create an example proxy service
where users will be redirected if they try to access a route that is not
configured.
If users try to access a route that is not configured, they are directed
to this demo service.
Optional. To minimize forwarding interruption to the updating service while
updating a single replicated service, add the following line to the
labels section of the docker-compose.yml file:
Layer 7 routing components communicate with one another by default
using overlay networks, but Interlock also supports host mode networking
in a variety of ways, including proxy only, Interlock only, application only,
and hybrid.
When using host mode networking, you cannot use DNS service discovery,
since that functionality requires overlay networking. For services to
communicate, each service needs to know the IP address of the node where
the other service is running.
Note
Use an alternative to DNS service discovery such as Registrator if you
require this functionality.
The following is a high-level overview of how to use host mode instead of
overlay networking:
Update the ucp-interlock configuration.
Deploy your Swarm services.
Configure proxy services.
If you have not already done so, configure the layer 7 routing solution for
production with the ucp-interlock-proxy service replicas running
on their own dedicated nodes.
This section describes how to deploy an example Swarm service on an eight-node
cluster using host mode networking to route traffic without using overlay
networks. The cluster has three manager nodes and five worker nodes, with two
workers configured as dedicated ingress cluster load balancer nodes that will
receive all application traffic.
This example does not cover the actual infrastructure deployment, and assumes
you have a typical Swarm cluster using dockerinit and
dockerswarmjoin from the nodes.
By default, NGINX is used as a proxy. The following configuration options are
available for the NGINX extension.
Note
The ServerNamesHashBucketSize option, which allowed the user to manually
set the bucket size for the server names hash table, was removed in MKE
3.4.2 because MKE now adaptively calculates the setting and overrides any
manual input.
Option
Type
Description
Defaults
User
string
User name for the proxy
nginx
PidPath
string
Path to the PID file for the proxy service
/var/run/proxy.pid
MaxConnections
int
Maximum number of connections for the proxy service
1024
ConnectTimeout
int
Timeout in seconds for clients to connect
600
SendTimeout
int
Timeout in seconds for the service to read a response from the proxied
upstream
600
ReadTimeout
int
Timeout in seconds for the service to read a response from the proxied
upstream
600
SSLOpts
int
Options to be passed when configuring SSL
N/A
SSLDefaultDHParam
int
Size of DH parameters
1024
SSLDefaultDHParamPath
string
Path to DH parameters file
N/A
SSLVerify
string
SSL client verification
required
WorkerProcesses
string
Number of worker processes for the proxy service
1
RLimitNoFile
int
Maximum number of open files for the proxy service
65535
SSLCiphers
string
SSL ciphers to use for the proxy service
HIGH:!aNULL:!MD5
SSLProtocols
string
Enable the specified TLS protocols
TLSv1.2
HideInfoHeaders
bool
Hide proxy-related response headers
N/A
KeepaliveTimeout
string
Connection keep-alive timeout
75s
ClientMaxBodySize
string
Maximum allowed client request body size
1m
ClientBodyBufferSize
string
Buffer size for reading client request body
8k
ClientHeaderBufferSize
string
Maximum number and size of buffers used for reading large
client request header
1k
LargeClientHeaderBuffers
string
Maximum number and size of buffers used for reading large
client request header
48k
ClientBodyTimeout
string
Timeout for reading client request body
60s
UnderscoresInHeaders
bool
Enables or disables the use of underscores in client request header
fields
false
UpstreamZoneSize
int
Size of the shared memory zone (in KB)
64
GlobalOptions
[]string
List of options that are included in the global configuration
N/A
HTTPOptions
[]string
List of options that are included in the HTTP configuration
N/A
TCPOptions
[]string
List of options that are included in the stream (TCP) configuration
Change the action that Swarm takes when an update fails using
update-failure-action (the default is pause), for example, to
rollback to the previous configuration:
Change the amount of time between proxy updates using update-delay
(the default is to use rolling updates), for example, setting the delay to
thirty seconds:
This topic describes how to update Interlock services by first updating
the Interlock configuration to specify the new extension or proxy image
versions and then updating the Interlock services to use the new configuration
and image.
After Interlock is deployed, you can launch and publish services and
applications. This topic describes how to configure services to publish
themselves to the load balancer by using service labels.
Caution
The following procedures assume a DNS entry exists for each of the
applications (or local hosts entry for local testing).
To publish a demo service with four replicas to the host (demo.local):
Create a Docker Service using the following two labels:
com.docker.lb.hosts for Interlock to determine where the service is
available.
com.docker.lb.port for the proxy service to determine which port to
use to access the upstreams.
Create an overlay network so that service traffic is isolated and
secure:
Defines the hostname for the service. When the layer 7 routing solution
gets a request containing app.example.org in the host header, that
request is forwarded to the demo service.
com.docker.lb.network
Defines which network the ucp-interlock-proxy should attach to in
order to communicate with the demo service. To use layer 7 routing, you
must attach your services to at least one network. If your service is
attached to a single network, you do not need to add a label to specify
which network to use for routing. When using a common stack file for
multiple deployments leveraging MKE Interlock and layer 7 routing,
prefix com.docker.lb.network with the stack name to ensure traffic
is directed to the correct overlay network. In combination with
com.docker.lb.ssl_passthrough, the label in mandatory even if your
service is only attached to a single network.
com.docker.lb.port
Specifies which port the ucp-interlock-proxy service should use to
communicate with this demo service. Your service does not need to expose
a port in the Swarm routing mesh. All communications are done using the
network that you have specified.
The ucp-interlock service detects that your service is using these
labels and automatically reconfigures the ucp-interlock-proxy service.
Optional. Increase traffic to the new version by adding more replicas. For
example:
dockerservicescaledemo-v2=4
Example output:
demo-v2
Complete the upgrade by scaling the demo-v1 service to zero replicas:
dockerservicescaledemo-v1=0
Example output:
demo-v1
This routes all application traffic to the new version. If you need to roll
back your service, scale the v1 service back up and the v2 service back
down.
Interlock detects when the service is available and publishes it.
Note
Interlock only supports one path per host for each service cluster. When
a specific com.docker.lb.hosts label is applied, it cannot be
applied again in the same service cluster.
After the tasks are running and the proxy service is updated, the
application is available at http://demo.local:
Interlock uses backend task IPs to route traffic from the
proxy to each container. Traffic to the front-end route is layer 7 load
balanced directly to service tasks. This allows for routing
functionality such as sticky sessions for each container. Task routing
mode applies layer 7 routing and then sends packets directly to a
container.
Interlock uses the Swarm service VIP as the backend IP instead of
using container IPs. Traffic to the front-end route is layer 7 load
balanced to the Swarm service VIP, which Layer 4 load balances to
backend tasks. VIP mode is useful for reducing the amount of churn in
Interlock proxy service configurations, which can be an advantage in
highly dynamic environments.
VIP mode optimizes for fewer proxy updates with the tradeoff of a
reduced feature set. Most application updates do not require configuring
backends in VIP mode. In VIP routing mode, Interlock uses the service
VIP, which is a persistent endpoint that exists from service creation to
service deletion, as the proxy backend. VIP routing mode applies Layer
7 routing and then sends packets to the Swarm Layer 4 load balancer,
which routes traffic to service containers.
Canary deployments
In task mode, a canary service with one task next to an existing service
with four tasks represents one out of five total tasks, so the canary
will receive 20% of incoming requests.
Because VIP mode routes by service IP rather than by task IP, it affects
the behavior of canary deployments. In VIP mode, a canary service with
one task next to an existing service with four tasks will receive 50%
of incoming requests, as it represents one out of two total services.
You can set each service to use either the task or the VIP backend routing
mode. Task mode is the default and is used if a label is not specified or if it
is set to task.
Interlock detects when the service is available and publishes it. After
tasks are running and the proxy service is updated, the application is
available at any URL that is not configured.
In this example, Interlock configures a single upstream for the host using
IP 10.0.2.9. Interlock skips further proxy updates as long as
there is at least one replica for the service, as the only upstream is
the VIP.
Interlock uses service labels to configure how applications are
published, to define the host names that are routed to the service, to define
the applicable ports, and to define other routing configurations.
The following occurs when you deploy or update a Swarm service with service
labels:
The ucp-interlock service monitors the Docker API for events and
publishes the events to the ucp-interlock-extension service.
The ucp-interlock-extension service generates a new configuration for
the proxy service based on the labels you have added to your services.
The ucp-interlock service takes the new configuration and
reconfigures ucp-interlock-proxy to start using the new
configuration.
This process occurs in milliseconds and does not interrupt services.
The following table lists the service labels that Interlock uses:
Label
Description
Example
com.docker.lb.hosts
Comma-separated list of the hosts for the service to serve.
example.com,test.com
com.docker.lb.port
Port to use for internal upstream communication.
8080
com.docker.lb.network
Name of the network for the proxy service to attach to for upstream
connectivity.
app-network-a
com.docker.lb.context_root
Context or path to use for the application.
/app
com.docker.lb.context_root_rewrite
Changes the path from the value of label com.docker.lb.context_root
to / when set to true.
true
com.docker.lb.ssl_cert
Docker secret to use for the SSL certificate.
example.com.cert
com.docker.lb.ssl_key
Docker secret to use for the SSL key.
example.com.key
com.docker.lb.websocket_endpoints
Comma-separated list of endpoints to be upgraded for websockets.
/ws,/foo
com.docker.lb.service_cluster
Name of the service cluster to use for the application.
us-east
com.docker.lb.sticky_session_cookie
Cookie to use for sticky sessions.
app_session
com.docker.lb.redirects
Semicolon-separated list of redirects to add in the format of
<source>,<target>.
http://old.example.com,http://new.example.com
com.docker.lb.ssl_passthrough
Enables SSL passthrough when set to true.
false
com.docker.lb.backend_mode
Selects the backend mode that the proxy should use to access the
upstreams. The default is task.
Interlock detects when the service is available and publishes it.
After tasks are running and the proxy service is updated, the application is
available through http://new.local with a redirect configured that sends
http://old.local to http://new.local:
Reconfiguring the single proxy service that Interlock manages by default can
take one to two seconds for each overlay network that the proxy manages. You
can scale up to a larger number of Interlock-routed networks and services
by implementing a service cluster. Service clusters use Interlock to manage
multiple proxy services, each responsible for routing to a separate set of
services and their corresponding networks, thereby minimizing proxy
reconfiguration time.
The provided instruction is based on the presumption that certain
prerequisites have been met:
You have an operational MKE cluster with at least two worker nodes
(mke-node-0 and mke-node-1), to use as dedicated proxy
servers for two independent Interlock service clusters.
You have enabled Interlock with 80 as an HTTP port and 8443 as
an HTTPS port.
From a manager node, apply node labels to the MKE workers that you have
chosen to use as your proxy servers:
Change all instances of the MKE version and *.ucp.InstanceID in the
above to match your deployment.
Optional. Modify the configuration file that Interlock creates by default:
Replace [Extensions.default] with [Extensions.east].
Change ServiceName to "ucp-interlock-extension-east".
Change ConfigServiceName to "ucp-interlock-config-east".
Change ProxyServiceName to "ucp-interlock-proxy-east".
Add the "node.labels.region==east" constraint to the
ProxyConstraints list.
Add the ServiceCluster="east" key immediately below and inline
with ProxyServiceName.
Add the Networks=["eastnet"] key immediately below and inline
with ServiceCluster. This list can contain as many overlay
networks as you require. Interlock only connects to the specified
networks and connects to them all at startup.
Change PublishMode="ingress" to PublishMode="host".
Change the [Extensions.default.Labels] section title to
[Extensions.east.Labels].
Add the "ext_region"="east" key under the
[Extensions.east.Labels] section.
Change the [Extensions.default.ContainerLabels] section title to
[Extensions.east.ContainerLabels].
Change the [Extensions.default.ProxyLabels] section title to
[Extensions.east.ProxyLabels].
Add the "proxy_region"="east" key under the
[Extensions.east.ProxyLabels] section.
Change the [Extensions.default.ProxyContainerLabels] section title to
[Extensions.east.ProxyContainerLabels].
Change the [Extensions.default.Config] section title to
[Extensions.east.Config].
Optional. Change ProxyReplicas=2 to ProxyReplicas=1. This is only
necessary if there is a single node labeled as a proxy for each service
cluster.
Configure your west service cluster by duplicating the entire
[Extensions.east] block and changing all instances of east to
west.
Create a new dockerconfig object from the config.toml file:
The provided instruction is based on the presumption that certain prerequisites have been met:
You have an operational MKE cluster with at least two worker nodes
(mke-node-0 and mke-node-1), to use as dedicated proxy
servers for two independent Interlock service clusters.
You have enabled Interlock with 80 as an HTTP port and 8443 as
an HTTPS port.
With your service clusters configured, you can now deploy services, routing
to them with your new proxy services using the service_cluster label.
Ping your whoami service on the mke-node-0 proxy server:
curl-H"Host: demo.A"http://<mke-node-0publicIP>
The response contains the container ID of the whoami container
declared by the demoeast service.
The same curl command on mke-node-1 fails because that Interlock
proxy only routes traffic to services with the service_cluster=west
label, which are connected to the westnet Docker network that you listed
in the configuration for that service cluster.
Ping your whoami service on the mke-node-1 proxy server:
curl-H"Host: demo.B"http://<mke-node-1publicIP>
The service routed by Host:demo.B is only reachable through the
Interlock proxy mapped to port 80 on mke-node-1.
In removing a service cluster, Interlock removes all of the services that are
used internally to manage the service cluster, while leaving all of the user
services intact. For continued function, however, you may need to update,
modify, or remove the user services that remain. For instance:
Any remaining user service that depends on functionality provided by the
removed service cluster will need to be provisioned and managed by different
means.
All load balancing that is managed by the service cluster will no longer be
available following its removal, and thus must be reconfigured.
Following the removal of the service cluster, all ports that were previously
managed by the service cluster will once again be available. Also, any manually
created networks will remain in place.
Remove the subsection from [Extensions] that corresponds with the
service cluster that you want to remove, but leave the [Extensions]
section header itself in place. For example, remove the entire
[Extensions.east] subsection from the config.toml file
generated in Configure service clusters.
Create a new dockerconfig object from the old_config.toml file:
This topic describes how to publish a service with a proxy that is configured
for persistent sessions using either cookies or IP hashing. Persistent sessions
are also known as sticky sessions.
Interlock detects when the service is available and publishes it.
After tasks are running and the proxy service is updated, the application is
configured to use persistent sessions and is available at
http://demo.local:
The curl command stores Set-Cookie from the application and
sends it with subsequent requests, which are pinned to the same instance. If
you make multiple requests, the same x-upstream-addr is present in each.
Using client IP hashing to configure persistent sessions is not as flexible or
consistent as using cookies but it enables workarounds for applications that
cannot use the other method. To use IP hashing, you must reconfigure Interlock
proxy to use host mode networking, because the default ingress networking
mode uses SNAT, which obscures client IP addresses.
Create an overlay network to isolate and secure service traffic:
Interlock detects when the service is available and publishes it.
After tasks are running and the proxy service is updated, the application is
configured to use persistent sessions and is available at
http://demo.local:
IP hashing for extensions creates a new upstream address when scaling
replicas because the proxy uses the new set of replicas to determine
where to pin the requests. When the upstreams are determined, a new
“sticky” backend is selected as the dedicated upstream.
This topic describes how to deploy a Swarm service wherein the proxy
manages the TLS connection. Using proxy-managed TLS entails that the traffic
between the proxy and the Swarm service is not secure, so you should only use
this option if you trust that no one can monitor traffic inside the services
that run in your datacenter.
To deploy a Swarm service with proxy-managed TLS:
Obtain a private key and certificate for the TLS connection. The Common Name
(CN) in the certificate must match the name where your service will be
available. Generate a self-signed certificate for app.example.org:
The demo service has labels specifying that the proxy service routes
app.example.org traffic to this service. All traffic between the service
and proxy occurs using the demo-network network. The service has labels
that specify the Docker secrets used on the proxy service for terminating
the TLS connection.
The private key and certificate are stored as Docker secrets, and thus you
can readily scale the number of replicas used for running the proxy service,
with MKE distributing the secrets to the replicas.
Test that everything works correctly by updating your /etc/hosts file to
map app.example.org to the IP address of an MKE node.
Optional. In a production deployment, create a DNS entry so that users
can access the service using the domain name of your choice. After
creating the DNS entry, access your service at
https://<hostname>:<https-port>.
hostname is the name you specified with the com.docker.lb.hosts.
label.
https-port is the port you configured in the MKE settings.
Because this example uses self-signed certificates, client tools such as
browsers display a warning that the connection is insecure.
Optional. Test that everything works using the CLI:
The proxy uses SNI to determine where to route traffic, and thus you must
verify that you are using a version of curl that includes the SNI
header with insecure requests. Otherwise, curl displays the
following error:
ServerabortedtheSSLhandshake
Note
There is no way to update expired certificates using the proxy-managed TLS
method. You must create a new secret and then update the corresponding
service.
This topic describes how to deploy a Swarm service wherein the service
manages the TLS connection by encrypting traffic from users to your Swarm
service.
Deploy your Swarm service using the following example docker-compose.yml
file:
This updates the service to start using the secrets with the private key and
certificate and it labels the service with com.docker.lb.ssl_passthrough:true, thus configuring the proxy service such that TLS traffic for
app.example.org is passed to the service.
Since the connection is fully encrypted from end-to-end, the proxy service
cannot add metadata such as version information or the request ID to the
response headers.
Mutual Transport Layer Security (mTLS) is a process of mutual authentication in
which both parties verify the identity of the other party, using a signed
certificate.
You must have the following items to deploy services with mTLS:
One or more CA certificates for signing the server and client certificates
and keys.
A signed certificate and key for the server
A signed certificate and key for the client
To deploy a backend service with proxy-managed mTLS enabled:
Create a secret for the CA certificate that the client uses to authenticate
the server.
Interlock detects when the service is available and publishes it.
Note
You must have an entry for demo.local in your /etc/hosts file or
use a routable domain.
Once tasks are running and the proxy service is updated, the application
will be available at http://demo.local. Navigate to this URL in two
different browser windows and notice that the text you enter in one window
displays automatically in the other.
Running Kubernetes on Windows Server nodes is only supported on MKE 3.3.0
and later. If you want to run Kubernetes on Windows Server nodes on a
cluster that is currently running an earlier version of MKE than 3.3.0, you
must perform a fresh install of MKE 3.3.0 or later.
The following procedure deploys a complete web application on IIS servers as
Kubernetes Services. The example workload includes an MSSQL database and a
load balancer. The procedure includes the following tasks:
The NGINX server is operational, but it is not accessible from outside
of the cluster. Create a YAML file to add a NodePort service, which exposes
the server on a specified port.
In the left-side navigation menu, navigate to Kubernetes and
click Create.
In the Namespace drop-down, select default.
Paste the following configuration details in the Object YAML
editor:
The service connects internal port 80 of the cluster to the external
port 32768.
Click Create, and the Services page opens.
Select the nginx service and in the Overview tab,
scroll to the Ports section.
To review the default NGINX page, navigate to <node-ip>:<nodeport>
in your browser.
Note
To display the NGINX page, you may need to add a rule in your cloud
provider firewall settings to allow inbound traffic on the port specified
in the YAML file.
The YAML definition connects the service to the NGINX server using
the app label nginx and a corresponding label selector.
MKE supports updating an existing deployment by applying an updated YAML file.
In this example, you will scale the server up to four replicas and update NGINX
to a later version.
In the left-side navigation panel, navigate to
Kubernetes > Controllers and select
nginx-deployment.
To edit the deployment, click the gear icon in the upper right corner.
Update the number of replicas from 2 to 4.
Update the value of image from nginx:1.7.9 to
nginx:1.8.
Click Save to update the deployment with the new configuration
settings.
To review the newly-created replicas, in the left-side navigation panel,
navigate to Kubernetes > Pods.
The content of the updated YAML file is as follows:
Mirantis currently supports the use of OPA Gatekeeper for purposes of policy
enforcement.
Open Policy Agent (OPA) is an open source policy engine that facilitates
policy-based control for cloud native environments. OPA introduces a high-level
declarative language called Rego that decouples policy decisions from
enforcement.
The OPA Constraint Framework introduces two primary resources: constraint
templates and constraints.
Constraint templates
OPA policy definitions, written in Rego
Constraints
The application of a constraint template to a given set of objects
Gatekeeper uses the Kubernetes API to integrate OPA into Kubernetes. Policies
are defined in the form of Kubernetes CustomResourceDefinitions (CRDs) and are
enforced with custom admission controller webhooks. These CRDs define
constraint templates and constraints on the API server. Any time a request to
create, delete, or update a resource is sent to the Kubernetes cluster API
server, Gatekeeper validates that resource against the predefined policies.
Gatekeeper also audits preexisting resource constraint violations against newly
defined policies.
Using OPA Gatekeeper, you can enforce a wide range of policies against your
Kubernetes cluster. Policy examples include:
Container images can only be pulled from a set of whitelisted repositories.
New resources must be appropriately labeled.
Deployments must specify a minimum number of replicas.
Note
By design, when the OPA Gatekeeper is disabled using the configuration file,
the policies are not cleaned up. Thus, when the OPA Gatekeeper is
re-enabled, the cluster can immediately adopt the existing policies.
The retention of the policies poses no risk, as they are merely data on the
API server and have no value outside of an OPA Gatekeeper deployment.
The following topics offer installation instructions and an example use case.
Set the cluster_config.policy_enforcement.gatekeeper.enabled
configuration parameter to "true". For more information on Gatekeeper
configuration options, refer to
cluster_config.policy_enforcement.gatekeeper.
Optional. Exclude resources that are contained in a specified set of
namespaces by assigning a comma-separated list of namespaces to the
cluster_config.policy_enforcement.gatekeeper.excluded_namespaces
configuration parameter.
Caution
Avoid adding namespaces to the excluded_namespaces list that do not
yet exist in the cluster.
To guide you in the creation of OPA Gatekeeper policies, as an example this
topic illustrates how to generate a policy for restricting escalation to root
privileges.
Note
Gatekeeper provides a library of commonly
used policies, including replacements for familiar PodSecurityPolicies.
Important
For users who are new to Gatekeeper, Mirantis recommends performing a dry
run on potential policies prior to production deployment. Such an approach,
by only auditing violations, will prevent potential cluster disruption. To
perform a dry run, set spec.enforcementAction to dryrun in the
constraint.yaml detailed herein.
Create a YAML file called template.yaml and place the following code in
that file:
apiVersion:templates.gatekeeper.sh/v1kind:ConstraintTemplatemetadata:name:k8spspallowprivilegeescalationcontainerannotations:description:>-Controls restricting escalation to root privileges. Corresponds to the`allowPrivilegeEscalation` field in a PodSecurityPolicy. For moreinformation, seehttps://kubernetes.io/docs/concepts/policy/pod-security-policy/#privilege-escalationspec:crd:spec:names:kind:K8sPSPAllowPrivilegeEscalationContainervalidation:openAPIV3Schema:type:objectdescription:>-Controls restricting escalation to root privileges. Corresponds to the`allowPrivilegeEscalation` field in a PodSecurityPolicy. For moreinformation, seehttps://kubernetes.io/docs/concepts/policy/pod-security-policy/#privilege-escalationproperties:exemptImages:description:>-Any container that uses an image that matches an entry in this list will be excludedfrom enforcement. Prefix-matching can be signified with `*`. For example: `my-image-*`.It is recommended that users use the fully-qualified Docker image name (e.g. start with a domain name)in order to avoid unexpectedly exempting images from an untrusted repository.type:arrayitems:type:stringtargets:-target:admission.k8s.gatekeeper.shrego:|package k8spspallowprivilegeescalationcontainerimport data.lib.exempt_container.is_exemptviolation[{"msg": msg, "details": {}}] {c := input_containers[_]not is_exempt(c)input_allow_privilege_escalation(c)msg := sprintf("Privilege escalation container is not allowed: %v", [c.name])}input_allow_privilege_escalation(c) {not has_field(c, "securityContext")}input_allow_privilege_escalation(c) {not c.securityContext.allowPrivilegeEscalation == false}input_containers[c] {c := input.review.object.spec.containers[_]}input_containers[c] {c := input.review.object.spec.initContainers[_]}input_containers[c] {c := input.review.object.spec.ephemeralContainers[_]}# has_field returns whether an object has a fieldhas_field(object, field) = true {object[field]}libs:-|package lib.exempt_containeris_exempt(container) {exempt_images := object.get(object.get(input, "parameters", {}), "exemptImages", [])img := container.imageexemption := exempt_images[_]_matches_exemption(img, exemption)}_matches_exemption(img, exemption) {not endswith(exemption, "*")exemption == img}_matches_exemption(img, exemption) {endswith(exemption, "*")prefix := trim_suffix(exemption, "*")startswith(img, prefix)}
MKE supports using a selective grant to allow a set of user and service
accounts to use privileged attributes on Kubernetes Pods. This enables
administrators to create scenarios that would ordinarily require administrators
or cluster-admins to execute. Such selective grants can be used to temporarily
bypass restrictions on non-administrator accounts, as the changes can be
reverted at any time.
The privileged attributes associated with user and service accounts are
specified separately. It is only possible to specify one list of privileged
attributes for user accounts and one list for service accounts.
The user accounts specified for access must be non-administrator users and the
service accounts specified for access must not be bound to the
cluster-admin role.
The following privileged attributes can be assigned using a selective grant:
Attribute
Description
hostIPC
Allows the Pod containers to share the host IPC namespace
hostNetwork
Allows the Pod to use the network namespace and network resources of the
host node
hostPID
Allows the Pod containers to share the host process ID namespace
hostBindMounts
Allows the Pod containers to use directories and volumes mounted on the
container host
privileged
Allows one or more Pod containers to run privileged, escalate
privileges, or both
kernelCapabilities
Allows you to specify the addition of kernel capabilities on one or more
of the kernel capabilities
The following Pod manifest demonstrates the use of several of the privileged
attributes in a Pod:
In the [cluster_config] section on the MKE configuration file, specify
the required privileged attributes for user accounts using the
priv_attributes_allowed_for_user_accounts parameter.
Specify the associated user accounts with the
priv_attributes_user_accounts parameter.
Specify the required privileged attributes for service accounts using the
priv_attributes_allowed_for_service_accounts parameter.
Specify the associated service accounts with the
priv_attributes_service_accounts parameter.
Kubernetes uses service accounts to enable workload access control.
A service account is an identity for processes that run in a Pod. When
a process is authenticated through a service account, it can contact the API
server and access cluster resources. The default service account is
default.
You provide a service account with access to cluster resources by creating a
role binding, just as you do for users and teams.
This example illustrates how to create a service account and role binding used
with an NGINX server.
To create a Kubernetes namespace:
It is necessary to create a namespace for use with your service account, as
unlike user accounts, service accounts are scoped to a particular namespace.
Log in to the MKE web UI.
In the left-side navigation panel, navigate to
Kubernetes > Namespaces and click Create.
Leave the Namespace drop-down blank.
Paste the following in the Object YAML editor:
apiVersion:v1kind:Namespacemetadata:name:nginx
Click Create.
Navigate to the nginx namespace.
Click the vertical ellipsis in the upper-right corner and click
Set Context.
To create a service account:
In the left-side navigation panel, navigate to
Kubernetes > Service Accounts and click Create.
There are now two service accounts associated with the nginx
namespace: default and nginx-service-account.
To create a role binding:
To give the service account access to cluster resources, create a role binding
with view permissions.
From the left-side navigation panel, navigate to
Access Control > Grants.
Note
If Hide Swarm Navigation is selected on the
<username> > Admin Settings > Tuning page, Grants
will display as Role Bindings under the
Access Control menu item.
In the Grants pane, select the Kubernetes tab and
click Create Role Binding.
In the Subject pane, under
SELECT SUBJECT TYPE, select Service Account.
In the Namespace drop-down, select nginx.
In the Service Account drop-down, select
nginx-service-account and then click Next.
In the Resource Set pane, select the nginx
namespace.
In the Role pane, under ROLE TYPE, select
Cluster Role and then select view.
Click Create.
The NGINX service account can now access all cluster resources in the nginx
namespace.
Calico affords MKE secure networking functionality for
container-to-container communication within Kubernetes. MKE manages the
Calico lifecycle, packaging it at both the time of installation and upgrade,
and fully supports its use with MKE
MKE also supports the use of alternative, unmanaged CNI plugins available on
Docker Hub. Mirantis can provide limited instruction on basic configuration,
but for detailed guidance on third-party CNI components, you must refer to the
external product documentation or support.
Consider the following limitations before implementing an unmanaged CNI plugin:
MKE only supports implementation of an unmanaged CNI plugin at install time.
MKE does not manage the version or configuration of alternative CNI plugins.
MKE does not upgrade or reconfigure alternative CNI plugins. To switch from
the managed CNI to an unmanaged CNI plugin, or vice versa, you must uninstall
and then reinstall MKE.
MKE components that require Kubernetes networking will
remain in the ContainerCreating state in Kubernetes until a CNI is
installed. Once the installation is complete, you can access MKE from a web
browser. Note that the manager node will be unhealthy as the kubelet
will report NetworkPluginNotReady. Additionally, the metrics in the
MKE dashboard will also be unavailable, as this runs in a Kubernetes
pod.
Install the unmanaged CNI plugin. Follow the CNI plugin documentation for
specific installation instructions. The unmanaged CNI plugin install steps
typically include:
Download the relevant upstream CNI binaries.
Place the CNI binaries in /opt/cni/bin.
Download the relevant CNI plugin Kubernetes Manifest YAML file.
Run kubectl apply -f <your-custom-cni-plugin>.yaml.
Caution
You must install the unmanaged CNI immediately after installing MKE and
before joining any manager or worker nodes to the cluster.
Note
While troubleshooting a custom CNI plugin, you may want to access
logs within the kubelet. Connect to an MKE manager node and run
dockerlogsucp-kubelet.
When MKE is installed with --unmanaged-cni, the ucp-kube-proxy-win
container on Windows nodes will not fully start, but will instead log the
following suggestion in a loop:
If using a VXLAN-based CNI, define the following variables:
CNINetworkName must match the name of the Windows Kubernetes HNS
network, which you can find either in the installation documentation for
the third party CNI or by using hnsdiag list networks.
CNISourceVip must use the value of the source VIP for this node, which
should be available in the installation documentation for the third party
CNI. Because the source VIP will be different for each node and can change
across host reboots, Mirantis recommends setting this variable using a
utility script.
The following is an example of how to define these variables using
PowerShell:
MKE provides data-plane level IPSec network encryption to securely encrypt
application traffic in a Kubernetes cluster. This secures application traffic
within a cluster when running in untrusted infrastructure or environments. It
is an optional feature of MKE that is enabled by deploying the SecureOverlay
components on Kubernetes when using the default Calico driver for networking
with the default IPIP tunneling configuration.
Kubernetes network encryption is enabled by two components in MKE:
SecureOverlay Agent
SecureOverlay Master
The SecureOverlay Agent is deployed as a per-node service that manages the
encryption state of the data plane. The Agent controls the IPSec encryption on
Calico IPIP tunnel traffic between different nodes in the Kubernetes cluster.
The Master is deployed on an MKE manager node and acts as the key management
process that configures and periodically rotates the encryption keys.
Kubernetes network encryption uses AES Galois Counter Mode (AES-GCM)
with 128-bit keys by default.
You must deploy the SecureOverlay Agent and Master on MKE to enable encryption,
as it is not enabled by default. You can enable or disable encryption at any
time during the cluster lifecycle. However, be aware that enabling or disabling
encryption can cause temporary traffic outages between Pods, lasting up to a
few minutes. When enabled, Kubernetes Pod traffic between hosts is encrypted at
the IPIP tunnel interface in the MKE host.
Kubernetes network encryption is supported on the following platforms:
Maximum transmission units (MTUs) are the largest packet length that a
container will allow. Before deploying the SecureOverlay components, verify
that Calico is configured so that the IPIP tunnel MTU leaves sufficient room
for the encryption overhead. Encryption adds 26 bytes of overhead, but every
IPSec packet size must be a multiple of 4 bytes. IPIP tunnels require 20 bytes
of encapsulation overhead. The IPIP tunnel interface MTU must be no more than
EXTMTU-46-((EXTMTU-46)modulo4), where EXTMTU is the minimum MTU
of the external interfaces. An IPIP MTU of 1452 should generally be safe for
most deployments.
In the MKE configuration file, update the ipip_mtu parameter with the new
MTU:
Once the cluster node MTUs are properly configured, deploy the SecureOverlay
components to MKE using either the MKE configuration file or the SecureOverlay
YAML file.
To configure SecureOverlay using the MKE configuration file:
Set the value of secure_overlay in the MKE configuration filecluster_configtable to true.
To configure SecureOverlay using the SecureOverlay YAML file:
Run the following procedure at the time of cluster installation, prior to
starting any workloads.
Copy the contents of the SecureOverlay YAML file into a YAML file called
ucp-secureoverlay.yaml.
SecureOverlay YAML
# Cluster role for key management jobskind:ClusterRoleapiVersion:rbac.authorization.k8s.io/v1beta1metadata:name:ucp-secureoverlay-mgrrules:-apiGroups:[""]resources:-secretsverbs:-get-update---# Cluster role binding for key management jobsapiVersion:rbac.authorization.k8s.io/v1beta1kind:ClusterRoleBindingmetadata:name:ucp-secureoverlay-mgrroleRef:apiGroup:rbac.authorization.k8s.iokind:ClusterRolename:ucp-secureoverlay-mgrsubjects:-kind:ServiceAccountname:ucp-secureoverlay-mgrnamespace:kube-system---# Service account for key management jobsapiVersion:v1kind:ServiceAccountmetadata:name:ucp-secureoverlay-mgrnamespace:kube-system---# Cluster role for secure overlay per-node agentkind:ClusterRoleapiVersion:rbac.authorization.k8s.io/v1beta1metadata:name:ucp-secureoverlay-agentrules:-apiGroups:[""]resources:-nodesverbs:-get-list-watch---# Cluster role binding for secure overlay per-node agentapiVersion:rbac.authorization.k8s.io/v1beta1kind:ClusterRoleBindingmetadata:name:ucp-secureoverlay-agentroleRef:apiGroup:rbac.authorization.k8s.iokind:ClusterRolename:ucp-secureoverlay-agentsubjects:-kind:ServiceAccountname:ucp-secureoverlay-agentnamespace:kube-system---# Service account secure overlay per-node agentapiVersion:v1kind:ServiceAccountmetadata:name:ucp-secureoverlay-agentnamespace:kube-system---# K8s secret of current key configurationapiVersion:v1kind:Secretmetadata:name:ucp-secureoverlaynamespace:kube-systemtype:Opaquedata:keys:""---# DaemonSet for secure overlay per-node agentapiVersion:apps/v1kind:DaemonSetmetadata:name:ucp-secureoverlay-agentnamespace:kube-systemlabels:k8s-app:ucp-secureoverlay-agentspec:selector:matchLabels:k8s-app:ucp-secureoverlay-agentupdateStrategy:type:RollingUpdatetemplate:metadata:labels:k8s-app:ucp-secureoverlay-agentannotations:scheduler.alpha.kubernetes.io/critical-pod:''spec:hostNetwork:truepriorityClassName:system-node-criticalterminationGracePeriodSeconds:10serviceAccountName:ucp-secureoverlay-agentcontainers:-name:ucp-secureoverlay-agentimage:docker/ucp-secureoverlay-agent:3.1.0securityContext:capabilities:add:["NET_ADMIN"]env:-name:MY_NODE_NAMEvalueFrom:fieldRef:fieldPath:spec.nodeNamevolumeMounts:-name:ucp-secureoverlaymountPath:/etc/secureoverlay/readOnly:truevolumes:-name:ucp-secureoverlaysecret:secretName:ucp-secureoverlay---# Deployment for manager of the whole cluster (primarily to rotate keys)apiVersion:apps/v1kind:Deploymentmetadata:name:ucp-secureoverlay-mgrnamespace:kube-systemspec:selector:matchLabels:app:ucp-secureoverlay-mgrreplicas:1template:metadata:name:ucp-secureoverlay-mgrnamespace:kube-systemlabels:app:ucp-secureoverlay-mgrspec:serviceAccountName:ucp-secureoverlay-mgrrestartPolicy:Alwayscontainers:-name:ucp-secureoverlay-mgrimage:docker/ucp-secureoverlay-mgr:3.1.0
You can provide persistent storage for MKE workloads by using NFS storage.
When mounted into the running container, NFS shares provide state to the
application, managing data external to the container lifecycle.
Note
The following subjects are out of the scope of this topic:
Provisioning an NFS server
Exporting an NFS share
Using external Kubernetes plugins to dynamically provision NFS shares
There are two different ways to mount existing NFS shares within Kubernetes
Pods:
Define NFS shares within the Pod definitions. NFS shares are defined
manually by each tenant when creating a workload.
Define NFS shares as a cluster object through PersistentVolumes, with the
cluster object lifecycle handled separately from the workload. This is
common for operators who want to define a range of NFS shares for tenants to
request and consume.
While defining workloads in Kubernetes manifest files, users can reference the
NFS shares that they want to mount within the Pod specification for each Pod.
This can be a standalone Pod or it can be wrapped in a higher-level object
like a Deployment, DaemonSet, or StatefulSet.
The following example includes a running MKE cluster and a downloaded
client bundle with permission to schedule Pods in a namespace.
Create nfs-in-a-pod.yaml with the following content:
Verify everything was mounted correctly by accessing a shell prompt within
the container and searching for your mount:
Access a shell prompt within the container:
kubectlexec-itnfs-in-a-podsh
Verify that everything is correctly mounted by searching for your mount:
mount|grepnfs.example.com
Note
MKE and Kubernetes are unaware of the NFS share because it is defined as
part of the Pod specification. As such, when you delete the Pod, the NFS
share detaches from the cluster, though the data remains in the NFS share.
This method uses the Kubernetes PersistentVolume (PV) and PersistentVolumeClaim
(PVC) objects to manage NFS share lifecycle and access.
You can define multiple shares for a tenant to use within the cluster. The PV
is a cluster-wide object, so it can be pre-provisioned. A PVC is a claim by a
tenant for using a PV within the tenant namespace.
To create PV objects at the cluster level, you will need a
ClusterRoleBinding grant.
Note
The “NFS share lifecycle” refers to granting and removing the end user
ability to consume NFS storage, rather than the lifecycle of the NFS
server.
To define the PersistentVolume at the cluster level:
The 5Gistorage size is used to match the volume to the tenant
claim.
The valid accessModes values for an NFS PV are:
ReadOnlyMany: the volume can be mounted as read-only by many nodes.
ReadWriteOnce: the volume can be mounted as read-write by a single
node.
ReadWriteMany: the volume can be mounted as read-write by many
nodes.
The access mode in the PV definition is used to match a PV to a Claim.
When a PV is defined and created inside of Kubernetes, a volume is not
mounted. Refer to Access Modes
for more information, including any changes to the valid accessModes.
The valid persistentVolumeReclaimPolicy values are:
Reclaim
Recycle
Delete
MKE uses the reclaim policy to define what the cluster does after a PV is
released from a claim. Refer to Reclaiming
in the official Kubernetes documentation for more information, including
any changes to the valid persistentVolumeReclaimPolicy values.
A tenant can now “claim” a PV for use within their workloads by using a
Kubernetes PVC. A PVC exists within a namespace and it attempts to match
available PVs to the tenant request.
Create myapp-cliam.yaml with the following content:
To deploy this PVC, the tenant must have a RoleBinding that permits the
creation of PVCs. If there is a PV that meets the tenant criteria,
Kubernetes binds the PV to the claim. This does not, however, mount the
share.
The final task is to deploy a workload to consume the PVC. The PVC is defined
within the Pod specification, which can be a standalone Pod or wrapped in a
higher-level object such as a Deployment, DaemonSet, or StatefulSet.
You can provide persistent storage for MKE workloads on Microsoft Azure by
using Azure Disk Storage. You can either pre-provision Azure Disk Storage to be
consumed by Kubernetes Pods, or you can use the Azure Kubernetes integration to
dynamically provision Azure Disks as needed.
This guide assumes that you have already provisioned an MKE environment on
Microsoft Azure and that you have provisioned a cluster after meeting all of
the prerequisites listed in Install MKE on Azure.
You can use existing Azure Disks or manually provision new ones
to provide persistent storage for Kubernetes Pods. You can manually provision
Azure Disks in the Azure Portal, using ARM Templates, or using the Azure CLI.
The following example uses the Azure CLI to manually
provision an Azure Disk.
Create an environment variable for myresourcegroup:
Make note of the Azure ID of the Azure Disk Object returned by the previous
step.
You can now create Kubernetes Objects that refer to this Azure Disk. The
following example uses a Kubernetes Pod, though the same Azure Disk
syntax can be used for DaemonSets, Deployments, and StatefulSets. In the
example, the Azure diskName and diskURI refer to the manually created
Azure Disk:
Kubernetes can dynamically provision Azure Disks using the Azure Kubernetes
integration, configured at the time of your MKE installation. For
Kubernetes to determine which APIs to use when provisioning storage, you
must create Kubernetes StorageClass objects specific to each storage backend.
There are two different Azure Disk types that can be consumed by
Kubernetes: Azure Disk Standard Volumes and Azure Disk Premium Volumes.
Depending on your use case, you can deploy one or both of the Azure Disk
storage classes.
To create an Azure Disk with a PersistentVolumeClaim:
After you create a storage class, you can use Kubernetes Objects to
dynamically provision Azure Disks. This is done using Kubernetes
PersistentVolumesClaims.
The following example uses the standard storage class and creates a 5
GiB Azure Disk. Alter these values to fit your use case.
Verify the creation of a new Azure Disk in the Azure Portal.
To attach the new Azure Disk to a Kubernetes Pod:
You can now mount the Kubernetes PersistentVolume into a Kubernetes Pod. The
disk can be consumed by any Kubernetes object type, including a Deployment,
DaemonSet, or StatefulSet. However, the following example simply mounts the
PersistentVolume into a standalone Pod.
Azure limits the number of data disks that can be attached to each Virtual
Machine. Refer to Azure Virtual Machine Sizes
for this information. Kubernetes prevents Pods from deploying on Nodes that
have reached their maximum Azure Disk Capacity. In such cases, Pods remain
stuck in the ContainerCreating status, as demonstrated in the following
example:
Describe the Pod to display troubleshooting logs, which indicate the node
has reached its capacity:
kubectldescribepodsmypod-azure-disk
Example output:
WarningFailedAttachVolume7s(x11over6m)attachdetach-controller\
AttachVolume.Attachfailedforvolume"pvc-6b09dae3-6ad6-11e9-9509-0242ac11000b":\
Attachvolume"kubernetes-dynamic-pvc-6b09dae3-6ad6-11e9-9509-0242ac11000b"toinstance\"/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Compute/virtualMachines/worker-03"\
failedwithcompute.VirtualMachinesClient#CreateOrUpdate:Failuresendingrequest:\StatusCode=409--OriginalError:failedrequest:autorest/azure:\
Servicereturnedanerror.Status=<nil>Code="OperationNotAllowed"\Message="The maximum number of data disks allowed to be attached to a VM of this size is 4."\Target="dataDisks"
You can provide persistent storage for MKE workloads on Microsoft Azure by
using Azure Files. You can either pre-provision Azure Files shares to be
consumed by Kubernetes Pods, or you can use the Azure Kubernetes integration to
dynamically provision Azure Files shares as needed.
This guide assumes that you have already provisioned an MKE environment on
Microsoft Azure and that you have provisioned a cluster after meeting all of
the prerequisites listed in Install MKE on Azure.
You can use existing Azure Files shares or manually provision new ones
to provide persistent storage for Kubernetes Pods. You can manually provision
Azure Files shares in the Azure Portal, using ARM Templates, or
using the Azure CLI. The following example uses the Azure CLI to
manually provision an Azure Files share.
To manually provision an Azure Files share:
Note
The Azure Kubernetes driver does not support Azure Storage accounts created
using Azure Premium Storage.
Create an Azure Storage account:
Create the following environment variables, replacing <region> with
the required region:
After creating an Azure Files share, you must load the Azure Storage
account access key into MKE as a Kubernetes Secret. This provides access
to the file share when Kubernetes attempts to mount the share into a
Pod. You can find this Secret either in the Azure Portal or by using the Azure
CLI, as in the following example.
Create the following environment variables, if you have not done so already:
Kubernetes can dynamically provision Azure Files shares using the Azure
Kubernetes integration, configured at the time of your MKE installation. For
Kubernetes to determine which APIs to use when provisioning storage, you must
create Kubernetes StorageClass objects specific to each storage backend.
Note
The Azure Kubernetes plugin only supports using the Standard StorageClass.
File shares that use the Premium StorageClass will fail to mount.
To create an Azure Files share using a PersistentVolumeClaim:
After you create a storage class, you can use Kubernetes Objects to
dynamically provision Azure Files shares. This is done using Kubernetes
PersistentVolumesClaims.
Kubernetes uses an existing Azure Storage account, if one exists inside
of the Azure Resource Group. If an Azure Storage account does not exist,
Kubernetes creates one.
The following example uses the standard storage class and creates a 5 Gi
Azure File share. Alter these values to fit your use case.
To attach the new Azure Files share to a Kubernetes Pod:
You can now mount the Kubernetes PersistentVolume into a Kubernetes Pod. The
file share can be consumed by any Kubernetes object type, including a
Deployment, DaemonSet, or StatefulSet. However, the following example simply
mounts the PersistentVolume into a standalone Pod.
Attach the new Azure Files share to a Kubernetes Pod:
When creating a PersistentVolumeClaim, the volume can get stuck in a
Pending state if the persistent-volume-binder service account does not
have the relevant Kubernetes RBAC permissions.
The storage account creates a Kubernetes Secret to store the Azure Files
storage account key. If the persistent-volume-binder service account
does not have the correct permissions, a warning such as the following will
display:
Warning ProvisioningFailed 7s (x3 over 37s) persistentvolume-controller
Failed to provision volume with StorageClass "standard": Couldn't create secret
secrets is forbidden: User "system:serviceaccount:kube-system:persistent-volume-binder"
cannot create resource "secrets" in API group "" in the namespace "default": access denied
Grant the persistent-volume-binder service account the relevant
RBAC permissions by creating the following RBAC ClusterRole:
Internet Small Computer System Interface (iSCSI) is an IP-based standard that
provides block-level access to storage devices. iSCSI receives requests from
clients and fulfills them on remote SCSI devices. iSCSI support in
MKE enables Kubernetes workloads to consume persistent storage from iSCSI
targets.
Note
MKE does not support using iSCSI with Windows clusters.
Note
Challenge-Handshake Authentication Protocol (CHAP) secrets are supported for
both iSCSI discovery and session management.
The iSCSI initiator is any client that consumes storage and sends iSCSI
commands. In an MKE cluster, the iSCSI initiator must be installed and
running on any node where Pods can be scheduled. Configuration, target
discovery, logging in, and logging out of a target are performed primarily by
two software components: iscsid (service) and iscsiadm (CLI tool).
These two components are typically packaged as part of open-iscsi on Debian
systems and iscsi-initiator-utils on RHEL, CentOS, and Fedora systems.
iscsid is the iSCSI initiator daemon and implements the control path
of the iSCSI protocol. It communicates with iscsiadm and kernel
modules.
iscsiadm is a CLI tool that allows discovery, login to iSCSI targets,
session management, and access and management of the open-iscsi
database.
The iSCSI target is any server that shares storage and receives iSCSI
commands from an initiator.
Note
iSCSI kernel modules implement the data path. The most common modules used
across Linux distributions are scsi_transport_iscsi.ko, libiscsi.ko,
and iscsi_tcp.ko. These modules need to be loaded on the host for
proper functioning of the iSCSI initiator.
Complete hardware and software configuration of the iSCSI storage provider.
There is no significant demand for RAM and disk when running external
provisioners in MKE clusters. For setup information specific to a storage
vendor, refer to the vendor documentation.
Configure kubectl on your clients.
Make sure that the iSCSI server is accessible to MKE worker nodes.
An iSCSI target can run on dedicated, stand-alone hardware, or can be
configured in a hyper-converged manner to run alongside container
workloads on MKE nodes. To provide access to the storage device, configure each
target with one or more logical unit numbers (LUNs).
iSCSI targets are specific to the storage vendor. Refer to the vendor
documentation for setup instructions, including applicable RAM and disk space
requirements, and expose them to the MKE cluster.
To expose iSCSI targets to the MKE cluster:
If necessary for access control, configure the target with client iSCSI
qualified names (IQNs).
CHAP secrets for authentication.
Make sure that each iSCSI LUN is accessible by all nodes in the cluster.
Configure the iSCSI service to expose storage as an iSCSI LUN to all nodes
in the cluster. You can do this by allowing all MKE nodes, and along with
them the IQNs, to join the target ACL list.
Every Linux distribution packages the iSCSI initiator software in a
particular way. Follow the instructions specific to the storage
provider, using the following steps as a guideline.
Prepare all MKE nodes by installing OS-specific iSCSI packages and
loading the necessary iSCSI kernel modules. In the following example,
scsi_transport_iscsi.ko and libiscsi.ko are pre-loaded by the
Linux distribution. The iscsi_tcp kernel module must be loaded with a
separate command.
Set up MKE nodes as iSCSI initiators. Configure initiator names
for each node, using the format
InitiatorName=iqn.<YYYY-MM.reverse.domain.name:OptionalIdentifier>:
The Kubernetes in-tree iSCSI plugin only supports static provisioning, for
which you must:
Verify that the desired iSCSI LUNs are pre-provisioned in the iSCSI targets.
Create iSCSI PV objects, which correspond to the
pre-provisioned LUNs with the appropriate iSCSI configuration. As
PersistentVolumeClaims (PVCs) are created to consume storage, the iSCSI
PVs bind to the PVCs and satisfy the request for persistent storage.
To configure in-tree iSCSI volumes:
Create a YAML file for the PersistentVolume object based on the
following example:
Make the following changes using information appropriate for your
environment:
Replace 12Gi with the size of the storage available.
Replace 192.0.2.100:3260 with the IP address and port number of the
iSCSI target in your environment. Refer to the storage provider
documentation for port information.
Replace iqn.2017-10.local.example.server:disk1 with a unique name for
the identifier. More than one iqn can be specified, but it must use
the format iqn.YYYY-MM.reverse.domain.name:OptionalIdentifier.
iqn.2017-10.local.example.server:disk1 is the IQN of the iSCSI
initiator, which in this case is the MKE worker node. Each MKE worker
must have a unique IQN.
An external provisioner is a piece of software running out of process
from Kubernetes that is responsible for creating and deleting PVs. External
provisioners monitor the Kubernetes API server for PV claims and create PVs
accordingly.
When using an external provisioner, you must perform the following
additional steps:
Configure external provisioning based on your storage provider. Refer
to your storage provider documentation for deployment information.
Define storage classes. Refer to your storage provider dynamic
provisioning documentation for configuration information.
Define a PVC and a Pod. When you define a PVC to use the storage class, a PV
is created and bound.
Start a Pod using the PVC that you defined.
Note
In some cases, on-premises storage providers use external provisioners to
connect PV provisioning to the backend storage.
The following issues occur frequently in iSCSI integrations:
The host might not have iSCSI kernel modules loaded. To avoid this,
always prepare your MKE worker nodes by installing the iSCSI packages
and the iSCSI kernel modules prior to installing MKE. If worker
nodes are not prepared correctly prior to an MKE installation:
Prepare the nodes.
Restart the ucp-kubelet container for changes to take effect.
Some hosts have depmod confusion. On some Linux distributions, the
kernel modules cannot be loaded until the kernel sources are
installed and depmod is run. If you experience problems with
loading kernel modules, verify that you are running depmod after
performing the kernel module installation.
Configure Pods to use the PersistentVolumeClaim when binding to
the PersistentVolume.
Create a YAML file with the following ReplicationController object. The
ReplicationController is used to set up two replica Pods running web
servers that use the PersistentVolumeClaim to mount the
PersistentVolume onto a mountpath containing shared resources.
The Container Storage Interface (CSI) is a specification for container
orchestrators to manage block- and file-based volumes for storing data. Storage
vendors can each create a single CSI driver that works with multiple container
orchestrators. The Kubernetes community maintains sidecar containers that a
containerized CSI driver can use to interface with Kubernetes controllers in
charge of the following:
Managing persistent volumes
Attaching volumes to nodes, if applicable
Mounting volumes to Pods
Taking snapshots
These sidecar containers include a driver registrar, external attacher,
external provisioner, and external snapshotter.
Mirantis supports version 1.0 and later of the CSI specification, and thus MKE
can manage storage backends that ship with an associated CSI driver.
Note
Enterprise storage vendors provide CSI drivers, whereas Mirantis does not.
Kubernetes does not enforce a specific procedure for how storage providers
(SPs) should bundle and distribute CSI drivers.
The simplest way to deploy CSI drivers is for storage vendors to package them
in containers. In the context of Kubernetes clusters, containerized CSI drivers
typically deploy as StatefulSets for managing the cluster-wide logic and
DaemonSets for managing node-specific logic.
Note the following considerations:
You can deploy multiple CSI drivers for different storage backends in
the same cluster.
To avoid credential leak to user processes, Kubernetes recommends running
CSI Controllers on master nodes and the CSI node plugin on worker nodes.
MKE allows running privileged Pods, which is required to run CSI drivers.
The Docker daemon on the hosts must be configured with shared mount
propagation for CSI. This allows the sharing of volumes mounted by
one container into other containers in the same Pod or to other Pods
on the same node. By default, MKE enables bidirectional mount propagation in
the Docker daemon.
Pods that contain CSI plugins must have the appropriate permissions to
access and manipulate Kubernetes objects.
Using YAML files that the storage vendor provides, you can configure the
cluster roles and bindings for service accounts associated with CSI driver
Pods. MKE administrators must apply those YAML files to properly configure RBAC
for the service accounts associated with CSI Pods.
The dynamic provisioning of persistent storage depends on the capabilities of
the CSI driver and of the underlying storage backend. Review the CSI driver
provider documentation for the available parameters. Refer to
CSI HostPath Driver
for a generic CSI plugin example.
You can access the following CSI deployment information in the MKE web UI:
Persistent storage objects
In the MKE web UI left-side navigation panel, navigate to
Kubernetes > Storage for information on persistent storage objects
such as StorageClass, PersistentVolumeClaim, and PersistentVolume.
Volumes
In the MKE web UI left-side navigation panel, navigate to
Kubernetes > Pods, select a Pod, and scroll to Volumes
to view the Pod volume information.
MKE provides graphics processing unit (GPU) support for Kubernetes workloads
that run on Linux worker nodes. This topic describes how to configure your
system to use and deploy NVIDIA GPUs.
GPU support requires that you install GPU drivers, which you can do either
prior to or after installing MKE. Installing the GPU drivers installs the
NVIDIA driver using a runfile on your Linux host.
Note
This procedure describes how to manually install the GPU drivers. However,
Mirantis recommends that you use a pre-existing automation system to
automate the installation and patching of the drivers, along with the kernel
and other host software.
Enable the NVIDIA GPU device plugin by setting nvidia_device_plugin to
true in the MKE configuration file.
If you attempt to add an additional replica to the previous example Deployment,
it will result in a FailedScheduling error with the
Insufficientnvidia.com/gpu message.
NGINX Ingress Controller for Kubernetes manages traffic that originates outside
your cluster (ingress traffic) using the Kubernetes Ingress rules.
You can use either the host name, path, or both the host name and path to route
incoming requests to the appropriate service.
Only administrators can enable and disable NGINX Ingress Controller. Both
administrators and regular users with the appropriate roles and permissions can
create Ingress resources.
Use the MKE web UI to enable and configure the NGINX Ingress Controller.
Log in to the MKE web UI as an administrator.
Using the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
In the Kubernetes tab, toggle the
HTTP Ingress Controller for Kubernetes slider to the right.
Under Configure proxy, specify the NGINX Ingress Controller
service node ports through which external traffic can enter the cluster.
Verify that the specified node ports are open.
Note
On production applications, it is typical to expose services using the
load balancer that your cloud provider offers.
Optional. Create a layer 7 load balancer in front of multiple nodes by
toggling the External IP slider to the right and adding a list
of external IP addresses to the NGINX Ingress Controller service.
Specify how to scale load balancing by setting the number of replicas.
Specify placement rules and load balancer configurations.
Specify any additional NGINX configuration options you require. Refer to the
NGINX documentation for the complete list of configuration options.
Click Save.
Note
The NGINX Ingress Controller implements all Kubernetes Ingress resources
with the IngressClassName of nginx-default, regardless of which
namespace they are created in.
Note
The Ingress Controller implements any new Kubernetes Ingress resource
that is created without IngressClassName.
A Kubernetes Ingress specifies a set of rules that route requests that match a
particular <domain>/{path} to a given application. Ingresses are scoped to
a single namespace and thus can route requests only to the applications inside
that namespace.
Log in to the MKE web UI.
Navigate to Kubernetes > Ingresses and click Create.
In the Create Ingress Object page, enter an ingress name and the
following rule details:
Host (optional)
Path
Path type
Service name
Port number
Port name
Generate the configuration file by clicking Generate YAML.
Canary deployments release applications incrementally to a subset of users,
which allows for the gradual deployment of new application versions without any
downtime.
NGINX Ingress Controller supports traffic-splitting policies based on header,
cookie, and weight. Whereas header- and cookie-based policies serve to provide
a new service version to a subset of users, weight-based policies serve to
divert a percentage of traffic to a new service version.
NGINX Ingress Controller uses the following annotations to enable canary
deployments:
Canary rules are evaluated in the following order:
canary-by-header
canary-by-cookie
canary-weight
Canary deployments require that you create two ingresses: one for regular
traffic and one for alternative traffic. Be aware that you can apply only
one canary ingress.
You enable a particular traffic-splitting policy by setting the associated
canary annotation to true in the Kubernetes Ingress resource, as in the
following example:
Sticky sessions enable users who participate in split testing to consistently
see a particular feature. Adding sticky sessions to the initial request forces
NGINX Ingress Controller to route follow-up requests to the same Pod.
Enable the sticky session in the Kubernetes Ingress resource:
nginx.ingress.kubernetes.io/affinity:"cookie"
Specify the name of the required cookie (default: INGRESSCOOKIE).
By default, NGINX Ingress Controller generates default TLS certificates for TLS
termination. You can, though, generate and configure your own TLS
certificates for TLS termination purposes.
Create an Ingress for the sample application, inserting the Kubernets
Secret you created in the tls section as the host for which the TLS
connection will terminate:
TLS passthrough is the action of passing data through a load balancer to a
server without decrypting it. Usually, the decryption or TLS termination
happens at the load balancer and data is passed along to a web server as plain
HTTP. TLS passthrough, however, keeps the data encrypted as it travels through
the load balancer, with the web server performing the decryption upon receipt.
With TLS passthrough enabled in NGINX Ingress Controller, the request will be
forwarded to the backend service without being decrypted.
Enable TLS passthrough using either the MKE web UI or the MKE configuration
file.
Note
You must have MKE admin access to enable TLS passthrough.
To enable TLS passthrough with the MKE web UI, navigate to
<username> > Admin Settings > Ingress, scroll down to
Advanced Settings and toggle the
Enable TLS-Passthrough control on.
To enable TLS passthrough using the MKE configuration file, set the
ingress_extra_args.enable_ssl_passthrough file parameter under the
cluster_config.ingress_controller option to true.
Test the TLS passthrough by connecting to the application using HTTPS.
In the example below, the TLS connection is being negotiated using the
certficate provided for host nginx.example.com. Thus, theTLS connection was
passed to the deployed NGINX server.
The output shows that the TLS connection is being negotiated with
the certificate provided for host nginx.example.com, thus confirming
that the TLS connection has passed to the deployed NGINX server.
Clean up Kubernetes resources that are no longer needed:
Kubernetes Ingress only supports services over HTTP and HTTPS. Using NGINX
Ingress Controller, though, you can circumvent this limitation to enable
situations in which it may be necessary to expose TCP and UDP services.
The following example procedure exposes a TCP service on port 9000 and a
UDP service on port 5005:
Deploy a sample TCP service listening on port 9000, to echo back any
text it receives with the prefix hello.
MetalLB is a load balancer implementation for bare metal Kubernetes clusters.
It monitors for the services with the type LoadBalancer and assigns them an
IP address from IP address pools that are configured in the MKE system.
When a service of type LoadBalancer is created in a bare-metal cluster, the
external IP of the service remains in <pending> state. This is because the
Kubernetes implementation for network load balancers is only supported on known
IaaS platforms. To fill in this gap you can use MetalLB, which can assign the
IP to the service from a custom address pool resource, which is comprised of a
list of IP addresses. And administrators can add multiple pools to the cluster,
thus enabling you to control which IP addresses MetalLB can assign to
load-balancer services.
After assigning an IP address to a service, MetalLB must make the network aware
so that the external entities of the cluster can learn how to reach the IP. To
achieve this, MetalLB uses standard protocols, depending on whether the ARP or
or BGP mode is in use.
Note
When you install and configure MetalLB in MKE, support is restricted to
Layer 2 (ARP) mode.
Once MetalLB is installed you can create services of type LoadBalancer,
with MetalLB able to assign them an IP address from the IP address pools
configured in the system.
In the example that follows, we will create an NGINX deployment and a
``LoadBalancer``service:
Make sure to provide correct IP addresses in CIDR format.
MetalLB pool name values must adhere to the RFC 1123
international format:
A lowercase RFC 1123 subdomain must consist of lower case alphanumeric
characters, - or ., and must start and end with an alphanumeric
character. For example, example.com, regex used for validation is
‘[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*’)
When multiple address pools are configured, MKE advertises all of the pools
by default. To request assignment from a specific pool, users can add
metallb.universe.tf/address-poolannotation to the service, with the
name of the address pool as the annotation value. In the event that no such
annotation is added, MetalLB will assign an IP from one of the configured
pools.
You can configure both public and private IPs, based on your environment.
MKE allows you to define unlimited address pools and is type-agnostic.
Upload the modified MKE configuration file and allow at least 5 minutes for
MKE to propagate the configuration changes throughout the cluster.
In Kubernetes, by default, a Pod is only connected to a single network
interface, which is the default network. Using Multus CNI, however, you can
create a multi-home Pod that has multiple network interfaces.
The following example procedure attaches two network interfaces to a Pod,
net1 and net2.
Determine the primary network interface for the node. You will need this
information to create the NetworkAttachmentDefinitions file.
Note
The name of the primary interface can vary with the
underlying network adapter.
Run the following command on the nodes locate the default network
interface. The Iface column in the line with destination
default indicates which interface to use.
route
Note
eth0 is the primary network interface in most Linux distributions.
You can monitor the health of your MKE cluster using the MKE web UI, the CLI,
and the _ping endpoint. This topic describes how to monitor your cluster
health, vulnerability counts, and disk usage.
For those running MSR in addition to MKE, MKE displays image vulnerability
scanning count data obtained from MSR for containers, Swarm services, Pods, and
images. This feature requires that you run MSR 2.6.x or later and enable MKE
single sign-on.
The MKE web UI only displays the disk usage metrics, including space
availability, for the /var/lib/docker part of the filesystem. Monitoring
the total space available on each filesystem of an MKE worker or manager node
requires that you deploy a third-party operating system-monitoring solution.
From the left-side navigation panel, navigate to the Dashboard
page.
Cluster health-related warnings that require your immediate attention
display on the cluster dashboard. A greater number of such warnings are
likely to present for MKE administrators than for regular users.
Navigate to Shared Resources > Nodes to inspect the health of
the nodes that MKE manages. To read the node health status, hover over the
colored indicator.
Click a particular node to learn more about its health.
Click on the vertical ellipsis in the top right corner and select
Tasks.
From the left-side navigation panel, click Agent Logs to examine
log entries.
Automate the MKE cluster monitoring process by using the
https://<mke-manager-url>/_ping endpoint to evaluate the health of a single
manager node. The MKE manager evaluates whether its internal components are
functioning properly, and returns one of the following HTTP codes:
200 - all components are healthy
500 - one or more components are not healthy
Using an administrator client certificate as a TLS client certificate for the
_ping endpoint returns a detailed error message if any component is
unhealthy.
Do not access the _ping endpoint with a load balancer, as this method does
not allow you to determine which manager node is not healthy. Instead, connect
directly to the URL of a manager node. Use GET to ping the endpoint instead
of HEAD, as HEAD returns a 404 error code.
Troubleshooting is a necessary part of cluster maintenance. This section
provides you with the tools you need to diagnose and resolve the problems you
are likely to encounter in the course of operating your cluster.
Nodes enter a variety of states in the course of their lifecycle, including
transitional states such as when a node joins a cluster and when a node is
promoted or demoted. MKE reports the steps of the transition process as they
occur in both the ucp-controller logs and in the MKE web UI.
To view transitional node states in the MKE web UI:
Log in to the MKE web UI.
In the left-side navigation panel, navigate to
Shared Resources > Nodes. The transitional node state displays
in the DETAILS column for each node.
Optional. Click the required node. The transitional node state displays in
the Overview tab under Cluster Message.
The following table includes all the node states as they are reported by MKE,
along with their description and expected duration:
Message
Description
Expected duration
Completing node registration
The node is undergoing the registration process and does not yet appear
in the KV node inventory. This is expected to occur when a node first
joins the MKE swarm.
5 - 30 seconds
heartbeat failure
The node has not contacted any swarm managers in the last 10 seconds.
Verify the swarm state using docker info on the node.
inactive indicates that the node has been removed from the
swarm with docker swarm leave.
pending indicates dockerd has been attempting to contact a manager
since dockerd started on the node. Confirm that the network security
policy allows TCP port 2377 from the node to the managers.
error indicates an error prevented Swarm from starting on the
node. Verify the docker daemon logs on the node.
Until resolved
Node is being reconfigured
The ucp-reconcile container is converging the current state of the
node to the desired state. Depending on which state the node is
currently in, this process can involve issuing certificates, pulling
missing images, or starting containers.
1 - 60 seconds
Reconfiguration pending
The node is expected to be a manager but the ucp-reconcile
container has not yet been started.
1 - 10 seconds
The ucp-agent task is state
The ucp-agent task on the node is not yet in a running state.
This message is expected when the configuration has been updated or
when a node first joins the MKE cluster. This step may take
longer than expected if the MKE images need to be pulled
from Docker Hub on the affected node.
1 - 10 seconds
Unable to determine node state
The ucp-reconcile container on the target node has just begun
running and its state is not yet evident.
1 - 10 seconds
Unhealthy MKE Controller: node is unreachable
Other manager nodes in the cluster have not received a heartbeat message
from the affected node within a predetermined timeout period. This
usually indicates that there is either a temporary or permanent
interruption in the network link to that manager node. Ensure that the
underlying networking infrastructure is operational, and contact support
if the symptom persists.
Until resolved
Unhealthy MKE Controller: unable to reach controller
The controller that the node is currently communicating with is not
reachable within a predetermined timeout. Refresh the node listing to
determine whether the symptom persists. The symptom appearing
intermittently can indicate latency spikes between manager nodes, which
can lead to temporary loss in the availability of MKE. Ensure the
underlying networking infrastructure is operational and contact support
if the symptom persists.
Until resolved
Unhealthy MKE Controller: Docker Swarm Cluster: Local node <ip> has
status Pending
The MCR Engine ID is not unique in the swarm. When a node first
joins the cluster, it is added to the node inventory and discovered as
Pending by Swarm. MCR is considered validated if a
ucp-swarm-manager container can connect to MCR through TLS and its
Engine ID is unique in the swarm. If you see this issue repeatedly, make
sure that MCR does not have duplicate IDs. Use
docker info to view the Engine ID. To refresh the ID, remove
the /etc/docker/key.json file and restart the daemon.
You can troubleshoot your MKE cluster by using the MKE web UI, the ClI, and the
support bundle to review the logs of the individual MKE components. You must
have administrator privileges to view information about MKE system containers.
Using the Docker CLI requires that you authenticate using client
certificates. Client certificate bundles generated for users without
administrator privileges do not permit viewing MKE system container logs.
Review the logs of MKE system containers. Use the -a flag to display
system containers, as they are not displayed by default.
Optional. Review the log of a particular MKE container by using the
docker logs <mke container ID> command. For example, the
following command produces the log for the ucp-controller container
listed in the previous step:
With the logs contained in a support bundle you can troubleshoot problems that
existed before you changed your MKE configuration. Do not alter your MKE
configuration until after you have performed the following steps.
Log in to the MKE web UI.
In the left-side navigation panel, navigate to
<username> > Admin Settings > Log & Audit Logs
Select DEBUG and click Save.
Increasing the MKE log level to DEBUG produces more descriptive
logs, making it easier to understand the status of the MKE cluster.
Note
Changing the MKE log level restarts all MKE system components and
introduces a small amount of downtime to MKE. Your applications will not
be affected by this downtime.
support-dump.
Each of the following container types reports a different variety of problems
in its logs:
Review the ucp-reconcile container logs for problems that occur after a
node was added or removed.
Note
It is normal for the ucp-reconcile container to be stopped. This
container starts only when the ucp-agent detects that a node needs to
transition to a different state. The ucp-reconcile container is
responsible for creating and removing containers, issuing certificates,
and pulling missing images.
Review the ucp-controller container logs for problems that occur in the
normal state of the system.
Review the ucp-auth-api and ucp-auth-store container logs for
problems that occur when you are able to visit the MKE web UI but unable to
log in.
MKE regularly monitors its internal components, attempting to resolve issues as
it discovers them.
In most cases where a single MKE component remains in a persistently failed
state, removing and rejoining the unhealthy node restores the cluster to
a healthy state.
MKE persists configuration data on an etcd key-value store and RethinkDB
database that are replicated on all MKE manager nodes. These data stores are
for internal use only and should not be used by other applications.
Troubleshoot the etcd key-value store with the HTTP API¶
This example uses curl to make requests to the key-value store REST API
and jq to process the responses.
Use the REST API to access the cluster configurations. The $DOCKER_HOST
and $DOCKER_CERT_PATH environment variables are set when using the
client bundle.
Troubleshoot the etcd key-value store with the CLI¶
Execution of the MKE etcd key-value store takes place in containers with
the name ucp-kv. To check the health of etcd clusters, execute commands
inside these containers using docker exec` with etcdctl.
Log in to a manager node using SSH.
Troubleshoot an etcd key-value store:
dockerexec-itucp-kvsh-c\'etcdctl --cluster=true endpoint health -w table 2>/dev/null'
If the command fails, an error code is the only output that displays.
Troubleshoot your cluster configuration using the RethinkDB database¶
User and organization data for MKE is stored in a RethinkDB database, which is
replicated across all manager nodes in the MKE cluster.
The database replication and failover is typically handled automatically by the
MKE configuration management processes. However, you can use the CLI to review
the status of the database and manually reconfigure database replication.
Log in to a manager node using SSH.
Produce a detailed status of all servers and database tables in the
RethinkDB cluster:
NODE_ADDRESS is the IP address of this Docker Swarm manager node.
NUM_MANAGERS is the current number of manager nodes in the cluster.
VERSION is the most recent version of the mirantis/ucp-auth image.
Example output:
time="2017-07-14T20:46:09Z"level=debugmsg="Connecting to db ..."time="2017-07-14T20:46:09Z"level=debugmsg="connecting to DB Addrs: [192.168.1.25:12383]"time="2017-07-14T20:46:09Z"level=debugmsg="Reconfiguring number of replicas to 1"time="2017-07-14T20:46:09Z"level=debugmsg="(00/16) Reconfiguring Table Replication..."time="2017-07-14T20:46:09Z"level=debugmsg="(01/16) Reconfigured Replication of Table \"grant_objects\""
...
Note
If the quorum in any of the RethinkDB tables is lost, run the
reconfigure-db command with the --emergency-repair flag.
If one of the nodes goes offline during MKE cluster CA rotation, it can prevent
other nodes from finishing the rotation. In this event, to unblock other nodes,
remove the offline node from the cluster by running the docker node
rm --force <node_id> command from any manager node. Thereafter, once the
rotation is done, the node can rejoin the cluster.
If the CA rotation was only partially successful, having left some nodes in an
unhealthy state, you can attempt to remove and rejoin the problematic nodes.
The reason this happens is that the NodeLocalDNS DaemonSet creates
a dummy interface during network setup, and this dummy kernel module is not
loaded in RHEL or CENTOS by default. To fix the issue, load the dummy
kernel module and run the following command on every node in the cluster:
sudomodprobedummy
NodeLocalDNS containers are unable to add iptables rules¶
Although the NodeLocalDNS Pods switch to running state after the dummy
kernel module is loaded, the Pods still fail to add iptables rules.
kubectllogs-f-nkube-system-lk8s-app=node-local-dns
Notice:TheNOTRACKtargetisconvertedintoCTtargetinrulelistingandsaving.
Fatal:can't open lock file /run/xtables.lock: Permission denied[ERROR] Error checking/adding iptables rule {raw OUTPUT [-p tcp -d 10.96.0.10 --dport 8080 -j NOTRACK -m comment --comment NodeLocal DNS Cache: skip conntrack]}, error - error checking rule: exit status 4: Ignoring deprecated --wait-interval option.Warning: Extension CT revision 0 not supported, missing kernel module?Notice: The NOTRACK target is converted into CT target in rule listing and saving.Fatal: can'topenlockfile/run/xtables.lock:Permissiondenied
You can fix this problem in two different ways:
Use audit2allow to generate SELinux policy rules for the denied
operations:
modulelocaldnsthird1.0;
require{typekernel_t;typespc_t;typerpm_script_t;typefirewalld_t;typecontainer_t;typeiptables_var_run_t;classprocesstransition;classcapability{sys_adminsys_resource};classsystemmodule_request;classfile{lockopenread};}#============= container_t ==============
allowcontainer_tiptables_var_run_t:filelock;#!!!! This avc is allowed in the current policy
allowcontainer_tiptables_var_run_t:file{openread};#!!!! This avc is allowed in the current policy
allowcontainer_tkernel_t:systemmodule_request;#============= firewalld_t ==============#!!!! This avc is allowed in the current policy
allowfirewalld_tself:capability{sys_adminsys_resource};#============= spc_t ==============#!!!! This avc is allowed in the current policy
allowspc_trpm_script_t:processtransition;
Virtualization functionality is available for MKE through KubeVirt, a
Kubernetes extension with which you can natively run Virtual Machine (VM)
workloads alongside container workloads in Kubernetes clusters.
To deploy KubeVirt, the KVM kernel module must be present on all Kubernetes
nodes on which virtual machines will run, and nested virtualization must be
enabled in all virtual environments.
Run the following local platform validation on your Kubernetes nodes to
determine whether they can be used for KubeVirt:
Although it is not required to run KubeVirt, the virtctl CLI provides an
interface that can significantly enhance the convenience of your virtual
machine interactions.
Run the following command, inserting the correct values for your
architecture and platform. For <ARCH> the valid values are
linux or darwin, and for <PLATFORM> the
valid values are amd64 or arm64.
PATH is a system environment variable that contains a list of
directories, within each of which the system is able to search for a
binary. To reveal the list, issue the following command:
Alternatively, you can manually create the cirros-vm.yaml file, using the
following content:
---apiVersion:kubevirt.io/v1kind:VirtualMachinemetadata:labels:kubevirt.io/vm:vm-cirrosname:vm-cirrosspec:running:falsetemplate:metadata:labels:kubevirt.io/vm:vm-cirrosspec:domain:devices:disks:-disk:bus:virtioname:containerdisk-disk:bus:virtioname:cloudinitdiskresources:requests:memory:128MiterminationGracePeriodSeconds:0volumes:-containerDisk:image:mirantis.azurecr.io/kubevirt/cirros-container-disk-demo:1.3.1-20240911005512name:containerdisk-cloudInitNoCloud:userData:|#!/bin/shecho 'printed from cloud-init userdata'name:cloudinitdisk
Swarms are resilient to failures and can recover from temporary node failures,
such as machine reboots and restart crashes, and other transient errors.
However, if a swarm loses quorum, it cannot automatically recover. In such
cases, tasks on existing worker nodes continue to run, but it is not possible
to perform administrative tasks, such as scaling or updating services and
joining or removing nodes from the swarm. The best way to recover after losing
quorum is to bring the missing manager nodes back online. If that is not
possible, follow the instructions below.
In a swarm of N managers, a majority (quorum) of manager nodes must always be
available. For example, in a swarm with 5 managers, a minimum of 3 managers
must be operational and in communication with each other.
In other words, the swarm can tolerate up to (N-1)/2 permanent
failures, and beyond that, requests involving swarm management cannot be
processed. Such permanent failures include data corruption and hardware
failure.
If you lose a quorum of managers, you cannot administer the swarm. If
you have lost the quorum and you attempt to perform any management
operation on the swarm, MKE issues the following error:
If you cannot recover from losing quorum by bringing the failed nodes back
online, you must run the docker swarm init command with the
--force-new-cluster flag from a manager node. Using this flag removes all
managers except the manager from which the command was run.
Run --force-new-cluster from the manager node you want to recover:
Promote nodes to become managers until you have the required number of
manager nodes.
The Mirantis Container Runtime where you run the command becomes the manager
node of a single-node swarm, which is capable of managing and running services.
The manager has all the previous information about services and tasks, worker
nodes continue to be part of the swarm, and services continue running. You need
to add or re-add manager nodes to achieve your previous task distribution and
ensure that you have enough managers to maintain high availability and prevent
losing the quorum.
You do not usually need to force your swarm to rebalance its tasks.
However, when you add a new node to a swarm or a node reconnects to the swarm
after a period of unavailability, the swarm does not automatically give
a workload to the idle node. This is a design decision; if the swarm
periodically shifts tasks to different nodes for the sake of balance,
the clients using those tasks would be disrupted. The goal is to avoid
disrupting running services for the sake of balance across the swarm.
When new tasks start, or when a node with running tasks becomes
unavailable, those tasks are given to less busy nodes.
To force the swarm to rebalance its tasks:
Use the docker service update command with the --force or -f
flag to force the service to redistribute its tasks across the available
worker nodes. This causes the service tasks to restart. Client applications
may be disrupted. If configured, your service will use a rolling update.
Substitute <mke-version> with the MKE version of your
backup.
Confirm that you want to uninstall MKE.
Example output:
INFO[0000] Detected UCP instance tgokpm55qcx4s2dsu1ssdga92
INFO[0000] We're about to uninstall UCP from this Swarm cluster
Do you want to proceed with the uninstall? (y/n):
Restore MKE from the existing backup as described in Restore MKE.
If the swarm exists, restore MKE on a manager node. Otherwise, restore MKE
on any node, and the swarm will be created automatically during the restore
procedure.
For Kubernetes, MKE backs up the declarative state of Kubernetes objects in
etcd.
For Swarm, it is not possible to take the state and export it to a
declarative format, as the objects that are embedded within the Swarm raft
logs are not easily transferable to other nodes or clusters.
To recreate swarm-related workloads, you must refer to the original scripts
used for deployment. Alternatively, you can recreate the workloads by manually
recreating output using the docker inspect commands.
MKE manager nodes store the swarm state and manager logs in the
/var/lib/docker/swarm/ directory. Swarm raft logs contain crucial
information for recreating Swarm-specific resources, including services,
secrets, configurations, and node cryptographic identity. This data includes
the keys used to encrypt the raft logs. You must have these keys to restore the
swarm.
Because logs contain node IP address information and are not transferable to
other nodes, you must perform a manual backup on each manager node. If you do
not back up the raft logs, you cannot verify workloads or Swarm resource
provisioning after restoring the cluster.
Note
You can avoid performing a Swarm backup by storing stacks, services
definitions, secrets, and networks definitions in a source code
management or config management tool.
Keys used to encrypt communication between Swarm nodes and to encrypt
and decrypt raft logs
Membership
Yes
List of the nodes in the cluster
Services
Yes
Stacks and services stored in Swarm mode
Overlay networks
Yes
Overlay networks created on the cluster
Configs
Yes
Configs created in the cluster
Secrets
Yes
Secrets saved in the cluster
Swarm unlock key
No
Secret key needed to unlock a manager after its Docker daemon restarts
To back up Swarm:
Note
All commands that follow must be prefixed with sudo or executed from
a superuser shell by first running sudosh.
If auto-lock is enabled, retrieve your Swarm unlock key. Refer to
Rotate the unlock key
in the Docker documentation for more information.
Optional. Mirantis recommends that you run at least three manager nodes, in
order to achieve high availability, as you must stop the engine of the
manager node before performing the backup. A majority of managers must be
online for a cluster to be operational. If you have less than 3 managers,
the cluster will be unavailable during the backup.
Note
While a manager is shut down, your swarm is more likely to lose quorum if
further nodes are lost. A loss of quorum renders the swarm
unavailable until quorum is recovered. Quorum is only recovered when more
than 50% of the nodes become available. If you regularly take down
managers when performing backups, consider running a 5-manager swarm, as
this will enable you to lose an additional manager while the backup is
running, without disrupting services.
Select a manager node other than the leader to avoid a new election
inside the cluster:
All manager nodes store the same data, thus it is only necessary to back up a
single one.
Backing up MKE does not require that you pause the reconciler and delete MKE
containers, nor does it affect manager node activities and user resources, such
as services, containers, and stacks.
MKE does not support using a backup that runs an earlier version of MKE to
restore a cluster that runs a later version of MKE.
MKE does not support performing two backups at the same time. If a backup
is attempted while another backup is in progress, or if two backups
are scheduled at the same time, a message will display indicating
that the second backup failed because another backup is in progress.
MKE may not be able to back up a cluster that has crashed. Mirantis
recommends that you perform regular backups to avoid encountering this
scenario.
The following backup contents are stored in a .tar file. Backups contain
MKE configuration metadata for recreating configurations such as LDAP, SAML,
and RBAC.
Data
Backed up
Description
Configurations
Yes
MKE configurations, including Mirantis Container Runtime license, Swarm,
and client CAs.
Access control
Yes
Swarm resource permissions for teams, including collections, grants,
and roles.
Certificates and keys
Yes
Certificates, public and private keys used for authentication and mutual
TLS communication.
Metrics data
Yes
Monitoring data gathered by MKE.
Organizations
Yes
Users, teams, and organizations.
Volumes
Yes
All MKE-named volumes including all MKE component certificates and data.
Overlay networks
No
Swarm mode overlay network definitions, including port information.
Configs, secrets
No
MKE configurations and secrets. Create a Swarm backup to back up these
data.
Services
No
MKE stacks and services are stored in Swarm mode or SCM/config
management.
ucp-metrics-data
No
Metrics server data.
ucp-node-certs
No
Certs used to lock down MKE system components.
Routing mesh settings
No
Interlock layer 7 ingress configuration information. A manual backup and
restore process is possible and should be performed.
Note
Because Kubernetes stores the state of resources on etcd, a backup of
etcd is sufficient for stateless backups.
MKE backups include all Kubernetes declarative objects, including secrets, and
are stored in the ucp-kv etcd database.
Note
You cannot back up Kubernetes volumes and node labels. When you restore MKE,
Kubernetes objects and containers are recreated and IP addresses are
resolved.
Store the backup locally on the node at /tmp/mybackup.tar.
To create an MKE backup:
Run the
mirantis/ucp:3.7.16 backup
command on a single MKE manager node, including the --file and
--include-logs options. This creates a .tar archive with the
contents of all volumes used by MKE and streams it to stdout.
Replace 3.7.16 with the version you are currently
running.
If you are running MKE with Security-Enhanced Linux (SELinux) enabled,
which is typical for RHEL hosts, include --security-optlabel=disable in
the docker command, replacing 3.7.16 with the
version you are currently running:
To determine whether SELinux is enabled in MCR, view the
host /etc/docker/daemon.json file, and search for the string
"selinux-enabled":"true".
You can access backup progress and error reporting in the stderr streams of
the running backup container during the backup process. MKE updates progress
after each backup step, for example, after volumes are backed up. The
progress tracking is not preserved after the backup has completed.
A valid backup file contains at least 27 files, including
./ucp-controller-server-certs/key.pem. Verify that the backup is a valid
.tar file by listing its contents, as in the following example:
gpg--decrypt/tmp/mybackup.tar|tar--list
A log file is also created, in the same directory as the backup file. The
passphrase for the backup and log files are the same. Review the contents of
the log file by using the following command:
$AUTHTOKEN is your authentication bearer token if using auth
token identification.
$UCP_HOSTNAME is your MKE hostname.
Example output:
200OK
To list all backups using the MKE API:
You can view all existing backups with the GET:/api/ucp/backups
endpoint. This request does not expect a payload and returns a list of
backups, each as a JSON object following the schema detailed in
Backup schema.
The request returns one of the following HTTP status codes, and if
successful, a list of existing backups:
You can retrieve details for a specific backup using the
GET:/api/ucp/backup/{backup_id} endpoint, where {backup_id} is
the ID of an existing backup. This request returns the backup, if it
exists, as a JSON object following the schema detailed in Backup schema.
The request returns one of the following HTTP status codes, and if
successful, the backup for the specified ID:
To avoid directly managing backup files, you can specify a file name and
host directory on a secure and configured storage backend, such as NFS
or another networked file system. The file system location is the backup
folder on the manager node file system. This location must be writable
by the nobody user, which is specified by changing the directory
ownership to nobody. This operation requires administrator
permissions to the manager node, and must only be run once for a given
file system location.
To change the file system directory ownership to nobody:
sudochownnobody:nogroup/path/to/folder
Caution
Specify a different name for each backup file. Otherwise, the
existing backup file with the same name is overwritten.
Also specify a location that is mounted on a fault-tolerant file system,
such as NFS, rather than the node local disk. Otherwise, it is important
to regularly move backups from the manager node local disk to ensure
adequate space for ongoing backups.
Prior to restoring Swarm, verify that you meet the following prerequisites:
The node you select for the restore must use the same IP address as the node
from which you made the backup, as the command to force the new cluster does
not reset the IP address in the swarm data.
The node you select for the restore must run the same version of Mirantis
Container Runtime (MCR) as the node from which you made the backup.
You must have access to the list of manager node IP addresses located in
state.json inside the zip file.
If auto-lock was enabled on the backed-up swarm, you must have access to
the unlock key.
To perform the Swarm restore:
Caution
You must perform the Swarm restore on only the one manager node in your
cluster and the manager node must be the same manager from which you made
the backup.
Shut down MCR on the manager node that you have selected for your
restore:
systemctlstopdocker
On the new swarm, remove the contents of the /var/lib/docker/swarm
directory. Create this directory if it does not exist.
Restore the /var/lib/docker/swarm directory with the contents of
the backup:
tar-xvf<PATH_TO_TARBALL>-C/
Set <PATH_TO_TARBAL> to the location path where you saved the
tarball during backup. If you are following the procedure in
backup-swarm, the tarball will be in a /tmp/ folder with a unique
name based on the engine version and timestamp:
swarm-${ENGINE}-$(hostname-s)-$(date+%s%z).tgz.
Note
The new node uses the same encryption key for on-disk
storage as the old one. It is not possible to change the
on-disk storage encryption keys. For a swarm that has
auto-lock enabled, the unlock key is the same as on the old
swarm and is required to restore the swarm.
Unlock the swarm, if necessary:
dockerswarmunlock
Start Docker on the new node:
systemctlstartdocker
Verify that the state of the swarm is as expected, including
application-specific tests or checking the output of
dockerservicels to verify that all expected services are
present.
If you use auto-lock, rotate the unlock key:
dockerswarmunlock-key--rotate
Add the required manager and worker nodes to the new swarm.
Reinstate your previous backup process on the new swarm.
MKE supports the following three different approaches to performing a restore:
Run the restore on the machines from which the backup originated or on new
machines. You can use the same swarm from which the backup originated or a
new swarm.
Run the restore on a manager node of an existing swarm that does not have MKE
installed. In this case, the MKE restore uses the existing swarm and runs
in place of an MKE install.
Run the restore on an instance of MCR that is not included in a swarm. The
restore performs docker swarm init just as the install operation
would do. This creates a new swarm and restores MKE thereon.
Note
During the MKE restore operation, Kubernetes declarative objects and
containers are recreated and IP addresses are resolved.
Consider the following requirements prior to restoring MKE:
To restore an existing MKE installation from a backup, you must uninstall MKE
from the swarm by using the uninstall-ucp command.
Restore operations must run using the same major and minor MKE version
and mirantis/ucp image version as the backed-up cluster.
If you restore MKE using a different swarm than the one where the backed-up
MKE was deployed, MKE will use new TLS certificates. In this case, you must
download new client bundles, as the existing ones will no longer be
operational.
At the start of the restore operation, the script identifies the MKE version
defined in the backup and performs one of the following actions:
The MKE restore fails if it runs using an image that does not match
the MKE version from the backup. To override this in, for example, a
testing scenario, use the --force flag.
MKE provides instructions on how to run the restore process for the
MKE version in use.
Note
If SELinux is enabled, you must temporarily disable it prior to running the
restore command. You can then reenable SELinux once the command
has completed.
Volumes are placed onto the host where you run the MKE restore command.
Restore MKE from an existing backup file. The following example illustrates
how to restore MKE from an existing backup file located in
/tmp/backup.tar:
Replace mirantis/ucp:3.7.16
with the MKE version in your backup file.
For the --san flag, assign the cluster API server IP address
without the port number to the APISERVER_LB variable. For example, for
https://172.16.243.2:443 use 172.16.243.2. For more information on
the --san flag, refer to MKE CLI restore options.
If the backup file is encrypted with a passphrase, include the
--passphrase flag in the restore command:
Alternatively, you can invoke the restore command in interactive mode by
mounting the backup file to the container rather than streaming it through
stdin:
Regenerate certs. The current certs volume containing cluster-specific
information, such as SANs, is invalid on new clusters with different IPs.
For volumes that are not backed up, such as ucp-node-certs, the restore
regenerates certs. For certs that are backed up,
ucp-controller-server-certs, the restore does not perform a
regeneration and you must correct those certs when the restore
completes.
Mirantis’s Launchpad CLI Tool (Launchpad) is a command-line deployment and
lifecycle-management tool that runs on virtually any Linux, Mac, or Windows
machine. It simplifies and automates MKE, MSR, and MCR installation and
deployments on public clouds, private clouds, virtualization platforms, and
bare metal.
In addition, Launchpad provides full cluster lifecycle management. Using
Launchpad, multi-manager, high availability clusters (defined as having
sufficient node capacity to move active workloads around while updating) can be
upgraded with no downtime.
Note
Launchpad is distributed as a binary executable. The main integration point
with cluster management is the launchpad apply command and the
input launchpad.yaml configuration for the cluster. As the configuration
is in YAML format, you can integrate other tooling with Launchpad.
Mirantis Launchpad is a static binary that works on the following operating
systems:
Linux (x64)
MacOS (x64)
Windows (x64)
Important
The setup must meet MKE system requirements, in addition to the requirements for running Launchpad.
The following operating systems support MKE:
MKEx (Rocky&OSTree)
CentOS 7
Oracle Linux 7
Oracle Linux 8
Oracle Linux 9
Redhat Enterprise Linux 7
Redhat Enterprise Linux 8
Redhat Enterprise Linux 9
Rocky Linux 8
Rocky Linux 9
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 15
Ubuntu 18.04
Ubuntu 20.04
Ubuntu 22.04
Windows Server 2022, 2019
Be aware that Launchpad does not support all OS platform patch levels.
Refer to the Compatibility Matrix for your version of MCR for full OS
platform support information.
Launchpad remote management must have high privilege on your system, both to
prepare the system for installation and to perform the installation. This level
of access is necessary for package managent, and also to allow remote users to
execute MCR docker commands.
Note
For security reasons, Launchpad should not be executed with root/admin user
authentication on any machine.
Launchpad connects through the use of a cryptographic network protocol (SSH on
Linux systems, SSH or WinRM on Windows systems), and as such these must be set
up on all host instances.
Note
Only passwordless sudo capable SSH Key-Based authentication is currently
supported. On Windows the user must have administrator privileges.
OpenSSH is the open-source version of the Secure Shell (SSH) tools used by
administrators of Linux and other non-Windows operating systems for
cross-platform management of remote systems. It is included in Windows Server
2019.
To enable SSH on Windows, you can run the following PowerShell snippets,
modified for your specific configuration, on each Windows host.
Launchpad is a command-line deployment and lifecycle-management tool that
enables users on any Linux, Mac, or Windows machine to easily install, deploy,
modify, and update MKE, MSR, and MCR.
To fully evaluate and use MKE, MSR, and MCR, Mirantis recommends installing
Launchpad on a real machine (Linux, Mac, or Windows) or a virtual machine (VM)
that is capable of running:
A graphic desktop and browser, for accessing or installing:
The MKE web UI
Lens, an open source, stand-alone GUI application from Mirantis (available
for Linux, Mac, and Windows) for multi-cluster management and operations
Metrics, observability, visualization, and other tools
kubectl (the Kubernetes command-line client)
curl, Postman and/or client libraries, for accessing the Kubernetes REST API
Docker and related tools for using the Docker Swarm CLI, and for
containerizing workloads and accessing local and remote registries.
The machine can reside in different contexts from the hosts and connect with
those hosts in several different ways, depending on the infrastructure and
services in use. It must be able to communicate with the hosts via their IP
addresses on several ports. Depending on the infrastructure and security
requirements, this can be relatively simple to achieve for evaluation clusters
(refer to Networking Considerations for
more information).
A cluster is comprised of at least one manager node and one or more worker
nodes. At the start, Mirantis recommends deploying a small evaluation cluster,
with one manager and at least one worker node. Such a setup will allow you to
become familiar with Launchpad, with the procedures for provisioning
nodes, and with the features of MKE, MSR, and MCR. In addition, if the
deployment is on a public cloud, the setup will minimize costs.
Ultimately, Launchpad can deploy manager and worker nodes in
any combination, creating many different cluster configurations, such as:
Small evaluation clusters, with one manager and one or more worker nodes.
Diverse clusters, with Linux and Windows workers.
High-availability clusters, with two, three, or more manager node.
Clusters that Launchpad can auto-update, non-disruptively, with multiple
managers (allowing one-by-one update of MKE without loss of
cluster cohesion) and sufficient worker nodes of each type to allow workloads
be drained to new homes as each node is updated.
The hosts must be able to communicate with one another (and potentially, with
users in the outside world) by way of their IP addresses, using many ports.
Depending on infrastructure and security requirements, this can be
relatively simple to achieve for evaluation clusters (refer to Networking
Considerations).
Launchpad has built-in telemetry for tracking tool use. The
telemetry data is used to improve the product and overall user experience.
No sensitive data about the clusters is included in the telemetry payload.
Rename the downloaded binary to launchpad, move it to a
directory in the PATH variable, and give it permission to run (execute
permission).
Tip
If macOS is in use it may be necessary to give Launchpad
permissions in the Security & Privacy section in System Preferences.
Verify the installation by checking the installed tool version with the
launchpad version command.
$ launchpad version
# console output:
version: 1.0.0
Complete the registration. Please be aware that the registration information
will be used to assign evaluation licenses and to provide Launchpad use
help.
$ launchpad register
name: Anthony Stark
company: Stark Industries
email: astark@example.com
I agree to Mirantis Launchpad Software Evaluation License Agreement https://github.com/Mirantis/launchpad/blob/master/LICENSE [Y/n]: Yes
INFO[0022] Registration completed!
Adjust the text to meet your infrastructure requirements. The model should
work to deploy hosts on most public clouds.
If you’re deploying on VirtualBox or some other desktop virtualization
solution and are using bridged networking, it will be necessary
to make a few minor adjustments to the launchpad.yaml.
Deliberately set a –pod-cidr to ensure that pod IP addresses
don’t overlap with node IP addresses (the latter are
in the 192.168.x.x private IP network range on such a setup)
Supply appropriate labels for the target nodes’ private IP network
cards using the privateInterface parameter (this typically defaults to
enp0s3 on Ubuntu 18.04 (other Linux distributions use similar
nomenclature).
In addition, it may be necessary to set the username for logging in to the
host.
You can start the cluster once the cluster configuration file is fully set up.
In the same directory where you created the launchpad.yaml file, run:
$ launchpad apply
The launchpad tool uses a cryptographic network protocol (SSH on Linux
systems, SSH or WinRM on Windows systems) to connect to the infrastructure
specified in the launchpad.yaml and configures on the hosts everything that
is required. Within a few minutes the cluster should be up and running.
By default, the administrator username is admin. If the password is not
supplied in launchpad.yamlinstallFlags option like
--admin-password=supersecret, the generated admin password will display in
the install flow.
INFO[0083]127.0.0.1:time="2020-05-26T05:25:12Z"level=infomsg="Generated random admin password: wJm-TzIzQrRNx7d1fWMdcscu_1pN5Xs0"
Important
The addition or removal of nodes in subsequent Launchpad runs will fail if
the password is not provided in the launchpad.yaml file.
Users will likely install Launchpad on a laptop or a VM with the intent of
deploying MKE, MSR, or MCR onto VMs running on a public or private cloud
that supports security groups for IP access control. Such an approach makes
it fairly simple to configure networking in a way that provides adequate
security and convenient access to the cluster for evaluation and
experimentation.
The simplest way to configure the networking for a small, temporary cluster for
evaluation:
Create a new virtual subnet (or VPC and subnet) for hosts.
Create a new security group called de_hosts (or another name of your
choice) that permits inbound IPv4 traffic on all ports, either from the
security group de_hosts, or from the new virtual subnet only.
Create another new security group (for example, admit_me) that permits
inbound IPv4 traffic from your deployer machine’s public IP address only
(for instance, the website whatismyip.com) to determine your public IP.
When launching hosts, attach them to the newly-created subnet and
apply both new security groups.
(Optional) Once you know the IPv4 addresses (public, or VPN-accessible
private) of your nodes, unless you are using local DNS it makes sense to
assign names to your hosts (for example, manager, worker1,
worker2… and so on). Then, insert IP addresses and names in your
hostfile, thus letting you (and Launchpad) refer to hosts by hostname
instead of IP address.
Once the hosts are booted, SSH into them from your deployer machine
with your private key. For example:
ssh-i/my/private/keyfileusername@mynode
After that, determine whether they can access the internet. One
method for doing this is by pinging a Google nameserver:
$ ping 8.8.8.8
Now, proceed with installing Launchpad and configuring an MKE,
MSR, or MCR deployment. Once completed, use your deployer machine to
access the MKE web UI, run kubectl (after authenticating
to your cluster) and other utilities (for example, Postman, curl, and so on).
A more secure way to manage networking is to connect your deployer machine to
your VPC/subnet using a VPN, and to then modify the de_hosts security group
to accept traffic on all ports from this source.
If you intend to deploy a cluster for longer-term evaluation, it makes sense to
secure it more deliberately. In this case, a certain range of ports will need
to be opened on hosts. Refer to the MKE documentation for details.
Launchpad can deploy certificate bundles obtained from a certificate provider
to authenticate your cluster. These can be used in combination with DNS to
allow you to reach your cluster securely on a fully-qualified domain name
(FQDN). Refer to the MKE documentation for details.
Launchpad allows users to upgrade their clusters with the launchpad
apply reconciliation command. The tool discovers the current state of the
cluster and its components, and upgrades what is needed.
Run launchpad apply. Launchpad will upgrade MCR on all
hosts in the following sequence:
Upgrade the container runtime on each manager node one-by-one, and thus
if there is more than one manager node, all other manager nodes are
available during the time that the first node is being updated.
Once the first manager node is updated and is running again, the second
is updated, and so on, until all of the manager nodes are running the new
version of MCR.
10% of worker nodes are updated at a time, until all of the worker nodes
are running the new version of MCR.
Upgrade MKE, MSR, AND MCR (separately or collectively)¶
Upgrading to newer versions of MKE, MSR, and MCR is as easy as changing the
version tags in the launchpad.yaml and running the launchpad
apply command.
Note
Launchpad upgrades MKE on all nodes.
Open the launchpad.yaml file.
Update the version tags to the new version of the component(s).
Save launchpad.yaml.
Run the launchpad apply command.
Launchpad connects to the nodes to get the current version of each
component, after which it upgrades each node as described in
Upgrading Mirantis Container Runtime. This may
take several minutes.
Note
MKE and MSR upgrade paths require consecutive minor versions (for example,
to upgrade from MKE 3.1.0 to MKE 3.3.0 it is necessary to upgrade from MKE
3.1.0 to MKE 3.2.0 first, and then upgrade from MKE 3.2.0 to MKE 3.3.0).
Swarm manager nodes use the Raft Consensus Algorithm to manage the swarm state.
As such, it is advisable to have an understanding of some general Raft concepts
in order to manage a swarm.
There is no limit on the number of manager nodes that can be deployed. The
decision on how many manager nodes to implement comes down to a trade-off
between performance and fault-tolerance. Adding manager nodes to a swarm
makes the swarm more fault-tolerant, however additional manager nodes reduce
write performance as more nodes must acknowledge proposals to update the
swarm state (which means more network round-trip traffic).
Raft requires a majority of managers, also referred to as the quorum, to
agree on proposed updates to the swarm, such as node additions or removals.
Membership operations are subject to the same constraints as state
replication.
In addition, Manager nodes host the control plane etcd cluster, and thus
making changes to the cluster requires a working etcd cluster with the
majority of peers present and working.
It is highly advisable to run an odd number of peers in quorum-based systems.
MKE only works when a majority can be formed, so once more than one node has
been added it is not possible to (automatically) go back to having only one
node.
Adding manager nodes is as simple as adding them to the launchpad.yaml
file. Re-running launchpad apply will configure MKE on the new node
and also makes necessary changes in the swarm and etcd cluster.
To add worker nodes, simply include them in the launchpad.yaml
file. Re-running launchpad apply will configure everything on the
new node and join it to the cluster.
MSR nodes are identical to worker nodes. They participate in the MKE swarm, but
should not be used as traditional worker nodes for both MSR and cluster
workloads.
Note
By default, MKE will prevent scheduling of containers on MSR nodes.
MSR forms its own cluster and quorum in addition to the swarm formed by MKE.
There is no limit on the number of MSR nodes that can be configured, however
the best practice is to limit the amount to five. As with manager nodes, the
decision on how many nodes to implement should be made with an understanding of
the trade-off between performance and fault-tolerance (a larger amount of nodes
added can incur severe performance penalties).
The quorum formed by MSR utilizes RethinkDB which, as with swarm, uses the Raft
Consensus Algorithm.
To add MSR nodes, simply include them in the launchpad.yaml file
with a host role of msr. When adding an MSR node, specify both the
adminUsername and adminPassword in the spec.mke section of
the launchpad.yaml file so that MSR knows which admin credentials to use.
Path to a cluster config file, including the filename (default:
launchpad.yaml, to read from standard input use: -).
--force
Required when running non-interactively (default: false)
exec
Execute a command or run a remote terminal on a host.
Use Launchpad to run commands or an interactive terminal on the hosts in
the configuration.
Supported options:
--config
Path to a cluster config file, including the filename (default:
launchpad.yaml, to read from standard input use: -).
--target value
Target host (example: address[:port])
--interactive
Run interactive (default: false)
--first
Use the first target found in configuration (default: false)
--role value
Use the first target that has this role in configuration
-[command]
The command to run. When blank, will run the default shell.
describe
Presents basic information that correlates to the command target.
When the launchpad describe hosts command is run, the
information delivered includes the IP address, the internal IP, the host
name, the set role, the operating system, and the MCR version of each
host. When the launchpad describe MKE or launchpad
describe MSR is run, the command returns the product version number for
the product targeted, as well as the URL of the administation user
interface.
Supported options:
--config
Path to a cluster config file, including the filename (default:
launchpad.yaml, to read from standard input use: -).
Mirantis Launchpad cluster configuration is presented in YAML format.
launchpad.yaml is the file’s default name, though you can edit this name as
necessary using any common text editor.
In reading the configuration file, Launchpad will replace any strings that
begin with a dollar sign with values from the local host’s environment
variables. For example:
Comprehensive information follows for each of the top-level Launchpad
configuration file (launchpad.yaml) keys: apiVersion, kind, metadata,
spec, cluster
The latest API version is launchpad.mirantis.com/mke/v1.4, though earlier
configuration file versions are also likely to work without changes (without
any features added by more recent versions).
Private network address for the configured network interface (default:
eth0)
role
Role of the machine in the cluster. Possible values are:
manager
worker
msr
environment
Key-value pairs in YAML mapping syntax. Values are updated to host
environment (optional)
mcrConfig
Mirantis Container Runtime configuration in YAML mapping syntax, will be
converted to daemon.json (optional)
hooks
Hooks configuration for running commands before or after stages
(optional)
imageDir
Path to a directory containing .tar/.tar.gz files produced by dockersave. The images from that directory will be uploaded and
dockerload is used to load them.
sudodocker
Flag indicating whether Docker should be run with sudo.
When set to true on Linux hosts, Docker commands will be run with
sudo, and the user will not be added to the machine docker
group.
Optional. Custom upgrade flags for MKE upgrade. Obtain a list of
supported installation options for a specific MKE version by running the
installer container with docker run -t -i --rm
mirantis/ucp:3.7.16 upgrade
--help.
Optional. The initial full cluster configuration file in embedded “heredocs” syntax. Heredocs
allows you to define a mulitiline string while maintaining the original
formatting and indenting
cloud
Optional. Cloud provider configuration.
Note
The cloud option is valid only for MKE versions prior to 3.7.x.
provider: Provider name (currently Azure and OpenStack (MKE
3.3.3+) are supported)
configFile: Path to cloud provider configuration file on local
machine
configData: Inlined cloud provider configuration
swarmInstallFlags
Optional. Custom flags for Swarm initialization
swarmUpdateCommands
Optional. Custom commands to run after the Swarm initialization
caCertPath
certPath
keyPath
each followed by
<pathtofile>
or
caCertData
certData
keyData
each followed by
<PEMencodedstring>
Required components for configuring the MKE UI to use custom SSL
certificates on its Ingress. You must specify all components:
CA Certificate
SSL Certificate
Private Key
Launchpad accepts either inline PEM-encoded data or a file path,
depending on the provided argument.
Note
If MKE already uses custom certificates, Launchpad can rotate
the certificates during upgrade.
Important
Unless a password is provided, the MKE installer automatically generates an
administrator password. This password will display in clear text in the
output and persist in the logs. Subsequent runs will fail if this
automatically generated password is not configured in the launchpad.yaml
file.
Version of MSR to install or upgrade to (default: 2.8.5)
imageRepo
The image repository to use for MSR installation (default: docker.io/mirantis)
installFlags
Optional. Custom installation flags for MSR installation. Obtain a list
of supported installation options for a specific MSR version by running
the installer container with docker run -t -i --rm
mirantis/dtr:3.1.5 install --help.
Note
Launchpad inherits the MKE flags that MSR needs to perform an
installation, and to join or remove nodes. Thus, there is no need to
include the following install flags in the installFlags section of
msr:
--ucp-username (inherited from MKE’s --admin-username flag
or spec.mke.adminUsername)
--ucp-password (inherited from MKE’s --admin-password flag
or spec.mke.adminPassword)
--ucp-url (inherited from MKE’s --sanflag or intelligently
selected based on other configuration variables)
upgradeFlags
Optional. Custom upgrade flags for MSR upgrade. Obtain a list of
supported installation options for a specific MSR version by running the
installer container with docker run -t -i --rm
mirantis/dtr:3.1.5 upgrade --help.
replicaIDs
Set to sequential to generate sequential replica id’s for cluster
members, e.g., 000000000001, 000000000002, etc. (default: random)
Customers take a risk in opting to use and manage their own
install scripts for MCR instead of the install script that Mirantis
hosts at get.mirantis.com. Mirantis manages this script as necessary to
support MCR installations on demand, and can change it as needed to resolve
issues and to support new features. As such, customers who opt to use their
own script will need to monitor the Mirantis script to ensure
compatibility.
Options
Description
version
Version of MCR to install or upgrade to. (default 20.10.0)
channel
Installation channel to use. One of test or prod (optional).
repoURL
Repository URL to use for MCR installation. (optional)
installURLLinux
Location from which to download the initial installer script for Linux
hosts (local paths can also be used).
(default: https://get.mirantis.com/)
installURLWindows
Location from which to download the initial installer script for Windows
hosts (local paths can be used). (default:
https://get.mirantis.com/install.ps1)
Note
In most scenarios, it is not necessary to specify repoUrl and
installURLLinux/Windows, which usually are only used when
installing from a non-standard location (that is, a disconnected
datacenter).
prune
Removes certain system paths that are created by MCR during
uninstallation (for example, /var/lib/docker).
MKEx is an integrated container orchestration platform that is powered by an
immutable Rocky Linux operating system, offering next-level security and
reliability.
Note
An immutable Linux operating system is designed to be unchangeable following
installation, with system files that are read-only, and limited only to
those packages that are required to run the applications. Such an OS
is more resistant to tampering and malwares, and is well protected from
accidental or malicious modification. Also, as updates or changes can only
be made to an immutable OS by creating a new instance, such an OS is easier
to maintain and troubleshoot.
Mirantis, in conjunction with our partner CIQ, worked to preassemble
ostree-based Rocky Linux with Mirantis Container Runtime (MCR) and Mirantis
Kubernetes Engine (MKE), to provide users with an immutable, atomic
upgrade/rollback, versioning stack that offers a high degree of predictability
and resiliency.
rpm-ostree is a hybrid image/package system for managing and
deploying Linux-based operating systems. It combines the concepts of Git and
traditional package management to provide a version-controlled approach to
system updates and rollbacks. As with Git, rpm-ostree treats the
operating system as an immutable tree of files, which enables you to atomically
update or roll back the entire system.
rpm-ostree and Ostree system term glossary
Ostree
A Git-like content-addressed object store that manages operating system
images or deployments and provides versioning, branching, and atomic
upgrades.
rpm-ostree
The primary command-line tool used in the Ostree system. It enables system
administrators to manage deployments, perform upgrades, rollbacks, and
package installations using RPM-based packages.
Deployment
A specific versioned state of the operating system captured by Ostree.
Deployments are atomic, immutable, and can be booted into.
Atomic upgrade
The process of transitioning from one deployment to another, providing a
complete and consistent update to the system in a single transaction.
Rollback
The ability to revert to a previous deployment, restoring the system to a
known working state.
Commit
A unique identifier that represents a specific version of a deployment in
Ostree. Each commit consists of a set of objects that represent the file
system and metadata.
Repository
A collection of commits and objects that store the operating system images
or deployments. It serves as a central location for storing and
distributing the deployments.
Remote
A reference to a remote repository from which deployments can be fetched.
Remotes provide the location and access information for the repository
server.
Ref
A named reference to a specific commit in a repository. It allows for
easier access to a particular version of the deployment.
Initramfs
A small initial RAM file system that is loaded by the boot loader and used
to bootstrap the operating system during a system startup.
Overlay filesystem
A mechanism that enables changes to be made to a read-only file system by
creating a writable layer on top of it.
Atomic host
A variant of a Linux distribution that uses Ostree and
rpm-ostree for managing the operating system deployments. It
provides an immutable and transactional operating system experience.
Bootloader
Software responsible for loading the operating system during system
startup. In an OSTree-based system, the bootloader is often configured to
boot into specific deployments.
OSTree-based package manager
A package manager that interacts with the OSTree system, allowing for
the installation and management of packages within the deployments.
For example, DNF and PackageKit.
kargs
Kernel arguments passed to the Linux kernel during boot.
In rpm-ostree, kargs can be used to customize the boot process
or enable specific features.
Package layering
The ability to install RPM packages on top of an existing deployment
without modifying the base deployment. This allows for customizations and
additional software installations without affecting the base system.
Delta
A compressed binary diff between two versions of a deployment. Deltas are
used to optimize the download and storage of updates, reducing bandwidth
and storage requirements.
System upgrade
The process of updating the entire operating system to a new version,
typically achieved by transitioning to a new deployment.
Metadata
Information about a deployment or commit, such as version numbers, labels,
descriptions, or dependencies.
You can install MKEx either from a bootable ISO image or by way of Kickstart.
To install MKEx from a bootable ISO image:
Note
The bootable ISO image is quite similar to that of a Rocky Linux
installation.
Bootup the ISO. The welcome screen will display.
Select the language for the MKEx installation and click
Continue. A warning message will display, stating that MKEx
is pre-released software that is intended for development and
testing purposes only.
Click I want to proceed. The INSTALLATION SUMMARY
screen will display, offering entry points to the various aspects of the
MKEx installation.
Installation phase
Aspects
LOCALIZATION
Keyboard
Language Support
Time & Date
SYSTEM
Installation Destination
KDUMP
Network & Host Name
Security Policy
USER SETTINGS
Root Password
User Creation
Click Installation Destination to call the
INSTALLATION DESTINATION screen. Review the setup. The default
installation destination should suffice for testing purposes.
Once you are certain the setting is correct, click DONE to
return to the The INSTALLATION SUMMARY screen.
Next, click User Creation to call the CREATE USER
screen.
Configure a user for your MKEx installation, making sure to tick the
Make this user administrator checkbox, and click
DONE to return to the The INSTALLATION SUMMARY
screen.
Next, click Network & Host Name to call the
NETWORK & HOST NAME screen.
Set the toggle to ON to enable the network connection,
update the Host Name if necessary, and click
DONE to return to the The INSTALLATION SUMMARY
screen.
Click Begin Installation. Tthe INSTALLATION PROGRESS
screen will display.
Note
The output may be differ from any previous experience you have had with
installing Rocky Linux. This is due to the use of an immutable operating
system base rather than a traditional RPM-based OS.
Once installation is complete, click Reboot System to boot the
new image. Be aware that the initial boot will require time to load the MKE
images by way of the network. Once the initial boot is complete, a login
prompt will display.
Log in to the console.
Note
SSH is disabled by default, however you can enable it for easier access.
Verify the presence of the MKE image:
Note
Presence verification requires the use of the sudo command
due to the locking down of the /var/run/docker.sock socket. The
immutable operating system does not allow the use of the
usermod command.
Users with external network access:
sudodockerimagels
Users on an isolated system, without external network access:
ls-l/usr/share/mke
If the MKE image is present, skip ahead to the following step. If the
MKE image is not present, run the following command to load it:
Log in to MKE, upload your license, and set up your worker nodes.
Optional. Install additional software to your MKEx operating system to
benefit from additional features. Be aware that unlike other RPM-based
systems, the immutable MKEx-based Rocky image uses rpm-ostree to manage software.
To install MKEx using Kickstart:
Obtain a copy of the ostre-repo.tar.gz and host it in a normal http
server.
Copy the following kickstart to a file and host it on the http server.
During machine bootup, inject a cmdline parameter to
instruct Anaconda to use the hosted kickstart. This can be done by editing
the cmdline in the grub menu. When the grub screen displays, press the Tab
key and append inst.ks=<url-of-hosted-kickstart>.
The machine should boot into the Anaconda installer and automatically
install as per the Kickstart instructions.
Note
The Kickstart provided here is not complete, as it only contains what is
required for rpm-ostree. Be aware that it
may be necessary to add commands for networking, partitioning, adding
users, setting root passwd, and so forth.
MKEx is an integrated stack, with MKE container orchestration, or MCR container
engines, in a productized configuration that is delivered on a minimal version
of RHEL-compatible, ostree-based Rocky Linux.
You can deploy MKEx configurations on either bare metal or virtual machines,
from an ISO image that is assembled and validated by Mirantis. The image is
available online, as well as in file form for air-gapped installation.
Mirantis, in conjunction with our partner CIQ, built an ostree-based Rocky
Linux operating system with Mirantis Container Runtime (MCR) and Mirantis
Kubernetes Engine (MKE), to provide users with an immutable, atomic
upgrade/rollback, versioning stack that offers a high degree of predictability
and resiliency.
The sshd is disabled by default. System administrators can enable it to access
the node, though, and disable it prior to installing the OS. With sshd
disabled, users will be unable to access the nodes, and will thus have to use
Mirantis-provided debug Pods to troubleshoot MKE clusters.
Mirantis has configured rotating logs (100M) by default cat/etc/docker/daemon.json, and system administrators can change the value as
necessary.
To keep the footprint small and secure, only the required Linux packages are
installed. System administrators can add custom packages or set specific kernel
parameters through Ansible, or any other IaC software. Note, though, that the
ansible-pull command is installed by default, to enable the use of
Ansible outside of sshd.
Note
To ensure that the image is consistent, users
should contact Mirantis support and request the inclusion of specific
packages in the ISO image.
rpm-ostree provides a version-controlled approach to
system updates and rollbacks. As with Git, rpm-ostree treats the
operating system as an immutable tree of files, which enables you to atomically
update or roll back the entire system.
The basic core capabilities of rpm-ostree bring such concepts as
version control and atomic updating to the management and maintenance of the
operating system.
The rpm-ostree status command provides the current system state
as well as deployed operating system images. It displays the currently active
deployment, its commit ID, and any pending upgrades.
Note
As with git status you can use the rpm-ostree status to
better understand the status of your system and to learn of any pending
changes.
You can revert the system to a previous known state with the
rpm-ostree rollback command, specifying a specific deployment. This
command undoes any system changes made following the specified deployment,
effectively rolling back the entire system to a previous commit.
Note
The rpm-ostree rollback operation is similar to using
git checkout in a Git repository to revert to a previous commit.
To revert to previously booted tree:
Note
For example purposes, the target deployment shown is the original deployment
without the additional tmux package installed.
Rollback the current deployment to the target deployment.
You can use the rpm-ostree install command to install additional
packages on top of the base operating system image. The command adds the
packages you specify to the current deployment, thus enabling you to extend
your system functionality.
Use the rpm-ostree cleanup command to remove old or unused
deployments for the purpose of freeing up disk space. Following invoication, a
configurable number of recent deployments is retained and the rest are deleted.
Note
The rpm-ostree cleanup operatåion is similar to using
git gc in a Git repository, in that it serves to optimize disk
usage and keeps the system clean.
To clear temporary files and leave deployments unchanged:
Use the rpm-ostree kargs command to manage kernel arguments (kargs)
for the system. With this command, you can modify the kernel command-line
parameters for the next reboot, as well as customize the kernel parameters for
specific deployments.
To modify kernel arguments:
View the kernel arguments for the currently booted deployment:
To delete the custom kernel argument that was appended to the booted
deployment and force the system to automatically reboot after the command
completes:
Use the rpm-ostree rebase command to update the base
operating system image of the system.
Note
The rpm-ostree rebase operation is similar to using
git rebase in a Git repository, in that it pulls in a newer
version of the base operating system image and updates the system
accordingly.
To update the base operating system image:
Note
For example purposes, mkex:mkex/8/x86_64/mcr/20.10-devel is the different
base operating system image maintained in the repository.
Rebase the current deployment on a different base operating system image
maintained in this repository:
Use the rpm-ostree ex command to execute experimental commands in
the context of a specific deployment. Invocation of this command executes a
command that uses the files and environment of the specified deployment.
Example experimental command: To view the rpm-ostree history of the system:
Use the rpm-ostree deploy` command to trigger the deployment of a
specific operating image on the system.
Note
Similar to checking out to a specific commit in Git, the
rpm-ostree deploy` command enables you to switch to a different
version of the operating system by specifying the desired deployment through
its commit ID.
To deploy the image with the commit ID:
Note
For example purposes,
0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91 serves
as the commit ID.
Use the rpm-ostree uninstall command to remove installed packages
from the system. This command removes the specified packages from the current
deployment, similar to uninstalling packages from a Git repository.
Uninstall the tmux package previously added to the current deployment:
Mirantis recommends that package layering be used only for debugging
specific deployments, and not to manage system state at scale.
With package layering, you can create an overlay deployment with the
rpm-ostree ex command and install packages into that overlay.
Subsequently, this allows you to install additional packages on top of the base
operating system image without modifying the codebase itself, thus permitting
you to test new customizations and experimental changes against the booted
deployment without risk.
Note
For example purposes, the procedure detailed herein will install the
telnet package.
One of the key advantages of rpm-ostree is its ability to perform
rollbacks to previous system states. Thus, whenever you encounter issues or
unexpected behavior, you can invoke the rpm-ostree rollback command
to revert the system to a known, stable deployment.
rpm-ostreerollback
To effectively manage rollbacks, Mirantis recommends that you maintain multiple
deployments, each with a different version or configuration. Using this
approach, you can switch between deployments and perform rollbacks as
necessary.
OSTree is designed as a client-server system, where the server hosts
the repositories containing the operating system images or deployments,
and the clients interact with the server to fetch and manage these
deployments.
The server side of OSTree is responsible for maintaining and serving
the operating system images or deployments. Typically, the OSTree
server is a repository hosting server.
The server-side tools used for maintaining the repositories and serving
the deployments may vary depending on implementation. The list of commonly
used tools include:
OSTree
The core tool that manages the repository and handles the versioning and
branching of the operating system deployments.
Repository hosting software
Can be represented by the software such as ostree-repo or a dedicated
repository management system such as Pulp or Artifactory.
Web server
A web server such as Apache or NGINX, which handles the HTTP(S)
communication, to access the repository server through it. Typically,
the server locates on a centralized infrastructure or network accessible
to the clients over the network. It can be hosted on-premises or
in the cloud, depending on the deployment requirements.
The client side of OSTree is responsible for interacting with the server
to fetch and manage the operating system deployments on individual systems.
System administrators and end-users use the client tools and utilities
to perform package upgrade, rollback, installation, and other various
operations on the deployments.
The primary client-side tool is typically the rpm-ostree
command-line tool, which provides a set of commands for managing
the deployments.
Alongside OSTree, depending on specific distribution or system requirements,
additional tools or package managers can be used. For example, you can use
DNF to manage traditional packages.
When working with rpm-ostree, it is crucial to be aware of common
issues that can occur and to know how to troubleshoot them effectively.
Common problems during the tool usage include conflicts during upgrades, disk
space limitations, and failed deployments. To troubleshoot these issues,
consider basic techniques that include reviewing logs with
journalctl, monitoring disk space, inspecting deployment health
using rpm-ostree status, examining error messages for specific
operations and so on.
The journalctl command enables you to view the logs specifically
related to the rpm-ostreed service. It provides valuable information
about system events, errors, and warnings related to rpm-ostree.
When examining the logs, search for any error messages, warnings, or
indications of failed operations. These can provide insights into the root
cause of the issue and further troubleshooting steps.
To view and follow system logs for
rpm-ostreed.service:
When performing operations such as upgrades, rollbacks, and installations with
rpm-ostree, pay attention to any error messages that the system
displays.
Error messages often provide specific details about the issue encountered,
such as package conflicts, missing dependencies, or connectivity problems
with repositories.
To examine error messages, run the desired rpm-ostree command
and carefully read the output. Look for any error indications or specific
error codes mentioned. These can help narrow down the issue and guide
the troubleshooting process.
Example error messages during an upgrade:
Error: Can't upgrade to commit abcdefg: This package conflicts with package-xyz-1.0.0-1.x86_64.
The above error message indicates a package conflict preventing the upgrade.
You can perform further investigation by checking the package versions,
dependencies, and resolving the conflict accordingly.
The rpm-ostree tool provides a cancel command that you
use to cancel an active transaction. This can come in handy in situations where
you, for example, accidentally start an upgrade rebasing a large deployment and
want to cancel the opration:
To ensure a smooth and reliable experience with rpm-ostree, always
keep the repositories up to date. This involves regular metadata updates
and repository information refreshing.
To update the metadata for a repository:
rpm-ostreerefresh-md
This fetches the latest information about available packages, dependencies,
and updates. Similarly to updating remote branches in Git, refreshing of
the metadata ensures that you have the latest information from
the repositories.
Mirantis Kubernetes Engine (MKE) subscriptions provide access to prioritized
support for designated contacts from your company, agency, team, or
organization. MKE service levels are based on your subscription level and the
cloud or cluster that you designate in your technical support case. Our support
offerings are described on the
Enterprise-Grade Cloud Native and Kubernetes Support page.
You may inquire about Mirantis support subscriptions by using the
contact us form.
The CloudCare Portal is the primary way
that Mirantis interacts with customers who are experiencing technical
issues. Access to the CloudCare Portal requires prior authorization by your
company, agency, team, or organization, and a brief email verification step.
After Mirantis sets up its backend systems at the start of the support
subscription, a designated administrator at your company, agency, team, or
organization, can designate additional contacts. If you have not already
received and verified an invitation to our CloudCare Portal, contact your local
designated administrator, who can add you to the list of designated contacts.
Most companies, agencies, teams, and organizations have multiple designated
administrators for the CloudCare Portal, and these are often the persons most
closely involved with the software. If you do not know who your
local designated administrator is, or you are having problems accessing the
CloudCare Portal, you may also send an email to Mirantis support at
support@mirantis.com.
Once you have verified your contact details and changed your password, you and
all of your colleagues will have access to all of the cases and resources
purchased. Mirantis recommends that you retain your Welcome to Mirantis
email, because it contains information on how to access the CloudCare Portal,
guidance on submitting new cases, managing your resources, and other related
issues.
We encourage all customers with technical problems to use the
knowledge base, which you can access on the Knowledge tab
of the CloudCare Portal. We also encourage you to review the
MKE product documentation and release notes prior to
filing a technical case, as the problem may already be fixed in a later
release or a workaround solution provided to a problem experienced by other
customers.
One of the features of the CloudCare Portal is the ability to associate
cases with a specific MKE cluster. These are referred to in the Portal as
“Clouds”. Mirantis pre-populates your customer account with one or more
Clouds based on your subscription(s). You may also create and manage
your Clouds to better match how you use your subscription.
Mirantis also recommends and encourages customers to file new cases based on a
specific Cloud in your account. This is because most Clouds also have
associated support entitlements, licenses, contacts, and cluster
configurations. These greatly enhance the ability of Mirantis to support you in
a timely manner.
You can locate the existing Clouds associated with your account by using the
Clouds tab at the top of the portal home page. Navigate to the
appropriate Cloud and click on the Cloud name. Once you have verified that the
Cloud represents the correct MKE cluster and support entitlement, you
can create a new case via the New Case button near the top of the
Cloud page.
One of the key items required for technical support of most MKE cases is the
support bundle. This is a compressed archive in ZIP format of configuration
data and log files from the cluster. There are several ways to gather a support
bundle, each described in the paragraphs below. After you obtain a support
bundle, you can upload the bundle to your new technical support case by
following the instructions in the Mirantis knowledge base,
using the Detail view of your case.
Obtain a full-cluster support bundle using the MKE web UI¶
Log in to the MKE web UI as an administrator.
In the left-side nagivation panel, navigate to
<user name> and click Support Bundle.
It may take several minutes for the download to complete.
Note
The default name for the generated support bundle file is
docker-support-<cluster-id>-YYYYmmdd-hh_mm_ss.zip. Mirantis suggests
that you not alter the file name before submittal to the customer portal.
However, if necessary, you can add a custom string between
docker-support and <cluster-id>, as in:
docker-support-MyProductionCluster-<cluster-id>-YYYYmmdd-hh_mm_ss.zip.
Submit the support bundle to Mirantis Customer Support by clicking
Share support bundle on the success prompt that displays
when the support bundle finishes downloading.
Fill in the Jira feedback dialog, and click Submit.
Obtain a full-cluster support bundle using the MKE API¶
Create an environment variable with the user security token:
Add the --submit option to the support command to submit
the support bundle to Mirantis Customer Support. The support
bundle will be sent, along with the following information:
Cluster ID
MKE version
MCR version
OS/architecture
Cluster size
For more information on the support command, refer to
support.
Obtain a single-node support bundle using the CLI¶
If SELinux is enabled, include the following additional flag:
--security-optlabel=disable.
Note
The CLI-derived support bundle only contains logs for the node on which
you are running the command. If your MKE cluster is highly available,
collect support bundles from all manager nodes.
Add the --submit option to the support command to submit
the support bundle to Mirantis Customer Support. The support
bundle will be sent, along with the following information:
Cluster ID
MKE version
MCR version
OS/architecture
Cluster size
For more information on the support command, refer to
support.
The Mirantis Kubernetes Engine (MKE) API is a REST API, available using HTTPS,
that enables programmatic access to Swarm and Kubernetes resources managed by
MKE. MKE exposes the full Mirantis Container Runtime API, so you can extend
your existing code with MKE features. The API is secured with role-based access
control (RBAC), and thus only authorized users can make changes and deploy
applications to your cluster.
The MKE API is accessible through the same IP addresses and domain names that
you use to access the MKE web UI. And as the API is the same one used by the
MKE web UI, you can use it to programmatically do everything you can do from
the MKE web UI.
The system manages Swarm resources through collections and Kubernetes resources
through namespaces. For detailed information on these resource sets, refer to
the RBAC core elements table in the Role-based access control documentation.
endpoint
Description
/roles
Allows you to enumerate and create custom permissions for accessing
collections.
/accounts
Enables the management of users, teams, and organizations.
Additional information is available for each command by using the --help
flag.
Note
To obtain the appropriate image, it may be necessary to use
docker/ucp:3.x.y rather than mirantis/ucp:3.x.y, as older versions
are associated with the docker organization. Review the images in the
mirantis
and docker
organizations on Docker Hub to determine the correct organization.
The backup command creates a backup of an MKE manager node.
Specifically, the command creates a TAR file with the contents of the volumes
used by the given MKE manager node and then prints it. You can then use the
restore command to restore the data from an existing backup.
To create backups of a multi-node cluster, you only need to back up a
single manager node. The restore operation will reconstitute a new MKE
installation from the backup of any previous manager node.
Note
The backup contains private keys and other sensitive information. Use
the --passphrase flag to encrypt the backup with PGP-compatible
encryption or --no-passphrase to opt out of encrypting the backup.
Mirantis does not recommend the latter option.
Specifies the name of the file wherein the backup contents are
written. This option requires that you bind-mount the file path to the
container that is performing the backup. The file path must be relative
to the container file tree. For example:
Installing MKE on a manager node with SELinux enabled at the daemon and the
operating system levels requires that you include
--security-optlabel=disable with your backup command. This flag
disables SELinux policies on the MKE container. The MKE container mounts and
configures the Docker socket as part of the MKE container. Therefore, the MKE
backup process fails with the following error if you neglect to include this
flag:
You must have access to a recent backup of your MKE instance to run the
ca command.
With the ca command you can make changes to the material of
MKE Root CA servers. Specifically, you
can set the server material to rotate automatically or you can replace it with
your own certificate and private key.
You can use the ca command with a provided Root CA certificate and
key by bind-mounting these credentials to the CLI container at /ca/cert.pem
and /ca/key.pem, respectively:
The MKE Cluster Root CA certificate must have swarm-ca as its common
name.
The MKE Client Root CA certificate must have UCPClientRootCA as its
common name.
The certificate must be a self-signed root certificate, and intermediate
certificates are not allowed.
The certificate and key must be in PEM format without a passphrase.
The MKE etcd Root CA certificate must have MKEetcdRootCA as its
common name.
Finally, to apply the certificates, you must reboot the manager nodes
one at a time, making sure to reboot the leader node last.
Note
If there are unhealthy nodes in the cluster, CA rotation cannot
complete. If the rotation is hanging, you can run the following command to
determine whether any nodes are down or are otherwise unable to rotate TLS
certificates:
The dump-certs command prints the public certificates used by the
MKE web server. Specifically, the command produces public certificates for the
MKE web server running on the specified node. By default, it prints the
contents of the ca.pem and cert.pem files.
Integrating MKE and MSR requires that you use this command with the
--cluster--ca flags to configure MSR.
Produces JSON-formatted output for easier parsing.
--ca
Prints only the contents of the ca.pem file.
--cluster
Prints the internal MKE swarm root CA and certificate instead of the
public server certificate.
--etcd
Prints the etcd server certificate. By default, the option prints the
contents of both the ca.pem and cert.pem files. You can, though,
print only the contents of ca.pem by using the option in
conjunction with the --ca option.
The id command prints the ID of the MKE components that run on your
MKE cluster. This ID matches the ID in the output of the docker info
command, when issued while using a client bundle.
The install command installs MKE on the specified node.
Specifically, the command initializes a new swarm, promotes the specified node
into a manager node, and installs MKE.
The following customizations are possible when installing MKE:
Customize the MKE web server certificates:
Create a volume named ucp-controller-server-certs.
Copy the ca.pem, cert.pem, and key.pem files to the root
directory.
Run the install` command with the --external-server-cert
flag.
Customize the license used by MKE using one of the following options:
Bind mount the file at /config/docker_subscription.lic in the tool. For
example:
Produces JSON-formatted output for easier parsing.
--interactive,-i
Runs in interactive mode, prompting for configuration values.
--admin-password<value>
Sets the MKE administrator password, $UCP_ADMIN_PASSWORD.
--admin-username<value>
Sets the MKE administrator user name, $UCP_ADMIN_USER.
--azure-ip-count<value>
Configures the number of IP addresses to be provisioned for each Azure
Virtual Machine.
Default: 128.
binpack
Sets the Docker Swarm scheduler to binpack mode, for backward
compatibility.
--cloud-provider<value>
Sets the cluster cloud provider.
Valid values: azure, gce.
--cni-installer-url<value>
Sets a URL that points to a Kubernetes YAML file that is used as an
installer for the cluster CNI plugin. If specified, the default CNI
plugin is not installed. If the URL uses the HTTPS scheme, no
certificate verification is performed.
--controller-port<value>
Sets the port for the web UI and the API
Default: 443.
--data-path-addr<value>
Sets the address or interface to use for data path traffic,
$UCP_DATA_PATH_ADDR.
Format: IP address or network interface name
--disable-tracking
Disables anonymous tracking and analytics.
--disable-usage
Disables anonymous usage reporting.
--dns-opt<value>
Sets the DNS options for the MKE containers, $DNS_OPT.
--dns-search<value>
Sets custom DNS search domains for the MKE containers, $DNS_SEARCH.
--dns<value>
Sets custom DNS servers for the MKE containers, $DNS.
--enable-profiling
Enables performance profiling.
--existing-config
Sets to use the latest existing MKE configuration during the
installation. The installation will fail if a configuration is not
found.
--external-server-cert
Customizes the certificates used by the MKE web server.
--external-service-lb<value>
Sets the IP address of the load balancer where you can expect to reach
published services.
--force-insecure-tcp
Forces the installation to continue despite unauthenticated Mirantis
Container Runtime ports.
--force-minimums
Forces the installation to occur even if the system does not meet the
minimum requirements.
--host-address<value>
Sets the network address that advertises to other nodes,
$UCP_HOST_ADDRESS.
Format: IP address or network interface name
--iscsiadm-pathvalue<value>
Sets the path to the host iscsiadm binary. This option is
applicable only when --storage-iscsi is specified.
--kube-apiserver-port<value>
Sets the port for the Kubernetes API server.
Default: 6443.
--kv-snapshot-count<value>
Sets the number of changes between key-value store snapshots,
$KV_SNAPSHOT_COUNT.
Default: 20000.
--kv-timeout<value>
Sets the timeout in milliseconds for the key-value store,
$KV_TIMEOUT.
Default: 5000.
--license<value>
Adds a license, $UCP_LICENSE.
Format: “$(catlicense.lic)”
--nodeport-range<value>
Sets the allowed port range for Kubernetes services of NodePort type.
Default: 32768-35535.
--pod-cidr<values>
Sets Kubernetes cluster IP pool for the Pods to be allocated from.
Default: 192.168.0.0/16.
--preserve-certs
Sets so that certificates are not generated if they already exist.
--pull<value>
Pulls MKE images.
Valid values: always, missing, and never
Default: missing.
--random
Sets the Docker Swarm scheduler to random mode, for backward
compatibility.
--registry-password<value>
Sets the password to use when pulling images, $REGISTRY_PASSWORD.
--registry-username<value>
Sets the user name to use when pulling images, $REGISTRY_USERNAME.
--san<value>
Adds subject alternative names to certificates, $UCP_HOSTNAMES.
For example: --sanwww2.acme.com
--service-cluster-ip-range<value>
Sets the Kubernetes cluster IP Range for services.
Default: 10.96.0.0/16.
--skip-cloud-provider-check
Disables checks which rely on detecting which cloud provider, if any,
the cluster is currently running on.
--storage-expt-enabled
Enables experimental features in Kubernetes storage.
--storage-iscsi
Enables ISCSI-based PersistentVolumes in Kubernetes.
--swarm-experimental
Enables Docker Swarm experimental features, for backward
compatibility.
--swarm-grpc-port<value>
Sets the port for communication between nodes.
Default: 2377.
--swarm-port<value>
Sets the port for the Docker Swarm manager, for backward compatibility.
Default: 2376.
--unlock-key<value>
Sets the unlock key for this swarm-mode cluster, if one exists,
$UNLOCK_KEY.
--unmanaged-cni
Indicates that Calico is the CNI provider, managed by MKE. Calico is the
default CNI provider.
--kubelet-data-root
Configures the kubelet data root directory on Linux when performing new
MKE installations.
--containerd-root
Configures the containerd root directory on Linux when performing new
MKE installations. Any non-root directory containerd customizations
must be made along with the root directory customizations prior to
installation and with the --containerd-root flag omitted.
--ingress-controller
Configures the HTTP ingress controller for the management of traffic
that originates outside the cluster.
--calico-ebpf-enabled
Sets whether Calico eBPF mode is enabled.
When specifying --calico-ebpf-enabled, do not use
--kube-default-drop-masq-bits or --kube-proxy-mode.
--kube-default-drop-masq-bits
Sets whether MKE uses Kubernetes default values for iptables drop and
masquerade bits.
--kube-proxy-mode
Sets the operational mode for kube-proxy.
Valid values: iptables, ipvs, disabled
Default: iptables.
--kube-protect-kernel-defaults
Protects kernel parameters from being overridden by kubelet.
Default: false.
Important
When enabled, kubelet can fail to start if the following kernel
parameters are not properly set on the nodes before you install MKE
or before adding a new node to an existing cluster:
Configures MKE in Swarm-only mode, which supports only Docker Swarm
orchestration.
--windows-containerd-root<value>
Sets the root directory for containerd on Windows.
--secure-overlay
Enables IPSec network encryption using SecureOverlay in Kubernetes.
--calico-ip-auto-method<value>
Allows the user to set the method for autodetecting the IPv4 address for
the host. When specified, IP autodetection method is set for
calico-node.
--calico-vxlan
Sets the calico CNI dataplane to VXLAN.
Default: VXLAN.
vxlan-vni<value>
Sets the vxlan-vni ID. Note that dataplane must be set to VXLAN.
Valid values: 10000 - 20000.
Default: 10000.
--cni-mtu<value>
Sets the MTU for CNI interfaces. Calculate MTU size based on which
overlay is in use. For user-specific configuration, subtract 20 bytes
for IPIP or 50 bytes for VXLAN.
Default: 1480 for IPIP, 1450 for VXLAN.
--windows-kubelet-data-root<value>
Sets the data root directory for kubelet on Windows.
--default-node-orchestrator<value>
Sets the default node orchestrator for the cluster.
Valid values: swarm, kubernetes.
Default: swarm.
--iscsidb-path<value>
Sets the absolute path to host iscsi DB. Verify that --storage-iscsi
is specified. Note that Symlinks are not allowed.
--kube-proxy-disabled
Disables kube-proxy. This option is activated by
--calico-ebpf-enabled, and it cannot be used in combination with
--kube-proxy-mode.
--cluster-label<value>
Sets the cluster label that is employed for usage reporting.
--multus-cni
Enables Multus CNI plugin in the MKE cluster. This meta plugin provides
the ability to attach multiple network interfaces to Pods using other
CNI plugins.
Installing MKE on a manager node with SELinux enabled at the daemon and the
operating system levels requires that you include
--security-optlabel=disable with your install command. This flag
disables SELinux policies on the installation container. The MKE
installation container mounts and configures the Docker socket as part
of the MKE installation container. Therefore, omitting this flag will result in
the failure of your MKE installation with the following error:
The restore command restores an MKE cluster from a backup.
Specifically, the command installs a new MKE cluster that is populated with the
state of a previous MKE manager node using a TAR file originally generated
using the backup command. All of the MKE settings, users, teams, and
permissions are restored from the backup file.
The restore operation does not alter or recover the following cluster
resources:
Containers
Networks
Volumes
Services
You can use the restore command on any manager node in an
existing cluster. If the current node does not belong in a
cluster, one is initialized using the value of the --host-address
flag. When restoring on an existing Swarm-mode cluster, there must be no
previous MKE components running on any node of the cluster. This cleanup
operation is performed using the uninstall-ucp command.
If the restoration is performed on a different cluster than the one from which
the backup file was created, the cluster root CA of the old MKE
installation is not restored. This restoration invalidates any
previously issued admin client bundles and, thus, all administrators
are required to download new client bundles after the operation
is complete. Any existing non-admin user client bundles remain fully
operational.
By default, the backup TAR file is read from stdin. You can also
bind-mount the backup file under /config/backup.tar and run the
restore command with the --interactive flag.
Note
You must run uninstall-ucp before attempting the restore operation on
an existing MKE cluster.
If your Swarm-mode cluster has lost quorum and the original set
of managers are not recoverable, you can attempt to recover a
single-manager cluster using the
docker swarm init --force-new-cluster command.
You can restore MKE from a backup that was taken on a different manager
node or a different cluster altogether.
Use the support command to create a support bundle for the
specified MKE nodes. This command creates a support bundle file for the
specified nodes, including the MKE cluster ID, and prints it to stdout.
The uninstall-ucp command uninstalls MKE from the specified swarm,
preserving the swarm so that your applications can continue running.
After MKE is uninstalled, you can use the docker swarm leave
and docker node rm commands to remove nodes from the swarm. You
cannot join nodes to the swarm until MKE is installed again.
Produces JSON-formatted output for easier parsing.
--interactive,-i
Runs in interactive mode and prompts for configuration values.
--id<value>
Sets the ID of the MKE instance to uninstall.
--no-purge-secret
Configures the command to leave the MKE-related Swarm secrets in place.
--pull<value>
Pulls MKE images.
Valid values: always, missing, and never.
--purge-config
Removes the MKE configuration file when uninstalling MKE.
--registry-password<value>
Sets the password to use when pulling images.
--registry-username<value>
Sets the user name to use when pulling images.
--unmanaged-cni
Specifies that MKE was installed in unmanaged CNI mode. When this
parameter is supplied to the uninstaller, no attempt is made to clean up
/etc/cni, thus causing any user-supplied CNI configuration files to
persist in their original state.
The Center for Internet Security (CIS) provides the CIS Kubernetes Benchmarks
for each Kubernetes release. These benchmarks comprise a comprehensive set of
recommendations that is targeted to enhancing Kubernetes security
configuration. Designed to align with industry regulations, CIS Benchmarks
ensure standards that meet diverse compliance requirements, and their universal
applicability across Kubernetes distributions ensures the fortification of such
environments and while fostering a robust security posture.
Note
The CIS Benchmark results detailed herein are verified against MKE 3.7.2.
Mirantis has based its handling of Kubernetes benchmarks on CIS Kubernetes
Benchmark v1.7.0.
Section 1 is comprised of security recommendations for the direct configuration
of Kubernetes control plane processes. It is broken out into four subsections:
1.1 Control Node Plane Configuration Files
Recommendation designation
Recommendation
Level
Result
1.1.1
Ensure that the API server pod specification file permissions are set
to 600 or more restrictive.
Level 1 - Master Node
Pass
1.1.2
Ensure that the API server pod specification file ownership is set to
root:root.
Level 1 - Master Node
Pass
1.1.3
Ensure that the controller manager pod specification file permissions
are set to 600 or more restrictive.
Level 1 - Master Node
Pass
1.1.4
Ensure that the controller manager pod specification file ownership is
set to root:root.
Level 1 - Master Node
Pass
1.1.5
Ensure that the scheduler pod specification file permissions are set
to 600 or more restrictive.
Level 1 - Master Node
Pass
1.1.6
Ensure that the scheduler pod specification file ownership is set to
root:root.
Level 1 - Master Node
Pass
1.1.7
Ensure that the etcd pod specification file permissions are set to
600 or more restrictive.
Level 1 - Master Node
Pass
1.1.8
Ensure that the etcd pod specification file ownership is set to
root:root.
Level 1 - Master Node
Pass
1.1.9
Ensure that the Container Network Interface file permissions are set
to 600 or more restrictive.
Level 1 - Master Node
Pass
1.1.10
Ensure that the Container Network Interface file ownership is set to
root:root.
Level 1 - Master Node
Pass
1.1.11
Ensure that the etcd data directory permissions are set to 700 or
more restrictive.
Level 1 - Master Node
Pass
1.1.12
Ensure that the etcd data directory ownership is set to etcd:etcd.
Level 1 - Master Node
Fail
MKE runs etcd in a container, and thus it does not create an etcd
user on the host. Access to the etcd data directory is instead
controlled through a docker volume.
1.1.13
Ensure that the admin.conf file permissions are set to 600 or
more restrictive.
Level 1 - Master Node
Pass
1.1.14
Ensure that the admin.conf file ownership is set to root:root.
Level 1 - Master Node
Pass
1.1.15
Ensure that the scheduler.conf file permissions are set to 600
or more restrictive.
Level 1 - Master Node
Pass
1.1.16
Ensure that the scheduler.conf file ownership is set to
root:root.
Level 1 - Master Node
Pass
1.1.17
Ensure that the controller-manager.conf file permissions are set
to 600 or more restrictive.
Level 1 - Master Node
Pass
1.1.18
Ensure that the controller-manager.conf file ownership is set to
root:root.
Level 1 - Master Node
Pass
1.1.19
Ensure that the Kubernetes PKI directory and file ownership is set to
root:root.
Level 1 - Master Node
Pass
1.1.20
Ensure that the Kubernetes PKI certificate file permissions are set to
600 or more restrictive.
Level 1 - Master Node
Pass
1.1.21
Ensure that the Kubernetes PKI key file permissions are set to
600.
Level 1 - Master Node
Pass
1.2 API Server
Recommendation designation
Recommendation
Level
Result
1.2.1
Ensure that the --anonymous-auth argument is set to false.
Level 1 - Master Node
Pass
1.2.2
Ensure that the --token-auth-file parameter is not set.
Level 1 - Master Node
Pass
1.2.3
Ensure that the -DenyServiceExternalIPs argument is set.
Level 1 - Master Node
Pass
1.2.4
Ensure that the --kubelet-client-certificate and
--kubelet-client-key arguments are set as appropriate.
Level 1 - Master Node
Pass
1.2.5
Ensure that the --kubelet-certificate-authority argument is set
as appropriate.
Level 1 - Master Node
Pass
1.2.6
Ensure that the --authorization-mode argument is not set to
AlwaysAllow.
Level 1 - Master Node
Pass
1.2.7
Ensure that the --authorization-mode argument includes Node.
Level 1 - Master Node
Pass
1.2.8
Ensure that the --authorization-mode argument includes RBAC.
Level 1 - Master Node
Pass
1.2.9
Ensure that the admission control plugin EventRateLimit is set.
Level 1 - Master Node
Fail
Optionally, MKE can configure the EventRateLimit admission
controller plugin.
1.2.10
Ensure that the admission control plugin AlwaysAdmit is not set.
Level 1 - Master Node
Pass
1.2.11
Ensure that the admission control plugin AlwaysPullImages is set.
Level 1 - Master Node
Fail
Optionally, MKE can configure the AlwaysPullImages admission
controller plugin.
1.2.12
Ensure that the admission control plugin SecurityContextDeny is
set if PodSecurityPolicy is not used.
Level 1 - Master Node
Pass
1.2.13
Ensure that the admission control plugin ServiceAccount is set.
Level 1 - Master Node
Pass
1.2.14
Ensure that the admission control plugin NamespaceLifecycle is
set.
Level 1 - Master Node
Pass
1.2.15
Ensure that the admission control plugin NodeRestriction is set.
Level 1 - Master Node
Pass
1.2.16
Ensure that the --secure-port option is not set to 0. Note:
This recommendation is obsolete and will be deleted per the consensus
process.
Level 1 - Master Node
Pass
1.2.17
Ensure that the --profiling option is set to false.
Level 1 - Master Node
Pass
1.2.18
Ensure that the --audit-log-path option is set.
Level 1 - Master Node
Pass
1.2.19
Ensure that the --audit-log-maxage argument is set to 30 or
as appropriate.
Level 1 - Master Node
Pass
1.2.20
Ensure that the --audit-log-maxbackup argument is set to 10
or as appropriate.
Level 1 - Master Node
Pass
1.2.21
Ensure that the --audit-log-maxsize argument is set to 100 or
as appropriate.
Level 1 - Master Node
Pass
1.2.22
Ensure that the --request-timeout argument is set as appropriate.
Level 1 - Master Node
Fail
Optionally, MKE can configure the Kubernetes API server
–-request-timeout argument value.
1.2.23
Ensure that the --service-account-lookup argument is set to
true.
Level 1 - Master Node
Pass
1.2.24
Ensure that the ``–service-account-key-file `` argument is set as
appropriate.
Level 1 - Master Node
Pass
1.2.25
Ensure that the --etcd-certfile and --etcd-keyfile arguments
are set as appropriate.
Level 1 - Master Node
Pass
1.2.26
Ensure that the --tls-cert-file and ``–tls-private-key-file ``
arguments are set as appropriate.
Level 1 - Master Node
Pass
1.2.27
Ensure that the --client-ca-file argument is set as appropriate.
Level 1 - Master Node
Pass
1.2.28
Ensure that the --etcd-cafile argument is set as appropriate.
Level 1 - Master Node
Pass
1.2.29
Ensure that the --encryption-provider-config argument is set as
appropriate.
Level 1 - Master Node
Pass
1.2.30
Ensure that encryption providers are appropriately configured.
Level 1 - Master Node
Pass
1.2.31
Ensure that the API Server only makes use of Strong Cryptographic
Ciphers.
Level 1 - Master Node
Fail
Optionally, MKE can be configured to support a list of compliant TLS
ciphers.
1.3 Controller Manager
Recommendation designation
Recommendation
Level
Result
1.3.1
Ensure that the --terminated-pod-gc-threshold argument is set as
appropriate.
Level 1 - Master Node
Fail
Optionally, MKE can be configured to use a compliant
terminated-pod-gc-threshold value.
1.3.2
Ensure that the --profiling argument is set to false.
Level 1 - Master Node
Pass
1.3.3
Ensure that the --use-service-account-credentials argument is set
to true.
Level 1 - Master Node
Pass
1.3.4
Ensure that the --service-account-private-key-file argument is
set as appropriate.
Level 1 - Master Node
Pass
1.3.5
Ensure that the --root-ca-file argument is set as appropriate.
Level 1 - Master Node
Pass
1.3.6
Ensure that the RotateKubeletServerCertificate argument is set to
true.
Level 1 - Master Node
Pass
1.3.7
Ensure that the --bind-address argument is set to 127.0.0.1.
Level 1 - Master Node
Pass
1.4 Scheduler
Recommendation designation
Recommendation
Level
Result
1.4.1
Ensure that the --profiling``argumentissetto``false.
Level 1 - Master Node
Pass
1.4.2
Ensure that the --bind-address argument is set to 127.0.0.1.
Section 4 details security recommendations for the components that run on
Kubernetes worker nodes.
Note
Note that the components for Kubernetes worker nodes may also run on
Kubernetes master nodes. Thus, the recommendations in Section 4 should be
applied to master nodes as well as worker nodes where the master nodes make
use of these components.
Section 4 is broken out into two subsections:
4.1 Worker Node Configuration Files
Recommendation designation
Recommendation
Level
Result
4.1.1
Ensure that the kubelet service file permissions are set to 600 or
more restrictive.
Level 1 - Worker Node
Pass
4.1.2
Ensure that the kubelet service file ownership is set to
root:root.
Level 1 - Worker Node
Pass
4.1.3
If proxy kubeconfig file exists, ensure permissions are set to
600 or more restrictive.
Level 1 - Worker Node
Pass
4.1.4
If proxy kubeconfig file exists, ensure ownership is set to
root:root.
Level 1 - Worker Node
Pass
4.1.5
Ensure that the --kubeconfigkubelet.conf file permissions are
set to 600 or more restrictive.
Level 1 - Worker Node
Pass
4.1.6
Ensure that the --kubeconfigkubelet.conf file ownership is set
to root:root.
Level 1 - Worker Node
Pass
4.1.7
Ensure that the certificate authorities file permissions are set to
600 or more restrictive.
Level 1 - Worker Node
Fail
MKE sets the CA cert file permission to 644. This fulfills the
control requirement of restricting write access to administrators,
thus preventing non-root containers from accessing the file. Further
restrictions to 600 are unnecessary and can potentially
complicate the configuration.
4.1.8
Ensure that the client certificate authorities file ownership is set
to root:root.
Level 1 - Worker Node
Pass
4.1.9
If the kubelet config.yaml configuration file is being used
validate permissions set to 600 or more restrictive.
Level 1 - Worker Node
Pass
4.1.10
If the kubelet config.yaml configuration file is being used
validate file ownership is set to root:root.
Level 1 - Worker Node
Pass
4.2 Kubelet
Recommendation designation
Recommendation
Level
Result
4.2.1
Ensure that the --anonymous-auth argument is set to false.
Level 1 - Worker Node
Pass
4.2.2
Ensure that the --authorization-mode argument is not set to
AlwaysAllow.
Level 1 - Worker Node
Pass
4.2.3
Ensure that the --client-ca-file argument is set as appropriate.
Level 1 - Worker Node
Pass
4.2.4
Verify that the --read-only-port argument is set to 0.
Level 1 - Worker Node
Pass
4.2.5
Ensure that the --streaming-connection-idle-timeout argument is
not set to 0.
Level 1 - Worker Node
Pass
4.2.6
Ensure that the --make-iptables-util-chains argument is set to
true.
Level 1 - Worker Node
Pass
4.2.7
Ensure that the --hostname-override argument is not set.
Level 1 - Worker Node
Pass
4.2.8
Ensure that the eventRecordQPS argument is set to a level which
ensures appropriate event capture.
Level 2 - Worker Node
Pass
4.2.9
Ensure that the --tls-cert-file and --tls-private-key-file
arguments are set as appropriate.
Level 1 - Worker Node
Pass
4.2.10
Ensure that the --rotate-certificates argument is not set to
false.
Level 1 - Worker Node
Fail
Not applicable, as MKE has a certificate authority that issues TLS
certificates for kubelet.
4.2.11
Verify that the RotateKubeletServerCertificate argument is set to
true.
Level 1 - Worker Node
Fail
Not applicable, as MKE has a certificate authority that issues TLS
certificates for kubelet.
4.2.12
Ensure that the Kubelet only makes use of Strong Cryptographic
Ciphers.
Level 1 - Worker Node
Fail
Optionally, MKE can be configured to support a list of compliant TLS ciphers.
Section 5 details recommendations for various Kubernetes policies which are
important to the security of the environment. Section 5 is broken out into six
subsections, with 5.6 not in use:
5.1 RBAC and Service Accounts
Recommendation designation
Recommendation
Level
Result
5.1.1
Ensure that the cluster-admin role is only used where required.
Level 1 - Master Node
Pass
5.1.2
Minimize access to secrets.
Level 1 - Master Node
Pass
5.1.3
Minimize wildcard use in Roles and ClusterRoles.
Level 1 - Worker Node
Pass
5.1.4
Minimize access to create Pods.
Level 1 - Master Node
Pass
5.1.5
Ensure that default service accounts are not actively used.
Level 1 - Master Node
Pass
MKE installations are compliant starting with MKE
3.7.1. For customers upgrading from previous MKE versions, Mirantis
offers a script that can be used to determine which service accounts
are in violation and that offers an option for patching such accounts.
5.1.6
Ensure that Service Account Tokens are only mounted where necessary.
Level 1 - Master Node
Fail
MKE system service accounts set automount to false at the
service account level and override the automount flag on the
system Pods that require it.
To have core MKE functionality, the following Pods must mount their
respective service account tokens:
calico-kube-controllers
calico-node
coredns
ucp-metrics
ucp-node-feature-discovery
5.1.7
Avoid use of system:masters group.
Level 1 - Master Node
Pass
5.1.8
Limit use of the Bind, Impersonate and Escalate permissions in the
Kubernetes cluster.
Level 1 - Master Node
Pass
5.1.9
Minimize access to create persistent volumes.
Level 1 - Master Node
Pass
5.1.10
Minimize access to the proxy sub-resource of nodes.
Level 1 - Master Node
Pass
5.1.11
Minimize access to the approval sub-resource of
certificatesigningrequests objects.
Level 1 - Master Node
Pass
5.1.12
Minimize access to webhook configuration objects.
Level 1 - Master Node
Pass
5.1.13
Minimize access to the service account token creation.
Level 1 - Master Node
Pass
5.2 Pod Security Standards
Recommendation designation
Recommendation
Level
Result
5.2.1
Ensure that the cluster has at least one active policy control
mechanism in place.
Level 1 - Master Node
Pass
5.2.2
Minimize the admission of privileged containers.
Level 1 - Master Node
Pass
5.2.3
Minimize the admission of containers wishing to share the host
process ID namespace.
Level 1 - Master Node
Pass
5.2.4
Minimize the admission of containers wishing to share the host IPC
namespace.
Level 1 - Master Node
Pass
5.2.5
Minimize the admission of containers wishing to share the host
network namespace.
Level 1 - Master Node
Pass
5.2.6
Minimize the admission of containers with
allowPrivilegeEscalation.
Level 1 - Master Node
Pass
5.2.7
Minimize the admission of root containers.
Level 2 - Master Node
Pass
5.2.8
Minimize the admission of containers with the NET_RAW capability.
Level 1 - Master Node
Pass
MKE control plane containers no longer use NET_RAW, however policies
must be added to restrict NET_RAW capability for user workloads.
5.2.9
Minimize the admission of containers with added capabilities.
Level 1 - Master Node
Pass
5.2.10
Minimize the admission of containers with capabilities assigned.
Level 2 - Master Node
Pass
5.2.11
Minimize the admission of Windows HostProcess Containers.
Level 1 - Master Node
Pass
5.2.12
Minimize the admission of HostPath volumes.
Level 1 - Master Node
Pass
5.2.13
Minimize the admission of containers which use HostPorts.
Level 1 - Master Node
Pass
5.3 Pod Network Policies and CNI
Recommendation designation
Recommendation
Level
Result
5.3.1
Ensure that the CNI in use supports Network Policies.
Level 1 - Master Node
Pass
5.3.2
Ensure that all Namespaces have Network Policies defined.
Level 2 - Master Node
Pass
5.4 Secrets Management
Recommendation designation
Recommendation
Level
Result
5.4.1
Prefer using secrets as files over secrets as environment variables.
Level 2 - Master Node
Pass
5.4.2
Consider external secret storage.
Level 2 - Master Node
Pass
5.5 Secrets Management
Recommendation designation
Recommendation
Level
Result
5.5.1
Configure Image Provenance using ImagePolicyWebhook admission
controller.
Level 2 - Master Node
Pass
5.7 General Policies
Recommendation designation
Recommendation
Level
Result
5.7.1
Create administrative boundaries between resources using namespaces.
Level 1 - Master Node
Pass
5.7.2
Ensure that the seccomp profile is set to docker/default in
your Pod definitions.
Level 2 - Master Node
Pass
5.7.3
Apply Security Context to Your Pods and Containers.
Upgrading from one MKE minor version to another minor version can result
in the downgrading of MKE middleware components. For more information,
refer to the middleware versioning tables in the release notes of both the
source and target MKE versions.
In MKE 3.7.0 - 3.7.1, performance issues may occur with both cri-dockerd
and dockerd due to the manner in which cri-dockerd handles container and
ImageFSInfo statistics.
MKE 3.7.16 current
The MKE 3.7.16 patch release focuses exclusively on CVE mitigation.
MKE 3.7.15
Patch release for MKE 3.7 introducing the following key features:
Ability to enable cAdvisor through API call
New flag for collecting metrics during support bundle generation
Hypervisor Looker dashboard information added to telemetry
MKE 3.7.14
The MKE 3.7.14 patch release focuses exclusively on CVE mitigation.
MKE 3.7.13
The MKE 3.7.13 patch release focuses exclusively on CVE mitigation.
MKE 3.7.12
Patch release for MKE 3.7 introducing the following key features:
Addition of external cloud provider support for AWS
GracefulNodeShutdown settings now configurable
MKE 3.7.11
The MKE 3.7.11 patch release focuses exclusively on CVE mitigation.
MKE 3.7.10
Patch release for MKE 3.7 introducing the following key features:
Support for NodeLocalDNS 1.23.1
Support for Kubelet node configurations
node-exporter port now configurable
MKE 3.7.9
The MKE 3.7.9 patch release focuses exclusively on CVE mitigation.
MKE 3.7.8
Patch release for MKE 3.7 introducing the following key features:
Addition of Kubernetes log retention configuration parameters
Customizability of audit log policies
Support for scheduling of etcd cluster cleanup and defragmentation
Inclusion of Docker events in MKE support bundle
MKE 3.7.7
The MKE 3.7.7 patch release focuses exclusively on CVE mitigation.
MKE 3.7.6
Patch release for MKE 3.7 introducing the following key features:
Kubernetes for GMSA now supported
Addition of ucp-cadvisor container level metrics component
MKE 3.7.5
Patch release for MKE 3.7 introducing the following key features:
etcd alarms are exposed through Prometheus metrics
Augmented validation for etcd storage quota
Improved handling of larger sized etcd instances
All errors now returned from pre upgrade checks
Minimum Docker storage requirement now part of pre upgrade checks
MKE 3.7.4 (discontinued)
MKE 3.7.4 was discontinued shortly after release due to issues encountered
when upgrading to it from previous versions of the product.
MKE 3.7.3
The MKE 3.7.3 patch release focuses exclusively on CVE resolution.
MKE 3.7.2
Patch release for MKE 3.7 introducing the following key features:
Prometheus metrics scraped from Linux workers
Performance improvement to MKE image tagging API
MKE 3.7.1
Initial MKE 3.7.1 release introducing the following key features:
Support bundle metrics additions for new MKE 3.7 features
Added ability to filter organizations by name in MKE web UI
Increased Docker and Kubernetes CIS benchmark compliance
MetalLB supports MKE-specific loglevel
Improved Kubernetes role creation error handling in MKE web UI
Increased SAML proxy feedback detail
Upgrade verifies that cluster nodes have minimum required MCR
kube-proxy now binds only to localhost
Enablement of read-only rootfs for specific containers
Support for cgroup v2
Added MKE web UI capability to add OS constraints to swarm services
Added ability to set support bundle collection windows
Added ability to set line limit of log files in support bundles
Addition of search function to Grants > Swarm in MKE web UI
MKE 3.7.0
Initial MKE 3.7.0 release introducing the following key features:
ZeroOps: certificate management
ZeroOps: upgrade rollback
ZeroOps: metrics
Prometheus memory resources
etcd event cleanup
Ingress startup options: TLS, TCP/UDP, HTTP/HTTPS
Additional NGINX Ingress Controller options
Setting for NGINX Ingress Controller default ports
[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible¶
In air-gapped swarm-only environments, upgrades fail to start if all of the
MKE images are not preloaded on the selected manager node or if the node
cannot automatically pull the required MKE images.
Workaround:
Ensure either that the manager nodes have the complete set of MKE images
preloaded before performing an upgrade or that they can pull the images from a
remote repository.
[MKE-11535]
ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state¶
Due to the upstream dependency issue in gpu-feature-discovery software,
customers may encounter nvidia-gpu-feature-discovery in
CrashLoopBackOff state with the following errors:
[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes¶
The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a
Linux-only solution, however it attempts without success to also deploy to
Windows nodes.
Workaround:
Edit the node-local-dns daemonset:
kubectleditdaemonsetnode-local-dns-nkube-system
Add the following under spec.template.spec:
nodeSelector:
kubernetes.io/os:linux
Save the daemonset.
[MKE-11525] Kubelet node profiles fail to supersede global setting¶
Flags specified in the global custom_kubelet_flags setting and then applied
through kubelet node profiles end up being applied twice.
Workaround:
Do not define any global flags in the global custom_kubelet_flags setting
that will be used in kubelet node profiles.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
The MKE 3.7.16 patch release focuses exclusively on CVE mitigation. To this
end, the following middleware component versions have been upgraded to resolve
vulnerabilities in MKE:
The various Is methods (IsPrivate, IsLoopback, etc) did not work as
expected for IPv4-mapped IPv6 addresses, returning false for addresses
which would return true in their traditional IPv4 forms.
An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of
header data by sending an excessive number of CONTINUATION frames.
Maintaining HPACK state requires parsing and processing all HEADERS and
CONTINUATION frames on a connection. When a request’s headers exceed
MaxHeaderBytes, no memory is allocated to store the excess headers, but
they are still parsed. This permits an attacker to cause an HTTP/2
endpoint to read arbitrary amounts of header data, all associated with a
request which is going to be rejected. These headers can include
Huffman-encoded data which is significantly more expensive for the
receiver to decode than for an attacker to send. The fix sets a limit on
the amount of excess header frames we will process before closing a
connection.
A flaw was found in cri-o, where an arbitrary systemd property can be
injected via a Pod annotation. Any user who can create a pod with an
arbitrary annotation may perform an arbitrary action on the host system.
Issue summary: Calling the OpenSSL API function SSL_select_next_proto
with an empty supported client protocols buffer may cause a crash or
memory contents to be sent to the peer.
libcurl’s ASN1 parser has this utf8asn1str() function used for parsing
an ASN.1 UTF-8 string. Itcan detect an invalid field and return error.
Unfortunately, when doing so it also invokes free() on a 4 byte
localstack buffer. Most modern malloc implementations detect this error
and immediately abort. Some however accept the input pointer and add
that memory to its list of available chunks. This leads to the
overwriting of nearby stack memory. The content of the overwrite is
decided by the free() implementation; likely to be memory pointers and
a set of flags. The most likely outcome of exploting this flaw is a
crash, although it cannot be ruled out that more serious results can be
had in special circumstances.
libcurl’s URL API function
[curl_url_get()](https://curl.se/libcurl/c/curl_url_get.html) offers
punycode conversions, to and from IDN. Asking to convert a name that is
exactly 256 bytes, libcurl ends up reading outside of a stack based
buffer when built to use the macidn IDN backend. The conversion
function then fills up the provided buffer exactly - but does not null
terminate the string. This flaw can lead to stack contents accidently
getting returned as part of the converted string.
When an application tells libcurl it wants to allow HTTP/2 server push,
and the amount of received headers for the push surpasses the maximum
allowed limit (1000), libcurl aborts the server push. When aborting,
libcurl inadvertently does not free all the previously allocated headers
and instead leaks the memory. Further, this error condition fails
silently and is therefore not easily detected by an application.
libcurl did not check the server certificate of TLS connections done to
a host specified as an IP address, when built to use mbedTLS. libcurl
would wrongly avoid using the set hostname function when the specified
hostname was given as an IP address, therefore completely skipping the
certificate check. This affects all uses of TLS protocols (HTTPS, FTPS,
IMAPS, POPS3, SMTPS, etc).
Applications performing certificate name checks (e.g., TLS clients
checking server certificates) may attempt to read an invalid memory
address resulting in abnormal termination of the application process.
Impact summary: Abnormal termination of an application can a cause a
denial of service. Applications performing certificate name checks
(e.g., TLS clients checking server certificates) may attempt to read an
invalid memory address when comparing the expected name with an
otherName subject alternative name of an X.509 certificate. This may
result in an exception that terminates the application program. Note
that basic certificate chain validation (signatures, dates, …) is not
affected, the denial of service can occur only when the application also
specifies an expected DNS name, Email address or IP address. TLS servers
rarely solicit client certificates, and even when they do, they
generally don’t perform a name check against a reference identifier
(expected identity), but rather extract the presented identity after
checking the certificate chain. So TLS servers are generally not
affected and the severity of the issue is Moderate. The FIPS modules in
3.3, 3.2, 3.1 and 3.0 are not affected by this issue.
Calling Decoder.Decode on a message which contains deeply nested
structures can cause a panic due to stack exhaustion. This is a
follow-up to CVE-2022-30635.
NGINX Open Source and NGINX Plus have a vulnerability in the
ngx_http_mp4_module, which might allow an attacker to over-read NGINX
worker memory resulting in its termination, using a specially crafted
mp4 file. The issue only affects NGINX if it is built with the
ngx_http_mp4_module and the mp4 directive is used in the configuration
file. Additionally, the attack is possible only if an attacker can
trigger the processing of a specially crafted mp4 file with the
ngx_http_mp4_module. Note: Software versions which have reached End of
Technical Support (EoTS) are not evaluated.
When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC
module and the network infrastructure supports a Maximum Transmission
Unit (MTU) of 4096 or greater without fragmentation, undisclosed QUIC
packets can cause NGINX worker processes to leak previously freed
memory.
When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC
module, undisclosed HTTP/3 encoder instructions can cause NGINX worker
processes to terminate or cause or other potential impact.
When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC
module, undisclosed HTTP/3 requests can cause NGINX worker processes to
terminate or cause other potential impact. This attack requires that a
request be specifically timed during the connection draining process,
which the attacker has no visibility and limited influence over.
When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC
module, undisclosed requests can cause NGINX worker processes to
terminate. Note: The HTTP/3 QUIC module is not enabled by default and is
considered experimental. For more information, refer to Support for QUIC
and HTTP/3 https://nginx.org/en/docs/quic.html. Note: Software versions
which have reached End of Technical Support (EoTS) are not evaluated.
[MKE-11928] Ability to enable cAdvisor through API call¶
Addition of API endpoints through which admins can now enable and disable
cAdvisor by POST to api/ucp/config/c-advisor/enable and
api/ucp/config/c-advisor/disable, respectively.
[FIELD-7167] New flag for collecting metrics during support bundle generation¶
Added the metrics flag, which when used with the support CLI
command triggers the collection of metrics during the generation of a support
bundle. The default value is false.
[FIELD-5197] Hypervisor Looker dashboard information added to telemetry¶
Added Hypervisor Looker dashboard information to the MKE segment telemetry.
[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible¶
In air-gapped swarm-only environments, upgrades fail to start if all of the
MKE images are not preloaded on the selected manager node or if the node
cannot automatically pull the required MKE images.
Workaround:
Ensure either that the manager nodes have the complete set of MKE images
preloaded before performing an upgrade or that they can pull the images from a
remote repository.
[MKE-11535]
ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state¶
Due to the upstream dependency issue in gpu-feature-discovery software,
customers may encounter nvidia-gpu-feature-discovery in
CrashLoopBackOff state with the following errors:
[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes¶
The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a
Linux-only solution, however it attempts without success to also deploy to
Windows nodes.
Workaround:
Edit the node-local-dns daemonset:
kubectleditdaemonsetnode-local-dns-nkube-system
Add the following under spec.template.spec:
nodeSelector:
kubernetes.io/os:linux
Save the daemonset.
[MKE-11525] Kubelet node profiles fail to supersede global setting¶
Flags specified in the global custom_kubelet_flags setting and then applied
through kubelet node profiles end up being applied twice.
Workaround:
Do not define any global flags in the global custom_kubelet_flags setting
that will be used in kubelet node profiles.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
Calling Decoder.Decode on a message which contains deeply nested
structures can cause a panic due to stack exhaustion. This is a
follow-up to CVE-2022-30635.
[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible¶
In air-gapped swarm-only environments, upgrades fail to start if all of the
MKE images are not preloaded on the selected manager node or if the node
cannot automatically pull the required MKE images.
Workaround:
Ensure either that the manager nodes have the complete set of MKE images
preloaded before performing an upgrade or that they can pull the images from a
remote repository.
[MKE-11535]
ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state¶
Due to the upstream dependency issue in gpu-feature-discovery software,
customers may encounter nvidia-gpu-feature-discovery in
CrashLoopBackOff state with the following errors:
[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes¶
The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a
Linux-only solution, however it attempts without success to also deploy to
Windows nodes.
Workaround:
Edit the node-local-dns daemonset:
kubectleditdaemonsetnode-local-dns-nkube-system
Add the following under spec.template.spec:
nodeSelector:
kubernetes.io/os:linux
Save the daemonset.
[MKE-11525] Kubelet node profiles fail to supersede global setting¶
Flags specified in the global custom_kubelet_flags setting and then applied
through kubelet node profiles end up being applied twice.
Workaround:
Do not define any global flags in the global custom_kubelet_flags setting
that will be used in kubelet node profiles.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
The MKE 3.7.14 patch release focuses exclusively on CVE mitigation. To this
end, the following middleware component versions have been upgraded to resolve
vulnerabilities in MKE:
[MKE-11916] Kubernetes 1.27.16
[MKE-11833] etcd 3.5.15
The following table details the specific CVEs addressed, including which images
are affected per CVE.
libcurl did not check the server certificate of TLS connections done to
a host specified as an IP address, when built to use mbedTLS. libcurl
would wrongly avoid using the set hostname function when the specified
hostname was given as an IP address, therefore completely skipping the
certificate check. This affects all uses of TLS protocols (HTTPS, FTPS,
IMAPS, POPS3, SMTPS, etc).
An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of
header data by sending an excessive number of CONTINUATION frames.
Maintaining HPACK state requires parsing and processing all HEADERS and
CONTINUATION frames on a connection. When a request’s headers exceed
MaxHeaderBytes, no memory is allocated to store the excess headers, but
they are still parsed. This permits an attacker to cause an HTTP/2
endpoint to read arbitrary amounts of header data, all associated with a
request which is going to be rejected. These headers can include
Huffman-encoded data which is significantly more expensive for the
receiver to decode than for an attacker to send. The fix sets a limit on
the amount of excess header frames we will process before closing a
connection.
[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible¶
In air-gapped swarm-only environments, upgrades fail to start if all of the
MKE images are not preloaded on the selected manager node or if the node
cannot automatically pull the required MKE images.
Workaround:
Ensure either that the manager nodes have the complete set of MKE images
preloaded before performing an upgrade or that they can pull the images from a
remote repository.
[MKE-11535]
ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state¶
Due to the upstream dependency issue in gpu-feature-discovery software,
customers may encounter nvidia-gpu-feature-discovery in
CrashLoopBackOff state with the following errors:
[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes¶
The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a
Linux-only solution, however it attempts without success to also deploy to
Windows nodes.
Workaround:
Edit the node-local-dns daemonset:
kubectleditdaemonsetnode-local-dns-nkube-system
Add the following under spec.template.spec:
nodeSelector:
kubernetes.io/os:linux
Save the daemonset.
[MKE-11525] Kubelet node profiles fail to supersede global setting¶
Flags specified in the global custom_kubelet_flags setting and then applied
through kubelet node profiles end up being applied twice.
Workaround:
Do not define any global flags in the global custom_kubelet_flags setting
that will be used in kubelet node profiles.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
The MKE 3.7.13 patch release focuses exclusively on CVE mitigation. To this
end, the following middleware component versions have been upgraded to resolve
vulnerabilities in MKE:
A flaw has been discovered in GnuTLS where an application crash can be
induced when attempting to verify a specially crafted .pem bundle using
the “certtool –verify-chain” command.
Certain DNSSEC aspects of the DNS protocol (in RFC 4033, 4034, 4035,
6840, and related RFCs) allow remote attackers to cause a denial of
service (CPU consumption) via one or more DNSSEC responses, aka the
“KeyTrap” issue. One of the concerns is that, when there is a zone with
many DNSKEY and RRSIG records, the protocol specification implies that
an algorithm must evaluate all combinations of DNSKEY and RRSIG records.
The Closest Encloser Proof aspect of the DNS protocol (in RFC 5155 when
RFC 9276 guidance is skipped) allows remote attackers to cause a denial
of service (CPU consumption for SHA-1 computations) via DNSSEC responses
in a random subdomain attack, aka the “NSEC3” issue. The RFC 5155
specification implies that an algorithm must perform thousands of
iterations of a hash function in certain situations.
Issue summary: Calling the OpenSSL API function SSL_select_next_proto
with an empty supported client protocols buffer may cause a crash or
memory contents to be sent to the peer.
The iconv() function in the GNU C Library versions 2.39 and older may
overflow the output buffer passed to it by up to 4 bytes when converting
strings to the ISO-2022-CN-EXT character set, which may be used to crash
an application or overwrite a neighbouring variable.
nscd: Stack-based buffer overflow in netgroup cache If the Name Service
Cache Daemon’s (nscd) fixed size cache is exhausted by client requests
then a subsequent client request for netgroup data may result in a
stack-based buffer overflow. This flaw was introduced in glibc 2.15
when the cache was added to nscd. This vulnerability is only present in
the nscd binary.
[MKE-11534] Addition of external cloud provider support for AWS¶
MKE 3.7 now supports the use of external cloud providers for AWS. As a result,
MKE 3.6 users who are using --cloud-provider=aws can now migrate to
MKE 3.7, first by upgrading to MKE 3.6.17 and then to 3.7.12.
[FIELD-6967] GracefulNodeShutdown settings now configurable¶
With custom kubelet node profiles, you can now configure the following
kubelet GracefulNodeShutdown flags, which control the node shutdown grace
periods:
–shutdown-grace-period
–shutdown-grace-period-critical-pods
The GracefulNodeShutdown feature gate is enabled by default, with the
shutdown grace parameters set to 0s.
For more information, refer to configure-gracefulnodeshutdown-settings.
Issues addressed in the MKE 3.7.12 release include:
[FIELD-7110] Fixed an issue wherein nvidia-gpu-feature-discovery pods
crashed with the
"--mig-strategy=mixed":executablefilenotfoundin$PATH:unknown error
whenever nvidia_device_plugin was enabled. If you observe related issues,
apply the workaround described in
[MKE-11535]
ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state.
[FIELD-7106] At startup, the ucp-kubelet container now obtains the
profile label directly from etcd, thus ensuring that the kubelet
profile settings are in place at the moment the container is created.
[FIELD-7059] Fixed an issue wherein the ucp-cluster-agent container leaks
connections to Windows nodes.
[FIELD-7053] Fixed an issue wherein, in rare cases, cri-dockerd failed to
pull large images.
[FIELD-7037] Fixed an issue exclusive to MKE 3.7.8 through 3.7.10, wherein in
an air-gapped environment, the addition of a second manager node could cause
cluster deployment to fail.
[FIELD-7032] Fixed an issue wherein SAML metadata was not deleted from
RethinkDB upon disablement of the SAML configuration.
[FIELD-7012] Addition of TLS configuration to node-exporter.
[MKE-11535]
ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state¶
Due to the upstream dependency issue in gpu-feature-discovery software,
customers may encounter nvidia-gpu-feature-discovery in
CrashLoopBackOff state with the following errors:
[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes¶
The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a
Linux-only solution, however it attempts without success to also deploy to
Windows nodes.
Workaround:
Edit the node-local-dns daemonset:
kubectleditdaemonsetnode-local-dns-nkube-system
Add the following under spec.template.spec:
nodeSelector:
kubernetes.io/os:linux
Save the daemonset.
[MKE-11525] Kubelet node profiles fail to supersede global setting¶
Flags specified in the global custom_kubelet_flags setting and then applied
through kubelet node profiles end up being applied twice.
Workaround:
Do not define any global flags in the global custom_kubelet_flags setting
that will be used in kubelet node profiles.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible¶
In air-gapped swarm-only environments, upgrades fail to start if all of the
MKE images are not preloaded on the selected manager node or if the node
cannot automatically pull the required MKE images.
Workaround:
Ensure either that the manager nodes have the complete set of MKE images
preloaded before performing an upgrade or that they can pull the images from a
remote repository.
Requests is a HTTP library. Prior to 2.32.0, when making requests
through a Requests Session, if the first request is made with
verify=False to disable cert verification, all subsequent requests to
the same host will continue to ignore cert verification regardless of
changes to the value of verify. This behavior will continue for the
lifecycle of the connection in the connection pool. This vulnerability
is fixed in 2.32.0.
Moby is an open-source project created by Docker to enable software
containerization. The classic builder cache system is prone to cache
poisoning if the image is built FROM scratch. Also, changes to some
instructions (most important being HEALTHCHECK and ONBUILD) would not
cause a cache miss. An attacker with the knowledge of the Dockerfile
someone is using could poison their cache by making them pull a
specially crafted image that would be considered as a valid cache
candidate for some build steps. 23.0+ users are only affected if they
explicitly opted out of Buildkit (DOCKER_BUILDKIT=0 environment
variable) or are using the /build API endpoint. All users on versions
older than 23.0 could be impacted. Image build API endpoint (/build) and
ImageBuild function from github.com/docker/docker/client is also
affected as it the uses classic builder by default. Patches are included
in 24.0.9 and 25.0.2 releases.
An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of
header data by sending an excessive number of CONTINUATION frames.
Maintaining HPACK state requires parsing and processing all HEADERS and
CONTINUATION frames on a connection. When a request’s headers exceed
MaxHeaderBytes, no memory is allocated to store the excess headers, but
they are still parsed. This permits an attacker to cause an HTTP/2
endpoint to read arbitrary amounts of header data, all associated with a
request which is going to be rejected. These headers can include
Huffman-encoded data which is significantly more expensive for the
receiver to decode than for an attacker to send. The fix sets a limit on
the amount of excess header frames we will process before closing a
connection.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.11, the upgrade will fail, and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2024-JULY-8
MKE 3.7.11
Patch release for MKE 3.7 that focuses exclusively on CVE mitigation.
For detail on the specific CVEs addressed, refer to Security information.
[MKE-11535][FIELD-7110]
ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state¶
Due to the upstream dependency issue in gpu-feature-discovery software,
customers may encounter nvidia-gpu-feature-discovery in
CrashLoopBackOff state with the following errors:
[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes¶
The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a
Linux-only solution, however it attempts without success to also deploy to
Windows nodes.
Workaround:
Edit the node-local-dns daemonset:
kubectleditdaemonsetnode-local-dns-nkube-system
Add the following under spec.template.spec:
nodeSelector:
kubernetes.io/os:linux
Save the daemonset.
[MKE-11525] Kubelet node profiles fail to supersede global setting¶
Flags specified in the global custom_kubelet_flags setting and then applied
through kubelet node profiles end up being applied twice.
Workaround:
Do not define any global flags in the global custom_kubelet_flags setting
that will be used in kubelet node profiles.
[FIELD-7023] Air-gapped upgrades fail if images are inaccessible¶
In air-gapped environments, upgrades fail if the MKE images are not preloaded
on the selected manager node or the node cannot automatically pull the
required MKE images. This results in a rollback to the previous MKE version,
which in this particular scenario can inadvertently remove the etcd/RethinkDB
cluster from the MKE cluster and thus require you to restore MKE from a backup.
Workaround:
Ensure either that the manager nodes have all necessary MKE images preloaded
before performing an upgrade or that they can pull the images from
a remote repository.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
The MKE 3.7.11 patch release focuses exclusively on CVE mitigation. To this
end, the following middleware component versions have been upgraded to resolve
vulnerabilities in MKE:
[MKE-11542] Kubernetes 1.27.14
The following table details the specific CVEs addressed, including which images
are affected per CVE.
An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of
header data by sending an excessive number of CONTINUATION frames.
Maintaining HPACK state requires parsing and processing all HEADERS and
CONTINUATION frames on a connection. When a request’s headers exceed
MaxHeaderBytes, no memory is allocated to store the excess headers, but
they are still parsed. This permits an attacker to cause an HTTP/2
endpoint to read arbitrary amounts of header data, all associated with a
request which is going to be rejected. These headers can include
Huffman-encoded data which is significantly more expensive for the
receiver to decode than for an attacker to send. The fix sets a limit on
the amount of excess header frames we will process before closing a
connection.
The protojson.Unmarshal function can enter an infinite loop when
unmarshaling certain forms of invalid JSON. This condition can occur
when unmarshaling into a message which contains a google.protobuf.Any
value, or when the UnmarshalOptions.DiscardUnknown option is set.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail, and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.10, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2024-JUNE-17
MKE 3.7.10
Patch release for MKE 3.7 introducing the following enhancements:
With NodeLocalDNS, you can run a local instance of the DNS caching agent on
each node in the cluster. This results in significant performance improvement
versus relying on a centralized CoreDNS instance to resolve external DNS
records, as the local NodeLocalDNS instance is able to cache DNS results and
thus mitigate network latency and conntrack issues. For more information, refer
to Manage NodeLocalDNS.
[MKE-11480] Support for Kubelet node configurations¶
You can now set kubelet node profiles through Kubernetes node Labels. With
these profiles, which are a set of kubelet flags, you can customize the
settings of your kubelet agents on a node-by-node level, in addition to setting
cluster-wide flags for use by every kubelet agent. For more information, refer
to Custom kubelet profiles.
[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes¶
The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a
Linux-only solution, however it attempts without success to also deploy to
Windows nodes.
Workaround:
Edit the node-local-dns daemonset:
kubectleditdaemonsetnode-local-dns-nkube-system
Add the following under spec.template.spec:
nodeSelector:
kubernetes.io/os:linux
Save the daemonset.
[MKE-11525] Kubelet node profiles fail to supersede global setting¶
Flags specified in the global custom_kubelet_flags setting and then applied
through kubelet node profiles end up being applied twice.
Workaround:
Do not define any global flags in the global custom_kubelet_flags setting
that will be used in kubelet node profiles.
[FIELD-7023] Air-gapped upgrades fail if images are inaccessible¶
In air-gapped environments, upgrades fail if the MKE images are not preloaded
on the selected manager node or the node cannot automatically pull the
required MKE images. This results in a rollback to the previous MKE version,
which in this particular scenario can inadvertently remove the etcd/RethinkDB
cluster from the MKE cluster and thus require you to restore MKE from a backup.
Workaround:
Ensure either that the manager nodes have all necessary MKE images preloaded
before performing an upgrade or that they can pull the images from
a remote repository.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of
header data by sending an excessive number of CONTINUATION frames.
Maintaining HPACK state requires parsing and processing all HEADERS and
CONTINUATION frames on a connection. When a request’s headers exceed
MaxHeaderBytes, no memory is allocated to store the excess headers, but
they are still parsed. This permits an attacker to cause an HTTP/2
endpoint to read arbitrary amounts of header data, all associated with a
request which is going to be rejected. These headers can include
Huffman-encoded data which is significantly more expensive for the
receiver to decode than for an attacker to send. The fix sets a limit on
the amount of excess header frames we will process before closing a
connection.
nscd: Stack-based buffer overflow in netgroup cache. If the Name Service
Cache Daemon’s (nscd) fixed size cache is exhausted by client requests
then a subsequent client request for netgroup data may result in a
stack-based buffer overflow. This flaw was introduced in glibc 2.15 when
the cache was added to nscd. This vulnerability is only present in the
nscd binary.
nscd: Null pointer crashes after notfound response. If the Name Service
Cache Daemon’s (nscd) cache fails to add a not-found netgroup response
to the cache, the client request can result in a null pointer
dereference. This flaw was introduced in glibc 2.15 when the cache was
added to nscd. This vulnerability is only present in the nscd binary.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.9, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2024-MAY-28
MKE 3.7.9
Patch release for MKE 3.7 that focuses exclusively on CVE mitigation.
For detail on the specific CVEs addressed, refer to Security information.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[MKE-11282] –swarm-only upgrade fails due to ‘unavailable’ manager ports¶
Upgrades to Swarm-only clusters that were originally installed using the
--swarm-only fail pre-upgrade checks at the Check 7 of 8: [Port
Requirements] step.
Workaround:
Include the --force-port-check upgrade option when upgrading a
Swarm-only cluster.
The MKE 3.7.9 patch release focuses exclusively on CVE mitigation. To this end,
the following middleware component versions have been upgraded to resolve
vulnerabilities in MKE:
[MKE-11504] Golang 1.21.10
[MKE-11502] cri-dockerd 0.3.14
[MKE-11482] NGINX Ingress Controller 1.10.1
[MKE-11482] Gatekeeper 3.14.2
[MKE-11482] Metallb 0.14.5
DOCKER_EE_CLI 23.0.11~3
The following table details the specific CVEs addressed, including which images
are affected per CVE.
Moby is an open-source project created by Docker to enable software
containerization. The classic builder cache system is prone to cache
poisoning if the image is built FROM scratch. Also, changes to some
instructions (most important being HEALTHCHECK and ONBUILD)
would not cause a cache miss. An attacker with the knowledge of the
Dockerfile someone is using could poison their cache by making them
pull a specially crafted image that would be considered as a valid
cache candidate for some build steps. 23.0+ users are only affected if
they explicitly opted out of Buildkit (DOCKER_BUILDKIT=0
environment variable) or are using the /build API endpoint. All users
on versions older than 23.0 could be impacted. Image build API endpoint
(/build) and ImageBuild function from
github.com/docker/docker/client is also affected as it the uses classic
builder by default. Patches are included in 24.0.9 and 25.0.2 releases.
An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of
header data by sending an excessive number of CONTINUATION frames.
Maintaining HPACK state requires parsing and processing all HEADERS and
CONTINUATION frames on a connection. When a request’s headers exceed
MaxHeaderBytes, no memory is allocated to store the excess headers, but
they are still parsed. This permits an attacker to cause an HTTP/2
endpoint to read arbitrary amounts of header data, all associated with a
request which is going to be rejected. These headers can include
Huffman-encoded data which is significantly more expensive for the
receiver to decode than for an attacker to send. The fix sets a limit on
the amount of excess header frames we will process before closing a
connection.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.8, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2024-MAY-6
MKE 3.7.8
Patch release for MKE 3.7 introducing the following enhancements:
Addition of Kubernetes log retention configuration parameters
Customizability of audit log policies
Support for scheduling of etcd cluster cleanup and defragmentation
[MKE-11323] Addition of Kubernetes log retention configuration parameters¶
Audit log retention values for Kubernetes can now be customized using three new
Kubernetes apiserver parameters in the MKE configuration file:
kube_api_server_audit_log_maxage
kube_api_server_audit_log_maxbackup
kube_api_server_audit_log_maxsize
[MKE-11265] Customizability of audit log policies¶
Audit log policies can now be customized, a feature that is enabled through the
KubeAPIServerCustomAuditPolicyYaml and
KubeAPIServerEnableCustomAuditPolicy settings.
[MKE-9275] Support for scheduling of etcd cluster cleanup and defragmentation¶
Customers can now schedule etcd cluster cleanup by way of a cron job. In
addition, defragmentation can be configured to start following a successful
cleanup operation. This new functionality is initiated through the
/api/ucp/config-toml endpoint. For more information, refer to MKE
Configuration File: etcd_cleanup_schedule_config.
[FIELD-6901] Inclusion of Docker events in MKE support bundle¶
The MKE support bundle now includes Docker event information from the nodes.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[MKE-11282] –swarm-only upgrade fails due to ‘unavailable’ manager ports¶
Upgrades to Swarm-only clusters that were originally installed using the
--swarm-only fail pre-upgrade checks at the Check 7 of 8: [Port
Requirements] step.
Workaround:
Include the --force-port-check upgrade option when upgrading a
Swarm-only cluster.
The MKE 3.7.8 patch release focuses exclusively on CVE mitigation. To this
end, the following middleware component versions have been upgraded to resolve
vulnerabilities in MKE:
[MKE-11477] cri-dockerd 0.3.13
[MKE-11428] Interlock 3.3.13
[MKE-11482] Blackbox Exporter 0.25.0
[MKE-11482] Alert Manager 0.27.0
The following table details the specific CVEs addressed, including which images
are affected per CVE.
An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of
header data by sending an excessive number of CONTINUATION frames.
Maintaining HPACK state requires parsing and processing all HEADERS and
CONTINUATION frames on a connection. When a request’s headers exceed
MaxHeaderBytes, no memory is allocated to store the excess headers, but
they are still parsed. This permits an attacker to cause an HTTP/2
endpoint to read arbitrary amounts of header data, all associated with a
request which is going to be rejected. These headers can include
Huffman-encoded data which is significantly more expensive for the
receiver to decode than for an attacker to send. The fix sets a limit on
the amount of excess header frames we will process before closing a
connection.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.7, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2024-APR-15
MKE 3.7.7
Patch release for MKE 3.7 that focuses exclusively on CVE mitigation.
For detail on the specific CVEs addressed, refer to Security information.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[MKE-11282] –swarm-only upgrade fails due to ‘unavailable’ manager ports¶
Upgrades to Swarm-only clusters that were originally installed using the
--swarm-only fail pre-upgrade checks at the Check 7 of 8: [Port
Requirements] step.
Workaround:
Include the --force-port-check upgrade option when upgrading a
Swarm-only cluster.
The MKE 3.7.7 patch release focuses exclusively on CVE mitigation. To this end,
the following middleware component versions have been upgraded to resolve
vulnerabilities in MKE:
DOCKER_EE_CLI 23.0.10
Powershell
docker/docker vendor
The following table details the specific CVEs addressed, including which images
are affected per CVE.
runc is a CLI tool for spawning and running containers on Linux
according to the OCI specification. In runc 1.1.11 and earlier, due to
an internal file descriptor leak, an attacker could cause a
newly-spawned container process (from runc exec) to have a working
directory in the host filesystem namespace, allowing for a container
escape by giving access to the host filesystem (“attack 2”). The same
attack could be used by a malicious image to allow a container process
to gain access to the host filesystem through runc run (“attack 1”).
Variants of attacks 1 and 2 could be also be used to overwrite
semi-arbitrary host binaries, allowing for complete container escapes
(“attack 3a” and “attack 3b”). runc 1.1.12 includes patches for this
issue.
Moby is an open-source project created by Docker to enable software
containerization. The classic builder cache system is prone to cache
poisoning if the image is built FROM scratch. Also, changes to some
instructions (most important being HEALTHCHECK and ONBUILD)
would not cause a cache miss. An attacker with the knowledge of the
Dockerfile someone is using could poison their cache by making them
pull a specially crafted image that would be considered as a valid
cache candidate for some build steps. 23.0+ users are only affected if
they explicitly opted out of Buildkit (DOCKER_BUILDKIT=0
environment variable) or are using the /build API endpoint. All users
on versions older than 23.0 could be impacted. Image build API endpoint
(/build) and ImageBuild function from
github.com/docker/docker/client is also affected as it the uses classic
builder by default. Patches are included in 24.0.9 and 25.0.2 releases.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.6, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2024-MAR-20
MKE 3.7.6
Kubernetes for GMSA now supported
Addition of ucp-cadvisor container level metrics component
Mirantis now supports GMSA on Kubernetes in MKE. Consequently, users can
generate GMSA credentials on the Kubernetes cluster and use these credentials
in Pod specifications. This allows MCR to use the specified GMSA credentials
while launching the Pods.
Kubernetes for GMSA functionality is off by default. To activate the function,
set windows_gmsa to true in the MKE configuration file.
The implementation supports the latest specification of GMSA credentials,
windows.k8s.io/v1. Before enabling this feature, ensure that there are no
existing GMSA credential specs or resources using such specs.
[MKE-11022] Addition of ucp-cadvisor container level metrics component¶
The new optional ucp-cadvisor component runs a standalone cadvisor instance
on each node, which provides additional container level metrics.
To enable the ucp-cadvisor component feature, set cadvisor_enabled to
true in the MKE configuration file.
Note
Currently, the ucp-cadvisor component is supported for Linux nodes only.
It is not supported for Windows nodes.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[FIELD-6785] Reinstallation can fail following cluster CA rotation¶
If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE
3.7.x or 3.6.x on an existing docker swarm can fail with the following error
messages:
unabletosigncert:{\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"
[FIELD-6402] Default metric collection memory settings may be insufficient¶
In MKE 3.7, ucp-metrics collects more metrics than in previous versions of
MKE. As such, for large clusters with many nodes, the following ucp-metrics
component default settings may be insufficient:
memory request: 1Gi
memory limit: 2Gi
Workaround:
Administrators can modify the MKE configuration file to increase the default
memory request and memory limit setting values for the ucp-metrics
component. The settings to configure are both under the cluster section:
For memory request, modify the prometheus_memory_request setting
For memory limit, modify the prometheus_memory_limitsetting
[MKE-11281] cAdvisor Pods on Windows nodes cannot enter ‘Running’ state¶
When you enable cAdvisor, Pods are deployed to every node in the cluster. These
cAdvisor Pods only work on Linux nodes, however, so the Pods that are
inadvertently targeted to Windows nodes remain perpetually suspended
and never actually run.
we inadvertently target Windows nodes
with cAdvisor and the workaround updates the DaemonSet such that only Linux
nodes are targeted.
Workaround:
Update the DaemonSet so that only Linux nodes are targeted by patching the
ucp-cadvisor DaemonSet to include a node selector for Linux:
[MKE-11282] –swarm-only upgrade fails due to ‘unavailable’ manager ports¶
Upgrades to Swarm-only clusters that were originally installed using the
--swarm-only fail pre-upgrade checks at the Check 7 of 8: [Port
Requirements] step.
Workaround:
Include the --force-port-check upgrade option when upgrading a
Swarm-only cluster.
The HTTP/2 protocol allows a denial of service (server resource
consumption) because request cancellation can reset many streams
quickly, as exploited in the wild in August through October 2023.
Processing a maliciously formatted PKCS12 file may lead OpenSSL to
crash leading to a potential Denial of Service attack Impact summary:
Applications loading files in the PKCS12 format from untrusted sources
might terminate abruptly. A file in PKCS12 format can contain
certificates and keys and may come from an untrusted source. The
PKCS12 specification allows certain fields to be NULL, but OpenSSL
does not correctly check for this case. This can lead to a NULL
pointer dereference that results in OpenSSL crashing. If an
application processes PKCS12 files from an untrusted source using the
OpenSSL APIs then that application will be vulnerable to this issue.
OpenSSL APIs that are vulnerable to this are: PKCS12_parse(),
PKCS12_unpack_p7data(), PKCS12_unpack_p7encdata(),
PKCS12_unpack_authsafes() and PKCS12_newpass(). We have also fixed a
similar issue in SMIME_write_PKCS7(). However since this function is
related to writing data we do not consider it security significant.
The FIPS modules in 3.2, 3.1 and 3.0 are not affected by this issue.
A security issue was discovered in Kubernetes where a user that can
create pods and persistent volumes on Windows nodes may be able to
escalate to admin privileges on those nodes. Kubernetes clusters are
only affected if they are using an in-tree storage plugin for Windows
nodes.
OpenTelemetry-Go Contrib is a collection of third-party packages for
OpenTelemetry-Go. A handler wrapper out of the box adds labels
http.user_agent and http.method that have unbound cardinality. It
leads to the server’s potential memory exhaustion when many malicious
requests are sent to it. HTTP header User-Agent or HTTP method for
requests can be easily set by an attacker to be random and long. The
library internally uses httpconv.ServerRequest that records every
value for HTTP method and User-Agent. In order to be affected, a
program has to use the otelhttp.NewHandler wrapper and not filter any
unknown HTTP methods or User agents on the level of CDN, LB, previous
middleware, etc. Version 0.44.0 fixed this issue when the values
collected for attribute http.request.method were changed to be
restricted to a set of well-known values and other high cardinality
attributes were removed. As a workaround to stop being affected,
otelhttp.WithFilter() can be used, but it requires manual careful
configuration to not log certain requests entirely. For convenience and
safe usage of this library, it should by default mark with the label
unknown non-standard HTTP methods and User agents to show that such
requests were made but do not increase cardinality. In case someone
wants to stay with the current behavior, library API should allow to
enable it.
OpenTelemetry-Go Contrib is a collection of third-party packages for
OpenTelemetry-Go. Prior to version 0.46.0, the grpc Unary Server
Interceptor out of the box adds labels net.peer.sock.addr and
net.peer.sock.port that have unbound cardinality. It leads to the
server’s potential memory exhaustion when many malicious requests are
sent. An attacker can easily flood the peer address and port for
requests. Version 0.46.0 contains a fix for this issue. As a workaround
to stop being affected, a view removing the attributes can be used. The
other possibility is to disable grpc metrics instrumentation by passing
otelgrpc.WithMeterProvider option with noop.NewMeterProvider.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.5, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2024-MAR-05
MKE 3.7.5
etcd alarms are exposed through Prometheus metrics
Augmented validation for etcd storage quota
Improved handling of larger sized etcd instances
All errors now returned from pre upgrade checks
Minimum Docker storage requirement now part of pre upgrade checks
[MKE-10834] etcd alarms are exposed through Prometheus metrics¶
The NOSPACE and CORRUPT alarms generated by etcd are now exposed through
Prometheus metrics. In addition, the alertmanager now sends an alert in the
event of a NOSPACE alarm.
[MKE-10833] Augmented validation for etcd storage quota¶
Validation for etcd storage quota has been extended in terms of minimum quota,
maximum quota, and current database size.
Minimum quota validation: The system now enforces a minimum etcd storage
quota of 2GB.
Maximum quota validation: etcd storage quotas can no longer exceed 8GB.
Current dbSize validation: Validation checks are now in place to verify
whether the current database size exceeds the specified etcd storage
quota across all etcd cluster members.
[MKE-10684] SAML CA certificate can now be reset with the DELETE eNZi endpoint¶
The SAML configuration can now be removed by issuing a DELETE request
to https://{cluster}/enzi/v0/config/auth/saml.
[MKE-10070] All errors now returned from pre upgrade checks¶
All pre upgrade checks are now run to completion, after which a comprehensive
list of failures is returned. Previously, a failure in any sub step would
result in the exit of the pre upgrade check routine and the return of a single
error. As such, any issues in the environment can be triaged in a single run.
[MKE-9946] Minimum Docker storage requirement now part of pre upgrade checks¶
The pre upgrade checks now verify that the minimum Docker storage requirement
of 25GB is met.
[FIELD-6695] Improved handling of larger sized etcd instances¶
Now, when etcd storage usage exceeds the quota, there are steps that allow for
the easy fixing and recovery of the MKE cluster. In addition, MKE web UI
banners have been added to indicate etcd alarms and to inform the user when the
storage quote setting is in excess of 40% of the node total memory.
Issues addressed in the MKE 3.7.5 release include:
[MKE-10903] Fixed an issue wherein flood messages occurred in
ucp-worker-agent.
[MKE-10835] Fixed an issue wherein Gatekeeper Pods were frequently entering
CrashLoopBackOff due to the absence of an expected CRD definition.
[MKE-10644] Fixed an issue wherein a vulnerability in etcd library
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
allowed a memory exhaustion attack on the grpc server. The patch is applied
on top of etcd version 3.5.10 and includes the fix for CVE-2023-47108. No other
changes from etcd 3.5.10 are included.
[FIELD-6835] Fixed an issue wherein ucp-cluster-agent continually restarted
during upgrade to MKE 3.7.4.
[FIELD-6695] Fixed multiple issues caused by large etcd instances:
The addition of a second manager sometimes caused etcd cluster failure when
the etcd storage size was of substantial size.
Cold starting of MKE clusters would fail whenever the etcd storage size
exceeded 2GB.
Simultaneous restarting of manager nodes would fail whenever the etcd
storage size exceeded 2GB.
The etcd storage usage indicator in the MKE web UI banner was not accurate.
[FIELD-6670] Fixed an issue wherein CNI plugin log level was not consistent
with the MKE log level setting.
[FIELD-6602] Resolved couldn'tgetdbusconnection:dialunix/var/run/dbus/system_bus_socket error message by removing systemdcollector and repaired undev mount errors in Node Exporter.
[FIELD-6598] Fixed an issue wherein users without access to the /system
collection could promote worker nodes to manager nodes.
[FIELD-6573] Fixed an issue wherein kubelet failed to rotate Pod containter
logs.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[FIELD-6785] Reinstallation can fail following cluster CA rotation¶
If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE
3.7.x or 3.6.x on an existing docker swarm can fail with the following error
messages:
unabletosigncert:{\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"
[FIELD-6402] Default metric collection memory settings may be insufficient¶
In MKE 3.7, ucp-metrics collects more metrics than in previous versions of
MKE. As such, for large clusters with many nodes, the following ucp-metrics
component default settings may be insufficient:
memory request: 1Gi
memory limit: 2Gi
Workaround:
Administrators can modify the MKE configuration file to increase the default
memory request and memory limit setting values for the ucp-metrics
component. The settings to configure are both under the cluster section:
For memory request, modify the prometheus_memory_request setting
For memory limit, modify the prometheus_memory_limitsetting
A security issue was discovered in Kubernetes where a user that can
create pods and persistent volumes on Windows nodes may be able to
escalate to admin privileges on those nodes. Kubernetes clusters are
only affected if they are using an in-tree storage plugin for Windows
nodes.
A security issue was discovered in Kubernetes where a user that can
create pods and persistent volumes on Windows nodes may be able to
escalate to admin privileges on those nodes. Kubernetes clusters are
only affected if they are using an in-tree storage plugin for Windows
nodes.
A security issue was discovered in Kubernetes where a user that can
create pods on Windows nodes may be able to escalate to admin
privileges on those nodes. Kubernetes clusters are only affected if
they include Windows nodes.
OpenTelemetry-Go Contrib is a collection of third-party packages for
OpenTelemetry-Go. Prior to version 0.46.0, the grpc Unary Server
Interceptor out of the box adds labels net.peer.sock.addr and
net.peer.sock.port that have unbound cardinality. It leads to the
server’s potential memory exhaustion when many malicious requests are
sent. An attacker can easily flood the peer address and port for
requests. Version 0.46.0 contains a fix for this issue. As a workaround
to stop being affected, a view removing the attributes can be used. The
other possibility is to disable grpc metrics instrumentation by passing
otelgrpc.WithMeterProvider option with noop.NewMeterProvider.
OpenTelemetry-Go Contrib is a collection of third-party packages for
OpenTelemetry-Go. A handler wrapper out of the box adds labels
http.user_agent and http.method that have unbound cardinality. It
leads to the server’s potential memory exhaustion when many malicious
requests are sent to it. HTTP header User-Agent or HTTP method for
requests can be easily set by an attacker to be random and long. The
library internally uses httpconv.ServerRequest that records every
value for HTTP method and User-Agent. In order to be affected, a
program has to use the otelhttp.NewHandler wrapper and not filter any
unknown HTTP methods or User agents on the level of CDN, LB, previous
middleware, etc. Version 0.44.0 fixed this issue when the values
collected for attribute http.request.method were changed to be
restricted to a set of well-known values and other high cardinality
attributes were removed. As a workaround to stop being affected,
otelhttp.WithFilter() can be used, but it requires manual careful
configuration to not log certain requests entirely. For convenience and
safe usage of this library, it should by default mark with the label
unknown non-standard HTTP methods and User agents to show that such
requests were made but do not increase cardinality. In case someone
wants to stay with the current behavior, library API should allow to
enable it.
The HTTP/2 protocol allows a denial of service (server resource
consumption) because request cancellation can reset many streams
quickly, as exploited in the wild in August through October 2023.
An off-by-one heap-based buffer overflow was found in the
__vsyslog_internal function of the glibc library. This function is
called by the syslog and vsyslog functions. This issue occurs when
these functions are called with a message bigger than INT_MAX bytes,
leading to an incorrect calculation of the buffer size to store the
message, resulting in an application crash. This issue affects glibc
2.37 and newer.
A heap-based buffer overflow was found in the __vsyslog_internal
function of the glibc library. This function is called by the syslog
and vsyslog functions. This issue occurs when the openlog function was
not called, or called with the ident argument set to NULL, and the
program name (the basename of argv[0]) is bigger than 1024 bytes,
resulting in an application crash or local privilege escalation. This
issue affects glibc 2.36 and newer.
An integer overflow was found in the __vsyslog_internal function of the
glibc library. This function is called by the syslog and vsyslog
functions. This issue occurs when these functions are called with a
very long message, leading to an incorrect calculation of the buffer
size to store the message, resulting in undefined behavior. This issue
affects glibc 2.37 and newer.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.2, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2023-DEC-04
MKE 3.7.3
Patch release for MKE 3.7 that focuses exclusively on CVE resolution.
For detail on the specific CVEs addressed, refer to Security information.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[FIELD-6785] Reinstallation can fail following cluster CA rotation¶
If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE
3.7.x or 3.6.x on an existing docker swarm can fail with the following error
messages:
unabletosigncert:{\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"
[FIELD-6402] Default metric collection memory settings may be insufficient¶
In MKE 3.7, ucp-metrics collects more metrics than in previous versions of
MKE. As such, for large clusters with many nodes, the following ucp-metrics
component default settings may be insufficient:
memory request: 1Gi
memory limit: 2Gi
Workaround:
Administrators can modify the MKE configuration file to increase the default
memory request and memory limit setting values for the ucp-metrics
component. The settings to configure are both under the cluster section:
For memory request, modify the prometheus_memory_request setting
For memory limit, modify the prometheus_memory_limitsetting
The MKE 3.7.3 patch release focuses exclusively on CVE mitigation. To this end,
the following middleware component versions have been upgraded to resolve
vulnerabilities in MKE:
[MKE-10346] Interlock 3.3.12
[MKE-10682] Calico 3.26.4/Calico for Windows 3.26.4
[SECMKE-113] cri-dockerd 0.3.7
[FIELD-6558] NGINX Ingress Controller 1.9.4
[MKE-10340] CoreDNS 1.11.1
[MKE-10309] Prometheus 2.48.0
[SECMKE-122] NVIDIA GPU Feature Discovery 0.8.2
[MKE-10586] Gatekeeper 3.13.4
The following table details the specific CVEs addressed, including which images
are affected per CVE.
A security issue was discovered in Kubernetes where a user that can
create pods on Windows nodes may be able to escalate to admin
privileges on those nodes. Kubernetes clusters are only affected if
they include Windows nodes.
A security issue was discovered in Kubernetes where a user that can
create pods on Windows nodes may be able to escalate to admin
privileges on those nodes. Kubernetes clusters are only affected if
they include Windows nodes.
A security issue was discovered in Kubernetes where a user that can
create pods and persistent volumes on Windows nodes may be able to
escalate to admin privileges on those nodes. Kubernetes clusters are
only affected if they are using an in-tree storage plugin for Windows
nodes.
A malicious HTTP/2 client which rapidly creates requests and
immediately resets them can cause excessive server resource
consumption. While the total number of requests is bounded by the
http2.Server.MaxConcurrentStreams setting, resetting an in-progress
request allows the attacker to create a new request while the existing
one is still executing. With the fix applied, HTTP/2 servers now bound
the number of simultaneously executing handler goroutines to the
stream concurrency limit (MaxConcurrentStreams). New requests arriving
when at the limit (which can only happen after the client has reset an
existing, in-flight request) will be queued until a handler exits. If
the request queue grows too large, the server will terminate the
connection. This issue is also fixed in golang.org/x/net/http2 for
users manually configuring HTTP/2. The default stream concurrency
limit is 250 streams (requests) per HTTP/2 connection. This value may
be adjusted using the golang.org/x/net/http2 package; see the
Server.MaxConcurrentStreams setting and the ConfigureServer function.
The HTTP/2 protocol allows a denial of service (server resource
consumption) because request cancellation can reset many streams
quickly, as exploited in the wild in August through October 2023.
OpenTelemetry-Go Contrib is a collection of third-party packages for
OpenTelemetry-Go. A handler wrapper out of the box adds labels
http.user_agent and http.method that have unbound cardinality. It
leads to the server’s potential memory exhaustion when many malicious
requests are sent to it. HTTP header User-Agent or HTTP method for
requests can be easily set by an attacker to be random and long. The
library internally uses httpconv.ServerRequest that records every
value for HTTP method and User-Agent. In order to be affected, a
program has to use the otelhttp.NewHandler wrapper and not filter
any unknown HTTP methods or User agents on the level of CDN, LB,
previous middleware, etc. Version 0.44.0 fixed this issue when the
values collected for attribute http.request.method were changed to
be restricted to a set of well-known values and other high cardinality
attributes were removed. As a workaround to stop being affected,
otelhttp.WithFilter() can be used, but it requires manual careful
configuration to not log certain requests entirely. For convenience
and safe usage of this library, it should by default mark with the
label unknown non-standard HTTP methods and User agents to show that
such requests were made but do not increase cardinality. In case
someone wants to stay with the current behavior, library API should
allow to enable it.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.2, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2023-NOV-20
MKE 3.7.2
Patch release for MKE 3.7 introducing the following enhancements:
Issues addressed in the MKE 3.7.2 release include:
[FIELD-6453] Fixed an issue wherein users assigned as admins lost their admin
privileges after conducting an LDAP sync with JIT enabled.
[FIELD-6446] Fixed an issue wherein the /ucp/etcd/info API endpoint
incorrectly displayed the size of objects stored in the database, rather than
the actual size of the database on disk. This fix introduces the
DbSizeInUse field, alongside the existing ``DbSize``field , which ensures
the accurate reporting of both the logically used size and physically
allocated size of the backend database.
[FIELD-6437] Fixed an issue with the MKE configuration setting
cluster_config.metallb_config.metallb_ip_addr_pool.name wherein the name
was not verified against RFC-1123 label names.
[FIELD-6353] Fixed an issue wherein the accounts/<org>/members API would
provide incomplete results when requesting non-admins.
[MKE-10267] Fixed three instances of CVE-2023-4911, rated
High, which were detected on glibc-related components in the
ucp-calico-node image.
[MKE-10231] Fixed an issue wherein clusters were left in an inoperable state
following either:
Upgrade of MKE 3.7.0 clusters installed with the --multus-cni
argument to MKE 3.7.1
Installation of a fresh MKE 3.7.1 cluster with the --multus-cni
argument
[MKE-10204] Fixed an issue whereby the ucp images --list command
returned all images, including those that are swarm-only. Now the swarm-only
images are only returned when the :command:–swarm-only
flag is included.
[MKE-10202] Fixed an issue whereby in swarm-only mode workers were attempting
to run a Kubernetes component following an MKE upgrade.
[MKE-10032] Fixed an issue wherein MKE debug levels were not applied to
cri-dockerd logs.
[MKE-10031] Fixed an issue wherein Calico for Windows was continuously
writing to the cri-dockerd logs.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8662] Swarm only manager nodes are labeled as mixed mode¶
When MKE is installed in swarm only mode, manager nodes start off in mixed
mode. As Kubernetes installation is skipped altogether, however, they should be
labeled as swarm mode.
Workaround: Change the labels following installation.
Change the labels following installation.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[FIELD-6785] Reinstallation can fail following cluster CA rotation¶
If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE
3.7.x or 3.6.x on an existing docker swarm can fail with the following error
messages:
unabletosigncert:{\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"
[FIELD-6402] Default metric collection memory settings may be insufficient¶
In MKE 3.7, ucp-metrics collects more metrics than in previous versions of
MKE. As such, for large clusters with many nodes, the following ucp-metrics
component default settings may be insufficient:
memory request: 1Gi
memory limit: 2Gi
Workaround:
Administrators can modify the MKE configuration file to increase the default
memory request and memory limit setting values for the ucp-metrics
component. The settings to configure are both under the cluster section:
For memory request, modify the prometheus_memory_request setting
For memory limit, modify the prometheus_memory_limitsetting
A malicious HTTP/2 client which rapidly creates requests and
immediately resets them can cause excessive server resource
consumption. While the total number of requests is bounded by the
http2.Server.MaxConcurrentStreams setting, resetting an in-progress
request allows the attacker to create a new request while the existing
one is still executing. With the fix applied, HTTP/2 servers now bound
the number of simultaneously executing handler goroutines to the
stream concurrency limit (MaxConcurrentStreams). New requests arriving
when at the limit (which can only happen after the client has reset an
existing, in-flight request) will be queued until a handler exits. If
the request queue grows too large, the server will terminate the
connection. This issue is also fixed in golang.org/x/net/http2 for
users manually configuring HTTP/2. The default stream concurrency
limit is 250 streams (requests) per HTTP/2 connection. This value may
be adjusted using the golang.org/x/net/http2 package; see the
Server.MaxConcurrentStreams setting and the ConfigureServer function.
The HTTP/2 protocol allows a denial of service (server resource
consumption) because request cancellation can reset many streams
quickly, as exploited in the wild in August through October 2023.
MKE
MKE provides the disable_http2toml-config to disable http2 on the
ucp-controller and other components that may be exposed to untrusted
clients, regardless of authentication status. Setting this to true
results in MKE-s HTTP servers excluding http2 from the http versions
offered for negotiation and clients being unable to negotiate http2
protocol, even if they explicitly specify it.
The setting is disabled by default. It is provided as a defense
in-depth mechanism, in the event of an unforeseen scenario requires
untrusted clients hit ports used by any of MKE-s http servers. The
best defense continues to be disallowing untrusted clients from
hitting any of the ports in use by MKE.
Kubernetes
Upstream Kubernetes has introduced the feature gate
UnauthenticatedHTTP2DOSMitigation to mitigate against http2 rapid
reset attacks on the kube-apiserver. The mitigation applies strictly
to scenarios where the kube-apiserver communicates with an
unauthenticated or anonymous client. It specifically does not apply
to authenticated clients of the kube-apiserver.
From upstream Kubernetes documentation: “Since this change has the
potential to cause issues, the UnauthenticatedHTTP2DOSMitigation
feature gate can be disabled to remove this protection (which is
enabled by default). For example, when the API server is fronted by an
L7 load balancer that is set up to mitigate http2 attacks,
unauthenticated clients could force disable connection reuse between
the load balancer and the API server (many incoming connections could
share the same backend connection). An API server that is on a private
network may opt to disable this protection to prevent performance
regressions for unauthenticated clients.”
To enable the configuration, MKE has introduced the
unauthenticated_http2_dos_mitigation configuration setting. Edit
the setting to true to enable this feature gate.
This flaw makes curl overflow a heap based buffer in the SOCKS5 proxy
handshake. When curl is asked to pass along the host name to the
SOCKS5 proxy to allow that to resolve the address instead of it
getting done by curl itself, the maximum length that host name can be
is 255 bytes. If the host name is detected to be longer, curl switches
to local name resolving and instead passes on the resolved address
only. Due to this bug, the local variable that means “let the host
resolve the name” could get the wrong value during a slow SOCKS5
handshake, and contrary to the intention, copy the too long host name
to the target buffer instead of copying just the resolved address
there. The target buffer being a heap based buffer, and the host name
coming from the URL that curl has been told to operate with.
This flaw allows an attacker to insert cookies at will into a running
program using libcurl, if the specific series of conditions are met.
libcurl performs transfers. In its API, an application creates “easy
handles” that are the individual handles for single transfers. libcurl
provides a function call that duplicates en easy handle called
[curl_easy_duphandle](https://curl.se/libcurl/c/curl_easy_duphandle.html).
If a transfer has cookies enabled when the handle is duplicated, the
cookie-enable state is also cloned - but without cloning the actual
cookies. If the source handle did not read any cookies from a specific
file on disk, the cloned version of the handle would instead store the
file name as none (using the four ASCII letters, no quotes).
Subsequent use of the cloned handle that does not explicitly set a
source to load cookies from would then inadvertently load cookies from
a file named none - if such a file exists and is readable in the
current directory of the program using libcurl. And if using the
correct file format of course.
When curl retrieves an HTTP response, it stores the incoming headers
so that they can be accessed later via the libcurl headers API.
However, curl did not have a limit in how many or how large headers it
would accept in a response, allowing a malicious server to stream an
endless series of headers and eventually cause curl to run out of heap
memory.
When curl retrieves an HTTP response, it stores the incoming headers
so that they can be accessed later via the libcurl headers API.
However, curl did not have a limit in how many or how large headers it
would accept in a response, allowing a malicious server to stream an
endless series of headers and eventually cause curl to run out of heap
memory.
A buffer overflow was discovered in the GNU C Library’s dynamic loader
ld.so while processing the GLIBC_TUNABLES environment variable. This
issue could allow a local attacker to use maliciously crafted
GLIBC_TUNABLES environment variables when launching binaries with SUID
permission to execute code with elevated privileges.
OpenTelemetry-Go Contrib is a collection of third-party packages for
OpenTelemetry-Go. A handler wrapper out of the box adds labels
http.user_agent and http.method that have unbound cardinality. It
leads to the server’s potential memory exhaustion when many malicious
requests are sent to it. HTTP header User-Agent or HTTP method for
requests can be easily set by an attacker to be random and long. The
library internally uses httpconv.ServerRequest that records every
value for HTTP method and User-Agent. In order to be affected, a
program has to use the otelhttp.NewHandler wrapper and not filter
any unknown HTTP methods or User agents on the level of CDN, LB,
previous middleware, etc. Version 0.44.0 fixed this issue when the
values collected for attribute http.request.method were changed to
be restricted to a set of well-known values and other high cardinality
attributes were removed. As a workaround to stop being affected,
otelhttp.WithFilter() can be used, but it requires manual careful
configuration to not log certain requests entirely. For convenience
and safe usage of this library, it should by default mark with the
label unknown non-standard HTTP methods and User agents to show that
such requests were made but do not increase cardinality. In case
someone wants to stay with the current behavior, library API should
allow to enable it.
MKE is unaffected by CVE-2023-45142, however some code scanners may
still detect the following CVE/Image combinations:
Mirantis has begun an initiative to align MKE with CIS Benchmarks, where pertinent. The
following table details the CIS Benchmark resolutions and improvements that
are introduced in MKE 3.7.2:
CIS Benchmark type/version
Recommendation designation
Ticket
Resolution/Improvement
Kubernetes 1.7
1.1.9
MKE-9909
File permissions for the CNI config file is set to 600.
Kubernetes 1.7
1.1.12
MKE-10150
The AlwaysPullImages admission control plugin, disabled by
default, can now be enabled. To do so, edit the
k8s_always_pull_images_ac_enabled parameter in the
cluster_config section of the MKE configuration file.
Kubernetes 1.7
1.1.15
MKE-9907
File permissions for the kube-scheduler configuration file are
restricted to 600.
Kubernetes 1.7
1.2.22
MKE-9902
The kubernetes apiserver --request-timeout argument can be set.
Kubernetes 1.7
1.2.23
MKE-9992
The kubernetes apiserver service-account-lookup argument is set
explicitly to true.
Kubernetes 1.7
1.2.31, 4.2.12
MKE-9978
The hardening setting use_strong_tls_ciphers allows for limiting
the list of accepted ciphers for
cipher_suites_for_kube_api_server, cipher_suites_for_kubelet,
and cipher_suites_for_etcd_server to the ciphers considered to be
strong.
Kubernetes 1.7
1.3.1
MKE-9990
MKE now supports the kube_manager_terminated_pod_gc_threshold
configuration parameter. Using this parameter, users can set the
threshold for the terminated Pod garbage collector in Kube Controller
Manager according to their cluster-specific requirement.
Kubernetes 1.7
2.7
MKE-10012
A separate unique certificate authority is now in place for etcd, with
MKE components using certificates issued by it to connect to the
component. In line with this new CA:
A new internal 12392 TCP port requirement is necessary for
manager nodes.
Admin client bundles now include etcd_cert.pem and
etcd_key.pem to connect directly to etcd. The ca.pemfile
includes etcd CA in addition to the Cluster and Client CAs.
Users upgrading to MKE 3.7.2 must rotate the etcd CA after doing
so, to ensure the uniqueness of the etcd CA. For more
information, refer to MKE etcd Root CA.
Kubernetes 1.7
4.1.3
MKE-9911
File permissions for the kubeconfig file are set to 600.
Kubernetes 1.7
4.1.5
MKE-9910
File permissions for the kubelet.conf file are set to 600 or
fewer.
Kubernetes 1.7
4.1.9
MKE-9912
File permissions for the kubelet_daemon.conf file are restricted
to 600 or fewer.
Kubernetes 1.7
5.1.1
MKE-10138
To ensure that the cluster-admin role is only used when necessary,
a special ucp-metrics cluster role that has only the necessary
permissions is now used by Prometheus.
Kubernetes 1.7
5.1.2
MKE 10197, MKE-10114
Multus and Calico service accounts no longer use the cluster-admin
role, and as such no longer require access to secrets.
Kubernetes 1.7
5.1.3
MKE-10117
Replaced the Multus CNI template wildcards with the exact resources
needed for the CNI Roles and ClusterRoles.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.1, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2023-SEPT-26
MKE 3.7.1
Patch release for MKE 3.7 introducing the following enhancements:
Support bundle metrics additions for new MKE 3.7 features
Added ability to filter organizations by name in MKE web UI
Increased Docker and Kubernetes CIS benchmark compliance
MetalLB supports MKE-specific loglevel
Improved Kubernetes role creation error handling in MKE web UI
Increased SAML proxy feedback detail
Upgrade verifies that cluster nodes have minimum required MCR
kube-proxy now binds only to localhost
Enablement of read-only rootfs for specific containers
Support for cgroup v2
Added MKE web UI capability to add OS constraints to swarm services
Added ability to set support bundle collection windows
Added ability to set line limit of log files in support bundles
Addition of search function to Grants > Swarm in MKE web UI
[MKE-9105] Added MKE web UI capability to add OS constraints to swarm services¶
A new helper in the MKE web UI allows users to add OS constraints to swarm
services by selecting the OS from a dropdown. Thereafter, the necessary
constraints are automatically applied.
[FIELD-6026] Added ability to set support bundle collection windows¶
Added an MKE web UI function that allows users to specify the time period in
which MKE support bundle data collection is to take place.
[FIELD-6024] Added ability to set line limit of log files in support bundles¶
In the MKE web UI, the Download support bundle dialog that opens
when you navigate to <user name> and click Support Bundle` now has
a control that you can use to set a line limit for log files. The valid line
limit range for the Log lines limit control is from 1 to 999999.
[FIELD-5936] Addition of search function to Grants > Swarm in MKE web UI¶
MKE web UI users can now use a Search function in Grants
> Swarm to filter the list of Swarm grants.
Issues addressed in the MKE 3.7.1 release include:
[FIELD-6318] Fixed an issue wherein MKE authorization decisions for OIDC
tokens were at times inconsistent across manager nodes.
[MKE-10028] Fixed an issue wherein the CoreDNS Pod became stuck following a
restore from from an MKE backup.
[MKE-10022] Fixed an issue wherein clientSecret was not returned in GETTOML requests. Now it returns as <redacted>, indicating that it is set.
In addition, reuse of the GETTOML<clientSecret:redacted> functions
in PUTTOML requests.
[MKE-10021] Fixed an issue wherein whenever a user fixed a malformed SAML
proxy setting, error messages would continue to propagate.
[MKE-9508] Fixed an issue wherein the PUT request to /api/ucp/config-toml
was missing the text field needed to provide the MKE configuration file in
the live API.
[FIELD-6335] Fixed an issue wherein MKE nodes at times showed Pending
status, which is a false positive.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[FIELD-6785] Reinstallation can fail following cluster CA rotation¶
If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE
3.7.x or 3.6.x on an existing docker swarm can fail with the following error
messages:
unabletosigncert:{\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"
[FIELD-6402] Default metric collection memory settings may be insufficient¶
In MKE 3.7, ucp-metrics collects more metrics than in previous versions of
MKE. As such, for large clusters with many nodes, the following ucp-metrics
component default settings may be insufficient:
memory request: 1Gi
memory limit: 2Gi
Workaround:
Administrators can modify the MKE configuration file to increase the default
memory request and memory limit setting values for the ucp-metrics
component. The settings to configure are both under the cluster section:
For memory request, modify the prometheus_memory_request setting
For memory limit, modify the prometheus_memory_limitsetting
Updated the following middleware component versions to resolve
vulnerabilities in MKE:
[MKE-10159] NGINX Ingress Controller 1.8.2
[FIELD-6356] AlertManager 0.26.0
[MKE-10050] CoreDNS 1.11.0
Mirantis has begun an initiative to align MKE with CIS Benchmarks, where pertinent. The
following table details the CIS Benchmark resolutions and improvements that
are introduced in MKE 3.7.1:
CIS Benchmark type/version
Recommendation designation
Ticket
Resolution/Improvement
Docker 1.6
4.9
MKE-9960
The MKE Dockerfiles were improved and are now exempt from ADD
instructions, with only COPY in use.
Kubernetes 1.7
1.1.17
MKE-9906
The permission for
/ucp-volume-mounts/ucp-node-certs/controller-manager.conf is now
set to 600.
Kubernetes 1.7
1.2.9
MKE-10149
Support for the EventRateLimit admission controller has been added to
MKE. By default, the admission controller remains disabled, however it
can be enabled with a TOML configuration, as exemplified below:
MKE will not validate the individual values for individual limits
specified, except to employ a default value of 4096 for
limit_cache_size when a value is provided.
Ensure that you validate your configuration on a test cluster
before applying it in production, as a misconfigured admission
controller can make
kube-apiserver unavailable for the cluster.
Kubernetes 1.7
1.3.7
MKE-9904
The --bind-address argument is set to 127.0.0.1 in
ucp-kube-controllermanager.
Kubernetes 1.7
4.1.8
MKE-10011, MKE-9917
The kubelet Client Certficate Authority file ownership is now
root:root, changed from its previous nobody:nogroup setting.
Kubernetes 1.7
4.2.5
MKE-9913
The kubelet streamingConnectIdleTimeout argument is set explicitly
to 4h.
Kubernetes 1.7
4.2.6
MKE-9914
The kubelet make-iptables-util-chains argument is set explicitly to
true.
Kubernetes 1.7
4.2.8
MKE-10006
The kubelet_event_record_qps parameter can now be configured in the
MKE configuration file, as exemplified below:
[cluster_config]kubelet_event_record_qps=50
Kubernetes 1.7
5.1.5
MKE-10005
The MKE install process now sets default service accounts in control
plane namespaces to specifically not automount service account tokens.
Kubernetes 1.7
5.1.6
MKE-9921
The use of service account tokens is restricted, allowing for mounting
only where necessary in MKE system namespaces.
Kubernetes 1.7
5.2.2
MKE-9923
Work was done to minimize the admission of privileged containers.
Kubernetes 1.7
5.2.8
MKE-9924
NET_RAW capability has been removed from all unprivileged system
containers.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or
later, as these versions support a transition pathway to an alternative
external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Release date
Name
Highlights
2023-AUG-31
MKE 3.7.0
Initial MKE 3.7.0 release introducing the following key features and
enhancements:
ZeroOps: certificate management
ZeroOps: upgrade rollback
ZeroOps: metrics
Prometheus memory resources
etcd event cleanup
Ingress startup options: TLS, TCP/UDP, HTTP/HTTPS
Additional NGINX Ingress Controller options
Setting for NGINX Ingress Controller default ports
Bare metal Kubernetes clusters can leverage MetalLB to create Load Balancer
services, offering features such as address allocation and external
announcement.
[MKE-10152] Upgrading large Windows clusters can initiate a rollback¶
Upgrades can rollback on a cluster with a large number of Windows worker nodes.
Workaround:
Invoke the --manual-worker-upgrade option and then manually upgrade
the workers.
[MKE-9699] Ingress Controller with external load balancer can enter crashloop¶
Due to the upstream Kubernetes issue
73140, rapid
toggling of the Ingress Controller with an external load
balancer in use can cause the resource to become stuck in a crashloop.
Workaround:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Ingress.
Click the Kubernetes tab to display the
HTTP Ingress Controller for Kubernetes pane.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the left to disable the Ingress Controller.
Use the CLI to delete the Ingress Controller resources:
Return to the HTTP Ingress Controller for Kubernetes pane in
the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and
TCP Port.
Toggle the HTTP Ingress Controller for Kubernetes enabled
control to the right to re-enable the Ingress Controller.
[MKE-9358] cgroup v2 (unsupported) is enabled in RHEL 9.0 by default¶
As MKE does not support cgroup v2 on Linux platforms, RHEL 9.0 users cannot use
the software due to cgroup v2 default enablement.
As a workaround, RHEL 9.0 users must disable cgroup v2.
[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶
The use of Windows ServerCore with Containers images will prevent kubelet
from starting up, as these images are not compatible with GCP.
As a workaround, use Windows Server or Windows Server Core images.
[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶
Communication between GCP VPCs and Docker networks that use Swarm overlay
networks will fail if their MTU values are not manually aligned. By default,
the MTU value for GCP VPCs is 1460, while the default MTU value for Docker
networks is 1500.
[FIELD-6785] Reinstallation can fail following cluster CA rotation¶
If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE
3.7.x or 3.6.x on an existing docker swarm can fail with the following error
messages:
unabletosigncert:{\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"
[FIELD-6402] Default metric collection memory settings may be insufficient¶
In MKE 3.7, ucp-metrics collects more metrics than in previous versions of
MKE. As such, for large clusters with many nodes, the following ucp-metrics
component default settings may be insufficient:
memory request: 1Gi
memory limit: 2Gi
Workaround:
Administrators can modify the MKE configuration file to increase the default
memory request and memory limit setting values for the ucp-metrics
component. The settings to configure are both under the cluster section:
For memory request, modify the prometheus_memory_request setting
For memory limit, modify the prometheus_memory_limitsetting
Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes
1.27.4, which is the version that is configured to MKE 3.7.0, does not
support the AWS in-tree cloud provider. As such, if your MKE cluster is
using the AWS in-tree cloud provider, you must defer upgrade to a
later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to
MKE 3.7.0, the upgrade will fail and you will receive the following error
message:
Your MKE cluster is currently using the AWS in-tree cloud provider, whichKubernetes no longer supports. Please defer upgrading to MKE 3.7 until aversion that supports migration to an alternative external AWS cloudprovider is released.
Taking into account continuous reorganization and enhancement of Mirantis
Kubernetes Engine (MKE), certain components are deprecated and eventually
removed from the product. This section provides the following details about the
deprecated and removed functionality that may potentially impact existing
MKE deployments:
The MKE release version in which deprecation is announced
The final MKE release version in which a deprecated component is present
The MKE release version in which a deprecated component is removed
Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0,
does not support the AWS in-tree cloud provider. Thus, defer upgrade to
a later version of MKE 3.7 that supports a transition pathway to an
alternative external AWS cloud provider.
Custom log drivers
3.6.0
3.6.0
3.6.0
Removed due to Dockershim deprecation from Kubernetes.
Mirantis Kubernetes Engine (MKE, and formerly Docker Enterprise/UCP) provides
enterprises with the easiest and fastest way to deploy cloud native
applications at scale in any environment.
MKE 3.7.4 was discontinued shortly after release due to issues
encountered when upgrading to it from previous versions of the product.
Important
RHEL 9, Rocky 9, Oracle 9, and Ubuntu 22.04 all default to cgroup v2. MKE
3.7.0 only supports cgroup v1, whereas all later versions support cgroup v2.
Thus, if you are running any of the aforementioned OS versions, you must
either upgrade to MKE 3.7.1 or later, or you must downgrade to cgroup v1.
The Mirantis Kubernetes Engine (MKE) and Mirantis Secure Registry (MSR) web
user interfaces (UIs) both run in the browser, separate from any backend
software. As such, Mirantis aims to support browsers separately from
the backend software in use.
Mirantis currently supports the following web browsers:
Browser
Supported version
Release date
Operating systems
Google Chrome
96.0.4664 or newer
15 November 2021
MacOS, Windows
Microsoft Edge
95.0.1020 or newer
21 October 2021
Windows only
Firefox
94.0 or newer
2 November 2021
MacOS, Windows
To ensure the best user experience, Mirantis recommends that you use the
latest version of any of the supported browsers. The use of other browsers
or older versions of the browsers we support can result in rendering issues,
and can even lead to glitches and crashes in the event that some JavaScript
language features or browser web APIs are not supported.
Important
Mirantis does not tie browser support to any particular MKE or MSR software
release.
Mirantis strives to leverage the latest in browser technology to build more
performant client software, as well as ensuring that our customers benefit from
the latest browser security updates. To this end, our strategy is to regularly
move our supported browser versions forward, while also lagging behind the
latest releases by approximately one year to give our customers a
sufficient upgrade buffer.
The MKE, MSR, and MCR platform subscription provides software, support, and
certification to enterprise development and IT teams that build and manage
critical apps in production at scale. It provides a trusted platform for all
apps which supply integrated management and security across the app lifecycle,
comprised primarily of Mirantis Kubernetes Engine, Mirantis Secure Registry
(MSR), and Mirantis Container Runtime (MCR).
Mirantis validates the MKE, MSR, and MCR platform for the operating system
environments specified in the mcr-23.0-compatibility-matrix, adhering
to the Maintenance Lifecycle detailed here. Support for the MKE, MSR, and MCR
platform is defined in the Mirantis Cloud Native Platform Subscription
Services agreement.
Detailed here are all currently supported product versions, as well as the
product versions most recently deprecated. It can be assumed that all earlier
product versions are at End of Life (EOL).
Important Definitions
“Major Releases” (X.y.z): Vehicles for delivering major and minor feature
development and enhancements to existing features. They incorporate all
applicable Error corrections made in prior Major Releases, Minor Releases,
and Maintenance Releases.
“Minor Releases” (x.Y.z): Vehicles for delivering minor feature
developments, enhancements to existing features, and defect corrections. They
incorporate all applicable Error corrections made in prior Minor Releases,
and Maintenance Releases.
“Maintenance Releases” (x.y.Z): Vehicles for delivering Error corrections
that are severely affecting a number of customers and cannot wait for the
next major or minor release. They incorporate all applicable defect
corrections made in prior Maintenance Releases.
“End of Life” (EOL): Versions are no longer supported by Mirantis,
updating to a later version is recommended.
With the intent of improving the customer experience, Mirantis strives to offer
maintenance releases for the Mirantis Kubernetes Engine (MKE) software every
six to eight weeks. Primarily, these maintenance releases will aim to resolve
known issues and issues reported by customers, quash CVEs, and reduce technical
debt. The version of each MKE maintenance release is reflected in the third
digit position of the version number (as an example, for MKE 3.6 the most
current maintenance release is MKE 3.7.16).
In parallel with our maintenance MKE release work, each year Mirantis will
develop and release a new major version of MKE, the Mirantis support lifespan
of which will adhere to our legacy two year standard.
End of Life Date
The End of Life (EOL) date for MKE 3.6 is 2024-OCT-13.
The MKE team will make every effort to hold to the release cadence stated here.
Customers should be aware, though, that development and release cycles can
change, and without advance notice.
A Technology Preview feature provides early access to upcoming product
innovations, allowing customers to experiment with the functionality and
provide feedback.
Technology Preview features may be privately or publicly available and neither
are intended for production use. While Mirantis will provide assistance with
such features through official channels, normal Service Level Agreements do not
apply.
As Mirantis considers making future iterations of Technology Preview features
generally available, we will do our best to resolve any issues that customers
experience when using these features.
During the development of a Technology Preview feature, additional components
may become available to the public for evaluation. Mirantis cannot guarantee
the stability of such features. As a result, if you are using Technology
Preview features, you may not be able to seamlessly upgrade to subsequent
product releases.
Mirantis makes no guarantees that Technology Preview features will graduate to
generally available features.