Introduction ¶

Reference Architecture¶

The MKE Reference Architecture provides a technical overview of Mirantis Kubernetes Engine (MKE). It is your source for the product hardware and software specifications, standards, component information, and configuration detail.

Introduction to MKE¶

Mirantis Kubernetes Engine (MKE) allows you to adopt modern application development and delivery models that are cloud-first and cloud-ready. With MKE you get a centralized place with a graphical UI to manage and monitor your Kubernetes and/or Swarm cluster instance.

The core MKE components are:

ucp-cluster-agent
Reconciles the cluster-wide state, including Kubernetes addons such as Kubecompose and KubeDNS, managing replication configurations for the etcd and RethinkDB clusters, and syncing the node inventories of SwarmKit and Swarm Classic. This component is a single-replica service that runs on any manager node in the cluster.
ucp-manager-agent
Reconciles the node-local state on manager nodes, including the configuration of the local Docker daemon, local date volumes, certificates, and local container components. Each manager node in the cluster runs a task from this service.
ucp-worker-agent
Performs the same reconciliation operations as ucp-manager-agent but on worker nodes. This component runs a task on each worker node.

The following MKE component names differ based on the node’s operating system:

Component name on Linux	Component name on Windows
`ucp-worker-agent`	`ucp-worker-agent-win`
`ucp-containerd-shim-process`	`ucp-containerd-shim-process-win`
`ucp-dsinfo`	`ucp-dsinfo-win`
No equivalent	`ucp-kube-binaries-win`
`ucp-pause`	`ucp-pause-win`

MKE hardware requirements¶

Take careful note of the minimum and recommended hardware requirements for MKE manager and worker nodes prior to deployment.

Note

High availability (HA) installations require transferring files between hosts.
On manager nodes, MKE only supports the workloads it requires to run.
Windows container images are typically larger than Linux container images. As such, provision more local storage for Windows nodes and for any MSR repositories that store Windows container images.

Minimum and recommended hardware requirements¶
	Manager nodes	Worker nodes
Minimum hardware requirements	16 GB of RAM 2 vCPUs 79 GB available storage: 79 GB available storage for the `/var` partition, unpartitioned OR 79 GB available storage, partitioned as follows: 25 GB for a single `/var/` partition 25 GB for `/var/lib/kubelet/` (for installations and future upgrades) 25 GB for `/var/lib/docker/` 4 GB for `/var/lib/containerd/`	4 GB RAM 15 GB storage for the `/var/` partition
Recommended hardware requirements	24 - 32 GB RAM 4 vCPUs At least 79 GB available storage, partitioned as follows: 25 GB for a single `/var/` partition 25 GB for `/var/lib/kubelet/` (for installations and future upgrades) 25 GB for `/var/lib/docker/` 4 GB for `/var/lib/containerd/`	Recommendations vary depending on the workloads.

MKE software requirements¶

Prior to MKE deployment, consider the following software requirements:

Run the same MCR version (20.10.0 or later) on all nodes.
Run Linux kernel 3.10 or higher on all nodes.

For debugging purposes, the host OS kernel versions should match as closely as possible.
Use a static IP address for each node in the cluster.

Manager nodes¶

Manager nodes manage a swarm and persist the swarm state. Using several containers per node, the ucp-manager-agent automatically deploys all MKE components on manager nodes, including the MKE web UI and the data stores that MKE uses.

Note

Some Kubernetes components are run as Swarm services because the MKE control plane is itself a Docker Swarm cluster.

The following tables detail the MKE services that run on manager nodes:

Swarm services¶
MKE component	Description
`ucp-auth-api`	The centralized service for identity and authentication used by MKE and MSR.
`ucp-auth-store`	A container that stores authentication configurations and data for users, organizations, and teams.
`ucp-auth-worker`	A container that performs scheduled LDAP synchronizations and cleans authentication and authorization data.
`ucp-client-root-ca`	A certificate authority to sign client bundles.
`ucp-cluster-agent`	The agent that monitors the cluster-wide MKE components. Runs on only one manager node.
`ucp-cluster-root-ca`	A certificate authority used for TLS communication between MKE components.
`ucp-controller`	The MKE web server.
`ucp-hardware-info`	A container for collecting disk/hardware information about the host.
`ucp-interlock`	A container that monitors Swarm workloads configured to use layer 7 routing. Only runs when you enable layer 7 routing.
`ucp-interlock-config`	A service that manages Interlock configuration.
`ucp-interlock-extension`	A service that verifies the run status of the Interlock extension.
`ucp-interlock-proxy`	A service that provides load balancing and proxying for Swarm workloads. Runs only when layer 7 routing is enabled.
`ucp-kube-apiserver`	A master component that serves the Kubernetes API. It persists its state in `etcd` directly, and all other components communicate directly with the API server. The Kubernetes API server is configured to encrypt Secrets using AES-CBC with a 256-bit key. The encryption key is never rotated, and the encryption key is stored on manager nodes, in a file on disk.
`ucp-kube-controller-manager`	A master component that manages the desired state of controllers and other Kubernetes objects. It monitors the API server and performs background tasks when needed.
`ucp-kubelet`	The Kubernetes node agent running on every node, which is responsible for running Kubernetes pods, reporting the health of the node, and monitoring resource usage.
`ucp-kube-proxy`	The networking proxy running on every node, which enables pods to contact Kubernetes services and other pods by way of cluster IP addresses.
`ucp-kube-scheduler`	A master component that manages Pod scheduling, which communicates with the API server only to obtain workloads that need to be scheduled.
`ucp-kv`	A container used to store the MKE configurations. Do not use it in your applications, as it is for internal use only. Also used by Kubernetes components.
`ucp-manager-agent`	The agent that monitors the manager node and ensures that the right MKE services are running.
`ucp-proxy`	A TLS proxy that allows secure access from the local Mirantis Container Runtime to MKE components.
`ucp-sf-notifier`	A Swarm service that sends notifications to Salesforce when alerts are configured by OpsCare, and later when they are triggered.
`ucp-swarm-manager`	A container used to provide backward compatibility with Docker Swarm.

Kubernetes components¶
MKE component	Description
`cri-dockerd-mke`	An MKE service that accounts for the removal of dockershim from Kubernetes as of version 1.24, thus enabling MKE to continue using Docker as the container runtime.
`k8s_calico-kube-controllers`	A cluster-scoped Kubernetes controller used to coordinate Calico networking. Runs on one manager node only.
`k8s_calico-node`	The Calico node agent, which coordinates networking fabric according to the cluster-wide Calico configuration. Part of the `calico-node` DaemonSet. Runs on all nodes. Configure the container network interface (CNI) plugin using the `--cni-installer-url` flag. If this flag is not set, MKE uses Calico as the default CNI plugin.
`k8s_enable-strictaffinity`	An init container for Calico controller that sets the StrictAffinity in Calico networking according to the configured boolean value.
`k8s_firewalld-policy_calico-node`	An init container for `calico-node` that verifies whether systems with firewalld are compatible with Calico.
`k8s_install-cni_calico-node`	A container in which the Calico CNI plugin binaries are installed and configured on each host. Part of the calico-node DaemonSet. Runs on all nodes.
`k8s_ucp-coredns_coredns`	The CoreDNS plugin, which provides service discovery for Kubernetes services and Pods.
`k8s_ucp-gatekeeper_gatekeeper-controller-manager`	The Gatekeeper manager controller for Kubernetes that provides policy enforcement. Only runs when OPA Gatekeeper is enabled in MKE.
`k8s_ucp-gatekeeper-audit_gatekeeper-audit`	The audit controller for Kubernetes that provides audit functionality of OPA Gatekeeper. Only runs when OPA Gatekeeper is enabled in MKE.
`k8s_ucp-kube-compose`	A custom Kubernetes resource component that translates Compose files into Kubernetes constructs. Part of the Compose deployment. Runs on one manager node only.
`k8s_ucp-kube-compose-api`	The API server for Kube Compose, which is part of the compose deployment. Runs on one manager node only.
`k8s_ucp-kube-ingress-controller`	The Ingress controller for Kubernetes, which provides layer 7 routing for Kubernertes services. Only runs with Ingress for Kubernetes enabled.
`k8s_ucp-metrics-inventory`	A container that generates the inventory targets for Prometheus server. Part of the Kubernetes Prometheus Metrics plugin.
`k8s_ucp-metrics-prometheus`	A container used to collect and process metrics for a node. Part of the Kubernetes Prometheus Metrics plugin.
`k8s_ucp-metrics-proxy`	A container that runs a proxy for the metrics server. Part of the Kubernetes Prometheus Metrics plugin.
`k8s_ucp-node-feature-discovery-master`	A container that provides node feature discovery labels for Kubernetes nodes.
`k8s_ucp-node-feature-discovery-worker`	A container that provides node feature discovery labels for Kubernetes nodes.
`k8s_ucp-nvidia-device-partitioner`	A container that provides support for Multi Instance GPU (MIG) on NVIDIA GPUs.
`k8s_ucp-secureoverlay-agent`	A container that provides a per-node service that manages the encryption state of the data plane.
`k8s_POD_ucp-secureoverlay-mgr`	A container that provides the key management process that configures and periodically rotates the encryption keys.

Kubernetes pause containers¶
MKE component	Description
`k8s_POD_calico-node`	The pause container for the `calico-node` pod.
`k8s_POD_calico-kube-controllers`	The pause container for the `calico-kube-controllers` pod.
`k8s_POD_compose`	The pause container for the `compose` pod.
`k8s_POD_compose-api`	The pause container for `ucp-kube-compose-api`.
`k8s_POD_coredns`	The pause container for the ucp-coredns Pod.
`k8s_POD_ingress-nginx-controller`	The pause container for `ucp-kube-ingress-controller`.
`k8s_POD_gatekeeper-audit`	The pause container for `ucp-gatekeeper-audit`.
`k8s_POD_gatekeeper-controller-manager`	The pause container for `ucp-gatekeeper`.
`k8s_POD_ucp-metrics`	The pause container for the ucp-metrics.
`k8s_POD_ucp-node-feature-discovery`	The pause container for the node feature discovery labels on Kubernetes nodes.
`k8s_POD_ucp-nvidia-device-partitioner`	A pause container for `ucp-nvidia-device-partitioner`.
`k8s_ucp-pause_ucp-nvidia-device-partitioner`	A pause container for `ucp-nvidia-device-partitioner`.

Worker nodes¶

Worker nodes are instances of MCR that participate in a swarm for the purpose of executing containers. Such nodes receive and execute tasks dispatched from manager nodes. Worker nodes must have at least one manager node, as they do not participate in the Raft distributed state, perform scheduling, or serve the swarm mode HTTP API.

Note

Some Kubernetes components are run as Swarm services because the MKE control plane is itself a Docker Swarm cluster.

The following tables detail the MKE services that run on worker nodes.

Swarm services¶
MKE component	Description
`ucp-hardware-info`	A container for collecting host information regarding disks and hardware.
`ucp-interlock-config`	A service that manages Interlock configuration.
`ucp-interlock-extension`	A helper service that reconfigures the `ucp-interlock-proxy` service, based on the Swarm workloads that are running.
`ucp-interlock-proxy`	A service that provides load balancing and proxying for swarm workloads. Only runs when you enable layer 7 routing.
`ucp-kube-proxy`	The networking proxy running on every node, which enables Pods to contact Kubernetes services and other Pods through cluster IP addresses. Named `ucp-kube-proxy-win` in Windows systems.
`ucp-kubelet`	The Kubernetes node agent running on every node, which is responsible for running Kubernetes Pods, reporting the health of the node, and monitoring resource usage. Named `ucp-kubelet-win` in Windows systems.
`ucp-pod-cleaner-win`	A service that removes all the Kubernetes Pods that remain once Kubernetes components are removed from Windows nodes. Runs only on Windows nodes.
`ucp-proxy`	A TLS proxy that allows secure access from the local Mirantis Container Runtime to MKE components.
`ucp-tigera-node-win`	The Calico node agent that coordinates networking fabric for Windows nodes according to the cluster-wide Calico configuration. Runs on Windows nodes when Kubernetes is set as the orchestrator.
`ucp-tigera-felix-win`	A Calico component that runs on every machine that provides endpoints. Runs on Windows nodes when Kubernetes is set as the orchestrator.
`ucp-worker-agent-x` and `ucp-worker-agent-y`	A service that monitors the worker node and ensures that the correct MKE services are running. The `ucp-worker-agent` service ensures that only authorized users and other MKE services can run Docker commands on the node. The `ucp-worker-agent-<x/y>` deploys a set of containers onto worker nodes, which is a subset of the containers that `ucp-manager-agent` deploys onto manager nodes. This component is named `ucp-worker-agent-win-<x/y>` on Windows nodes.

Kubernetes components¶
MKE component	Description
`cri-dockerd-mke`	An MKE service that accounts for the removal of dockershim from Kubernetes as of version 1.24, thus enabling MKE to continue using Docker as the container runtime.
`k8s_calico-node`	The Calico node agent that coordinates networking fabric according to the cluster-wide Calico configuration. Part of the calico-node DaemonSet. Runs on all nodes.
`k8s_firewalld-policy_calico-node`	An init container for `calico-node` that verifies whether systems with firewalld are compatible with Calico.
`k8s_install-cni_calico-node`	A container that installs the Calico CNI plugin binaries and configuration on each host. Part of the `calico-node` DaemonSet. Runs on all nodes.
`k8s_ucp-node-feature-discovery-master`	A container that provides node feature discovery labels for Kubernetes nodes.
`k8s_ucp-node-feature-discovery-worker`	A container that provides node feature discovery labels for Kubernetes nodes.
`k8s_ucp-nvidia-device-partitioner`	A container that provides supports for Multi Instance GPU (MIG) on NVIDIA GPUs.
`k8s_ucp-secureoverlay-agent`	A container that provides a per-node service that manages the encryption state of the data plane.

Kubernetes pause containers¶
MKE component	Description
`k8s_POD_calico-node`	The pause container for the Calico-node Pod. This container is hidden by default, but you can see it by running the following command: docker ps -a
`k8s_POD_ucp-node-feature-discovery`	The pause container for the node feature discovery labels on Kubernetes nodes.
`k8s_POD_ucp-nvidia-device-partitioner`	The pause container for `ucp-nvidia-device-partitioner`.
`k8s_ucp-pause_ucp-nvidia-device-partitioner`	The pause container for `ucp-nvidia-device-partitioner`.

Admission controllers¶

Admission controllers are plugins that govern and enforce cluster usage. There are two types of admission controllers: default and custom. The tables below list the available admission controllers. For more information, see Kubernetes documentation: Using Admission Controllers.

Note

You cannot enable or disable custom admission controllers.

Default admission controllers
Custom admission controllers

Default admission controllers¶
Name	Description
`DefaultStorageClass`	Adds a default storage class to `PersistentVolumeClaim` objects that do not request a specific storage class.
`DefaultTolerationSeconds`	Sets the pod default forgiveness toleration to tolerate the `notready:NoExecute` and `unreachable:NoExecute` taints based on the `default-not-ready-toleration-seconds` and `default-unreachable-toleration-seconds` Kubernetes API server input parameters if they do not already have toleration for the `node.kubernetes.io/not-ready:NoExecute` or `node.kubernetes.io/unreachable:NoExecute` taints. The default value for both input parameters is five minutes.
`LimitRanger`	Ensures that incoming requests do not violate the constraints in a namespace `LimitRange` object.
`MutatingAdmissionWebhook`	Calls any mutating webhooks that match the request.
`NamespaceLifecycle`	Ensures that users cannot create new objects in namespaces undergoing termination and that MKE rejects requests in nonexistent namespaces. It also prevents users from deleting the reserved `default`, `kube-system`, and `kube-public` namespaces.
`NodeRestriction`	Limits the `Node` and `Pod` objects that a kubelet can modify.
`PersistentVolumeLabel` (deprecated)	Attaches region or zone labels automatically to `PersistentVolumes` as defined by the cloud provider.
`PodNodeSelector`	Limits which node selectors can be used within a namespace by reading a namespace annotation and a global configuration.
`PodSecurityPolicy`	Determines whether a new or modified pod should be admitted based on the requested security context and the available Pod Security Policies.
`ResourceQuota`	Observes incoming requests and ensures they do not violate any of the constraints in a namespace `ResourceQuota` object.
`ServiceAccount`	Implements automation for `ServiceAccount` resources.
`ValidatingAdmissionWebhook`	Calls any validating webhooks that match the request.

Custom admission controllers¶
Name	Description
`UCPAuthorization`	Annotates Docker Compose-on-Kubernetes `Stack` resources with the identity of the user performing the request so that the Docker Compose-on-Kubernetes resource controller can manage `Stacks` with correct user authorization. Detects the deleted `ServiceAccount` resources to correctly remove them from the scheduling authorization backend of an MKE node. Simplifies creation of the `RoleBindings` and `ClusterRoleBindings` resources by automatically converting user, organization, and team Subject names into their corresponding unique identifiers. Prevents users from deleting the built-in `cluster-admin`, `ClusterRole`, or `ClusterRoleBinding` resources. Prevents under-privileged users from creating or updating `PersistentVolume` resources with host paths. Works in conjunction with the built-in `PodSecurityPolicies` admission controller to prevent under-privileged users from creating `Pods` with privileged options. To grant non-administrators and non-cluster-admins access to privileged attributes, refer to Use admission controllers for access in the MKE Operations Guide.
`CheckImageSigning`	Enforces MKE Docker Content Trust policy which, if enabled, requires that all pods use container images that have been digitally signed by trusted and authorized users, which are members of one or more teams in MKE.
`UCPNodeSelector`	Adds a `com.docker.ucp.orchestrator.kubernetes:*` toleration to pods in the kube-system namespace and removes the `com.docker.ucp.orchestrator.kubernetes` tolerations from pods in other namespaces. This ensures that user workloads do not run on swarm-only nodes, which MKE taints with `com.docker.ucp.orchestrator.kubernetes:NoExecute`. It also adds a node affinity to prevent pods from running on manager nodes depending on MKE settings.

Pause containers¶

Every Kubernetes Pod includes an empty pause container, which bootstraps the Pod to establish all of the cgroups, reservations, and namespaces before its individual containers are created. The pause container image is always present, so the pod resource allocation happens instantaneously as containers are created.

To display pause containers:

When using the client bundle, pause containers are hidden by default.

To display pause containers when using the client bundle:
```
docker ps -a | grep -I pause
```

To display pause containers when not using the client bundle:

Log in to a manager or worker node.
Display pause containers:
```
docker ps | grep -I pause
```

Example output on a manager node:

5aeeafb80e8f   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_calico-kube-controllers-86565cb444-rwlrd_kube-system_fdd491cc-94e4-4510-a080-396454f2798c_0
ea4a1263398d   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-node-feature-discovery-59btp_node-feature-discovery_ef7a6f29-e3d4-4430-9c75-22940208f616_0
951f6622f8de   d50ea4c05222               "/pause"   2 hours ago   Up 2 hours   k8s_ucp-pause_ucp-nvidia-device-partitioner-77qq5_kube-system_59d95409-721e-48f3-9524-97f1d30e63a4_0
f99ab238282e   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-nvidia-device-partitioner-77qq5_kube-system_59d95409-721e-48f3-9524-97f1d30e63a4_0
eec3d297e7a2   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-metrics-6sf2z_kube-system_de4f67d3-99cc-4d00-a4f1-ccad66c31ebc_0
5a40fdc669b1   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_compose-api-cb58448cc-xfb5g_kube-system_d1c8c8d2-9b81-475f-9cd4-9f486d3ace97_0
8e5897a13cd6   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_coredns-9d5479b97-gmwct_kube-system_9c89d798-ff47-4194-b5d3-1ba3368698bc_0
d308274689a4   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_coredns-9d5479b97-sxnb2_kube-system_74a70909-771c-4dce-9518-d49129e3645c_0
c45bf83d032a   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_compose-69d4dc8c69-f56ql_kube-system_64646ec3-f9e8-4cce-aeb1-37636e1858ce_0
c32ea1407b28   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_calico-node-j9fmw_kube-system_0939bae3-0659-4608-8547-0b4095d99cc5_0

Example output on a worker node:

c5e836c38435   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-node-feature-discovery-wztkl_node-feature-discovery_efe87dc1-349e-47f2-a98f-67d4675f6d9b_0
0f66550f654e   d50ea4c05222               "/pause"   2 hours ago   Up 2 hours   k8s_ucp-pause_ucp-nvidia-device-partitioner-bq5th_kube-system_873f6045-8f61-4a55-9d00-d55b27e8f2c9_0
753efca985ef   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_calico-node-xx28v_kube-system_6c024ae0-8d27-4d89-a327-8ca635b37f79_0
7f2eda992ea6   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-nvidia-device-partitioner-bq5th_kube-system_873f6045-8f61-4a55-9d00-d55b27e8f2c9_0

See also

Kubernetes Pods

Volumes¶

MKE uses named volumes to persist data on all nodes on which it runs.

Volumes used by MKE manager nodes¶
Volume name	Contents
`ucp-auth-api-certs`	Certificate and keys for the authentication and authorization service.
`ucp-auth-store-certs`	Certificate and keys for the authentication and authorization store.
`ucp-auth-store-data`	Data of the authentication and authorization store, replicated across managers.
`ucp-auth-worker-certs`	Certificate and keys for authentication worker.
`ucp-auth-worker-data`	Data of the authentication worker.
`ucp-client-root-ca`	Root key material for the MKE root CA that issues client certificates.
`ucp-cluster-root-ca`	Root key material for the MKE root CA that issues certificates for swarm members.
`ucp-controller-client-certs`	Certificate and keys that the MKE web server uses to communicate with other MKE components.
`ucp-controller-server-certs`	Certificate and keys for the MKE web server running in the node.
`ucp-kv`	MKE configuration data, replicated across managers.
`ucp-kv-certs`	Certificates and keys for the key-value store.
`ucp-metrics-data`	Monitoring data that MKE gathers.
`ucp-metrics-inventory`	Configuration file that the `ucp-metrics` service uses.
`ucp-node-certs`	Certificate and keys for node communication.
`ucp-backup`	Backup artifacts that are created while processing a backup. The artifacts persist on the volume for the duration of the backup and are cleaned up when the backup completes, though the volume itself remains.
`mke-containers`	Symlinks to MKE component log files, created by `ucp-agent`.
`ucp-kube-apiserver-audit`	Audit logs streamed by `kube-apiserver` container.

Volumes used by MKE worker nodes¶
Volume name	Contents
`ucp-node-certs`	Certificate and keys for node communication.
`mke-containers`	Symlinks to MKE component log files, created by `ucp-agent`.

You can customize the volume driver for the volumes by creating the volumes prior to installing MKE. During installation, MKE determines which volumes do not yet exist on the node and creates those volumes using the default volume driver.

By default, MKE stores the data for these volumes at /var/lib/docker/volumes/<volume-name>/_data.

Configuration¶

The table below presents the configuration files in use by MKE:

Configuration files in use by MKE¶
Configuration file name	Description
`com.docker.interlock.extension`	Configuration of the Interlock extension service that monitors and configures the proxy service
`com.docker.interlock.proxy`	Configuration of the service that handles and routes user requests
`com.docker.license`	MKE license
`com.docker.ucp.interlock.conf`	Configuration of the core Interlock service

Web UI and CLI¶

You can interact with MKE either through the web UI or the CLI.

With the MKE web UI you can manage your swarm, grant and revoke user permissions, deploy, configure, manage, and monitor your applications.

In addition, MKE exposes the standard Docker API, so you can continue using such existing tools as the Docker CLI client. As MKE secures your cluster with RBAC, you must configure your Docker CLI client and other client tools to authenticate your requests using client certificates that you can download from your MKE profile page.

Role-based access control¶

MKE allows administrators to authorize users to view, edit, and use cluster resources by granting role-based permissions for specific resource sets.

To authorize access to cluster resources across your organization, high-level actions that MKE administrators can take include the following:

Add and configure subjects (users, teams, organizations, and service accounts).
Define custom roles (or use defaults) by adding permitted operations per resource type.
Group cluster resources into resource sets of Swarm collections or Kubernetes namespaces.
Create grants by combining subject, role, and resource set.

Note

Only administrators can manage Role-based access control (RBAC).

The following table describes the core elements used in RBAC:

Element	Description
Subjects	Subjects are granted roles that define the permitted operations for one or more resource sets and include: User A person authenticated by the authentication backend. Users can belong to more than one team and more than one organization. Team A group of users that share permissions defined at the team level. A team can be in only one organization. Organization A group of teams that share a specific set of permissions, defined by the roles of the organization. Service account A Kubernetes object that enables a workload to access cluster resources assigned to a namespace.
Roles	Roles define what operations can be done by whom. A role is a set of permitted operations for a type of resource, such as a container or volume. It is assigned to a user or a team with a grant. For example, the built-in `Restricted Control` role includes permissions to view and schedule but not to update nodes. Whereas a custom role may include permissions to read, write, and execute (`r-w-x`) volumes and secrets. Most organizations use multiple roles to fine-tune the appropriate access for different subjects. Users and teams may have different roles for the different resources they access.
Resource sets	Users can group resources into two types of resource sets to control user access: Docker Swarm collections and Kubernetes namespaces. Docker Swarm collections Collections have a directory-like structure that holds Swarm resources. You can create collections in MKE by defining a directory path and moving resources into it. Alternatively, you can use `labels` in your YAML file to assign application resources to the path. Resource types that users can access in Swarm collections include containers, networks, nodes, services, secrets, and volumes. Each Swarm resource can be in only one collection at a time, but collections can be nested inside one another to a maximum depth of two layers. Collection permission includes permission for child collections. For child collections and users belonging to more than one team, the system concatenates permissions from multiple roles into an `effective role` for the user, which specifies the operations that are allowed for the target. Kubernetes namespaces Namespaces are virtual clusters that allow multiple teams to access a given cluster with different permissions. Kubernetes automatically sets up four namespaces, and users can add more as necessary, though unlike Swarm collections they cannot be nested. Resource types that users can access in Kubernetes namespaces include pods, deployments, network policies, nodes, services, secrets, and more. Note MKE uses two default security policies: privileged and unprivileged. To prevent users from bypassing the MKE security model, only administrators and service accounts granted the `cluster-admin` `ClusterRole` for all Kubernetes namespaces through a `ClusterRoleBinding` can deploy pods with privileged options. Refer to Default Pod security policies in MKE for more information.
Grants	Grants consist of a subject, role, and resource set, and define how specific users can access specific resources. All the grants of an organization taken together constitute an access control list (ACL), which is a comprehensive access policy for the organization.

For complete information on how to configure and use role-based access control in MKE, refer to Authorize role-based access.

MKE limitations¶

MKE does not support user namespaces for nodes.
Due to a Kubernetes limitation, MKE containers do not run in Hyper-V isolation mode on Windows.

For more information on Hyper-V isolation and Kubernetes, refer to:
- Windows Container Essentials: Isolation Modes.
- Intro to Windows support in Kubernetes: Hyper-V isolation.

See also

Installation Guide¶

The MKE Installation Guide provides everything you need to install and configure Mirantis Kubernetes Engine (MKE). The guide offers detailed information, procedures, and examples that are specifically designed to help DevOps engineers and administrators install and configure the MKE container orchestration platform.

Plan the deployment¶

Default install directories¶

The following table details the default MKE install directories:

Path	Description
`/var/lib/docker`	Docker data root directory
`/var/lib/kubelet`	kubelet data root directory (created with `ftype = 1`)
`/var/lib/containerd`	containerd data root directory (created with `ftype = 1`)

Host name strategy¶

Before installing MKE, plan a single host name strategy to use consistently throughout the cluster, keeping in mind that MKE and MCR both use host names.

There are two general strategies for creating host names: short host names and fully qualified domain names (FQDN). Consider the following examples:

Short host name: engine01
Fully qualified domain name: node01.company.example.com

MCR considerations¶

A number of MCR considerations must be taken into account when deploying any MKE cluster.

default-address-pools¶

MCR uses three separate IP ranges for the docker0, docker_gwbridge, and ucp-bridge interfaces. By default, MCR assigns the first available subnet in default-address-pools (172.17.0.0/16) to docker0, the second (172.18.0.0/16) to docker_gwbridge, and the third (172.19.0.0/16) to ucp-bridge.

Note

The ucp-bridge bridge network specifically supports MKE component containers.

You can reassign the docker0, docker_gwbridge, and ucp-bridge subnets in default-address-pools. To do so, replace the relevant values in default-address-pools in the /etc/docker/daemon.json file, making sure that the setting includes at least three IP pools. Be aware that you must restart the docker.service to activate your daemon.json file edits.

By default, default-address-pools contains the following values:

{
  "default-address-pools": [
   {"base":"172.17.0.0/16","size":16}, <-- docker0
   {"base":"172.18.0.0/16","size":16}, <-- docker_gwbridge
   {"base":"172.19.0.0/16","size":16}, <-- ucp-bridge
   {"base":"172.20.0.0/16","size":16},
   {"base":"172.21.0.0/16","size":16},
   {"base":"172.22.0.0/16","size":16},
   {"base":"172.23.0.0/16","size":16},
   {"base":"172.24.0.0/16","size":16},
   {"base":"172.25.0.0/16","size":16},
   {"base":"172.26.0.0/16","size":16},
   {"base":"172.27.0.0/16","size":16},
   {"base":"172.28.0.0/16","size":16},
   {"base":"172.29.0.0/16","size":16},
   {"base":"172.30.0.0/16","size":16},
   {"base":"192.168.0.0/16","size":20}
   ]
 }

The default-address-pools parameters¶
Parameter	Description
`default-address-pools`	The list of CIDR ranges used to allocate subnets for local bridge networks.
`base`	The CIDR range allocated for bridge networks in each IP address pool.
`size`	The CIDR netmask that determines the subnet size to allocate from the base pool. If the `size` matches the netmask of the `base`, then the pool contains one subnet. For example, `{"base":"172.17.0.0/16","size":16}` creates the subnet: `172.17.0.0/16` (`172.17.0.1` - `172.17.255.255`).

For example, {"base":"192.168.0.0/16","size":20} allocates /20 subnets from 192.168.0.0/16, including the following subnets for bridge networks:

192.168.0.0/20 (192.168.0.1 - 192.168.15.255)

192.168.16.0/20 (192.168.16.1 - 192.168.31.255)

192.168.32.0/20 (192.168.32.1 - 192.168.47.255)

192.168.48.0/20 (192.168.48.1 - 192.168.63.255)

192.168.64.0/20 (192.168.64.1 - 192.168.79.255)

…

192.168.240.0/20 (192.168.240.1 - 192.168.255.255)

docker0¶

MCR creates and configures the host system with the docker0 virtual network interface, an ethernet bridge through which all traffic between MCR and the container moves. MCR uses docker0 to handle all container routing. You can specify an alternative network interface when you start the container.

MCR allocates IP addresses from the docker0 configurable IP range to the containers that connect to docker0. The default IP range, or subnet, for docker0 is 172.17.0.0/16.

You can change the docker0 subnet in /etc/docker/daemon.json using the settings in the following table. Be aware that you must restart the docker.service to activate your daemon.json file edits.

Parameter	Description
`default-address-pools`	Modify the first pool in `default-address-pools`. Caution By default, MCR assigns the second pool to `docker_gwbridge`. If you modify the first pool such that the `size` does not match the `base` netmask, it can affect `docker_gwbridge`. { "default-address-pools": [ {"base":"172.17.0.0/16","size":16}, <-- Modify this value {"base":"172.18.0.0/16","size":16}, {"base":"172.19.0.0/16","size":16}, {"base":"172.20.0.0/16","size":16}, {"base":"172.21.0.0/16","size":16}, {"base":"172.22.0.0/16","size":16}, {"base":"172.23.0.0/16","size":16}, {"base":"172.24.0.0/16","size":16}, {"base":"172.25.0.0/16","size":16}, {"base":"172.26.0.0/16","size":16}, {"base":"172.27.0.0/16","size":16}, {"base":"172.28.0.0/16","size":16}, {"base":"172.29.0.0/16","size":16}, {"base":"172.30.0.0/16","size":16}, {"base":"192.168.0.0/16","size":20} ] }
`fixed-cidr`	Configures a CIDR range. Customize the subnet for `docker0` using standard CIDR notation. The default subnet is `172.17.0.0/16`, the network gateway is `172.17.0.1`, and MCR allocates IPs `172.17.0.2` - `172.17.255.254` for your containers. { "fixed-cidr": "172.17.0.0/16", }
`bip`	Configures a gateway IP address and CIDR netmask of the `docker0` network. Customize the subnet for `docker0` using the `<gateway IP>/<CIDR netmask>` notation. The default subnet is `172.17.0.0/16`, the network gateway is `172.17.0.1`, and MCR allocates IPs `172.17.0.2` - `172.17.255.254` for your containers. { "bip": "172.17.0.0/16", }

Parameter

Description

default-address-pools

Modify the first pool in default-address-pools.

Caution

By default, MCR assigns the second pool to docker_gwbridge. If you modify the first pool such that the size does not match the base netmask, it can affect docker_gwbridge.

{
   "default-address-pools": [
         {"base":"172.17.0.0/16","size":16}, <-- Modify this value
         {"base":"172.18.0.0/16","size":16},
         {"base":"172.19.0.0/16","size":16},
         {"base":"172.20.0.0/16","size":16},
         {"base":"172.21.0.0/16","size":16},
         {"base":"172.22.0.0/16","size":16},
         {"base":"172.23.0.0/16","size":16},
         {"base":"172.24.0.0/16","size":16},
         {"base":"172.25.0.0/16","size":16},
         {"base":"172.26.0.0/16","size":16},
         {"base":"172.27.0.0/16","size":16},
         {"base":"172.28.0.0/16","size":16},
         {"base":"172.29.0.0/16","size":16},
         {"base":"172.30.0.0/16","size":16},
         {"base":"192.168.0.0/16","size":20}
   ]
}

fixed-cidr

Configures a CIDR range.

Customize the subnet for docker0 using standard CIDR notation. The default subnet is 172.17.0.0/16, the network gateway is 172.17.0.1, and MCR allocates IPs 172.17.0.2 - 172.17.255.254 for your containers.

{
  "fixed-cidr": "172.17.0.0/16",
}

bip

Configures a gateway IP address and CIDR netmask of the docker0 network.

Customize the subnet for docker0 using the <gateway IP>/<CIDR netmask> notation. The default subnet is 172.17.0.0/16, the network gateway is 172.17.0.1, and MCR allocates IPs 172.17.0.2 - 172.17.255.254 for your containers.

{
  "bip": "172.17.0.0/16",
}

docker_gwbridge¶

The docker_gwbridge is a virtual network interface that connects overlay networks (including ingress) to individual MCR container networks. Initializing a Docker swarm or joining a Docker host to a swarm automatically creates docker_gwbridge in the kernel of the Docker host. The default docker_gwbridge subnet (172.18.0.0/16) is the second available subnet in default-address-pools.

To change the docker_gwbridge subnet, open daemon.json and modify the second pool in default-address-pools:

{
    "default-address-pools": [
       {"base":"172.17.0.0/16","size":16},
       {"base":"172.18.0.0/16","size":16}, <-- Modify this value
       {"base":"172.19.0.0/16","size":16},
       {"base":"172.20.0.0/16","size":16},
       {"base":"172.21.0.0/16","size":16},
       {"base":"172.22.0.0/16","size":16},
       {"base":"172.23.0.0/16","size":16},
       {"base":"172.24.0.0/16","size":16},
       {"base":"172.25.0.0/16","size":16},
       {"base":"172.26.0.0/16","size":16},
       {"base":"172.27.0.0/16","size":16},
       {"base":"172.28.0.0/16","size":16},
       {"base":"172.29.0.0/16","size":16},
       {"base":"172.30.0.0/16","size":16},
       {"base":"192.168.0.0/16","size":20}
   ]
}

Caution

Modifying the first pool to customize the docker0 subnet can affect the default docker_gwbridge subnet. Refer to docker0 for more information.
You can only customize the docker_gwbridge settings before you join the host to the swarm or after temporarily removing it.

Docker swarm¶

The default address pool that Docker Swarm uses for its overlay network is 10.0.0.0/8. If this pool conflicts with your current network implementation, you must use a custom IP address pool. Prior to installing MKE, specify your custom address pool using the --default-addr-pool option when initializing swarm.

Note

The Swarm default-addr-pool and MCR default-address-pools settings define two separate IP address ranges used for different purposes.

A node.Status.Addr of 0.0.0.0 can cause unexpected problems. To prevent any such issues, add the --advertise-addr flag to the docker swarm join command.

To resolve the 0.0.0.0 situation, initiate the following workaround:

Stop the docker daemon that has .Status.Addr 0.0.0.0.
In the /var/lib/docker/swarm/docker-state.json file, apply the correct node IP to AdvertiseAddr and LocalAddr.
Start the docker daemon.

Example result:

`{"LocalAddr":"","RemoteAddr":"10.200.200.10:2377","ListenAddr":"0.0.0.0:2377","AdvertiseAddr":"","DataPathAddr":"","DefaultAddressPool":null,"SubnetSize":0,"DataPathPort":0,"JoinInProgress":false,"FIPS":false}`

`{"LocalAddr":"10.200.200.13","RemoteAddr":"","ListenAddr":"0.0.0.0:2377","AdvertiseAddr":"10.200.200.13:2377","DataPathAddr":"","DefaultAddressPool":null,"SubnetSize":0,"DataPathPort":0,"JoinInProgress":false,"FIPS":false}

Kubernetes¶

Kubernetes uses two internal IP ranges, either of which can overlap and conflict with the underlying infrastructure, thus requiring custom IP ranges.

The pod network: Either Calico or Azure IPAM services gives each Kubernetes pod an IP address in the default 192.168.0.0/16 range. To customize this range, during MKE installation, use the --pod-cidr flag with the ucp install command.
The services network: You can access Kubernetes services with a VIP in the default 10.96.0.0/16 Cluster IP range. To customize this range, during MKE installation, use the --service-cluster-ip-range flag with the ucp install command.

See also

docker data-root¶

The storage path for such persisted data as images, volumes, and cluster state is docker data root (data-root in /etc/docker/daemon.json).

MKE clusters require that all nodes have the same docker data-root for the Kubernetes network to function correctly. In addition, if the data-root is changed on all nodes you must recreate the Kubernetes network configuration in MKE by running the following commands:

kubectl -n kube-system delete configmap/calico-config
kubectl -n kube-system delete ds/calico-node deploy/calico-kube-controllers

See also

no-new-privileges¶

The no-new-privileges setting prevents the container application processes from gaining new privileges during the execution process.

For most Linux distributions, MKE supports setting no-new-privileges to true in the /etc/docker/daemon.json file. The parameter is not, however, supported on RHEL 7.9, CentOS 7.9, Oracle Linux 7.8, and Oracle Linux 7.9.

This option is not supported on Windows. It is a Linux kernel feature.

Device Mapper storage driver¶

MCR hosts that run the devicemapper storage driver use the loop-lvm configuration mode by default. This mode uses sparse files to build the thin pool used by image and container snapshots and is designed to work without any additional configuration.

Note

Mirantis recommends that you use direct-lvm mode in production environments in lieu of loop-lvm mode. direct-lvm mode is more efficient in its use of system resources than loop-lvm mode, and you can scale it as necessary.

For information on how to configure direct-lvm mode, refer to the Docker documentation, Use the Device Mapper storage driver.

Memory metrics reporting¶

To report accurate memory metrics, MCR requires that you enable specific kernel settings that are often disabled on Ubuntu and Debian systems. For detailed instructions on how to do this, refer to the Docker documentation, Your kernel does not support cgroup swap limit capabilities.

Perform pre-deployment configuration¶

Configure networking¶

A well-configured network is essential for the proper functioning of your MKE deployment. Pay particular attention to such key factors as IP address provisioning, port management, and traffic enablement.

IP considerations¶

Before installing MKE, adopt the following practices when assigning IP addresses:

Ensure that your network and nodes support using a static IPv4 address and assign one to every node.

Avoid IP range conflicts. The following table lists the recommended addresses you can use to avoid IP range conflicts:

Component	Subnet	Range	Recommended IP address
MCR	`default-address-pools`	CIDR range for interface and bridge networks	172.17.0.0/16 - 172.30.0.0/16, 192.168.0.0/16
Swarm	`default-addr-pool`	CIDR range for Swarm overlay networks	10.0.0.0/8
Kubernetes	`pod-cidr`	CIDR range for Kubernetes pods	192.168.0.0/16
Kubernetes	`service-cluster-ip-range`	CIDR range for Kubernetes services	10.96.0.0/16 Minimum: 10.96.0.0/24

See also

Kubernetes official documentation

Open ports to incoming traffic¶

When installing MKE on a host, you need to open specific ports to incoming traffic. Each port listens for incoming traffic from a particular set of hosts, known as the port scope.

MKE uses the following scopes:

Scope	Description
External	Traffic arrives from outside the cluster through end-user interaction.
Internal	Traffic arrives from other hosts in the same cluster.
Self	Traffic arrives to Self ports only from processes on the same host. These ports, however, do not need to be open to outside traffic.

Open the following ports for incoming traffic on each host type:

Hosts	Port	Scope	Purpose
Managers, workers	TCP 179	Internal	BGP peers, used for Kubernetes networking
Managers	TCP 443 (configurable)	External, internal	MKE web UI and API
Managers	TCP 2376 (configurable)	Internal	Docker swarm manager, used for backwards compatibility
Managers	TCP 2377 (configurable)	Internal	Control communication between swarm nodes
Managers, workers	UDP 4789	Internal	Overlay networking
Managers	TCP 6443 (configurable)	External, internal	Kubernetes API server endpoint
Managers, workers	TCP 6444	Self	Kubernetes API reverse proxy
Managers, workers	TCP, UDP 7946	Internal	Gossip-based clustering
Managers, workers	TCP 9091	Self	Felix Prometheus `calico-node` metrics
Managers	TCP 9094	Self	Felix Prometheus `kube-controller` metrics
Managers, workers	TCP 9099	Self	Calico health check
Managers, workers	TCP 10248	Self	Kubelet health check
Managers, workers	TCP 10250	Internal	Kubelet
Managers, workers	TCP 12376	Internal	TLS authentication proxy that provides access to MCR
Managers, workers	TCP 12378	Self	etcd reverse proxy
Managers	TCP 12379	Internal	etcd Control API
Managers	TCP 12380	Internal	etcd Peer API
Managers	TCP 12381	Internal	MKE cluster certificate authority
Managers	TCP 12382	Internal	MKE client certificate authority
Managers	TCP 12383	Internal	Authentication storage backend
Managers	TCP 12384	Internal	Authentication storage backend for replication across managers
Managers	TCP 12385	Internal	Authentication service API
Managers	TCP 12386	Internal	Authentication worker
Managers	TCP 12387	Internal	Prometheus server
Managers	TCP 12388	Internal	Kubernetes API server
Managers, workers	TCP 12389	Self	Hardware Discovery API

See also

Ports information for:

Cluster and service networking options¶

MKE supports the following cluster and service networking options:

Kube-proxy with iptables proxier, and either the managed CNI or an unmanaged alternative
Kube-proxy with ipvs proxier, and either the managed CNI or an unmanaged alternative
eBPF mode with either the managed CNI or an unmanaged alternative

You can configure cluster and service networking options at install time or in existing clusters. For detail on reconfiguring existing clusters, refer to Configure cluster and service networking in an existing cluster in the MKE Operations Guide.

Caution

Swarm workloads that require the use of encrypted overlay networks must use iptables proxier with either the managed CNI or an unmanaged alternative. Be aware that the other networking options detailed here automatically disable Docker Swarm encrypted overlay networks.

Mirantis partner integrations¶
Solution component	Develop and maintain	Test and integrate with MKE	First line support	Product support
Calico Open Source	Community	Mirantis	Mirantis	Tigera for Linux, Mirantis for Windows
Calico Enterprise	Tigera	Tigera, for every major MKE release	Mirantis	Tigera, with customers paying for additional features
Cilium Open Source	Community	Planned	Mirantis	Community or Isovalent
Cilium Enterprise	Isovalent	Isovalent	Mirantis	Isovalent

To enable kube-proxy with iptables proxier while using the managed CNI:

Using default option kube-proxy with iptables proxier is the equivalent of specifying --kube-proxy-mode=iptables at install time. To verify that the option is operational, confirm the presence of the following line in the ucp-kube-proxy container logs:

I1027 05:35:27.798469        1 server_others.go:212] Using iptables Proxier.

To enable kube-proxy with ipvs proxier while using the managed CNI:

Prior to MKE installation, verify that the following kernel modules are available on all Linux manager and worker nodes:
- ipvs
- ip_vs_rr
- ip_vs_wrr
- ip_vs_sh
- nf_conntrack_ipv4
Specify --kube-proxy-mode=ipvs at install time.
Optional. Once installation is complete, configure the following ipvs-related parameters in the MKE configuration file (otherwise, MKE will use the Kubernetes default parameter settings):
- ipvs_exclude_cidrs = ""
- ipvs_min_sync_period = ""
- ipvs_scheduler = ""
- ipvs_strict_arp = false
- ipvs_sync_period = ""
- ipvs_tcp_timeout = ""
- ipvs_tcpfin_timeout = ""
- ipvs_udp_timeout = ""
For more information on using these parameters, refer to kube-proxy in the Kubernetes documentation.

Note

The ipvs-related parameters have no install time counterparts and therefore must only be configured once MKE installation is complete.

Verify that kube-proxy with ipvs proxier is operational by confirming the presence of the following lines in the ucp-kube-proxy container logs:

I1027 05:14:50.868486     1 server_others.go:274] Using ipvs Proxier.
W1027 05:14:50.868822     1 proxier.go:445] IPVS scheduler not specified, use rr by default

To enable eBPF mode while using the managed CNI:

Verify that the prerequisites for eBPF use have been met, including kernel compatibility, for all Linux manager and worker nodes. Refer to the Calico documentation Enable the eBPF dataplane for more information.
Specify --calico-ebpf-enabled at install time.
Verify that eBPF mode is operational by confirming the presence of the following lines in the ucp-kube-proxy container logs:
```
KUBE_PROXY_MODE (disabled) CLEANUP_ON_START_DISABLED true
"Sleeping forever...."
```

To enable kube-proxy with iptables proxier while using an unmanaged CNI.

Specify --unmanaged-cni at install time.
Verify that kube-proxy with iptables proxier is operational by confirming the presence of the following line in the ucp-kube-proxy container logs:
```
I1027 05:35:27.798469     1 server_others.go:212] Using iptables Proxier.
```

To enable kube-proxy with ipvs proxier while using an unmanaged CNI:

Specify the following parameters at install time:
- --unmanaged-cni
- --kube-proxy-mode=ipvs

Verify that kube-proxy with ipvs proxier is operational by confirming the presence of the following lines in the ucp-kube-proxy container logs:

I1027 05:14:50.868486     1 server_others.go:274] Using ipvs Proxier.
W1027 05:14:50.868822     1 proxier.go:445] IPVS scheduler not specified, use rr by default

To enable eBPF mode while using an unmanaged CNI:

Verify that the prerequisites for eBPF use have been met, including kernel compatibility, for all Linux manager and worker nodes. Refer to the Calico documentation Enable the eBPF dataplane for more information.
Specify the following parameters at install time:
- --unmanaged-cni
- --kube-proxy-mode=disabled
- --kube-default-drop-masq-bits
Verify that eBPF mode is operational by confirming the presence of the following lines in ucp-kube-proxy container logs:
```
KUBE_PROXY_MODE (disabled) CLEANUP_ON_START_DISABLED true
"Sleeping forever...."
```

Calico networking¶

Calico is the default networking plugin for MKE. The default Calico encapsulation setting for MKE is VXLAN, however the plugin also supports IP-in-IP encapsulation. Refer to the Calico documentation on Overlay networking for more information.

Important

NetworkManager can impair the Calico agent routing function. To resolve this issue, you must create a file called /etc/NetworkManager/conf.d/calico.conf with the following content:

[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:wireguard.cali

Enable ESP traffic¶

For overlay networks with encryption to function, you must allow IP protocol 50 Encapsulating Security Payload (ESP) traffic.

If you are running RHEL 8.x, Rocky Linux 8.x, or CentOS 8, install kernel module xt_u32:

sudo dnf install kernel-modules-extra

Avoid firewall conflicts¶

Avoid firewall conflicts in the following Linux distributions:

Linux distribution	Procedure
SUSE Linux Enterprise Server 12 SP2	Installations have the `FW_LO_NOTRACK` flag turned on by default in the openSUSE firewall. It speeds up packet processing on the loopback interface but breaks certain firewall setups that redirect outgoing packets via custom rules on the local machine. To turn off the `FW_LO_NOTRACK` option: In `/etc/sysconfig/SuSEfirewall2`, set `FW_LO_NOTRACK="no"`. Either restart the firewall or reboot the system.
SUSE Linux Enterprise Server 12 SP3	No change is required, as installations have the `FW_LO_NOTRACK` flag turned off by default.

Linux distribution

Procedure

SUSE Linux Enterprise Server 12 SP2

Installations have the FW_LO_NOTRACK flag turned on by default in the openSUSE firewall. It speeds up packet processing on the loopback interface but breaks certain firewall setups that redirect outgoing packets via custom rules on the local machine.

To turn off the FW_LO_NOTRACK option:

In /etc/sysconfig/SuSEfirewall2, set FW_LO_NOTRACK="no".
Either restart the firewall or reboot the system.

SUSE Linux Enterprise Server 12 SP3

No change is required, as installations have the FW_LO_NOTRACK flag turned off by default.

DNS entry in hosts file¶

MKE adds the proxy.local DNS entry to the following files at install time:

Linux: /etc/hosts
Windows: c:\Windows\System32\Drivers\etc\hosts

To configure MCR to connect to the Internet using HTTP_PROXY you must set the value of proxy.local to NOPROXY.

Preconfigure an SLES installation¶

Before performing SUSE Linux Enterprise Server (SLES) installations, consider the following prerequisite steps:

For SLES 15 installations, disable CLOUD_NETCONFIG_MANAGE prior to installing MKE:
1. Set CLOUD_NETCONFIG_MANAGE="no" in the /etc/sysconfig/network/ifcfg-eth0 network interface configuration file.
2. Run the service network restart command.
By default, SLES disables connection tracking. To allow Kubernetes controllers in Calico to reach the Kubernetes API server, enable connection tracking on the loopback interface for SLES by running the following commands for each node in the cluster:
```
sudo mkdir -p /etc/sysconfig/SuSEfirewall2.d/defaults
echo FW_LO_NOTRACK=no | sudo tee \
/etc/sysconfig/SuSEfirewall2.d/defaults/99-docker.cfg
sudo SuSEfirewall2 start
```

See also

Verify the timeout settings¶

Confirm that MKE components have the time they require to effectively communicate.

Default timeout settings¶
Component	Timeout (ms)	Configurable
Raft consensus between manager nodes	3000	no
Gossip protocol for overlay networking	5000	no
etcd	500	yes
RethinkDB	10000	no
Stand-alone cluster	90000	no

Network lag of more than two seconds between MKE manager nodes can cause problems in your MKE cluster. For example, such a lag can indicate to MKE components that the other nodes are down, resulting in unnecessary leadership elections that will result in temporary outages and reduced performance. To resolve the issue, decrease the latency of the MKE node communication network.

See also

Configure time synchronization¶

Configure all containers in an MKE cluster to regularly synchronize with a Network Time Protocol (NTP) server, to ensure consistency between all containers in the cluster and to circumvent unexpected behavior that can lead to poor performance.

Install NTP on every machine in your cluster:

Ubuntu

sudo apt-get update && sudo apt-get install ntp ntpdate

CentOS/RHEL

sudo yum install ntp ntpdate
sudo systemctl start ntpd
sudo systemctl enable ntpd
sudo systemctl status ntpd
sudo ntpdate -u -s 0.centos.pool.ntp.org
sudo systemctl restart ntpd

SUSE

sudo zypper ref && zypper install ntp

In addition to installing NTP, the command sequence starts ntpd, a daemon that periodically syncs the machine clock to a central server.

Sync the machine clocks:
```
sudo ntpdate pool.ntp.org
```

Verify that the time of each machine is in sync with the NTP servers:

sudo ntpq -p

Example output, which illustrates how much the machine clock is out of sync with the NTP servers:

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 45.35.50.61     139.78.97.128    2 u   24   64    1   60.391  4623378   0.004
 time-a.timefreq .ACTS.           1 u   23   64    1   51.849  4623377   0.004
 helium.constant 128.59.0.245     2 u   22   64    1   71.946  4623379   0.004
 tock.usshc.com  .GPS.            1 u   21   64    1   59.576  4623379   0.004
 golem.canonical 17.253.34.253    2 u   20   64    1  145.356  4623378   0.004

Configure a load balancer¶

Though MKE does not include a load balancer, you can configure your own to balance user requests across all manager nodes. Before that, decide whether you will add nodes to the load balancer using their IP address or their fully qualified domain name (FQDN), and then use that strategy consistently throughout the cluster. Take note of all IP addresses or FQDNs before you start the installation.

If you plan to deploy both MKE and MSR, your load balancer must be able to differentiate between the two: either by IP address or port number. Because both MKE and MSR use port 443 by default, your options are as follows:

Configure your load balancer to expose either MKE or MSR on a port other than 443.
Configure your load balancer to listen on port 443 with separate virtual IP addresses for MKE and MSR.
Configure separate load balancers for MKE and MSR, both listening on port 443.

If you want to install MKE in a high-availability configuration with a load balancer in front of your MKE controllers, include the appropriate IP address and FQDN for the load balancer VIP. To do so, use one or more --san flags either with the ucp install command or in interactive mode when MKE requests additional SANs.

Configure IPVS¶

MKE supports the setting of values for all IPVS related parameters that are exposed by kube-proxy.

Kube-proxy runs on each cluster node, its role being to load-balance traffic whose destination is services (via cluster IPs and node ports) to the correct backend pods. Of the modes in which kube-proxy can run, IPVS (IP Virtual Server) offers the widest choice of load balancing algorithms and superior scalability.

Refer to the Calico documentation, Comparing kube-proxy modes: iptables or IPVS? for detailed information on IPVS.

Caution

You can only enable IPVS for MKE at installation, and it persists throughout the life of the cluster. Thus, you cannot switch to iptables at a later stage or switch over existing MKE clusters to use IPVS proxier.

MKE supports setting values for all IPVS-related parameters. For full parameter details, refer to the Kubernetes documentation for kube-proxy.

Use the kube-proxy-mode parameter at install time to enable IPVS proxier. The two valid values are iptables (default) and ipvs.

You can specify the following ipvs parameters for kube-proxy:

ipvs_exclude_cidrs
ipvs_min_sync_period
ipvs_scheduler
ipvs_strict_arp = false
ipvs_sync_period
ipvs_tcp_timeout
ipvs_tcpfin_timeout
ipvs_udp_timeout

To set these values at the time of bootstrap/installation:

Add the required values under [cluster_config] in a TOML file (for example, config.toml).
Create a config named com.docker.ucp.config from this TOML file:
```
docker config create com.docker.ucp.config config.toml
```
Use the --existing-config parameter when installing MKE. You can also change these values post-install using the MKE-s ucp/config-toml endpoint.

Caution

If you are using MKE 3.3.x with IPVS proxier and plan to upgrade to MKE 3.4.x, you must upgrade to MKE 3.4.3 or later as earlier versions of MKE 3.4.x do not support IPVS proxier.

Use an External Certificate Authority¶

You can customize MKE to use certificates signed by an External Certificate Authority (ECA). When using your own certificates, include a certificate bundle with the following:

ca.pem file with the root CA public certificate.
cert.pem file with the server certificate and any intermediate CA public certificates. This certificate should also have Subject Alternative Names (SANs) for all addresses used to reach the MKE manager.
key.pem file with a server private key.

You can either use separate certificates for every manager node or one certificate for all managers. If you use separate certificates, you must use a common SAN throughout. For example, MKE permits the following on a three-node cluster:

node1.company.example.org with the SAN mke.company.org
node2.company.example.org with the SAN mke.company.org
node3.company.example.org with the SAN mke.company.org

If you use a single certificate for all manager nodes, MKE automatically copies the certificate files both to new manager nodes and to those promoted to a manager role.

Customize named volumes¶

Note

Skip this step if you want to use the default named volumes.

MKE uses named volumes to persist data. If you want to customize the drivers that manage such volumes, create the volumes before installing MKE. During the installation process, the installer will automatically detect the existing volumes and start using them. Otherwise, MKE will create the default named volumes.

Configure kernel parameters¶

MKE uses a number of kernel parameters in its deployment.

Note

The MKE parameter values are not set by MKE, but by either MCR or an upstream component.

kernel.<subtree>
net.bridge.bridge-nf-<subtree>
net.fan.<subtree>
net.ipv4.<subtree>

net.netfilter.nf_conntrack_<subtree>
net.nf_conntrack_<subtree>
vm.overcommit_<subtree>

kernel.<subtree>¶

Parameter	Values	Description
`panic`	Default: Distribution dependent MKE: `1`	Sets the number of seconds the kernel waits to reboot following a panic. Note The `kernel.panic` parameter is not modified when the `kube_protect_kernel_defaults` parameter is enabled.
`panic_on_oops`	Default: Distribution dependent MKE: `1`	Sets whether the kernel should panic on an oops rather than continuing to attempt operations. Note The `kernel.panic_on_oops` parameter is not modified when the `kube_protect_kernel_defaults` parameter is enabled.
`keys.root_maxkeys`	Default: `1000000` MKE: `1000000`	Sets the maximum number of keys that the root user (`UID 0` in the root user namespace) can own. Note The `kernel.keys.root_maxkeys` parameter is not modified when the `kube_protect_kernel_defaults` parameter is enabled.
`keys.root_maxbytes`	Default: `25000000` MKE: `25000000`	Sets the maximum number of bytes of data that the root user (`UID 0` in the root user namespace) can hold in the payloads of the keys owned by root. Allocate 25 bytes per key multiplied by the number of kernel/keys/root_maxkeys. Note The `keys.root_maxbytes` parameter is not modified when the `kube_protect_kernel_defaults` parameter is enabled.
`pty.nr`	Default: Dependent on number of logins. Not user-configurable. MKE: `1`	Sets the number of open PTYs.

net.bridge.bridge-nf-<subtree>¶

Parameter	Values	Description
`call-arptables`	Default: No default MKE: `1`	Sets whether `arptables` rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.
`call-ip6tables`	Default: No default MKE: `1`	Sets whether `ip6tables` rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.
`call-iptables`	Default: No default MKE: `1`	Sets whether `iptables` rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.
`filter-pppoe-tagged`	Default: No default MKE: `0`	Sets whether netfilter rules apply to bridged PPPOE network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.
`filter-vlan-tagged`	Default: No default MKE: `0`	Sets whether netfilter rules apply to bridged VLAN network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.
`pass-vlan-input-dev`	Default: No default MKE: `0`	Sets whether netfilter strips the incoming VLAN interface name from bridged traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

net.fan.<subtree>¶

Parameter	Values	Description
`vxlan`	Default: No default MKE: `4`	Sets the version of the VXLAN module on older kernels, not present on kernel version 5.x. If the VXLAN module is not loaded this key is not present.

net.ipv4.<subtree>¶

Note

The *.vs.* default values persist, changing only because the ipvs kernel module was not previously loaded. For more information, refer to the Linux kernel documentation.

Parameter	Values	Description
`conf.all.accept_redirects`	Default: `1` MKE: `0`	Sets whether ICMP redirects are permitted. This key affects all interfaces.
`conf.all.forwarding`	Default: `0` MKE: `1`	Sets whether network traffic is forwarded. This key affects all interfaces.
`conf.all.route_localnet`	Default: `0` MKE: `1`	Sets `127/8` for local routing. This key affects all interfaces.
`conf.default.forwarding`	Default: `0` MKE: `1`	Sets `127/8` for local routing. This key affects new interfaces.
`conf.lo.forwarding`	Default: `0` MKE: `1`	Sets forwarding for localhost traffic.
`ip_forward`	Default: `0` MKE: `1`	Sets whether traffic forwards between interfaces. For Kubernetes to run, this parameter must be set to `1`.
`vs.am_droprate`	Default: `10` MKE: `10`	Sets the always mode drop rate used in mode 3 of the `drop_rate` defense.
`vs.amemthresh`	Default: `1024` MKE: `1024`	Sets the available memory threshold in pages, which is used in the automatic modes of defense. When there is not enough available memory, this enables the strategy and the variable is set to `2`. Otherwise, the strategy is disabled and the variable is set to `1`.
`vs.backup_only`	Default: `0` MKE: `0`	Sets whether the director function is disabled while the server is in back-up mode, to avoid packet loops for DR/TUN methods.
`vs.cache_bypass`	Default: `0` MKE: `0`	Sets whether packets forward directly to the original destination when no cache server is available and the destination address is not local (`iph->daddr is RTN_UNICAST`). This mostly applies to transparent web cache clusters.
`vs.conn_reuse_mode`	Default: `1` MKE: `1`	Sets how IPVS handles connections detected on port reuse. It is a bitmap with the following values: `0` disables any special handling on port reuse. The new connection is delivered to the same real server that was servicing the previous connection, effectively disabling `expire_nodest_conn`. `bit 1` enables rescheduling of new connections when it is safe. That is, whenever `expire_nodest_conn` and for TCP sockets, when the connection is in `TIME_WAIT` state (which is only possible if you use NAT mode). `bit 2` is `bit` 1 plus, for TCP connections, when connections are in `FIN_WAIT` state, as this is the last state seen by load balancer in Direct Routing mode. This bit helps when adding new real servers to a very busy cluster.
`vs.conntrack`	Default: `0` MKE: `0`	Sets whether connection-tracking entries are maintained for connections handled by IPVS. Enable if connections handled by IPVS are to be subject to stateful firewall rules. That is, `iptables` rules that make use of connection tracking. Otherwise, disable this setting to optimize performance. Connections handled by the IPVS FTP application module have connection tracking entries regardless of this setting, which is only available when IPVS is compiled with `CONFIG_IP_VS_NFCT` enabled.
`vs.drop_entry`	Default: `0` MKE: `0`	Sets whether entries are randomly dropped in the connection hash table, to collect memory back for new connections. In the current code, the `drop_entry` procedure can be activated every second, then it randomly scans 1/32 of the whole and drops entries that are in the SYN-RECV/SYNACK state, which should be effective against syn-flooding attack. The valid values of `drop_entry` are 0 to 3, where 0 indicates that the strategy is always disabled, 1 and 2 indicate automatic modes (when there is not enough available memory, the strategy is enabled and the variable is automatically set to 2, otherwise the strategy is disabled and the variable is set to 1), and 3 indicates that the strategy is always enabled.
`vs.drop_packet`	Default: `0` MKE: `0`	Sets whether rate packets are dropped prior to being forwarded to real servers. Rate 1 drops all incoming packets. The value definition is the same as that for `drop_entry`. In automatic mode, the following formula determines the rate: rate = amemthresh / (amemthresh - available_memory) when available memory is less than the available memory threshold. When mode 3 is set, the always mode drop rate is controlled by the `/proc/sys/net/ipv4/vs/am_droprate`.
`vs.expire_nodest_conn`	Default: `0` MKE: `0`	Sets whether the load balancer silently drops packets when its destination server is not available. This can be useful when the user-space monitoring program deletes the destination server (due to server overload or wrong detection) and later adds the server back, and the connections to the server can continue. If this feature is enabled, the load balancer terminates the connection immediately whenever a packet arrives and its destination server is not available, after which the client program will be notified that the connection is closed. This is equivalent to the feature that is sometimes required to flush connections when the destination is not available.
`vs.ignore_tunneled`	Default: `0` MKE: `0`	Sets whether IPVS configures the `ipvs_property` on all packets of unrecognized protocols. This prevents users from routing such tunneled protocols as IPIP, which is useful in preventing the rescheduling packets that have been tunneled to the IPVS host (that is, to prevent IPVS routing loops when IPVS is also acting as a real server).
`vs.nat_icmp_send`	Default: `0` MKE: `0`	Sets whether ICMP error messages (`ICMP_DEST_UNREACH`) are sent for VS/NAT when the load balancer receives packets from real servers but the connection entries do not exist.
`vs.pmtu_disc`	Default: `0` MKE: `0`	Sets whether all DF packets that exceed the PMTU are rejected with `FRAG_NEEDED`, irrespective of the forwarding method. For the TUN method, the flag can be disabled to fragment such packets.
`vs.schedule_icmp`	Default: `0` MKE: `0`	Sets whether scheduling ICMP packets in IPVS is enabled.
`vs.secure_tcp`	Default: `0` MKE: `0`	Sets the use of a more complicated TCP state transition table. For VS/NAT, the `secure_tcp` defense delays entering the `TCP ESTABLISHED` state until the three-way handshake completes. The value definition is the same as that of `drop_entry` and `drop_packet`.
`vs.sloppy_sctp`	Default: `0` MKE: `0`	Sets whether IPVS is permitted to create a connection state on any packet, rather than an SCTP INIT only.
`vs.sloppy_tcp`	Default: `0` MKE: `0`	Sets whether IPVS is permitted to create a connection state on any packet, rather than a TCP SYN only.
`vs.snat_reroute`	Default: `0` MKE: `1`	Sets whether the route of SNATed packets is recalculated from real servers as if they originate from the director. If disabled, SNATed packets are routed as if they have been forwarded by the director. If policy routing is in effect, then it is possible that the route of a packet originating from a director is routed differently to a packet being forwarded by the director. If policy routing is not in effect, then the recalculated route will always be the same as the original route. It is an optimization to disable snat_reroute and avoid the recalculation.
`vs.sync_persist_mode`	Default: `0` MKE: `0`	Sets the synchronization of connections when using persistence. The possible values are defined as follows: `0` means all types of connections are synchronized. `1` attempts to reduce the synchronization traffic depending on the connection type. For persistent services, avoid synchronization for normal connections, do it only for persistence templates. In such case, for TCP and SCTP it may need enabling `sloppy_tcp` and `sloppy_sctp` flags on back-up servers. For non-persistent services such optimization is not applied, mode `0` is assumed.
`vs.sync_ports`	Default: `1` MKE: `1`	Sets the number of threads that the master and back-up servers can use for sync traffic. Every thread uses a single UDP port, thread 0 uses the default port `8848`, and the last thread uses port `8848+sync_ports-1`.
`vs.sync_qlen_max`	Default: Calculated MKE: Calculated	Sets a hard limit for queued sync messages that are not yet sent. It defaults to 1/32 of the memory pages but actually represents number of messages. It will protect us from allocating large parts of memory when the sending rate is lower than the queuing rate.
`vs.sync_refresh_period`	Default: `0` MKE: `0`	Sets (in seconds) the difference in the reported connection timer that triggers new sync messages. It can be used to avoid sync messages for the specified period (or half of the connection timeout if it is lower) if the connection state has not changed since last sync. This is useful for normal connections with high traffic, to reduce sync rate. Additionally, retry `sync_retries` times with period of `sync_refresh_period/8`.
`vs.sync_retries`	Default: `0` MKE: `0`	Sets sync retries with period of `sync_refresh_period/8`. Useful to protect against loss of sync messages. The range of the `sync_retries` is 0 to 3.
`vs.sync_sock_size`	Default: `0` MKE: `0`	Sets the configuration of SNDBUF (master) or RCVBUF (slave) socket limit. Default value is 0 (preserve system defaults).
`vs.sync_threshold`	Default: `3 50` MKE: `3 50`	Sets the synchronization threshold, which is the minimum number of incoming packets that a connection must receive before the connection is synchronized. A connection will be synchronized every time the number of its incoming packets modulus `sync_period` equals the threshold. The range of the threshold is 0 to `sync_period`. When `sync_period` and `sync_refresh_period` are 0, send sync only for state changes or only once when packets matches `sync_threshold`.
`vs.sync_version`	Default: `1` MKE: `1`	Sets the version of the synchronization protocol to use when sending synchronization messages. The possible values are: ``0 ``selects the original synchronization protocol (version 0). This should be used when sending synchronization messages to a legacy system that only understands the original synchronization protocol. `1` selects the current synchronization protocol (version 1). This should be used whenever possible. Kernels with this `sync_version` entry are able to receive messages of both version 1 and version 2 of the synchronization protocol.

net.netfilter.nf_conntrack_<subtree>¶

Note

The net.netfilter.nf_conntrack_<subtree> default values persist, changing only when the nf_conntrack kernel module has not been previously loaded. For more information, refer to the Linux kernel documentation.

Parameter	Values	Description
`acct`	Default: `0` MKE: `0`	Sets whether connection-tracking flow accounting is enabled. Adds 64-bit byte and packet counter per flow.
`buckets`	Default: Calculated MKE: Calculated	Sets the size of the hash table. If not specified during module loading, the default size is calculated by dividing total memory by 16384 to determine the number of buckets. The hash table will never have fewer than 1024 and never more than 262144 buckets. This sysctl is only writeable in the initial net namespace.
`checksum`	Default: `0` MKE: `0`	Sets whether the checksum of incoming packets is verified. Packets with bad checksums are in an invalid state. If this is enabled, such packets are not considered for connection tracking.
`dccp_loose`	Default: `0` MKE: `1`	Sets whether picking up already established connections for Datagram Congestion Control Protocol (DCCP) is permitted.
`dccp_timeout_closereq`	Default: Distribution dependent MKE: `64`	The parameter description is not yet available in the Linux kernel documentation.
`dccp_timeout_closing`	Default: Distribution dependent MKE: `64`	The parameter description is not yet available in the Linux kernel documentation.
`dccp_timeout_open`	Default: Distribution dependent MKE: `43200`	The parameter description is not yet available in the Linux kernel documentation.
`dccp_timeout_partopen`	Default: Distribution dependent MKE: `480`	The parameter description is not yet available in the Linux kernel documentation.
`dccp_timeout_request`	Default: Distribution dependent MKE: `240`	The parameter description is not yet available in the Linux kernel documentation.
`dccp_timeout_respond`	Default: Distribution dependent MKE: `480`	The parameter description is not yet available in the Linux kernel documentation.
`dccp_timeout_timewait`	Default: Distribution dependent MKE: `240`	The parameter description is not yet available in the Linux kernel documentation.
`events`	Default: `0` MKE: `1`	Sets whether the connection tracking code provides userspace with connection-tracking events through ctnetlink.
`expect_max`	Default: Calculated MKE: `1024`	Sets the maximum size of the expectation table. The default value is nf_conntrack_buckets / 256. The minimum is 1.
`frag6_high_thresh`	Default: Calculated MKE: `4194304`	Sets the maximum memory used to reassemble IPv6 fragments. When `nf_conntrack_frag6_high_thresh` bytes of memory is allocated for this purpose, the fragment handler tosses packets until `nf_conntrack_frag6_low_thresh` is reached. The size of this parameter is calculated based on system memory.
`frag6_low_thresh`	Default: Calculated MKE: `3145728`	See `nf_conntrack_frag6_high_thresh`. The size of this parameter is calculated based on system memory.
`frag6_timeout`	Default: `60` MKE: `60`	Sets the time to keep an IPv6 fragment in memory.
`generic_timeout`	Default: `600` MKE: `600`	Sets the default for a generic timeout. This refers to layer 4 unknown and unsupported protocols.
`gre_timeout`	Default: `30` MKE: `30`	Set the GRE timeout from the conntrack table.
`gre_timeout_stream`	Default: `180` MKE: `180`	Sets the GRE timeout for streamed connections. This extended timeout is used when a GRE stream is detected.
`helper`	Default: `0` MKE: `0`	Sets whether the automatic conntrack helper assignment is enabled. If disabled, you must set up `iptables` rules to assign helpers to connections. See the CT target description in the `iptables-extensions(8)` main page for more information.
`icmp_timeout`	Default: `30` MKE: `30`	Sets the default for ICMP timeout.
`icmpv6_timeout`	Default: `30` MKE: `30`	Sets the default for ICMP6 timeout.
`log_invalid`	Default: `0` MKE: `0`	Sets whether invalid packets of a type specified by value are logged.
`max`	Default: Calculated MKE: `131072`	Sets the maximum number of allowed connection tracking entries. This value is set to `nf_conntrack_buckets` by default. Connection-tracking entries are added to the table twice, once for the original direction and once for the reply direction (that is, with the reversed address). Thus, with default settings a maxed-out table will have an average hash chain length of 2, not 1.
`sctp_timeout_closed`	Default: Distribution dependent MKE: `10`	The parameter description is not yet available in the Linux kernel documentation.
`sctp_timeout_cookie_echoed`	Default: Distribution dependent MKE: `3`	The parameter description is not yet available in the Linux kernel documentation.
`sctp_timeout_cookie_wait`	Default: Distribution dependent MKE: `3`	The parameter description is not yet available in the Linux kernel documentation.
`sctp_timeout_established`	Default: Distribution dependent MKE: `432000`	The parameter description is not yet available in the Linux kernel documentation.
`sctp_timeout_heartbeat_acked`	Default: Distribution dependent MKE: `210`	The parameter description is not yet available in the Linux kernel documentation.
`sctp_timeout_heartbeat_sent`	Default: Distribution dependent MKE: `30`	The parameter description is not yet available in the Linux kernel documentation.
`sctp_timeout_shutdown_ack_sent`	Default: Distribution dependent MKE: `3`	The parameter description is not yet available in the Linux kernel documentation.
`sctp_timeout_shutdown_recd`	Default: Distribution dependent MKE: `0`	The parameter description is not yet available in the Linux kernel documentation.
`sctp_timeout_shutdown_sent`	Default: Distribution dependent MKE: `0`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_be_liberal`	Default: `0` MKE: `0`	Sets whether only out of window RST segments are marked as `INVALID`.
`tcp_loose`	Default: `0` MKE: `1`	Sets whether already established connections are picked up.
`tcp_max_retrans`	Default: `3` MKE: `3`	Sets the maximum number of packets that can be retransmitted without receiving an acceptable ACK from the destination. If this number is reached, a shorter timer is started. Timeout for unanswered.
`tcp_timeout_close`	Default: Distribution dependent MKE: `10`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_timeout_close_wait`	Default: Distribution dependent MKE: `3600`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_timeout_fin_wait`	Default: Distribution dependent MKE: `120`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_timeout_last_ack`	Default: Distribution dependent MKE: `30`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_timeout_max_retrans`	Default: Distribution dependent MKE: `300`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_timeout_syn_recv`	Default: Distribution dependent MKE: `60`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_timeout_syn_sent`	Default: Distribution dependent MKE: `120`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_timeout_time_wait`	Default: Distribution dependent MKE: `120`	The parameter description is not yet available in the Linux kernel documentation.
`tcp_timeout_unacknowledged`	Default: Distribution dependent MKE: `30`	The parameter description is not yet available in the Linux kernel documentation.
`timestamp`	Default: `0` MKE: `0`	Sets whether connection-tracking flow timestamping is enabled.
`udp_timeout`	Default: `30` MKE: `30`	Sets the UDP timeout.
`udp_timeout_stream`	Default: `120` MKE: `120`	Sets the extended timeout that is used whenever a UDP stream is detected.

net.nf_conntrack_<subtree>¶

Note

The net.nf_conntrack_<subtree> default values persist, changing only when the nf_conntrack kernel module has not been previously loaded. For more information, refer to the Linux kernel documentation.

Parameter	Values	Description
`max`	Default: Calculated MKE: `131072`	Sets the maximum number of connections to track. The size of this parameter is calculated based on system memory.

vm.overcommit_<subtree>¶

Parameter	Values	Description
`memory`	Default: Distribution dependent MKE: `1`	Sets whether the kernel permits memory overcommitment from `malloc()` calls. Note The `vm.overcommit_memory` parameter is not modified when the `kube_protect_kernel_defaults` parameter is enabled.

vm.panic_<subtree>¶

Parameter	Values	Description
`on_oom`	Default: `0` MKE: `0`	Sets whether the kernel should panic on an out-of-memory, rather than continuing to attempt operations. When set to `0` the kernel invokes the oom_killer, which kills the rogue processes and thus preserves the system. Note The `vm.panic.on_oom` parameter is not modified when the `kube_protect_kernel_defaults` parameter is enabled.

Parameter

Values

Description

on_oom

Default: 0
MKE: 0

Sets whether the kernel should panic on an out-of-memory, rather than continuing to attempt operations.

When set to 0 the kernel invokes the oom_killer, which kills the rogue processes and thus preserves the system.

Note

The vm.panic.on_oom parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

Set up kernel default protections¶

To protect kernel parameters from being overridden by kublet, you can either invoke the --kube-protect-kernel-defaults command option at the time of MKE install, or following MKE install you can adjust the cluster_config | kube_protect_kernel_defaults parameter in the MKE configuration file.

Important

When enabled, kubelet can fail to start if the kernel parameters on the nodes are not properly set. You must set those kernel parameters on the nodes before you install MKE or before adding a new node to an existing cluster.

Create a configuration file called: /etc/sysctl.d/90-kubelet.conf and add the following snipped to it:

vm.panic_on_oom=0
vm.overcommit_memory=1
kernel.panic=10
kernel.panic_on_oops=1
kernel.keys.root_maxkeys=1000000
kernel.keys.root_maxbytes=25000000

Run sysctl -p /etc/sysctl.d/90-kubelet.conf

Install the MKE image¶

To install MKE:

Log in to the target host using Secure Shell (SSH).
Pull the latest version of MKE:
```
docker image pull mirantis/ucp:3.6.16
```
Install MKE:
```
docker container run --rm -it --name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.6.16 install \
--host-address <node-ip-address> \
--interactive
```
The ucp install command runs in interactive mode, prompting you for the necessary configuration values. For more information about the ucp install command, including how to install MKE on a system with SELinux enabled, refer to the MKE Operations Guide: mirantis/ucp install.

Note

MKE installs Project Calico for Kubernetes container-to-container communication. However, you may install an alternative CNI plugin, such as Cilium, Weave, or Flannel. For more information, refer to the MKE Operations Guide: Installing an unmanaged CNI plugin.

See also

Obtain the license¶

After you Install the MKE image, proceed with downloading your MKE license as described below. This section also contains steps to apply your new license using the MKE web UI.

Warning

Users are not authorized to run MKE without a valid license. For more information, refer to Mirantis Agreements and Terms.

To download your MKE license:

Open an email from Mirantis Support with the subject Welcome to Mirantis’ CloudCare Portal and follow the instructions for logging in.

If you did not receive the CloudCare Portal email, it is likely that you have not yet been added as a Designated Contact. To remedy this, contact your Designated Administrator.
In the top navigation bar, click Environments.
Click the Cloud Name associated with the license you want to download.
Scroll down to License Information and click the License File URL. A new tab opens in your browser.
Click View file to download your license file.

To update your license settings in the MKE web UI:

Log in to your MKE instance using an administrator account.
In the left navigation, click Settings.
On the General tab, click Apply new license. A file browser dialog displays.
Navigate to where you saved the license key (.lic) file, select it, and click Open. MKE automatically updates with the new settings.

Note

Though MKE is generally a subscription-only service, Mirantis offers a free trial license by request. Use our contact form to request a free trial license.

Install MKE on AWS¶

This section describes how to customize your MKE installation on AWS. It is for those deploying Kubernetes workloads while leveraging the AWS Kubernetes cloud provider, which provides dynamic volume and loadbalancer provisioning.

Note

You may skip this topic if you plan to install MKE on AWS with no customizations or if you will only deploy Docker Swarm workloads. Refer to Install the MKE image for the appropriate installation instruction.

Prerequisites¶

Complete the following prerequisites prior to installing MKE on AWS.

Log in to the AWS Management Console.
Assign a host name to your instance. To determine the host name, run the following curl command within the EC2 instance:
```
curl http://169.254.169.254/latest/meta-data/hostname
```
Tag your instance, VPC, security-groups, and subnets by specifying kubernetes.io/cluster/<unique-cluster-id> in the Key field and <cluster-type> in the Value field. Possible <cluster-type> values are as follows:
- owned, if the cluster owns and manages the resources that it creates
- shared, if the cluster shares its resources between multiple clusters
For example, Key: kubernetes.io/cluster/1729543642a6 and Value: owned.

To enable introspection and resource provisioning, specify an instance profile with appropriate policies for manager nodes. The following is an example of a very permissive instance profile:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [ "ec2:*" ],
      "Resource": [ "*" ]
    },
    {
      "Effect": "Allow",
      "Action": [ "elasticloadbalancing:*" ],
      "Resource": [ "*" ]
    },
    {
      "Effect": "Allow",
      "Action": [ "route53:*" ],
      "Resource": [ "*" ]
    },
    {
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": [ "arn:aws:s3:::kubernetes-*" ]
    }
  ]
}

To enable access to dynamically provisioned resources, specify an instance profile with appropriate policies for worker nodes. The following is an example of a very permissive instance profile:

{
  "Version": "2012-10-17",
  "Statement": [{
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": ["arn:aws:s3:::kubernetes-*"]
    },
    {
      "Effect": "Allow",
      "Action": "ec2:Describe*",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:AttachVolume",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:DetachVolume",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": ["route53:*"],
      "Resource": ["*"]
    }
  ]
}

Install MKE¶

After you perform the steps described in Prerequisites, run the following command to install MKE on a master node. Substitute <ucp-ip> with the private IP address of the master node.

Warning

If your cluster includes Kubernetes Windows worker nodes, you must omit the --cloud-provider aws flag from the following command, as its inclusion causes the Kubernetes Windows worker nodes never to enter a healthy state.

docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.6.16 install \
--host-address <ucp-ip> \
--cloud-provider aws \
--interactive

See also

Install MKE on Azure¶

Mirantis Kubernetes Engine (MKE) closely integrates with Microsoft Azure for its Kubernetes Networking and Persistent Storage feature set. MKE deploys the Calico CNI provider. In Azure, the Calico CNI leverages the Azure networking infrastructure for data path networking and the Azure IPAM for IP address management.

Prerequisites¶

To avoid significant issues during the installation process, you must meet the following infrastructure prerequisites to successfully deploy MKE on Azure.

Deploy all MKE nodes (managers and workers) into the same Azure resource group. You can deploy the Azure networking components (virtual network, subnets, security groups) in a second Azure resource group.
Size the Azure virtual network and subnet appropriately for your environment, because addresses from this pool will be consumed by Kubernetes Pods.
Attach all MKE worker and manager nodes to the same Azure subnet.
Set internal IP addresses for all nodes to Static rather than the Dynamic default.
Match the Azure virtual machine object name to the Azure virtual machine computer name and the node operating system hostname that is the FQDN of the host (including domain names). All characters in the names must be in lowercase.
Ensure the presence of an Azure Service Principal with Contributor access to the Azure resource group hosting the MKE nodes. Kubernetes uses this Service Principal to communicate with the Azure API. The Service Principal ID and Secret Key are MKE prerequisites.

If you are using a separate resource group for the networking components, the same Service Principal must have Network Contributor access to this resource group.
Ensure that an open NSG between all IPs on the Azure subnet passes into MKE during installation. Kubernetes Pods integrate into the underlying Azure networking stack, from an IPAM and routing perspective with the Azure CNI IPAM module. As such, Azure network security groups (NSG) impact pod-to-pod communication. End users may expose containerized services on a range of underlying ports, resulting in a manual process to open an NSG port every time a new containerized service deploys on the platform, affecting only workloads that deploy on the Kubernetes orchestrator.

To limit exposure, restrict the use of the Azure subnet to container host VMs and Kubernetes Pods. Additionally, you can leverage Kubernetes Network Policies to provide micro segmentation for containerized applications and services.

The MKE installation requires the following information:

subscriptionId: Azure Subscription ID in which to deploy the MKE objects
tenantId: Azure Active Directory Tenant ID in which to deploy the MKE objects
aadClientId: Azure Service Principal ID
aadClientSecret: Azure Service Principal Secret Key

Networking¶

MKE configures the Azure IPAM module for Kubernetes so that it can allocate IP addresses for Kubernetes Pods. Per Azure IPAM module requirements, the configuration of each Azure VM that is part of the Kubernetes cluster must include a pool of IP addresses.

You can use automatic or manual IPs provisioning for the Kubernetes cluster on Azure.

Automatic provisioning
Allows for IP pool configuration and maintenance for standalone Azure virtual machines (VMs). This service runs within the calico-node daemonset and provisions 128 IP addresses for each node by default.

Note

If you are using a VXLAN data plane, MKE automatically uses Calico IPAM. It is not necessary to do anything specific for Azure IPAM.

New MKE installations use Calico VXLAN as the default data plane (the MKE configuration calico_vxlan is set to true). MKE does not use Calico VXLAN if the MKE version is lower than 3.3.0 or if you upgrade MKE from lower than 3.3.0 to 3.3.0 or higher.
Manual provisioning
Manual provisioning of additional IP address for each Azure VM can be done through the Azure Portal, the Azure CLI az network nic ip-config create, or an ARM template.

Azure configuration file¶

For MKE to integrate with Microsoft Azure, the azure.json configuration file must be identical across all manager and worker nodes in your cluster. For Linux nodes, place the file in /etc/kubernetes on each host. For Windows nodes, place the file in C:\k on each host. Because root owns the configuration file, set its permissions to 0644 to ensure that the container user has read access.

The following is an example template for azure.json.

{
    "cloud":"AzurePublicCloud",
    "tenantId": "<parameter_value>",
    "subscriptionId": "<parameter_value>",
    "aadClientId": "<parameter_value>",
    "aadClientSecret": "<parameter_value>",
    "resourceGroup": "<parameter_value>",
    "location": "<parameter_value>",
    "subnetName": "<parameter_value>",
    "securityGroupName": "<parameter_value>",
    "vnetName": "<parameter_value>",
    "useInstanceMetadata": true
}

Optional parameters are available for Azure deployments:

primaryAvailabilitySetName: Worker nodes availability set
vnetResourceGroup: Virtual network resource group if your Azure network objects live in a separate resource group
routeTableName: Applicable if you have defined multiple route tables within an Azure subnet

Guidelines for IPAM configuration¶

Warning

To avoid significant issue during the installation process, follow these guidelines to either use the appropriate size network in Azure or take the necessary actions to fit within the subnet.

Configure the subnet and the virtual network associated with the primary interface of the Azure VMs with an adequate address prefix/range. The number of required IP addresses depends on the workload and the number of nodes in the cluster.

For example, for a cluster of 256 nodes, make sure that the address space of the subnet and the virtual network can allocate at least 128 * 256 IP addresses, in order to run a maximum of 128 pods concurrently on a node. This is in addition to initial IP allocations to VM network interface card (NICs) during Azure resource creation.

Accounting for the allocation of IP addresses to NICs that occur during VM bring-up, set the address space of the subnet and virtual network to 10.0.0.0/16. This ensures that the network can dynamically allocate at least 32768 addresses, plus a buffer for initial allocations for primary IP addresses.

Note

The Azure IPAM module queries the metadata of an Azure VM to obtain a list of the IP addresses that are assigned to the VM NICs. The IPAM module allocates these IP addresses to Kubernetes pods. You configure the IP addresses as ipConfigurations in the NICs associated with a VM or scale set member, so that Azure IPAM can provide the addresses to Kubernetes on request.

Manually provision IP address pools as part of an Azure VM scale set¶

Configure IP Pools for each member of the VM scale set during provisioning by associating multiple ipConfigurations with the scale set’s networkInterfaceConfigurations.

The following example networkProfile configuration for an ARM template configures pools of 32 IP addresses for each VM in the VM scale set.

"networkProfile": {
  "networkInterfaceConfigurations": [
    {
      "name": "[variables('nicName')]",
      "properties": {
        "ipConfigurations": [
          {
            "name": "[variables('ipConfigName1')]",
            "properties": {
              "primary": "true",
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              },
              "loadBalancerBackendAddressPools": [
                {
                  "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/loadBalancers/', variables('loadBalancerName'), '/backendAddressPools/', variables('bePoolName'))]"
                }
              ],
              "loadBalancerInboundNatPools": [
                {
                  "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/loadBalancers/', variables('loadBalancerName'), '/inboundNatPools/', variables('natPoolName'))]"
                }
              ]
            }
          },
          {
            "name": "[variables('ipConfigName2')]",
            "properties": {
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              }
            }
          }
          .
          .
          .
          {
            "name": "[variables('ipConfigName32')]",
            "properties": {
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              }
            }
          }
        ],
        "primary": "true"
      }
    }
  ]
}

Adjust the IP count value¶

During an MKE installation, you can alter the number of Azure IP addresses that MKE automatically provisions for pods.

By default, MKE will provision 128 addresses, from the same Azure subnet as the hosts, for each VM in the cluster. If, however, you have manually attached additional IP addresses to the VMs (by way of an ARM Template, Azure CLI or Azure Portal) or you are deploying in to small Azure subnet (less than /16), you can use an --azure-ip-count flag at install time.

Note

Do not set the --azure-ip-count variable to a value of less than 6 if you have not manually provisioned additional IP addresses for each VM. The MKE installation needs at least 6 IP addresses to allocate to the core MKE components that run as Kubernetes pods (in addition to the VM’s private IP address).

Below are several example scenarios that require the defining of the --azure-ip-count variable.

Scenario 1: Manually provisioned addresses

If you have manually provisioned additional IP addresses for each VM and want to disable MKE from dynamically provisioning more IP addresses, you must pass --azure-ip-count 0 into the MKE installation command.

Scenario 2: Reducing the number of provisioned addresses

Pass --azure-ip-count <custom_value> into the MKE installation command to reduce the number of IP addresses dynamically allocated from 128 to a custom value due to:

Primary use of the Swarm Orchestrator
Deployment of MKE on a small Azure subnet (for example, /24)
Plans to run a small number of Kubernetes pods on each node

To adjust this value post-installation, refer to the instructions on how to download the MKE configuration file, change the value, and update the configuration via the API.

Note

If you reduce the value post-installation, existing VMs will not reconcile and you will need to manually edit the IP count in Azure.

Run the following command to install MKE on a manager node.

docker container run --rm -it \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.6.16 install \
  --host-address <ucp-ip> \
  --pod-cidr <ip-address-range> \
  --cloud-provider Azure \
  --interactive

The --pod-cidr option maps to the IP address range that you configured for the Azure subnet.

Note

The pod-cidr range must match the Azure virtual network’s subnet attached to the hosts. For example, if the Azure virtual network had the range 172.0.0.0/16 with VMs provisioned on an Azure subnet of 172.0.1.0/24, then the Pod CIDR should also be 172.0.1.0/24.

This requirement applies only when MKE does not use the VXLAN data plane. If MKE uses the VXLAN data plane, the pod-cidr range must be different than the node IP subnet.
The --host-address maps to the private IP address of the master node.
The --azure-ip-count serves to adjust the amount of IP addresses provisioned to each VM.

Azure custom roles¶

You can create your own Azure custom roles for use with MKE. You can assign these roles to users, groups, and service principals at management group (in preview only), subscription, and resource group scopes.

Deploy an MKE cluster into a single resource group¶

A resource group is a container that holds resources for an Azure solution. These resources are the virtual machines (VMs), networks, and storage accounts that are associated with the swarm.

To create a custom all-in-one role with permissions to deploy an MKE cluster into a single resource group:

Create the role permissions JSON file.

For example:

{
  "Name": "Docker Platform All-in-One",
  "IsCustom": true,
  "Description": "Can install and manage Docker platform.",
  "Actions": [
    "Microsoft.Authorization/*/read",
    "Microsoft.Authorization/roleAssignments/write",
    "Microsoft.Compute/availabilitySets/read",
    "Microsoft.Compute/availabilitySets/write",
    "Microsoft.Compute/disks/read",
    "Microsoft.Compute/disks/write",
    "Microsoft.Compute/virtualMachines/extensions/read",
    "Microsoft.Compute/virtualMachines/extensions/write",
    "Microsoft.Compute/virtualMachines/read",
    "Microsoft.Compute/virtualMachines/write",
    "Microsoft.Network/loadBalancers/read",
    "Microsoft.Network/loadBalancers/write",
    "Microsoft.Network/loadBalancers/backendAddressPools/join/action",
    "Microsoft.Network/networkInterfaces/read",
    "Microsoft.Network/networkInterfaces/write",
    "Microsoft.Network/networkInterfaces/join/action",
    "Microsoft.Network/networkSecurityGroups/read",
    "Microsoft.Network/networkSecurityGroups/write",
    "Microsoft.Network/networkSecurityGroups/join/action",
    "Microsoft.Network/networkSecurityGroups/securityRules/read",
    "Microsoft.Network/networkSecurityGroups/securityRules/write",
    "Microsoft.Network/publicIPAddresses/read",
    "Microsoft.Network/publicIPAddresses/write",
    "Microsoft.Network/publicIPAddresses/join/action",
    "Microsoft.Network/virtualNetworks/read",
    "Microsoft.Network/virtualNetworks/write",
    "Microsoft.Network/virtualNetworks/subnets/read",
    "Microsoft.Network/virtualNetworks/subnets/write",
    "Microsoft.Network/virtualNetworks/subnets/join/action",
    "Microsoft.Resources/subscriptions/resourcegroups/read",
    "Microsoft.Resources/subscriptions/resourcegroups/write",
    "Microsoft.Security/advancedThreatProtectionSettings/read",
    "Microsoft.Security/advancedThreatProtectionSettings/write",
    "Microsoft.Storage/*/read",
    "Microsoft.Storage/storageAccounts/listKeys/action",
    "Microsoft.Storage/storageAccounts/write"
  ],
  "NotActions": [],
  "AssignableScopes": [
    "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
  ]
}

Create the Azure RBAC role.

az role definition create --role-definition all-in-one-role.json

Deploy MKE compute resources¶

Compute resources act as servers for running containers.

To create a custom role to deploy MKE compute resources only:

Create the role permissions JSON file.

For example:

{
  "Name": "Docker Platform",
  "IsCustom": true,
  "Description": "Can install and run Docker platform.",
  "Actions": [
    "Microsoft.Authorization/*/read",
    "Microsoft.Authorization/roleAssignments/write",
    "Microsoft.Compute/availabilitySets/read",
    "Microsoft.Compute/availabilitySets/write",
    "Microsoft.Compute/disks/read",
    "Microsoft.Compute/disks/write",
    "Microsoft.Compute/virtualMachines/extensions/read",
    "Microsoft.Compute/virtualMachines/extensions/write",
    "Microsoft.Compute/virtualMachines/read",
    "Microsoft.Compute/virtualMachines/write",
    "Microsoft.Network/loadBalancers/read",
    "Microsoft.Network/loadBalancers/write",
    "Microsoft.Network/networkInterfaces/read",
    "Microsoft.Network/networkInterfaces/write",
    "Microsoft.Network/networkInterfaces/join/action",
    "Microsoft.Network/publicIPAddresses/read",
    "Microsoft.Network/virtualNetworks/read",
    "Microsoft.Network/virtualNetworks/subnets/read",
    "Microsoft.Network/virtualNetworks/subnets/join/action",
    "Microsoft.Resources/subscriptions/resourcegroups/read",
    "Microsoft.Resources/subscriptions/resourcegroups/write",
    "Microsoft.Security/advancedThreatProtectionSettings/read",
    "Microsoft.Security/advancedThreatProtectionSettings/write",
    "Microsoft.Storage/storageAccounts/read",
    "Microsoft.Storage/storageAccounts/listKeys/action",
    "Microsoft.Storage/storageAccounts/write"
  ],
  "NotActions": [],
  "AssignableScopes": [
    "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
  ]
}

Create the Docker Platform RBAC role.

az role definition create --role-definition platform-role.json

Deploy MKE network resources¶

Network resources are services inside your cluster. These resources can include virtual networks, security groups, address pools, and gateways.

To create a custom role to deploy MKE network resources only:

Create the role permissions JSON file.

For example:

{
  "Name": "Docker Networking",
  "IsCustom": true,
  "Description": "Can install and manage Docker platform networking.",
  "Actions": [
    "Microsoft.Authorization/*/read",
    "Microsoft.Network/loadBalancers/read",
    "Microsoft.Network/loadBalancers/write",
    "Microsoft.Network/loadBalancers/backendAddressPools/join/action",
    "Microsoft.Network/networkInterfaces/read",
    "Microsoft.Network/networkInterfaces/write",
    "Microsoft.Network/networkInterfaces/join/action",
    "Microsoft.Network/networkSecurityGroups/read",
    "Microsoft.Network/networkSecurityGroups/write",
    "Microsoft.Network/networkSecurityGroups/join/action",
    "Microsoft.Network/networkSecurityGroups/securityRules/read",
    "Microsoft.Network/networkSecurityGroups/securityRules/write",
    "Microsoft.Network/publicIPAddresses/read",
    "Microsoft.Network/publicIPAddresses/write",
    "Microsoft.Network/publicIPAddresses/join/action",
    "Microsoft.Network/virtualNetworks/read",
    "Microsoft.Network/virtualNetworks/write",
    "Microsoft.Network/virtualNetworks/subnets/read",
    "Microsoft.Network/virtualNetworks/subnets/write",
    "Microsoft.Network/virtualNetworks/subnets/join/action",
    "Microsoft.Resources/subscriptions/resourcegroups/read",
    "Microsoft.Resources/subscriptions/resourcegroups/write"
  ],
  "NotActions": [],
  "AssignableScopes": [
    "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
  ]
}

Create the Docker Networking RBAC role.

az role definition create --role-definition networking-role.json

See also

Compatibility Matrix

See also

Install MKE on Google Cloud Platform¶

MKE includes support for installing and running MKE on Google Cloud Platform (GCP). You will learn in this section how to prepare your system for MKE installation on GCP, how to perform the installation, and some limitations with the support for GCP on MKE.

To learn how to deploy MKE on GCP using Launchpad, see Bootstrapping MKE cluster on GCP.

Prerequisites¶

Verify the following prerequisites before you install MKE on GCP:

MTU (maximum transmission unit) is set to at least 1500 on the VPC where you want to create your instances. For more information, refer to Google Cloud official documentation: Change the MTU setting of a VPC network.
All MKE instances have the necessary authorization for managing cloud resources.

GCP defines authorization through the use of service accounts, roles, and access scopes. For information on how to best configure the authorization required for your MKE instances, refer to Google Cloud official documentation: Service accounts.

An example of a permissible role for a service account is roles/owner, and an example of an access scope that provides access to most Google services is https://www.googleapis.com/auth/cloud-platform. As a best practice, define a broad access scope such as this to an instance and then restrict access using roles.

Refer to Google Identity official documentation: OAuth 2.0 Scopes for Google APIs for a list of available scopes, and to Google Cloud official documentation: Understanding roles for a list of available roles.
All of your MKE instances include the same prefix.
Each instance is tagged with the prefix of its associated instance names. For example, if the instance names are testcluster-m1 and testcluster-m2, tag the associated instance with testcluster.

Install MKE¶

To install MKE on GCP, run the following command:

docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.6.16 install \
--host-address <ucp-ip> \
--cloud-provider gce \
--interactive

Note

Do not use the --cloud-provider gce flag if you do not require cloud provider integration.

Google Cloud Platform support limitations¶

Be aware of the following limitations in the MKE support for GCP:

[MKE-8914] Windows Server Core with Containers images incompatible with GCP¶

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP¶

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

Create a new VPC and set the MTU value to 1500.
Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

Install MKE offline¶

To install MKE on an offline host, you must first use a separate computer with an Internet connection to download a single package with all the images and then copy that package to the host where you will install MKE. Once the package is on the host and loaded, you can install MKE offline as described in Install the MKE image.

Note

During the offline installation, both manager and worker nodes must be offline.

To install MKE offline:

Download the required MKE package.

Using one of the following links:

3.6.16 Linux	3.6.16 Windows Server 2019 LTSC	3.6.16 Windows Server 2022 LTSC
3.6.15 Linux	3.6.15 Windows Server 2019 LTSC	3.6.15 Windows Server 2022 LTSC
3.6.14 Linux	3.6.14 Windows Server 2019 LTSC	3.6.14 Windows Server 2022 LTSC
3.6.13 Linux	3.6.13 Windows Server 2019 LTSC	3.6.13 Windows Server 2022 LTSC
3.6.12 Linux	3.6.12 Windows Server 2019 LTSC	3.6.12 Windows Server 2022 LTSC
3.6.11 Linux	3.6.11 Windows Server 2019 LTSC	3.6.11 Windows Server 2022 LTSC
3.6.9 Linux	3.6.9 Windows Server 2019 LTSC	3.6.9 Windows Server 2022 LTSC
3.6.8 Linux	3.6.8 Windows Server 2019 LTSC	3.6.8 Windows Server 2022 LTSC
3.6.7 Linux	3.6.7 Windows Server 2019 LTSC	3.6.7 Windows Server 2022 LTSC
3.6.6 Linux	3.6.6 Windows Server 2019 LTSC	3.6.6 Windows Server 2022 LTSC
3.6.5 Linux	3.6.5 Windows Server 2019 LTSC	3.6.5 Windows Server 2022 LTSC
3.6.4 Linux	3.6.4 Windows Server 2019 LTSC	3.6.4 Windows Server 2022 LTSC
3.6.3 Linux	3.6.3 Windows Server 2019 LTSC	3.6.3 Windows Server 2022 LTSC
3.6.2 Linux	3.6.2 Windows Server 2019 LTSC	3.6.2 Windows Server 2022 LTSC
3.6.1 Linux	3.6.1 Windows Server 2019 LTSC	3.6.1 Windows Server 2022 LTSC
3.6.0 Linux	3.6.0 Windows Server 2019 LTSC	3.6.0 Windows Server 2022 LTSC

Using the following command with the package URL:
```
wget <mke-package-url> -O ucp.tar.gz
```

Copy the MKE package to the host machine:
```
scp ucp.tar.gz <user>@<host>
```
Use SSH to log in to the host where you transferred the package.
Load the MKE images from the .tar.gz file:
```
docker load -i ucp.tar.gz
```
Install the MKE image.

See also

Compatibility Matrix

Uninstall MKE¶

This topic describes how to uninstall MKE from your cluster. After uninstalling MKE, your instances of MCR will continue running in swarm mode and your applications will run normally. You will not, however, be able to do the following unless you reinstall MKE:

Enforce role-based access control (RBAC) to the cluster.
Monitor and manage the cluster from a central place.
Join new nodes using docker swarm join.

Note

You cannot join new nodes to your cluster after uninstalling MKE because your cluster will be in swarm mode, and swarm mode relies on MKE to provide the CA certificates that allow nodes to communicate with each other. After the certificates expire, the nodes will not be able to communicate at all. Either reinstall MKE before the certificates expire, or disable swarm mode by running docker swarm leave --force on every node.

To uninstall MKE:

Note

If SELinux is enabled, you must temporarily disable it prior to running the uninstall-ucp command.

Log in to a manager node using SSH.
Run the uninstall-ucp command in interactive mode, thus prompting you for the necessary configuration values:
```
docker container run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /var/log:/var/log \
  --name ucp \
  mirantis/ucp:3.6.16 uninstall-ucp --interactive
```
Note

The uninstall-ucp command completely removes MKE from every node in the cluster. You do not need to run the command from multiple nodes.

If the uninstall-ucp command fails, manually uninstall MKE.
1. On any manager node, remove the remaining MKE services:
```
docker service rm $(docker service ls -f name=ucp- -q)
```
2. On each manager node, remove the remaining MKE containers:
```
docker container rm -f $(docker container ps -a -f name=ucp- -f name=k8s_ -q)
```
3. On each manager node, remove the remaining MKE volumes:
```
docker volume rm $(docker volume ls -f name=ucp -q)
```
Note

For more information about the uninstall-ucp failure, refer to the logs in /var/log on any manager node. Be aware that you will not be able to access the logs if the volume /var/log:/var/log is not mounted while running the ucp container.
Optional. Delete the MKE configuration:
```
docker container run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /var/log:/var/log \
  --name ucp \
  mirantis/ucp:3.6.16 uninstall-ucp \
  --purge-config --interactive
```
MKE keeps the configuration by default in case you want to reinstall MKE later with the same configuration. For all available uninstall-ucp options, refer to mirantis/ucp uninstall-ucp.
Optional. Restore the host IP tables to their pre-MKE installation values by restarting the node.

Note

The Calico network plugin changed the host IP tables from their original values during MKE installation.

See also

Deploy Swarm-only mode¶

Swarm-only mode is an MKE configuration that supports only Swarm orchestration. Lacking Kubernetes and its operational and health-check dependencies, the resulting highly-stable application is smaller than a typical mixed-orchestration MKE installation.

You can only enable or disable Swarm-only mode at the time of MKE installation. MKE preserves the Swarm-only setting through upgrades, backups, and system restoration. Installing MKE in Swarm-only mode pulls only the images required to run MKE in this configuration. Refer to Swarm-only images for more information.

Note

Installing MKE in Swarm-only mode removes all Kubernetes options from the MKE web UI.

To install MKE in Swarm-only mode:

Complete the steps and recommendations in Plan the deployment and Perform pre-deployment configuration.

Add the --swarm-only flag to the install command in Install the MKE image:

docker container run --rm -it --name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.6.16 install \
--host-address <node-ip-address> \
--interactive \
--swarm-only

Note

In addition, MKE includes the --swarm-only flag with the bootstrapper images command, which you can use to pull or to check the required images on manager nodes.

Caution

To restore Swarm-only clusters, invoke the ucp restore command with the --swarm-only option.

Swarm-only images¶

Installing MKE in Swarm-only mode pulls the following set of images, which is smaller than that of a typical MKE installation:

ucp-agent (ucp-agent-win on Windows)
ucp-auth-store
ucp-auth
ucp-azure-ip-allocator
ucp-cfssl
ucp-compose
ucp-containerd-shim-process (ucp-containerd-shim-process-win on Windows)
ucp-controller
ucp-dsinfo (ucp-dsinfo-win on Windows)
ucp-etcd
ucp-interlock-config
ucp-interlock-extension
ucp-interlock-proxy
ucp-interlock
ucp-metrics
ucp-sf-notifier
ucp-swarm

Prometheus¶

In Swarm-only mode, MKE runs the Prometheus server and the authenticating proxy in a single container on each manager node. Thus, unlike in conventional MKE installations, you cannot configure Prometheus server placement. Prometheus does not collect Kubernetes metrics in Swarm-only mode, and it requires an additional reserved port on manager nodes: 12387.

See also

See also

Kubernetes official documentation

Operations Guide¶

The MKE Operations Guide provides the comprehensive information you need to run the MKE container orchestration platform. The guide is intended for anyone who needs to effectively develop and securely administer applications at scale, on private clouds, public clouds, and on bare metal.

Access an MKE cluster¶

You can access an MKE cluster in a variety of ways including through the MKE web UI, Docker CLI, and kubectl (the Kubernetes CLI). To use the Docker CLI and kubectl with MKE, first download a client certificate bundle. This topic describes the MKE web UI, how to download and configure the client bundle, and how to configure kubectl with MKE.

Access the MKE web UI¶

MKE allows you to control your cluster visually using the web UI. Role-based access control (RBAC) gives administrators and non-administrators access to the following web UI features:

Administrators:
- Manage cluster configurations.
- View and edit all cluster images, networks, volumes, and containers.
- Manage the permissions of users, teams, and organizations.
- Grant node-specific task scheduling permissions to users.
Non-administrators:
- View and edit all cluster images, networks, volumes, and containers. Requires administrator to grant access.

To access the MKE web UI:

Open a browser and navigate to https://<ip-address> (substituting <ip-address> with the IP address of the machine that ran docker run).
Enter the user name and password that you set up when installing the MKE image.

Note

To set up two-factor authentication for logging in to the MKE web UI, see Use two-factor authentication.

Download and configure the client bundle¶

Download and configure the MKE client certificate bundle to use MKE with Docker CLI and kubectl. The bundle includes:

A private and public key pair for authorizing your requests using MKE
Utility scripts for configuring Docker CLI and kubectl with your MKE deployment

Note

MKE issues different certificates for each user type:

User certificate bundles: Allow running docker commands only through MKE manager nodes.
Administrator certificate bundles: Allow running docker commands through all node types.

Download the client bundle¶

This section explains how to download the client certificate bundle using either the MKE web UI or the MKE API.

To download the client certificate bundle using the MKE web UI:

Navigate to My Profile.
Click Client Bundles > New Client Bundle.

To download the client certificate bundle using the MKE API on Linux:

Create an environment variable with the user security token:

AUTHTOKEN=$(curl -sk -d \
'{"username":"<username>","password":"<password>"}' \
https://<mke-ip>/auth/login | jq -r .auth_token)

Download the client certificate bundle:

curl -k -H "Authorization: Bearer $AUTHTOKEN" \
https://<mke-ip>/api/clientbundle -o bundle.zip

To download the client certificate bundle using the MKE API on Windows Server 2016:

Open an elevated PowerShell prompt.

Create an environment variable with the user security token:

$AUTHTOKEN=((Invoke-WebRequest -Body '{"username":"<username>", \
"password":"<password>"}' -Uri https://`<mke-ip`>/auth/login \
-Method POST).Content)|ConvertFrom-Json|select auth_token \
-ExpandProperty auth_token

Download the client certificate bundle:

[io.file]::WriteAllBytes("ucp-bundle.zip", \
((Invoke-WebRequest -Uri https://`<mke-ip`>/api/clientbundle \
-Headers @{"Authorization"="Bearer $AUTHTOKEN"}).Content))

Configure the client bundle¶

This section explains how to configure the client certificate bundle to authenticate your requests with MKE using the Docker CLI and kubectl.

To configure the client certificate bundle:

Extract the client bundle .zip file into a directory, and use the appropriate utility script for your system:
- For Linux:
```
cd client-bundle && eval "$(<env.sh)"
```
- For Windows (from an elevated PowerShell prompt):
```
cd client-bundle && env.cmd
```
The utility scripts do the following:
- Update DOCKER_HOST to make the client tools communicate with your MKE deployment.
- Update DOCKER_CERT_PATH to use the certificates included in the client bundle.
- Configure kubectl with the kubectl config command.
  
  Note
  
  The kubeconfig file is named kube.yaml and is located in the unzipped client bundle directory.
Verify that your client tools communicate with MKE:
```
docker version --format '{{.Server.Version}}'
kubectl config current-context
```
The expected Docker CLI server version starts with ucp/, and the expected kubectl context name starts with ucp_.
Optional. Change your context directly using the client certificate bundle .zip files. In the directory where you downloaded the user bundle, add the new context:
```
cd client-bundle && docker context \
import myucp ucp-docker-bundle.zip
```

Note

If you use the client certificate bundle with buildkit, make sure that builds are not accidentally scheduled on manager nodes. For more information, refer to Manage services node deployment.

Configure kubectl with MKE¶

MKE installations include Kubernetes. Users can deploy, manage, and monitor Kubernetes using either the MKE web UI or kubectl.

To install and use kubectl:

Identify which version of Kubernetes you are running by using the MKE web UI, the MKE API version endpoint, or the Docker CLI docker version command with the client bundle.

Caution

Kubernetes requires that kubectl and Kubernetes be within one minor version of each other.
Refer to Kubernetes: Install Tools to download and install the appropriate kubectl binary.
Download the client bundle.
Refer to Configure the client bundle to configure kubectl with MKE using the certificates and keys contained in the client bundle.
Optional. Install Helm, the Kubernetes package manager, and Tiller, the Helm server.

Caution

Helm requires MKE 3.1.x or higher.

To use Helm and Tiller with MKE, grant the default service account within the kube-system namespace the necessary roles:
```
kubectl create rolebinding default-view --clusterrole=view \
--serviceaccount=kube-system:default --namespace=kube-system

kubectl create clusterrolebinding add-on-cluster-admin \
--clusterrole=cluster-admin --serviceaccount=kube-system:default
```
Note

Helm recommends that you specify a Role and RoleBinding to limit the scope of Tiller to a particular namespace. Refer to the official Helm documentation for more information.

See also

Administer an MKE cluster¶

Add labels to cluster nodes¶

With MKE, you can add labels to your nodes. Labels are metadata that describe the node, such as:

node role (development, QA, production)
node region (US, EU, APAC)
disk type (HDD, SSD)

Once you apply a label to a node, you can specify constraints when deploying a service to ensure that the service only runs on nodes that meet particular criteria.

Hint

Use resource sets (MKE collections or Kubernetes namespaces) to organize access to your cluster, rather than creating labels for authorization and permissions to resources.

Apply labels to a node¶

The following example procedure applies the ssd label to a node.

Log in to the MKE web UI with administrator credentials.
Click Shared Resources in the navigation menu to expand the selections.
Click Nodes. The details pane will display the full list of nodes.
Click the node on the list that you want to attach labels to. The details pane will transition, presenting the Overview information for the selected node.
Click the settings icon in the upper-right corner to open the Edit Node page.
Navigate to the Labels section and click Add Label.
Add a label, entering disk into the Key field and ssd into the Value field.
Click Save to dismiss the Edit Node page and return to the node Overview.

Hint

You can use the CLI to apply a label to a node:

docker node update --label-add <key>=<value> <node-id>

Deploy a service with constraints¶

The following example procedure deploys a service with a constraint that ensures that the service only runs on nodes with SSD storage node.labels.disk == ssd.

To deploy an application stack with service constraints:

Log in to the MKE web UI with administrator credentials.
Verify that the target node orchestrator is set to Swarm.
Click Shared Resources in the left-side navigation panel to expand the selections.
Click Stacks. The details pane will display the full list of stacks.
Click the Create Stack button to open the Create Application page.
Under 1. Configure Application, enter “wordpress” into the Name field .
Under ORCHESTRATOR NODE, select Swarm Services.

Under 2. Add Application File, paste the following stack file in the docker-compose.yml editor:

version: "3.1"

services:
  db:
    image: mysql:5.7
    deploy:
      placement:
        constraints:
          - node.labels.disk == ssd
      restart_policy:
        condition: on-failure
    networks:
      - wordpress-net
    environment:
      MYSQL_ROOT_PASSWORD: wordpress
      MYSQL_DATABASE: wordpress
      MYSQL_USER: wordpress
      MYSQL_PASSWORD: wordpress
  wordpress:
    depends_on:
      - db
    image: wordpress:latest
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.labels.disk == ssd
      restart_policy:
        condition: on-failure
        max_attempts: 3
    networks:
      - wordpress-net
    ports:
      - "8000:80"
    environment:
      WORDPRESS_DB_HOST: db:3306
      WORDPRESS_DB_PASSWORD: wordpress

networks:
  wordpress-net:

Click Create to deploy the stack.
Click Done once the stack deployment completes to return to the stacks list which now features your newly created stack.

To verify service tasks deployed to labeled node:

In the left-side navigation panel, navigate to Shared Resources > Nodes. The details pane will display the full list of nodes.
Click the node with the disk label.
In the details pane, click the Metrics tab to verify that WordPress containers are scheduled on the node.
In the left-side navigation panel, navigate to Shared Resources > Nodes.
Click any node that does not have the disk label.
In the details pane, click the Metrics tab to verify that there are no WordPress containers scheduled on the node.

Add Swarm placement constraints¶

If a node is set to use Kubernetes as its orchestrator while simultaneously running Swarm services, you must deploy placement constraints to prevent those services from being scheduled on the node.

The necessary service constraints will be automatically adopted by any new MKE-created Swarm services, as well as by older Swarm services that you have updated. MKE does not automatically add placement constraints, however, to Swarm services that were created using older versions of MKE, as to do so would restart the service tasks.

To add placement constraints to older Swarm services:

Download and configure the client bundle.

Identify the Swarm services that do not have placement constraints:

services=$(docker service ls -q)
for service in $services; do
    if docker service inspect $service --format '{{.Spec.TaskTemplate.Placement.Constraints}}' | grep -q -v 'node.labels.com.docker.ucp.orchestrator.swarm==true'; then
        name=$(docker service inspect $service --format '{{.Spec.Name}}')
        if [ $name = "ucp-agent" ] || [ $name = "ucp-agent-win" ] ||  [ $name = "ucp-agent-s390x" ]; then
            continue
        fi
        echo "Service $name (ID: $service) is missing the node.labels.com.docker.ucp.orchestrator.swarm=true placement constraint"
    fi
done

Add placement constraints to the Swarm services you identified:

Note

All service tasks will restart, thus causing some amount of service downtime.

services=$(docker service ls -q)
for service in $services; do
    if docker service inspect $service --format '{{.Spec.TaskTemplate.Placement.Constraints}}' | grep -q -v 'node.labels.com.docker.ucp.orchestrator.swarm=true'; then
        name=$(docker service inspect $service --format '{{.Spec.Name}}')
        if [ $name = "ucp-agent" ] || [ $name = "ucp-agent-win" ]; then
            continue
        fi
        echo "Updating service $name (ID: $service)"
        docker service update --detach=true --constraint-add node.labels.com.docker.ucp.orchestrator.swarm==true $service
    fi
done

Add or remove a service constraint using the MKE web UI¶

You can declare the deployment constraints in your docker-compose.yml file or when you create a stack. Also, you can apply constraints when you create a service.

To add or remove a service constraint:

Verify whether a service has deployment constraints:
1. Navigate to the Services page and select that service.
2. In the details pane, click Constraints to list the constraint labels.
Edit the constraints on the service:
1. Click Configure and select Details to open the Update Service page.
2. Click Scheduling to view the constraints.
3. Add or remove deployment constraints.

Add SANs to cluster certificates¶

A SAN (Subject Alternative Name) is a structured means for associating various values (such as domain names, IP addresses, email addresses, URIs, and so on) with a security certificate.

MKE always runs with HTTPS enabled. As such, whenever you connect to MKE, you must ensure that the MKE certificates recognize the host name in use. For example, if MKE is behind a load balancer that forwards traffic to your MKE instance, your requests will not be for the MKE host name or IP address but for the host name of the load balancer. Thus, MKE will reject the requests, unless you include the address of the load balancer as a SAN in the MKE certificates.

Note

To use your own TLS certificates, confirm first that these certificates have the correct SAN values.
To use the self-signed certificate that MKE offers out-of-the-box, you can use the --san argument to set up the SANs during MKE deployment.

To add new SANs using the MKE web UI:

Log in to the MKE web UI using administrator credentials.
Navigate to the Nodes page.
Click on a manager node to display the details pane for that node.
Click Configure and select Details.
In the SANs section, click Add SAN and enter one or more SANs for the cluster.
Click Save.
Repeat for every existing manager node in the cluster.

Note

Thereafter, the SANs are automatically applied to any new manager nodes that join the cluster.

To add new SANs using the MKE CLI:

Get the current set of SANs for the given manager node:

docker node inspect --format '{{ index .Spec.Labels "com.docker.ucp.SANs"
}}' <node-id>

Example of system response:

default-cs,127.0.0.1,172.17.0.1

Append the desired SAN to the list (for example, default-cs,127.0.0.1,172.17.0.1,example.com) and run:
```
docker node update --label-add com.docker.ucp.SANs=<SANs-list> <node-id>
```
Note

<SANs-list> is the comma-separated list of SANs with your new SAN appended at the end.
Repeat the command sequence for each manager node.

Collect MKE cluster metrics with Prometheus¶

Prometheus is an open-source systems monitoring and alerting toolkit to which you can configure MKE as a target.

Prometheus runs as a Kubernetes deployment that, by default, is a DaemonSet that runs on every manager node. A key benefit of this is that you can set the DaemonSet to not schedule on any nodes, which effectively disables Prometheus if you do not use the MKE web interface.

Along with events and logs, metrics are data sources that provide a view into your cluster, presenting numerical data values that have a time-series component. There are several sources from which you can derive metrics, each providing different meanings for a business and its applications.

As the metrics data is stored locally on disk for each Prometheus server, it does not replicate on new managers or if you schedule Prometheus to run on a new node. The metrics are kept no longer than 24 hours.

MKE metrics types¶

MKE provides a base set of metrics that gets you into production without having to rely on external or third-party tools. Mirantis strongly encourages, though, the use of additional monitoring to provide more comprehensive visibility into your specific MKE environment.

Metrics types¶
Metric type	Description
Business	High-level aggregate metrics that typically combine technical, financial, and organizational data to create IT infrastructure information for business leaders. Examples of business metrics include: Company or division-level application downtime Aggregation resource utilization Application resource demand growth
Application	Metrics on APM tools domains (such as AppDynamics and DynaTrace) that supply information on the state or performance of the application itself. Service state Container platform Host infrastructure
Service	Metrics on the state of services that are running on the container platform. Such metrics have very low cardinality, meaning the values are typically from a small fixed set of possibilities (commonly binary). Application health Convergence of Kubernetes deployments and Swarm services Cluster load by number of services or containers or pods Note Web UI disk usage (including free space) reflects only the MKE managed portion of the file system: `/var/lib/docker`. To monitor the total space available on each filesystem of an MKE worker or manager, deploy a third-party monitoring solution to oversee the operating system.

See also

Metrics labels¶

The metrics that MKE exposes in Prometheus have standardized labels, depending on the target resource.

Container labels¶
Label name	Value
`collection`	The collection ID of the collection the container is in, if any.
`container`	The ID of the container.
`image`	The name of the container image.
`manager`	Set to `true` if the container node is an MKE manager.
`name`	The container name.
`podName`	The pod name, if the container is part of a Kubernetes Pod.
`podNamespace`	The pod namespace, if the container is part of a Kubernetes Pod namespace.
`podContainerName`	The container name in the pod spec, if the container is part of a Kubernetes pod.
`service`	The service ID, if the container is part of a Swarm service.
`stack`	The stack name, if the container is part of a Docker Compose stack.

Container networking labels¶
Label name	Value
`collection`	The collection ID of the collection the container is in, if any.
`container`	The ID of the container.
`image`	The name of the container image.
`manager`	Set to `true` if the container node is an MKE manager.
`name`	The container name.
`network`	The ID of the network.
`podName`	The pod name, if the container is part of a Kubernetes pod.
`podNamespace`	The pod namespace, if the container is part of a Kubernetes pod namespace.
`podContainerName`	The container name in the pod spec, if the container is part of a Kubernetes pod.
`service`	The service ID, if the container is part of a Swarm service.
`stack`	The stack name, if the container is part of a Docker Compose stack.

Note

The container networking labels are the same as the Container labels, with the addition of network.

Node labels¶
Label name	Value
`manager`	Set to `true` if the node is an MKE manager.

See also

Core MKE metrics¶

MKE exports metrics on every node and also exports additional metrics from every controller.

Node-sourced MKE metrics¶

The metrics that MKE exports from nodes are specific to those nodes (for example, the total memory on that node).

The tables below offer detail on the node-sourced metrics that MKE exposes in Prometheus with the ucp_ label.

ucp_engine_container_cpu_percent¶

Units	Percentage
Description	Percentage of CPU time in use by the container
Labels	Container

ucp_engine_container_cpu_total_time_nanoseconds¶

Units	Nanoseconds
Description	Total CPU time used by the container
Labels	Container

ucp_engine_container_disk_size_rootfs¶

Units	Bytes
Description	Total container disk size
Labels	Container

ucp_engine_container_health¶

Units

0.0 or 1.0

Description

The container health, according to its healthcheck.

The 0 value indicates that the container is not reporting as healthy, which is likely because it either does not have a healthcheck defined or because healthcheck results have not yet been returned

Labels

Container

ucp_engine_container_memory_max_usage_bytes¶

Units	Bytes
Description	Maximum memory in use by the container in bytes
Labels	Container

ucp_engine_container_memory_usage_bytes¶

Units	Bytes
Description	Current memory in use by the container in bytes
Labels	Container

ucp_engine_container_memory_usage_percent¶

Units	Percentage
Description	Percentage of total node memory currently in use by the container
Labels	Container

ucp_engine_container_network_rx_bytes_total¶

Units	Bytes
Description	Number of bytes received by the container over the network in the last sample
Labels	Container networking

ucp_engine_container_network_rx_dropped_packets_total¶

Units	Number of packets
Description	Number of packets bound for the container over the network that were dropped in the last sample
Labels	Container networking

ucp_engine_container_network_rx_errors_total¶

Units	Number of errors
Description	Number of received network errors for the container over the network in the last sample
Labels	Container networking

ucp_engine_container_network_rx_packets_total¶

Units	Number of packets
Description	Number of packets received by the container over the network in the last sample
Labels	Container networking

ucp_engine_container_network_tx_bytes_total¶

Units	Bytes
Description	Number of bytes sent by the container over the network in the last sample
Labels	Container networking

ucp_engine_container_network_tx_dropped_packets_total¶

Units	Number of packets
Description	Number of packets sent from the container over the network that were dropped in the last sample
Labels	Container networking

ucp_engine_container_network_tx_errors_total¶

Units	Number of errors
Description	Number of sent network errors for the container on the network in the last sample
Labels	Container networking

ucp_engine_container_network_tx_packets_total¶

Units	Number of packets
Description	Number of sent packets for the container over the network in the last sample
Labels	Container networking

ucp_engine_container_unhealth¶

Units

0.0 or 1.0

Description

Indicates whether the container is healthy, according to its healthcheck.

The 0 value indicates that the container is not reporting as healthy, which is likely because it either does not have a healthcheck defined or because healthcheck results have not yet been returned

Labels

Container

ucp_engine_containers¶

Units	Number of containers
Description	Total number of containers on the node
Labels	Node

ucp_engine_cpu_total_time_nanoseconds¶

Units	Nanoseconds
Description	System CPU time used by the container
Labels	Container

ucp_engine_disk_free_bytes¶

Units	Bytes
Description	Free disk space on the Docker root directory on the node, in bytes. This metric is not available to Windows nodes
Labels	Node

ucp_engine_disk_total_bytes¶

Units	Bytes
Description	Total disk space on the Docker root directory on this node in bytes. Note that the `ucp_engine_disk_free_bytes` metric is not available for Windows nodes
Labels	Node

ucp_engine_images¶

Units	Number of images
Description	Total number of images on the node
Labels	Node

ucp_engine_memory_total_bytes¶

Units	Bytes
Description	Total amount of memory on the node
Labels	Node

ucp_engine_networks¶

Units	Number of networks
Description	Total number of networks on the node
Labels	Node

ucp_engine_num_cpu_cores¶

Units	Number of cores
Description	Number of CPU cores on the node
Labels	Node

ucp_engine_volumes¶

Units	Number of volumes
Description	Total number of volumes on the node
Labels	Node

Controller-sourced MKE metrics¶

The metrics that MKE exports from controllers are cluster-scoped (for example, the total number of Swarm services).

The tables below offer detail on the controller-sourced metrics that MKE exposes in Prometheus with the ucp_ label.

ucp_controller_services¶

Units	Number of services
Description	Total number of Swarm services
Labels	Not applicable

ucp_engine_node_health¶

Units	0.0 or 1.0
Description	Health status of the node, as determined by MKE
Labels	nodeName: node name, nodeAddr: node IP address

ucp_engine_pod_container_ready¶

Units	0.0 or 1.0
Description	Readiness of the container in a Kubernetes pod, as determined by its readiness probe
Labels	Pod

ucp_engine_pod_ready¶

Units	0.0 or 1.0
Description	Readiness of the container in a Kubernetes pod, as determined by its readiness probe
Labels	Pod

See also

Kubernetes Pods

MKE cAdvsior metrics¶

Once you have enabled cAdvisor and generated an auth token, you can issue the following command to access the cAdvisor metrics:

curl -sk -H
"Authorization: Bearer $AUTHTOKEN"
"$<mke_url>/metricsservice/query?query=$<mke_specific_metric>\[$<time_duration>\]"

The Prometheus container metrics exposed by cAdvisor are presented below:

cadvisor_version_info¶

Units	N/A
Description	A metric with a constant `1` value that is labeled by kernel version, OS version, Docker version, cAdvisor version and cAdvisor revision.
Labels	`cadvisorRevision`, `cadvisorVersion`, `instance`, `job`, `kernelVersion`, `osVersion`

container_blkio_device_usage_total¶

Units	bytes
Description	The Block I/O (blkio) device bytes usage.
Labels	`container`, `device`, `id`, `image`, `instance`, `job`, `major`, `minor`, `name`, `namespace`, `operation`, `pod`

container_cpu_system_seconds_total¶

Value	seconds
Description	Cumulative system CPU time consumed.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_cpu_usage_seconds_total¶

Value	seconds
Description	Cumulative CPU time consumed.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_cpu_user_seconds_total¶

Value	seconds
Description	Cumulative user CPU time consumed.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_fs_reads_bytes_total¶

Units	bytes
Description	Cumulative count of bytes read.
Labels	`container`, `device`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_fs_reads_total¶

Value	integer
Description	Cumulative count of reads completed.
Labels	`container`, `device`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_fs_writes_bytes_total¶

Units	bytes
Description	Cumulative count of bytes written.
Labels	`container`, `device`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_fs_writes_total¶

Value	integer
Description	Cumulative count of writes completed.
Labels	`container`, `device`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_last_seen¶

Units	timestamp
Description	Last time a container was seen by the exporter.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_cache¶

Units	bytes
Description	Total page cache memory.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_failcnt¶

Value	integer
Description	Number of memory usage hits limits.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_failures_total¶

Value	integer
Description	Cumulative count of memory allocation failures.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_mapped_file¶

Units	bytes
Description	Size of memory mapped files.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_max_usage_bytes¶

Units	bytes
Description	Maximum memory usage recorded.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_rss¶

Units	bytes
Description	Size of RSS.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_swap¶

Units	bytes
Description	Container swap usage.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_usage_bytes¶

Units	bytes
Description	Current memory usage, including all memory regardless of when it was accessed.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_memory_working_set_bytes¶

Units	bytes
Description	Current working set.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_network_receive_bytes_total¶

Units	bytes
Description	Cumulative count of bytes received.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_network_receive_errors_total¶

Value	integer
Description	Cumulative count of errors encountered while receiving.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_network_receive_packets_dropped_total¶

Value	integer
Description	Cumulative count of packets dropped while receiving.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_network_receive_packets_total¶

Value	integer
Description	Cumulative count of packets received.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_network_transmit_bytes_total¶

Value	integer
Description	Cumulative count of bytes transmitted.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_network_transmit_errors_total¶

Value	integer
Description	Cumulative count of errors encountered while transmitting.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_network_transmit_packets_dropped_total¶

Value	integer
Description	Cumulative count of packets dropped while transmitting.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_network_transmit_packets_total¶

Value	integer
Description	Cumulative count of packets transmitted.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_scrape_error¶

Units	N/A
Description	`1` if an error occurred while container metrics were being obtained, otherwise `0`.
Labels	`instance`, `job`

container_spec_cpu_period¶

Units	N/A
Description	CPU period of the container.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_spec_cpu_shares¶

Units	N/A
Description	CPU share of the container.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_spec_memory_limit_bytes¶

Units	bytes
Description	Memory limit for the container.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_spec_memory_reservation_limit_bytes¶

Units	bytes
Description	Memory reservation limit for the container.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_spec_memory_swap_limit_bytes¶

Units	bytes
Description	Memory swap limit for the container.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

container_start_time_seconds¶

Value	seconds
Description	Start time of the container since unix epoch.
Labels	`container`, `id`, `image`, `instance`, `job`, `name`, `namespace`, `pod`

machine_cpu_cores¶

Value	integer
Description	Number of logical CPU cores.
Labels	`boot_id`, `instance`, `job`, `machine_id`, `system_uuid`

machine_cpu_physical_cores¶

Value	integer
Description	Number of physical CPU cores.
Labels	`boot_id`, `instance`, `job`, `machine_id`, `system_uuid`

machine_cpu_sockets¶

Value	integer
Description	Number of CPU sockets.
Labels	`boot_id`, `instance`, `job`, `machine_id`, `system_uuid`

machine_memory_bytes¶

Units	bytes
Description	Amount of memory installed on the machine.
Labels	`boot_id`, `instance`, `job`, `machine_id`, `system_uuid`

machine_nvm_avg_power_budget_watts¶

Units	watts
Description	NVM power budget.
Labels	`boot_id`, `instance`, `job`, `machine_id`, `system_uuid`

machine_nvm_capacity¶

Units	bytes
Description	NVM capacity value, labeled by NVM mode (memory mode or app direct mode).
Labels	`boot_id`, `instance`, `job`, `machine_id`, `system_uuid`

machine_scrape_error¶

Value	integer
Description	`1` if an error occurred while machine metrics were being obtained, otherwise `0`.
Labels	`instance`, `job`

Deploy Prometheus on worker nodes¶

MKE deploys Prometheus by default on the manager nodes to provide a built-in metrics backend. For cluster sizes over 100 nodes, or if you need to scrape metrics from Prometheus instances, Mirantis recommends that you deploy Prometheus on dedicated worker nodes in the cluster.

To deploy Prometheus on worker nodes:

Source an admin bundle.

Verify that ucp-metrics pods are running on all managers:

$ kubectl -n kube-system get pods -l k8s-app=ucp-metrics -o wide

NAME               READY  STATUS   RESTARTS  AGE  IP            NODE
ucp-metrics-hvkr7  3/3    Running  0         4h   192.168.80.66 3a724a-0

Add a Kubernetes node label to one or more workers. For example, a label with key ucp-metrics and value "" to a node with name 3a724a-1.
```
$ kubectl label node 3a724a-1 ucp-metrics=

node "test-3a724a-1" labeled
```
SELinux Prometheus Deployment

If you use SELinux, label your ucp-node-certs directories properly on the worker nodes before you move the ucp-metrics workload to them. To run ucp-metrics on a worker node, update the ucp-node-certs label by running:

sudo chcon -R system_u:object_r:container_file_t:s0 /var/lib/docker/volumes/ucp-node-certs/_data.

Patch the ucp-metrics DaemonSet’s nodeSelector with the same key and value in use for the node label. This example shows the key ucp-metrics and the value "".

$ kubectl -n kube-system patch daemonset ucp-metrics --type json -p
'[{"op": "replace", "path": "/spec/template/spec/nodeSelector", "value":
{"ucp-metrics": ""}}]' daemonset "ucp-metrics" patched

Confirm that ucp-metrics pods are running only on the labeled workers.

$ kubectl -n kube-system get pods -l k8s-app=ucp-metrics -o wide

NAME               READY  STATUS       RESTARTS  AGE IP           NODE
ucp-metrics-88lzx  3/3    Running      0         12s 192.168.83.1 3a724a-1
ucp-metrics-hvkr7  3/3    Terminating  0         4h 192.168.80.66 3a724a-0

See also

Configure external Prometheus to scrape metrics from MKE¶

To configure your external Prometheus server to scrape metrics from Prometheus in MKE:

Source an admin bundle.

Create a Kubernetes secret that contains your bundle TLS material.

(cd $DOCKER_CERT_PATH && kubectl create secret generic prometheus --from-file=ca.pem --from-file=cert.pem --from-file=key.pem)

Create a Prometheus deployment and ClusterIP service using YAML.

Important

To run Prometheus external to MKE, change the domain for the inventory container in the Prometheus deployment from ucp-controller.kube-system.svc.cluster.local to an external domain, to access MKE from the Prometheus node.

kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus
data:
  prometheus.yaml: |
    global:
      scrape_interval: 10s
    scrape_configs:
    - job_name: 'ucp'
      tls_config:
        ca_file: /bundle/ca.pem
        cert_file: /bundle/cert.pem
        key_file: /bundle/key.pem
        server_name: proxy.local
      scheme: https
      file_sd_configs:
      - files:
        - /inventory/inventory.json
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
spec:
  replicas: 2
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      nodeSelector:
        kubernetes.io/os: linux
      containers:
      - name: inventory
        image: alpine
        command: ["sh", "-c"]
        args:
        - apk add --no-cache curl &&
          while :; do
            curl -Ss --cacert /bundle/ca.pem --cert /bundle/cert.pem --key /bundle/key.pem --output /inventory/inventory.json https://ucp-controller.kube-system.svc.cluster.local/metricsdiscovery;
            sleep 15;
          done
        volumeMounts:
        - name: bundle
          mountPath: /bundle
        - name: inventory
          mountPath: /inventory
      - name: prometheus
        image: prom/prometheus
        command: ["/bin/prometheus"]
        args:
        - --config.file=/config/prometheus.yaml
        - --storage.tsdb.path=/prometheus
        - --web.console.libraries=/etc/prometheus/console_libraries
        - --web.console.templates=/etc/prometheus/consoles
        volumeMounts:
        - name: bundle
          mountPath: /bundle
        - name: config
          mountPath: /config
        - name: inventory
          mountPath: /inventory
      volumes:
      - name: bundle
        secret:
          secretName: prometheus
      - name: config
        configMap:
          name: prometheus
      - name: inventory
        emptyDir:
          medium: Memory
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus
spec:
  ports:
  - port: 9090
    targetPort: 9090
  selector:
    app: prometheus
  sessionAffinity: ClientIP
EOF

Determine the service ClusterIP:

$ kubectl get service prometheus

NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
prometheus   ClusterIP   10.96.254.107   <none>        9090/TCP   1h

Forward port 9090 on the local host to the ClusterIP. The tunnel you create does not need to be kept alive as its only purpose is to expose the Prometheus UI.
```
ssh -L 9090:10.96.254.107:9090 ANY_NODE
```
Visit http://127.0.0.1:9090 to explore the MKE metrics that Prometheus is collecting.

See also

Set up Grafana with MKE Prometheus¶

Important

The information offered herein on how to set up a Grafana instance connected to MKE Prometheus is derived from the official Deploy Grafana on Kubernetes documentation and modified to work with MKE. As it deploys Grafana with default credentials, Mirantis strongly recommends that you adjust the configuration detail to meet your specific needs prior to deploying Grafana with MKE in a production environment.

Source an MKE admin bundle.
Create the monitoring namespace on which you will deploy Grafana:
```
kubectl create namespace monitoring
```

Obtain the UCP cluster ID:

CLUSTER_ID=$(docker info --format '{{json .Swarm.Cluster.ID}}')

Apply the following YAML file to deploy Grafana in the monitoring namespace and to automatically configure MKE Prometheus as a data source:

kubectl apply -f - <<EOF
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      securityContext:
        runAsUser: 0
      containers:
        - name: grafana
          image: grafana/grafana:9.1.0-ubuntu
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 3000
              name: http-grafana
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /robots.txt
              port: 3000
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            failureThreshold: 3
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 3000
            timeoutSeconds: 1
          resources:
            requests:
              cpu: 250m
              memory: 750Mi
          volumeMounts:
            - mountPath: /etc/grafana/
              name: grafana-config-volume
            - mountPath: /etc/ssl
              name: ucp-node-certs
      volumes:
        - name: grafana-config-volume
          configMap:
            name: grafana-config
            items:
              - key: grafana.ini
                path: grafana.ini
              - key: dashboard.json
                path: dashboard.json
              - key: datasource.yml
                path: provisioning/datasources/datasource.yml
        - name: ucp-node-certs
          hostPath:
            path: /var/lib/docker/volumes/ucp-node-certs/_data
      nodeSelector:
        node-role.kubernetes.io/master: ""
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: monitoring
spec:
  ports:
    - port: 3000
      protocol: TCP
      targetPort: http-grafana
  selector:
    app: grafana
  sessionAffinity: None
  type: ClusterIP
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-config
  namespace: monitoring
  labels:
    grafana_datasource: '1'
data:
  grafana.ini: |
  dashboard.json: |
  datasource.yml: |-
    apiVersion: 1
    datasources:
    - name: mke-prometheus
      type: prometheus
      access: proxy
      orgId: 1
      url: https://ucp-metrics.kube-system.svc.cluster.local:443
      jsonData:
        tlsAuth: true
        tlsAuthWithCACert: false
        serverName: $CLUSTER_ID
      secureJsonData:
        tlsClientCert: "\$__file{/etc/ssl/cert.pem}"
        tlsClientKey: "\$__file{/etc/ssl/key.pem}"
---
EOF

Use port forwarding to access the Grafana UI. Be aware that this may require that you install socat on your manager nodes.
```
kubectl port-forward service/grafana 3000:3000 -n monitoring
```

You can now navigate to the Grafana UI, which has the MKE Prometheus data source installed at http://localhost:3000/. Log in initially using admin for both the user name and password, taking care to change your credentials after successful log in.

See also

Configure native Kubernetes role-based access control¶

MKE uses native Kubernetes RBAC, which is active by default for Kubernetes clusters. The YAML files of many ecosystem applications and integrations use Kubernetes RBAC to access service accounts. Also, organizations looking to run MKE both on-premises and in hosted cloud services want to run Kubernetes applications in both environments without having to manually change RBAC in their YAML file.

Note

Kubernetes and Swarm roles have separate views. Using the MKE web UI, you can view all the roles for a particular cluster:

Click Access Control in the navigation menu at the left.
Click Roles.
Select the Kubernetes tab or the Swarm tab to view the specific roles for each.

Create a Kubernetes role¶

You create Kubernetes roles either through the CLI using Kubernetes kubectl tool or through the MKE web UI.

To create a Kubernetes role using the MKE web UI:

Log in to the the MKE web UI.
In the navigation menu at the left, click Access Control to display the available options.
Click Roles.
At the top of the details pane, click the Kubernetes tab.
Click Create to open the Create Kubernetes Object page.
Click Namespace to select a namespace for the role from one of the available options.
Provide the YAML file for the role. To do this, either enter it in the Object YAML editor, or upload an existing .yml file using the Click to upload a .yml file selection link at the right.
Click Create to complete role creation.

See also

Create a Kubernetes role grant¶

Kubernetes provides two types of role grants:

ClusterRoleBinding (applies to all namespaces)
RoleBinding (applies to a specific namespace)

To create a grant for a Kubernetes role in the MKE web UI:

Log in to the the MKE web UI.
In the navigation menu at the left, click Access Control to display the available options.
Click the Grants option.
At the top of the details paine, click the Kubernetes tab. All existing grants to Kubernetes roles are present in the details pane.
Click Create Role Binding to open the Create Role Binding page.
Select the subject type at the top of the 1. Subject section (Users, Organizations, or Service Account).
Create a role binding for the selected subject type:
- Users: Select a type from the User drop-down list.
- Organizations: Select a type from the Organization drop-down list. Optionally, you can also select a team using the Team(optional) drop-down list, if any have been established.
- Service Account: Select a NAMESPACE from the Namespace drop-down list, then a type from the Service Account drop-down list.
Click Next to activate the 2. Resource Set section.
Select a resource set for the subject.

By default, the default namespace is indicated. To use a different namespace, select the Select Namespace button associated with the desired namespace.

For ClusterRoleBinding, slide the Apply Role Binding to all namespace (Cluster Role Binding) selector to the right.
Click Next to activate the 3. Role section.
Select the role type.
- Role
- Cluster Role
Note

Cluster Role type is the only role type available if you enabled Apply Role Binding to all namespace (Cluster Role Binding) in the 2. Resource Set section.
Select the role from the from the drop-down list.
Click Create to complete grant creation.

See also

See also

MKE audit logging¶

Audit logs are a chronological record of security-relevant activities by individual users, administrators, or software components that have had an effect on an MKE system. They focus on external user/agent actions and security, rather than attempting to understand state or events of the system itself.

Audit logs capture all HTTP actions (GET, PUT, POST, PATCH, DELETE) to all MKE API, Swarm API, and Kubernetes API endpoints (with the exception of the ignored list) that are invoked and and sent to Mirantis Container Runtime via stdout.

The benefits that audit logs provide include:

Historical troubleshooting: You can use audit logs to determine a sequence of past events that can help explain why an issue occurred.
Security analysis and auditing: A full record of all user interactions with the container infrastructure can provide your security team with the visibility necessary to root out questionable or unauthorized access attempts.
Chargeback: Use audit log about the resources to generate chargeback information.
Alerting: With a watch on an event stream or a notification the event creates, you can build alerting features on top of event tools that generate alerts for ops teams (PagerDuty, OpsGenie, Slack, or custom solutions).

Logging levels¶

MKE provides three levels of audit logging to administrators:

None	Audit logging is disabled.
Metadata	Includes: Method and API endpoint for the request MKE user who made the request Response status (success or failure) Timestamp of the call Object ID of any created or updated resource (for create or update API calls). We do not include names of created or updated resources. License key Remote address
Request	Includes all fields from the Metadata level, as well as the request payload.

Once you enable MKE audit logging, the audit logs will collect within the container logs of the ucp-controller container on each MKE manager node.

Note

Be sure to configure a logging driver with log rotation set, as audit logging can generate a large amount of data.

Enable MKE audit logging¶

Note

The enablement of auditing in MKE does not automatically enable auditing in Kubernetes objects. To do this, you must set the kube_api_server_auditing parameter in the MKE configuration file to true.

Once you have set the kube_api_server_auditing parameter to true, the following default auditing values are configured on the Kubernetes API server:

--audit-log-maxage: 30
--audit-log-maxbackup: 10
--audit-log-maxsize: 10

For information on how to enable and configure the Kubernetes API server audit values, refer to cluster_config table detail in the MKE configuration file.

You can enable MKE audit logging using the MKE web user interface, the MKE API, and the MKE configuration file.

Enable MKE audit logging using the web UI
Enable MKE audit logging using the API
Enable MKE audit logging using the configuration file

Enable MKE audit logging using the web UI¶

Log in to the MKE web user interface.
Click admin to open the navigation menu at the left.
Click Admin Settings.
Click Logs & Audit Logs to open the Logs & Audit Logs details pane.
In the Configure Audit Log Level section, select the relevant logging level.
Click Save.

Enable MKE audit logging using the API¶

Download the MKE client bundle from the command line, as described in Download the client bundle.

Retrieve the JSON file for current audit log configuration:

export DOCKER_CERT_PATH=~/ucp-bundle-dir/
curl --cert ${DOCKER_CERT_PATH}/cert.pem --key ${DOCKER_CERT_PATH}/key.pem --cacert ${DOCKER_CERT_PATH}/ca.pem -k -X GET https://ucp-domain/api/ucp/config/logging > auditlog.json

In auditlog.json, edit the auditlevel field to metadata or request:

{
    "logLevel": "INFO",
    "auditLevel": "metadata",
    "supportDumpIncludeAuditLogs": false
}

Send the JSON request for the audit logging configuration with the same API path, but using the PUT method:

curl --cert ${DOCKER_CERT_PATH}/cert.pem --key
${DOCKER_CERT_PATH}/key.pem --cacert ${DOCKER_CERT_PATH}/ca.pem -k -H
"Content-Type: application/json" -X PUT --data $(cat auditlog.json)
https://ucp-domain/api/ucp/config/logging

Enable MKE audit logging using the configuration file¶

You can enable MKE audit logging using the MKE configuration file before or after MKE installation.

The section of the MKE configuration file that controls MKE auditing logging is [audit_log_configuration]:

[audit_log_configuration]
  level = "metadata"
  support_dump_include_audit_logs = false

The level setting supports the following variables:

""
"metadata"
"request"

Caution

The support_dump_include_audit_logs flag specifies whether user identification information from the ucp-controller container logs is included in the support bundle. To prevent this information from being sent with the support bundle, verify that support_dump_include_audit_logs is set to false. When disabled, the support bundle collection tool filters out any lines from the ucp-controller container logs that contain the substring auditID.

Access audit logs using the docker CLI¶

The audit logs are exposed through the ucp-controller logs. You can access these logs locally through the Docker CLI.

Note

You can also access MKE audit logs using an external container logging solution, such as ELK.

To access audit logs using the Docker CLI:

Source a MKE client bundle.

Run docker logs to obtain audit logs.

The following example tails the command to show the last log entry.

$ docker logs ucp-controller --tail 1

{"audit":{"auditID":"f8ce4684-cb55-4c88-652c-d2ebd2e9365e","kind":"docker-swarm","level":"metadata","metadata":{"creationTimestamp":null},"requestReceivedTimestamp":"2019-01-30T17:21:45.316157Z","requestURI":"/metricsservice/query?query=(%20(sum%20by%20(instance)%20(ucp_engine_container_memory_usage_bytes%7Bmanager%3D%22true%22%7D))%20%2F%20(sum%20by%20(instance)%20(ucp_engine_memory_total_bytes%7Bmanager%3D%22true%22%7D))%20)%20*%20100\u0026time=2019-01-30T17%3A21%3A45.286Z","sourceIPs":["172.31.45.250:48516"],"stage":"RequestReceived","stageTimestamp":null,"timestamp":null,"user":{"extra":{"licenseKey":["FHy6u1SSg_U_Fbo24yYUmtbH-ixRlwrpEQpdO_ntmkoz"],"username":["admin"]},"uid":"4ec3c2fc-312b-4e66-bb4f-b64b8f0ee42a","username":"4ec3c2fc-312b-4e66-bb4f-b64b8f0ee42a"},"verb":"GET"},"level":"info","msg":"audit","time":"2019-01-30T17:21:45Z"}

Sample audit log for a Kubernetes cluster:

{"audit"; {
      "metadata": {...},
      "level": "Metadata",
      "timestamp": "2018-08-07T22:10:35Z",
      "auditID": "7559d301-fa6b-4ad6-901c-b587fab75277",
      "stage": "RequestReceived",
      "requestURI": "/api/v1/namespaces/default/pods",
      "verb": "list",
      "user": {"username": "alice",...},
      "sourceIPs": ["127.0.0.1"],
      ...,
      "requestReceivedTimestamp": "2018-08-07T22:10:35.428850Z"}}

Sample audit log for a Swarm cluster:

{"audit"; {
      "metadata": {...},
      "level": "Metadata",
      "timestamp": "2018-08-07T22:10:35Z",
      "auditID": "7559d301-94e7-4ad6-901c-b587fab31512",
      "stage": "RequestReceived",
      "requestURI": "/v1.30/configs/create",
      "verb": "post",
      "user": {"username": "alice",...},
      "sourceIPs": ["127.0.0.1"],
      ...,
      "requestReceivedTimestamp": "2018-08-07T22:10:35.428850Z"}}

API endpoints logging constraints¶

With regard to audit logging, for reasons having to do with system security a number of MKE API endpoints are either ignored or have their information redacted.

API endpoints ignored¶

The following API endpoints are ignored since they are not considered security events and can create a large amount of log entries:

/_ping
/ca
/auth
/trustedregistryca
/kubeauth
/metrics
/info
/version\*
/debug
/openid_keys
/apidocs
/kubernetesdocs
/manage

API endpoints information redacted¶

For security purposes, information for the following API endpoints is redacted from the audit logs:

/secrets/create (POST)
/secrets/{id}/update (POST)
/swarm/join (POST)
/swarm/update (POST) -/auth/login (POST)
Kubernetes secrets create/update endpoints

See also

See also

Enable MKE telemetry¶

You can set MKE to automatically record and transmit data to Mirantis through an encrypted channel for monitoring and analysis purposes. The data collected provides the Mirantis Customer Success Organization with information that helps us to better understand the operational use of MKE by our customers. It also provides key feedback in the form of product usage statistics, which enable our product teams to enhance Mirantis products and services.

Specifically, with MKE you can send hourly usage reports, as well as information on API and UI usage.

Caution

To send the telemetry, verify that dockerd and the MKE application container can resolve api.segment.io and create a TCP (HTTPS) connection on port 443.

To enable telemetry in MKE:

Log in to the MKE web UI as an administrator.
At the top of the navigation menu at the left, click the user name drop-down to display the available options.
Click Admin Settings to display the available options.
Click Usage to open the Usage Reporting screen.
Toggle the Enable API and UI tracking slider to the right.
(Optional) Enter a unique label to identify the cluster in the usage reporting.
Click Save.

Enable and integrate SAML authentication¶

Security Assertion Markup Language (SAML) is an open standard for exchanging authentication and authorization data between parties. It is commonly supported by enterprise authentication systems. SAML-based single sign-on (SSO) gives you access to MKE through a SAML 2.0-compliant identity provider.

MKE supports the Okta and ADFS identity providers.

The SAML integration process is as follows.

Configure the Identity Provider (IdP).
Enable SAML and configure MKE as the Service Provider under Admin Settings > Authentication and Authorization.
Create (Edit) Teams to link with the Group memberships. This updates team membership information when a user signs in with SAML.

Note

If LDAP integration is enabled, refer to Use LDAP in conjunction with SAML for information on using SAML in parallel with LDAP.

Configure SAML integration on identity provider¶

Identity providers require certain values to successfully integrate with MKE. As these values vary depending on the identity provider, consult your identity provider documentation for instructions on how to best provide the needed information.

Okta integration values¶

Okta integration requires the following values:

Value	Description
URL for single signon (SSO)	URL for MKE, qualified with `/enzi/v0/saml/acs`. For example, `https://111.111.111.111/enzi/v0/saml/acs`.
Service provider audience URI	URL for MKE, qualified with `/enzi/v0/saml/metadata`. For example, `https://111.111.111.111/enzi/v0/saml/metadata`.
NameID format	Select Unspecified.
Application user name	Email. For example, a custom `${f:substringBefore(user.email, "@")}` specifies the user name portion of the email address.
Attribute Statements	Name: `fullname` Value: `user.displayName`
Group Attribute Statement	Name: `member-of` Filter: (user defined) for associate group membership. The group name is returned with the assertion. Name: `is-admin` Filter: (user defined) for identifying whether the user is an admin.

Okta configuration

When two or more group names are expected to return with the assertion, use the regex filter. For example, use the value apple|orange to return groups apple and orange.

ADFS integration values¶

To enable ADFS integration:

Add a relying party trust.
Obtain the service provider metadata URI.

The service provider metadata URI value is the URL for MKE, qualified with /enzi/v0/saml/metadata. For example, https://111.111.111.111/enzi/v0/saml/metadata.
Add claim rules.
1. Convert values from AD to SAML
  - Display-name : Common Name
  - E-Mail-Addresses : E-Mail Address
  - SAM-Account-Name : Name ID
2. Create a full name for MKE (custom rule):
```
c:[Type == "http://schemas.xmlsoap.org/claims/CommonName"]      => issue(Type = "fullname", Issuer = c.Issuer, OriginalIssuer = c.OriginalIssuer, Value = c.Value,       ValueType = c.ValueType);
```
3. Transform account name to Name ID:
  - Incoming type: Name ID
  - Incoming format: Unspecified
  - Outgoing claim type: Name ID
  - Outgoing format: Transient ID
4. Pass admin value to allow admin access based on AD group. Send group membership as claim:
  - Users group: your admin group
  - Outgoing claim type: is*admin
  - Outgoing claim value: 1
5. Configure group membership for more complex organizations, with multiple groups able to manage access.
  - Send LDAP attributes as claims
  - Attribute store: Active Directory
    - Add two rows with the following information:
      - LDAP attribute = email address; outgoing claim type: email address
      - LDAP attribute = Display*Name; outgoing claim type: common name
  - Mapping:
    - Token-Groups - Unqualified Names : member-of

Note

Once you enable SAML, Service Provider metadata is available at https://<SPHost>/enzi/v0/saml/metadata. The metadata link is also labeled as entityID.

Only POST binding is supported for the Assertion Consumer Service, which is located at https://<SP Host>/enzi/v0/saml/acs.

Configure SAML integration on MKE¶

SAML configuration requires that you know the metadata URL for your chosen identity provider, as well as the URL for the MKE host that contains the IP address or domain of your MKE installation.

To configure SAML integration on MKE:

Log in to the MKE web UI.
In the navigation menu at the left, click the user name drop-down to display the available options.
Click Admin Settings to display the available options.
Click Authentication & Authorization.
In the Identity Provider section in the details pane, move the slider next to SAML to enable the SAML settings.
In the SAML idP Server subsection, enter the URL for the identity provider metadata in the IdP Metadata URL field.
Note

If the metadata URL is publicly certified, you can continue with the default settings:
- Skip TLS Verification unchecked
- Root Certificates Bundle blank
Mirantis recommends TLS verification in production environments. If the metadata URL cannot be certified by the default certificate authority store, you must provide the certificates from the identity provider in the Root Certificates Bundle field.
In the SAML Service Provider subsection, in the MKE Host field, enter the URL that includes the IP address or domain of your MKE installation.

The port number is optional. The current IP address or domain displays by default.
(Optional) Customize the text of the sign-in button by entering the text for the button in the Customize Sign In Button Text field. By default, the button text is Sign in with SAML.
Copy the SERVICE PROVIDER METADATA URL, the ASSERTION CONSUMER SERVICE (ACS) URL, and the SINGLE LOGOUT (SLO) URL to paste into the identity provider workflow.
Click Save.

Note

To configure a service provider, enter the Identity Provider’s metadata URL to obtain its metadata. To access the URL, you may need to provide the CA certificate that can verify the remote server.
To link group membership with users, use the Edit or Create team dialog to associate SAML group assertion with the MKE team to synchronize user team membership when the user log in.

SAML security considerations¶

From the MKE web UI you can download a client bundle with which you can access MKE using the CLI and the API.

A client bundle is a group of certificates that enable command-line access and API access to the software. It lets you authorize a remote Docker engine to access specific user accounts that are managed in MKE, absorbing all associated RBAC controls in the process. Once you obtain the client bundle, you can execute Docker Swarm commands from your remote machine to take effect on the remote cluster.

Previously-authorized client bundle users can still access MKE, regardless of the newly configured SAML access controls.

Mirantis recomments that you take the following steps to ensure that access from the client bundle is in sync with the identity provider, and to thus prevent any previously-authorized users from accessing MKE through their existing client bundle:

Remove the user account from MKE that grants the client bundle access.
If group membership in the identity provider changes, replicate the change in MKE.
Continue using LDAP to sync group membership.

To download the client bundle:

Log in to the MKE web UI.
In the navigation menu at the left, click the user name drop-down to display the available options.
Click your account name to display the available options.
Click My Profile.
Click the New Client Bundle drop-down in the details pane and select Generate Client Bundle.
(Optional) Enter a name for the bundle into the Label field.
Click Confirm to initiate the bundle download.

Enable Helm with MKE¶

To use Helm with MKE, you must define the necessary roles in the kube-system default service account.

Note

For comprehensive information on the use of Helm, refer to the Helm user documentation.

To enable Helm with MKE, enter the following kubectl commands in sequence:

kubectl create rolebinding default-view --clusterrole=view
--serviceaccount=kube-system:default --namespace=kube-system

kubectl create clusterrolebinding add-on-cluster-admin
--clusterrole=cluster-admin --serviceaccount=kube-system:default

Integrate SCIM¶

System for Cross-domain Identity Management (SCIM) is a standard for automating the exchange of user identity information between identity domains or IT systems. It offers an LDAP alternative for provisioning and managing users and groups in MKE, as well as for syncing users and groups with an upstream identity provider. Using SCIM schema and API, you can utilize Single sign-on services (SSO) across various tools.

Mirantis certifies the use of Okta 3.2.0, however MKE offers the discovery endpoints necessary to provide any system or application with the product SCIM configuration.

Configure SCIM for MKE¶

The Mirantis SCIM implementation uses SCIM version 2.0.

MKE SCIM intregration typically involves the following steps:

Enable SCIM.
Configure SCIM for authentication and access.
Specify user attributes.

Enable SCIM¶

Log in to the MKE web UI.
Click Admin Settings > Authentication & Authorization.
In the Identity Provider Integration section in the details pane, move the slider next to SCIM to enable the SCIM settings.

Configure SCIM authentication and access¶

In the SCIM configuration subsection, either enter the API token in the API Token field or click Generate to have MKE generate a UUID.

The base URL for all SCIM API calls is https://<Host IP>/enzi/v0/scim/v2/. All SCIM methods are accessible API endpoints of this base URL.

Bearer Auth is the API authentication method. When configured, you access SCIM API endpoints through the Bearer <token> HTTP Authorization request header.

Note

SCIM API endpoints are not accessible by any other user (or their token), including the MKE administrator and MKE admin Bearer token.
The only SCIM method MKE supports is an HTTP authentication request header that contains a Bearer token.

Specify user attributes¶

The following table maps the user attribute fields in use by Mirantis to SCIM and SAML attributes.

MKE	SAML	SCIM
Account name	`nameID` in response	`userName`
Account full name	Attribute value in `fullname` assertion	User’s `name.formatted`
Team group link name	Attribute value in `member-of` assertion	Group’s `displayName`
Team name	N/A	When creating a team, use the group’s `displayName` + `_SCIM`

Supported SCIM API endpoints¶

MKE supports SCIM API endpoints across three operational areas: User, Group, and Service Provider Configuration.

User operations¶

The SCIM API endpoints that serve in user operations provide the means to:

Retrieve user information
Create a new user
Update user information

For user GET and POST operations:

Filtering is only supported using the userName attribute and eq operator. For example, filter=userName Eq "john".
Attribute name and attribute operator are case insensitive. For example, the following two expressions have the same logical value:
- filter=userName Eq "john"
- filter=Username eq "john"
Pagination is fully supported.
Sorting is not supported.

GET /Users¶

Returns a list of SCIM users (by default, 200 users per page).

Use the startIndex and count query parameters to paginate long lists of users. For example, to retrieve the first 20 Users, set startIndex to 1 and count to 20, provide the following JSON request:

GET Host IP/enzi/v0/scim/v2/Users?startIndex=1&count=20
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8

The response to the previous query returns paging metadata that is similar to the following example:

{
  "totalResults":100,
  "itemsPerPage":20,
  "startIndex":1,
  "schemas":["urn:ietf:params:scim:api:messages:2.0:ListResponse"],
  "Resources":[{
     ...
  }]
}

GET /Users/{id}¶

Retrieves a single user resource.

The value of the {id} should be the user’s ID. You can also use the userName attribute to filter the results.

GET {Host IP}/enzi/v0/scim/v2/Users?{user ID}
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8

POST /Users¶

Creates a user.

The operation must include the userName attribute and at least one email address.

POST {Host IP}/enzi/v0/scim/v2/Users
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8

PATCH /Users/{id}¶

Updates a user’s active status.

Reactivate inactive users by specifying "active": true. To deactivate active users, specify "active": false. The value of the {id} should be the user’s ID.

PATCH {Host IP}/enzi/v0/scim/v2/Users?{user ID}
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8

PUT /Users/{id}¶

Updates existing user information.

All attribute values are overwritten, including attributes for which empty values or no values have been provided. If a previously set attribute value is left blank during a PUT operation, the value is updated with a blank value in accordance with the attribute data type and storage provider. The value of the {id} should be the user’s ID.

Group operations¶

The SCIM API endpoints that serve in group operations provide the means to:

Create a new user group
Retrieve group information
Update user group membership (add/replace/remove users)

For group GET and POST operations:

Pagination is fully supported.
Sorting is not supported.

GET /Groups/{id}¶

Retrieves information for a single group.

GET /scim/v1/Groups?{Group ID}
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8

GET /Groups¶

Returns a paginated list of groups (by default, ten groups per page).

Use the startIndex and count query parameters to paginate long lists of groups.

GET /scim/v1/Groups?startIndex=4&count=500 HTTP/1.1
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8

POST /Groups¶

Creates a new group.

Add users to the group during group creation by supplying user ID values in the members array.

PATCH /Groups/{id}¶

Updates an existing group resource, allowing the addition or removal of individual (or groups of) users from the group with a single operation. Add is the default operation.

To remove members from a group, set the operation attribute of a member object to delete.

PUT /Groups/{id}¶

Updates an existing group resource, overwriting all values for a group even if an attribute is empty or is not provided.

PUT replaces all members of a group with members that are provided by way of the members attribute. If a previously set attribute is left blank during a PUT operation, the new value is set to blank in accordance with the data type of the attribute and the storage provider.

Service Provider configuration operations¶

The SCIM API endpoints that serve in Service provider configuration operations provide the means to:

Retrieve service provider resource type metadata
Retrieve schema for service provider and SCIM resources
Retrieve schema for service provider configuration

SCIM defines three endpoints to facilitate discovery of the SCIM service provider features and schema that you can retrieve using HTTP GET:

GET /ResourceTypes¶

Discovers the resource types available on a SCIM service provider (for example, Users and Groups).

Each resource type defines the endpoints, the core schema URI that defines the resource, and any supported schema extensions.

GET /Schemas¶

Retrieves information about all supported resource schemas supported by a SCIM service provider.

GET /ServiceProviderConfig¶

Returns a JSON structure that describes the SCIM specification features that are available on a service provider using a schemas attribute of urn:ietf:params:scim:schemas:core:2.0:ServiceProviderConfig.

Integrate with an LDAP directory¶

MKE integrates with LDAP directory services, thus allowing you to manage users and groups from your organization directory and to automatically propagate the information to MKE and MSR.

Once you enable LDAP, MKE uses a remote directory server to create users automatically, and all logins are forwarded thereafter to the directory server.

When you switch from built-in authentication to LDAP authentication, all manually created users whose usernames fail to match any LDAP search results remain available.

When you enable LDAP authentication, you configure MKE to create user accounts only when users log in for the first time.

Note

If SAML integration is enabled, refer to Use LDAP in conjunction with SAML for information on using SAML in parallel with LDAP.

MKE integration with LDAP¶

To control the integration of MKE with LDAP, you create user searches. For these user searches, you use the MKE web UI to specify multiple search configurations and specify multiple LDAP servers with which to integrate. Searches start with the Base DN, the Distinguished Name of the node in the LDAP directory tree in which the search looks for users.

MKE to LDAP synchronization workflow

The following occurs when MKE synchronizes with LDAP:

MKE creates a set of search results by iterating over each of the user search configurations, in an order that you specify.
MKE choses an LDAP server from the list of domain servers by considering the Base DN from the user search configuration and selecting the domain server with the longest domain suffix match.

Note

If no domain server has a domain suffix that matches the Base DN from the search configuration, MKE uses the default domain server.
MKE creates a list of users from the search and creates MKE accounts for each one.

Note

If you select the Just-In-Time User Provisioning option, user accounts are created only when users first log in.

Example workflow:

Consider an example with three LDAP domain servers and three user search configurations.

The example LDAP domain servers:

LDAP domain server name	URL
`default`	ldaps://ldap.example.com
`dc=subsidiary1,dc=com`	ldaps://ldap.subsidiary1.com
`dc=subsidiary2,dc=subsidiary1,dc=com`	ldaps://ldap.subsidiary2.com

The example user search configurations:

User search configurations	Description
`baseDN=\` `ou=people,dc=subsidiary1,dc=com`	For this search configuration, `dc=subsidiary1,dc=com` is the only server with a domain that is a suffix, so MKE uses the server `ldaps://ldap.subsidiary1.com` for the search request.
`baseDN=\` `ou=product,dc=subsidiary2,dc=subsidiary1,dc=com`	For this search configuration, two of the domain servers have a domain that is a suffix of this `Base DN`. As `dc=subsidiary2,dc=subsidiary1,dc=com` is the longer of the two, however, MKE uses the server `ldaps://ldap.subsidiary2.com` for the search request.
`baseDN=\` `ou=eng,dc=example,dc=com`	For this search configuration, no server with a domain specified is a suffix of this `Base DN`, so MKE uses the default server, `ldaps://ldap.example.com`, for the search request.

Whenever user search results contain username collisions between the domains, MKE uses only the first search result, and thus the ordering of the user search configurations can be important. For example, if both the first and third user search configurations result in a record with the username jane.doe, the first has higher precedence and the second is ignored. As such, it is important to implement a username attribute that is unique for your users across all domains. As a best practice, choose something that is specific to the subsidiary, such as the email address for each user.

Configure the LDAP integration¶

Note

MKE saves a minimum amount of user data required to operate, including any user name and full name attributes that you specify in the configuration, as well as the Distinguished Name (DN) of each synced user. MKE does not store any other data from the directory server.

Use the MKE web UI to configure MKE to create and authenticate users using an LDAP directory.

Access the LDAP controls¶

To configure LDAP integration you must first gain access to the controls for the service protocol.

Log in to the MKE web UI.
In the left-side navigation menu, click the user name drop-down to display the available options.
Navigate to Admin Settings > Authentication & Authorization.
In the Identity Provider section in the details pane, move the slider next to LDAP to enable the LDAP settings.

Set up an LDAP server¶

To configure an LDAP server, perform the following steps:

To set up a new LDAP server, configure the settings in the LDAP Server subsection:

Control	Description
LDAP Server URL	The URL for the LDAP server.
Reader DN	The DN of the LDAP account that is used to search entries in the LDAP server. As a best practice, this should be an LDAP read-only user.
Reader Password	The password of the account used to search entries in the LDAP server.
Skip TLS verification	Sets whether to verify the LDAP server certificate when TLS is in use. The connection is still encrypted, however it is vulnerable to man-in-the-middle attacks.
Use Start TLS	Defines whether to authenticate or encrypt the connection after connection is made to the LDAP server over TCP. To ignore the setting, set the LDAP Server URL field to `ldaps://`.
No Simple Pagination (RFC 2696)	Indicates that your LDAP server does not support pagination.
Just-In-Time User Provisioning	Sets whether to create user accounts only when users log in for the first time. Mirantis recommends using the default `true` value.

Note

Available as of MKE 3.6.4 The disableReferralChasing setting, which is currently only available by way of the MKE API, allows you to disable the default behavior that occurs when a referral URL is received as a result of an LDAP search request. Refer to LDAP Configuration through API for more information.

Click Save to add your LDAP server.

Add additional LDAP domains¶

To integrate MKE with additional LDAP domains:

In the LDAP Additional Domains subsection, click Add LDAP Domain +. A set of input tools for configuring the additional domain displays.

Configure the settings for the new LDAP domain:

Control	Description
LDAP Domain	Text field in which to enter the root domain component of this server. A longest-suffix match of the `Base DN` for LDAP searches is used to select which LDAP server to use for search requests. If no matching domain is found, the default LDAP server configuration is put to use.
LDAP Server URL	Text field in which to enter the URL for the LDAP server.
Reader DN	Text field in which to enter the DN of the LDAP account that is used to search entries in the LDAP server. As a best practice, this should be an LDAP read-only user.
Reader Password	The password of the account used to search entries in the LDAP server.
Skip TLS verification	Sets whether to verify the LDAP server certificate when TLS is in use. The connection is still encrypted, however it is vulnerable to man-in-the-middle attacks.
Use Start TLS	Sets whether to authenticate or encrypt the connection after connection is made to the LDAP server over TCP. To ignore the setting, set the LDAP Server URL field to `ldaps://`.
No Simple Pagination (RFC 2696)	Select if your LDAP server does not support pagination.

Note

Click Confirm to add the new LDAP domain.
Repeat the procedure to add any additional LDAP domains.

Add LDAP user search configurations¶

To add LDAP user search configurations to your LDAP integration:

In the LDAP User Search Configurations subsection, click Add LDAP User Search Configuration +.A set of input tools for configuring the LDAP user search configurations displays.

Field	Description
Base DN	Text field in which to enter the DN of the node in the directory tree, where the search should begin seeking out users.
Username Attribute	Text field in which to enter the LDAP attribute that serves as username on MKE. Only user entries with a valid username will be created. A valid username must not be longer than 100 characters and must not contain any unprintable characters, whitespace characters, or any of the following characters: `/` `\` `[` `]` `:` `;` `\|` `=` `,` `+` `*` `?` `<` `>` `'` `"`.
Full Name Attribute	Text field in which to enter the LDAP attribute that serves as the user’s full name, for display purposes. If the field is left empty, MKE does not create new users with a full name value.
Filter	Text field in which to enter an LDAP search filter to use to find users. If the field is left empty, all directory entries in the search scope with valid username attributes are created as users.
Search subtree instead of just one level	Whether to perform the LDAP search on a single level of the LDAP tree, or search through the full LDAP tree starting at the `Base DN`.
Match Group Members	Sets whether to filter users further, by selecting those who are also members of a specific group on the directory server. The feature is helpful when the LDAP server does not support `memberOf` search filters.
Iterate through group members	Sets whether, when the Match Group Members option is enabled to sync users, the sync is done by iterating over the target group’s membership and making a separate LDAP query for each member, rather than through the use of a broad user search filter. This option can increase efficiency in situations where the number of members of the target group is significantly smaller than the number of users that would match the above search filter, or if your directory server does not support simple pagination of search results.
Group DN	Text field in which to enter the DN of the LDAP group from which to select users, when the Match Group Members option is enabled.
Group Member Attribute	Text field in which to enter the name of the LDAP group entry attribute that corresponds to the DN of each of the group members.

Click Confirm to add the new LDAP user search configurations.
Repeat the procedure to add any additional user search configurations. More than one such configuration can be useful in cases where users may be found in multiple distinct subtrees of your organization directory. Any user entry that matches at least one of the search configurations will be synced as a user.

Test LDAP login¶

Prior to saving your configuration changes, you can use the dedicated LDAP Test login tool to test the integration using the login credentials of an LDAP user.

Input the credentials for the test user into the provided Username and Passworfd fields:

Field	Description
Username	An LDAP user name for testing authentication to MKE. The value corresponds to the Username Attribute that is specified in the Add LDAP user search configurations section.
Password	The password used to authenticate (BIND) to the directory server.

Click Test. A search is made against the directory using the provided search Base DN, scope, and filter. Once the user entry is found in the directory, a BIND request is made using the input user DN and the given password value.

Set LDAP synchronization¶

Following LDAP integration, MKE synchronizes users at the top of the hour, based on an intervial that is defined in hours.

To set LDAP synchronization, configure the following settings in the LDAP Sync Configuration section:

Field	Description
Sync interval	The interval, in hours, to synchronize users between MKE and the LDAP server. When the synchronization job runs, new users found in the LDAP server are created in MKE with the default permission level. MKE users that do not exist in the LDAP server become inactive.
Enable sync of admin users	This option specifies that system admins should be synced directly with members of a group in your organization’s LDAP directory. The admins will be synced to match the membership of the group. The configured recovery admin user will also remain a system admin.

Manually synchronize LDAP¶

In addition to configuring MKE LDAP synchronization, you can also perform a hot synchronization by clicking the Sync Now button in the LDAP Sync Jobs subsection. Here you can also view the logs for each sync jobs by clicking View Logs link associated with a particular job.

Revoke user access¶

Whenever a user is removed from LDAP, the effect on their MKE account is determined by the Just-In-Time User Provisioning setting:

false: Users deleted from LDAP become inactive in MKE following the next LDAP synchronization runs.
true: A user deleted from LDAP cannot authenticate. Their MKE accounts remain active, however, and thus they can use their client bundles to run commands. To prevent this, deactivate the user’s MKE user account.

Synchronize teams with LDAP¶

MKE enables the syncing of teams within Organizations with LDAP, using either a search query or by matching a group that is established in your LDAP directory.

Log in to the MKE web UI as an administrator.
Navigate to Access Control > Orgs & Teams to display the Organizations that exist within your MKE instance.
Locate the name of the Organization that contains the MKE team that you want to sync to LDAP and click it to display all of the MKE teams for that Organization.
Hover your cursor over the MKE team that you want to sync with LDAP to reveal its vertical ellipsis, at the far right.
Click the vertical ellipsis and select Edit to call the Details screen for the team.
Toggle ENABLE SYNC TEAM MEMBERS to Yes to reveal the LDAP sync controls.
Toggle LDAP MATCH METHOD to set the LDAP match method you want to use to make the sync, Match Search Results (default) or Match Group Members.
- For Match Search Results:
  1. Enter a Base DN into the Search Base DN field, as it is established in LDAP.
  2. Enter a search filter based on one or more attributes into the Search filter field.
  3. Optional. Check Search subtree instead of just one level to enable search down through any sub-groups that exist within the group you entered into the Search Base DN field.
- For Match Group Members:
  1. Enter the group Distinguised Name (DN) into the Group DN field.
  2. Enter a member attribute into the Group Member field.
Toggle IMMEDIATELY SYNC TEAM MEMBERS as appropriate.
Toggle ALLOW NON-LDAP MEMBERS as appropriate.
Click Save.

LDAP Configuration through API¶

LDAP-specific GET and PUT API endpoints are available in the configuration resource. Swarm mode must be enabled to use the following endpoints:

GET /api/ucp/config/auth/ldap - Returns information on your current system LDAP configuration.
PUT /api/ucp/config/auth/ldap - Updates your LDAP configuration.

Configure an OpenID Connect identity provider¶

OpenID Connect (OIDC) allows you to authenticate MKE users with a trusted external identity provider.

Note

Kubernetes users who want client bundles to use OIDC must Download and configure the client bundle and replace the authorization section therein with the parameters presented in the Kubernetes OIDC Authenticator documentation.

For identity providers that require a client redirect URI, use https://<MKE_HOST>/login. For identity providers that do not permit the use of an IP address for the host, use https://<mke-cluster-domain>/login.

The requested scopes for all identity providers are "openid email". Claims are read solely from the ID token that your identity provider returns. MKE does not use the UserInfo URL to obtain user information. The default username claim is sub. To use a different username claim, you must specify that value with the usernameClaim setting in the MKE configuration file.

The following example details the MKE configuration file settings for using an external identity provider.

For the *signInCriteria array, term is set to hosted domain ("hd") and value is set to the domain from which the user is permitted to sign in.
For the *adminRoleCriteria array, matchType is set to "contains", in case any administrators are assigned to multiple roles that include admin.

[auth.external_identity_provider]
  wellKnownConfigUrl = "https://example.com/.well-known/openid-configuration"
  clientId = "4dcdace6-4eb4-461d-892f-01aed344ac80"
  clientSecret = "ed89aeddcdb4461ace640"
  usernameClaim = "email"
  caBundle = "----BEGIN CERTIFICATE----\nMIIF...UfTd\n----END CERTIFICATE----\n"

  [[auth.external_identity_provider.signInCriteria]]
    term = "hd"
    value = "myorg.com"
    matchType = "must"

  [[auth.external_identity_provider.adminRoleCriteria]]
    term = "roles"
    value = "admin"
    matchType = "contains"

Note

Using an external identity provider to sign in to the MKE web UI creates a new user session, and thus users who sign in this way will not be signed out when their ID token expires. Instead, the session lifetime is set using the auth.sessions parameters in the MKE configuration file.

Refer to the MKE configuration file auth.external_identity_provider (optional) for the complete reference documentation.

Use LDAP in conjunction with SAML¶

In MKE, you can configure LDAP to work together with SAML, though you may need to overcome certain issues to do so.

To enable LDAP and SAML to be used in tandem:

Enable and integrate SAML authentication.
Log in to the MKE web UI.
In the left-side navigation panel, navigate to user name > Admin Settings > Authentication & Authorization.
Scroll down to the Identity Provider Integration section and verify that SAML is toggled to Enabled.
Select the Also allow LDAP users checkbox.
Integrate with an LDAP directory.

To sync teams with both LDAP and SAML users:

Log in to the MKE web UI.
Verify that LDAP and SAML teams are both enabled for syncing.
In the left-side navigation panel, navigate to Access Control > Orgs & Teams
Select the required organization and then select the required team.
Click the gear icon in the upper right corner.
On the Details tab, select ENABLE SYNC TEAM MEMBERS.
Select ALLOW NON-LDAP MEMBERS.

To determine a user’s authentication protocol:

Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to Access Control > Users and select the target user.

If an LDAP DN attribute is present next to Full Name and Admin, the user is managed by LDAP. If, however, the LDAP DN attribute is not present, the user is not managed by LDAP.

Overlapping user names¶

Unexpected behavior can result from having the same user name in both SAML and LDAP.

If just-in-time (JIT) provisioning is enabled in LDAP, MKE only allows log in attempts from the identity provider that first attempts to log in. MKE then blocks all log in attempts from the second identify provider.

If JIT provisioning is disabled in LDAP, the LDAP synchronization, which occurs at regular intervals, always overrides the ability of the SAML user account to log in.

To allow overlapping user names:

There may at times be a user who has the same name in both LDAP and SAML who you want to be able to sign in using either protocol.

Define a custom SAML attribute with a name of dn and a value that is equivalent to the user account distinguished name (DN) with the LDAP provider. Refer to Define a custom SAML attribute in the Okta documentation for more information.

Note

MKE considers such users to be LDAP users. As such, should their LDAP DN change, the custom SAML attribute must be updated to match.
Log in to the MKE web UI.
From the left-side navigation panel, navigate to <user name> > Admin Settings > Authentication & Authorization and scroll down to the LDAP section.
Under SAML integration, select Allow LDAP users to sign in using SAML.

Manage services node deployment¶

You can configure MKE to allow users to deploy and run services in worker nodes only, to ensure that all cluster management functionality remains performant and to enhance cluster security.

Important

If for whatever reason a user deploys a malicious service that can affect the node on which it is running, that service will not be able to strike any other nodes in the cluster or have any impact on cluster management functionality.

Restrict services deployment to Swarm worker nodes¶

To keep manager nodes performant, it is necessary at times to restrict service deployment to Swarm worker nodes.

To restrict services deployment to Swarm worker nodes:

Log in to the MKE web UI with administrator credentials.
Click the user name at the top of the navigation menu.
Navigate to Admin Settings > Orchestration.
Under Container Scheduling, toggle all of the sliders to the left to restrict the deployment only to worker nodes.

Note

Creating a grant with the Scheduler role against the / collection takes precedence over any other grants with Node Schedule on subcollections.

Restrict services deployment to Kubernetes worker nodes¶

By default, MKE clusters use Kubernetes taints and tolerations to prevent user workloads from deploying to MKE manager or MSR nodes.

Note

Workloads deployed by an administrator in the kube-system namespace do not follow scheduling constraints. If an administrator deploys a workload in the kube-system namespace, a toleration is applied to bypass the taint, and the workload is scheduled on all node types.

To view the taints, run the following command:

$ kubectl get nodes <mkemanager> -o json | jq -r '.spec.taints | .[]'

Example of system response:

{
  "effect": "NoSchedule",
  "key": "com.docker.ucp.manager"
}

Allow services deployment on Kubernetes MKE manager or MSR nodes¶

You can circumvent the protections put in place by Kubernetes taints and tolerations. For details, refer to Restrict services deployment to Kubernetes worker nodes.

Schedule services deployment on manager and MSR nodes¶

Log in to the MKE web UI with administrator credentials.
Click the user name at the top of the navigation menu.
Navigate to Admin Settings > Orchestration.
Select from the following options:
- Under Container Scheduling, toggle to the right the slider for Allow administrators to deploy containers on MKE managers or nodes running MSR.
- Under Container Scheduling, toggle to the right the slider for Allow all authenticated users, including service accounts, to schedule on all nodes, including MKE managers and MSR nodes..

Following any scheduling action, MKE applies a toleration to new workloads, to allow the Pods to be scheduled on all node types. For existing workloads, however, it is necessary to manually add the toleration to the Pod specification.

Add a toleration to the Pod specification for existing workloads¶

Add the following toleration to the Pod specification, either through the MKE web UI or using the kubectl edit <object> <workload> command:
```
tolerations:
- key: "com.docker.ucp.manager"
operator: "Exists"
```

Run the following command to confirm the successful application of the toleration:

kubectl get <object> <workload> -o json | jq -r '.spec.template.spec.tolerations | .[]'

Example of system response:

{
"key": "com.docker.ucp.manager",
"operator": "Exists"
}

Caution

A NoSchedule taint is present on MKE manager and MSR nodes, and if you disable scheduling on managers and/or workers a toleration for that taint will not be applied to the deployments. As such, you should not schedule on these nodes, except when the Kubernetes workload is deployed in the kube-system namespace.

Run only the images you trust¶

With MKE you can force applications to use only Docker images that are signed by MKE users you trust. Every time a user attempts to deploy an application to the cluster, MKE verifies that the application is using a trusted Docker image. If a trusted Docker image is not in use, MKE halts the deployment.

By signing and verifying the Docker images, you ensure that the images in use in your cluster are trusted and have not been altered, either in the image registry or on their way from the image registry to your MKE cluster.

Example workflow

A developer makes changes to a service and pushes their changes to a version control system.
A CI system creates a build, runs tests, and pushes an image to the Mirantis Secure Registry (MSR) with the new changes.
The quality engineering team pulls the image, runs more tests, and signs and pushes the image if the image is verified.
IT operations deploys the service, but only if the image in use is signed by the QA team. Otherwise, MKE will not deploy.

To configure MKE to only allow running services that use Docker trusted images:

Log in to the MKE web UI.
In the left-side navigation menu, click the user name drop-down to display the available options.
Click Admin Settings > Docker Content Trust to reveal the Content Trust Settings page.
Enable Run only signed images.

Important

At this point, MKE allows the deployment of any signed image, regardless of signee.
(Optional) Make it necessary for the image to be signed by a particular team or group of teams:
1. Click Add Team+ to reveal the two-part tool.
2. From the drop-down at the left, select an organization.
3. From the drop-down at the right, select a team belonging to the organization you selected.
4. Repeat the procedure to configure additional teams.
  
  Note
  
  If you specify multiple teams, the image must be signed by a member of each team, or someone who is a member of all of the teams.
Click Save.

MKE immediately begins enforcing the image trust policy. Existing services continue to run and you can restart them as necessary. From this point, however, MKE only allows the deployment of new services that use a trusted image.

Set user session properties¶

MKE enables the setting of various user sessions properties, such as session timeout and the permitted number of concurrent sessions.

To configure MKE login session properties:

Log in to the MKE web UI.
In the left-side navigation menu, click the user name drop-down to display the available options.
Click Admin Settings > Authentication & Authorization to reveal the MKE login session controls.

The following table offers information on the MKE login session controls:

Field

Description

Lifetime Minutes

The set duration of a login session in minutes, starting from the moment MKE generates the session. MKE invalidates the active session once this period expires and the user must re-authenticate to establish a new session.

Default: 60
Minimum: 10

Renewal Threshold Minutes

The time increment in minutes by which MKE extends an active session prior to session expiration. MKE extends the session by the amount specified in Lifetime Minutes. The threshold value cannot be greater than that set in Lifetime Minutes.

To specify that sessions not be extended, set the threshold value to 0. Be aware, though, that this may cause MKE web UI users to be unexpectedly logged out.

Default: 20
Maximum: 5 minutes less than Lifetime Minutes

Per User Limit

The maximum number of sessions a user can have running simultaneously. If the creation of a new session results in the exceeding of this limit, MKE will delete the session least recently put to use. Specifically, every time you use a session token, the server marks it with the current time (lastUsed metadata). When you create a new session exceeds the per-user limit, the session with the oldest lastUsed time is deleted, which is not necessarily the oldest session.

To disable the Per User Limit setting, set the value to 0.

Default: 10
Minimum: 1 / Maximum: No limit

Configure an MKE cluster¶

Important

The MKE configuration file documentation is up-to-date for the latest MKE release. As such, if you are running an earlier version of MKE, you may encounter detail for configuration options and parameters that are not applicable to the version of MKE you are currently running.

Refer to the MKE Release Notes for specific version-by-version information on MKE configuration file additions and changes.

The configuring of an MKE cluster takes place through the application of a TOML file. You use this file, the MKE configuration file, to import and export MKE configurations, to both create new MKE instances and to modify existing ones.

Refer to example-config in the MKE CLI reference documentation to learn how to download an example MKE configuration file.

Use an MKE configuration file¶

Put the MKE configuration file to work for the following use cases:

Set the configuration file to run at the install time of new MKE clusters
Use the API to import the file back into the same cluster
Use the API to import the file into multiple clusters

To make use of an MKE configuration file, you edit the file using either the MKE web UI or the command line interface (CLI). Using the CLI, you can either export the existing configuration file for editing, or use the example-config command to view and edit an example TOML MKE configuration file.

docker container run --rm
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.6.16 \ example-config

Modify an existing MKE configuration¶

Working as an MKE admin, use the config-toml API from within the directory of your client certificate bundle to export the current MKE settings to a TOML file.

As detailed herein, the command set exports the current configuration for the MKE hostname MKE_HOST to a file named mke-config.toml:

Define the following environment variables:

export MKE_USERNAME=<mke-username>
export MKE_PASSWORD=<mke-password>
export MKE_HOST=<mke-fqdm-or-ip-address>

Obtain and define an AUTHTOKEN environment variable:

AUTHTOKEN=$(curl --silent --insecure --data '{"username":"'$MKE_USERNAME'","password":"'$MKE_PASSWORD'"}' https://$MKE_HOST/auth/login | jq --raw-output .auth_token)

Download the current MKE configuration file.

curl --silent --insecure -X GET "https://$MKE_HOST/api/ucp/config-toml" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" > mke-config.toml

Edit the MKE configuration file, as needed. For comprehensive detail, refer to Configuration options.

Upload the newly edited MKE configuration file:

Note

You may need to reacquire the AUTHTOKEN, if significant time has passed since you first acquired it.

curl --silent --insecure -X PUT -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file 'mke-config.toml' https://$MKE_HOST/api/ucp/config-toml

Apply an existing configuration at install time¶

To customize a new MKE instance using a configuration file, you must create the file prior to installation. Then, once the new configuration file is ready, you can configure MKE to import it during the installation process using Docker Swarm.

To import a configuration file at installation:

Create a Docker Swarm Config object named com.docker.mke.config and the TOML value of your MKE configuration file contents.
When installing MKE on the cluster, specify the --existing-config flag to force the installer to use the new Docker Swarm Config object for its initial configuration.
Following the installation, delete the com.docker.mke.config object.

Configuration options¶

auth table¶

Parameter

Required

Description

backend

The name of the authorization backend to use, managed or ldap.

Default: managed

default_new_user_role

The role assigned to new users for their private resource sets.

Valid values: admin, viewonly, scheduler, restrictedcontrol, or fullcontrol.

Default: restrictedcontrol

auth.sessions¶

Parameter	Required	Description
`lifetime_minutes`	no	The initial session lifetime, in minutes. Default: `60`
`renewal_threshold_minutes`	no	The length of time, in minutes, before the expiration of a session where, if used, a session will be extended by the current configured lifetime from then. A value of `0` disables session extension. Default: `20`
`per_user_limit`	no	The maximum number of sessions that a user can have simultaneously active. If creating a new session will put a user over this limit, the least recently used session is deleted. A value of `0` disables session limiting. Default: `10`
`store_token_per_session`	no	If set, the user token is stored in `sessionStorage` instead of `localStorage`. Setting this option logs the user out and requires that they log back in, as they are actively changing the manner in which their authentication is stored.

auth.external_identity_provider (optional)¶

Configures MKE with an external OpenID Connect (OIDC) identity provider.

Parameter	Required	Description
`wellKnownConfigUrl`	yes	Sets the OpenID discovery endpoint, ending in `.well-known/openid-configuration`, for your identity provider.
`clientID`	yes	Sets the client ID, which you obtain from your identity provider.
`clientSecret`	no (recommended)	Sets the client secret, which you obtain from your identity provider.
`usernameClaim`	no	Sets the unique JWT ID token claim that contains the user names from your identity provider. Default: `sub`
`caBundle`	no	Sets the PEM certificate bundle that MKE uses to authenticate the discovery, issuer, and JWKs endpoints.
`httpProxy`	no	Sets the HTTP proxy for your identity provider.
`httpsProxy`	no	Sets the HTTPS proxy for your identity provider.
`issuer`	no	Sets the ID token issuer. If left blank, the value is obtained automatically from the discovery endpoint.
`userServiceId`	no	Sets the MKE service ID with the JWK URI for the identity provider. If left blank, the service ID is generated automatically. Warning Do not remove or replace an existing value.

auth.external_identity_provider.signInCriteria array (optional)¶

An array of claims that ID tokens require for use with MKE.

Parameter

Required

Description

term

yes

Sets the name of the claim.

value

yes

Sets the value for the claim in the form of a string.

matchType

yes

Sets how MKE evaluates the JWT claim.

Valid values:

must - the JWT claim value must be the same as the configuration value.
contains - the JWT claim value must contain the configuration value.

auth.external_identity_provider.adminRoleCriteria array (optional)¶

An array of claims that admin user ID tokens require for use with MKE. Creating a new account using a token that satisfies the criteria determined by this array automatically produces an administrator account.

Parameter

Required

Description

term

yes

Sets the name of the claim.

value

yes

Sets the value for the claim in the form of a string.

matchType

yes

Sets how the JWT claim is evaluated.

Valid values:

must - the JWT claim value must be the same as the configuration value.
contains - the JWT claim value must contain the configuration value.

auth.account_lock (optional)¶

Parameter	Required	Description
`enabled`	no	Sets whether the MKE account lockout feature is enabled.
`failureTrigger`	no	Sets the number of failed log in attempts that can occur before an account is locked.
`durationSeconds`	no	Sets the desired lockout duration in seconds. A value of `0` indicates that the account will remain locked until it is unlocked by an administrator.

hardening_configuration (optional)¶

The hardening_enabled option must be set to true to enable all other hardening_configuration options.

Parameter	Required	Description
`hardening_enabled`	no	Parent option that when set to `true` enables security hardening configuration options: `limit_kernel_capabilities`, `pid_limit`, `pid_limit_unspecified`, and `use_strong_tls_ciphers`. Default: `false`
`limit_kernel_capabilities`	no	The option can only be enabled when `hardening_enabled` is set to `true`. Limits kernel capabilities to the minimum required by each container. Components run using Docker default capabilities by default. When you enable `limit_kernel_capabilities` all capabilities are dropped, except those that are specifically in use by the component. Several components run as privileged, with capabilities that cannot be disabled. Default: `false`
`pid_limit`	no	The option can only be enabled when `hardening_enabled` is set to `true`. Sets the maximum number of PIDs MKE can allow for their respective orchestrators. The `pid_limit` option must be set to the default `0` when it is not in use. Default: `0`
`pid_limit_unspecified`	no	The option can only be enabled when `hardening_enabled` is set to `true`. When set to `false`, enables PID limiting, using the `pid_limit` option value for the associated orchestrator. Default: `true`
`use_strong_tls_ciphers`	no	The option can only be enabled when `hardening_enabled` is set to `true`. When set to `true`, in line with control 4.2.12 of the CIS Kubernetes Benchmark 1.7.0, the `use_strong_tls_ciphers` parameter limits the allowed ciphers for the `cipher_suites_for_kube_api_server`, `cipher_suites_for_kubelet` and `cipher_suites_for_etcd_server` parameters in the `cluster_config` table to the following: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305 TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 TLS_RSA_WITH_AES_256_GCM_SHA384 TLS_RSA_WITH_AES_128_GCM_SHA256 Default: `false`

registries array (optional)¶

An array of tables that specifies the MSR instances that are managed by the current MKE instance.

Parameter	Required	Description
`host_address`	yes	Sets the address for connecting to the MSR instance tied to the MKE cluster.
`service_id`	yes	Sets the MSR instance’s OpenID Connect Client ID, as registered with the Docker authentication provider.
`ca_bundle`	no	Specifies the root CA bundle for the MSR instance if you are using a custom certificate authority (CA). The value is a string with the contents of a `ca.pem` file.

audit_log_configuration table (optional)¶

Configures audit logging options for MKE components.

Parameter

Required

Description

level

Specifies the audit logging level.

Valid values: empty (to disable audit logs), metadata, request.

Default: empty

support_dump_include_audit_logs

Sets support dumps to include audit logs in the logs of the ucp-controller container of each manager node.

Valid values: true, false.

Default: false

scheduling_configuration table (optional)¶

Specifies scheduling options and the default orchestrator for new nodes.

Note

If you run a kubectl command, such as kubectl describe nodes, to view scheduling rules on Kubernetes nodes, the results that present do not reflect the MKE admin settings conifguration. MKE uses taints to control container scheduling on nodes and is thus unrelated to the kubectl Unschedulable boolean flag.

Parameter

Required

Description

enable_admin_ucp_scheduling

Determines whether administrators can schedule containers on manager nodes.

Valid values: true, false.

Default: false

You can also set the parameter using the MKE web UI:

Log in to the MKE web UI as an administrator.
Click the user name drop-down in the left-side navigation panel.
Click Admin Settings > Orchestration to view the Orchestration screen.
Scroll down to the Container Scheduling section and toggle on the Allow administrators to deploy containers on MKE managers or nodes running MSR slider.

default_node_orchestrator

Sets the type of orchestrator to use for new nodes that join the cluster.

Valid values: swarm, kubernetes.

Default: swarm

tracking_configuration table (optional)¶

Specifies the analytics data that MKE collects.

Parameter	Required	Description
`disable_usageinfo`	no	Set to disable analytics of usage information. Valid values: `true`, `false`. Default: `false`
`disable_tracking`	no	Set to disable analytics of API call information. Valid values: `true`, `false`. Default: `false`
`cluster_label`	no	Set a label to be included with analytics.
`ops_care`	no	Set to enable OpsCare. Valid values: `true`, `false`. Default: `false`

trust_configuration table (optional)¶

Specifies whether MSR images require signing.

Parameter

Required

Description

require_content_trust

Set to require the signing of images by content trust.

Valid values: true, false.

Default: false

You can also set the parameter using the MKE web UI:

Log in to the MKE web UI as an administrator.
Click the user name drop-down in the left-side navigation panel.
Click Admin Settings > Docker Content Trust to open the Content Trust Settings screen.
Toggle on the Run only signed images slider.

require_signature_from

A string array that specifies which users or teams must sign images.

allow_repos

A string array that specifies repos that are to bypass content trust check, for example, ["docker.io/mirantis/dtr-rethink" , "docker.io/mirantis/dtr-registry" ....].

log_configuration table (optional)¶

Configures the logging options for MKE components.

Parameter

Required

Description

protocol

The protocol to use for remote logging.

Valid values: tcp, udp.

Default: tcp

host

Specifies a remote syslog server to receive sent MKE controller logs. If omitted, controller logs are sent through the default Docker daemon logging driver from the ucp-controller container.

level

The logging level for MKE components.

Valid values (syslog priority levels): debug, info, notice, warning, err, crit, alert, emerg.

license_configuration table (optional)¶

Enables automatic renewal of the MKE license.

Parameter

Required

Description

auto_refresh

Set to enable attempted automatic license renewal when the license nears expiration. If disabled, you must manually upload renewed license after expiration.

Valid values: true, false.

Default: true

custom headers (optional)¶

Included when you need to set custom API headers. You can repeat this section multiple times to specify multiple separate headers. If you include custom headers, you must specify both name and value.

[[custom_api_server_headers]]

Item	Description
name	Set to specify the name of the custom header with `name` = “X-Custom-Header-Name”.
value	Set to specify the value of the custom header with `value` = “Custom Header Value”.

user_workload_defaults (optional)¶

A map describing default values to set on Swarm services at creation time if those fields are not explicitly set in the service spec.

[user_workload_defaults]

[user_workload_defaults.swarm_defaults]

Parameter

Required

Description

[tasktemplate.restartpolicy.delay]

Delay between restart attempts. The value is input in the <number><value type> formation. Valid value types include:

ns = nanoseconds
us = microseconds
ms = milliseconds
s = seconds
m = minutes
h = hours

Default: value = "5s"

[tasktemplate.restartpolicy.maxattempts]

Maximum number of restarts before giving up.

Default: value = "3"

cluster_config table (required)¶

Configures the cluster that the current MKE instance manages.

The dns, dns_opt, and dns_search settings configure the DNS settings for MKE components. These values, when assigned, override the settings in a container /etc/resolv.conf file.

Parameter	Required	Description
`controller_port`	yes	Sets the port that the `ucp-controller` monitors. Default: `443`
`kube_apiserver_port`	yes	Sets the port the Kubernetes API server monitors.
`kube_protect_kernel_defaults`	no	Protects kernel parameters from being overridden by kubelet. Default: `false`. Important When enabled, kubelet can fail to start if the following kernel parameters are not properly set on the nodes before you install MKE or before adding a new node to an existing cluster: vm.panic_on_oom=0 vm.overcommit_memory=1 kernel.panic=10 kernel.panic_on_oops=1 kernel.keys.root_maxkeys=1000000 kernel.keys.root_maxbytes=25000000 For more information, refer to Configure kernel parameters.
`kube_api_server_auditing`	no	Enables auditing to the log file in the kube-apiserver container. Important Prior to using `kube_api_server_auditing` you must first enable auditing in MKE. Refer to Enable MKE audit logging for detailed information. Before you enable the `kube_api_server_auditing` option, verify that it does not conflict with MKE options that are already set. For more information, refer to the official Kubernetes documentation Troubleshooting Clusters - Audit backends. Default: `false`.
`swarm_port`	yes	Sets the port that the `ucp-swarm-manager` monitors. Default: `2376`
`swarm_strategy`	no	Sets placement strategy for container scheduling. Be aware that this does not affect swarm-mode services. Valid values: `spread`, `binpack`, `random`.
`dns`	yes	Array of IP addresses that serve as nameservers.
`dns_opt`	yes	Array of options in use by DNS resolvers.
`dns_search`	yes	Array of domain names to search whenever a bare unqualified host name is used inside of a container.
`profiling_enabled`	no	Determines whether specialized debugging endpoints are enabled for profiling MKE performance. Valid values: `true`, `false`. Default: `false`
`authz_cache_timeout`	no	Sets the timeout in seconds for the RBAC information cache of MKE non-Kubernetes resource listing APIs. Setting changes take immediate effect and do not require a restart of the MKE controller. Default: `0` (cache is not enabled) Once you enable the cache, the result of non-Kubernetes resource listing APIs only reflects the latest RBAC changes for the user when the cached RBAC info times out.
`kv_timeout`	no	Sets the key-value store timeout setting, in milliseconds. Default: `5000`
`kv_snapshot_count`	Required	Sets the key-value store snapshot count. Default: `20000`
`external_service_lb`	no	Specifies an optional external load balancer for default links to services with exposed ports in the MKE web interface.
`cni_installer_url`	no	Specifies the URL of a Kubernetes YAML file to use to install a CNI plugin. Only applicable during initial installation. If left empty, the default CNI plugin is put to use.
`metrics_retention_time`	no	Sets the metrics retention time.
`metrics_scrape_interval`	no	Sets the interval for how frequently managers gather metrics from nodes in the cluster.
`metrics_disk_usage_interval`	no	Sets the interval for the gathering of storage metrics, an operation that can become expensive when large volumes are present.
`nvidia_device_plugin`	no	Enables the `nvidia-gpu-device-plugin`, which is disabled by default.
`rethinkdb_cache_size`	no	Sets the size of the cache for MKE RethinkDB servers. Default: 1GB Leaving the field empty or specifying `auto` instructs RethinkDB to automatically determine the cache size.
`exclude_server_identity_headers`	no	Determines whether the `X-Server-Ip` and `X-Server-Name` headers are disabled. Valid values: `true`, `false`. Default: `false`
`cloud_provider`	no	Sets the cloud provider for the Kubernetes cluster.
`pod_cidr`	yes	Sets the subnet pool from which the IP for the Pod should be allocated from the CNI IPAM plugin. Default: `192.168.0.0/16`
`ipip_mtu`	no	Sets the IPIP MTU size for the Calico IPIP tunnel interface.
`azure_ip_count`	yes	Sets the IP count for Azure allocator to allocate IPs per Azure virtual machine.
`service_cluster_ip_range`	yes	Sets the subnet pool from which the IP for Services should be allocated. Default: `10.96.0.0/16`
`nodeport_range`	yes	Sets the port range for Kubernetes services within which the type `NodePort` can be exposed. Default: `32768-35535`
`custom_kube_api_server_flags`	no	Sets the configuration options for the Kubernetes API server. Be aware that this parameter function is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the MKE Software Support Agreement.
`custom_kube_controller_manager_flags`	no	Sets the configuration options for the Kubernetes controller manager. Be aware that this parameter function is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the MKE Software Support Agreement.
`custom_kubelet_flags`	no	Sets the configuration options for `kubelet`. Be aware that this parameter function is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the MKE Software Support Agreement.
`custom_kube_scheduler_flags`	no	Sets the configuration options for the Kubernetes scheduler. Be aware that this arameter function is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the MKE Software Support Agreement.
`local_volume_collection_mapping`	no	Set to store data about collections for volumes in the MKE local KV store instead of on the volume labels. The parameter is used to enforce access control on volumes.
`manager_kube_reserved_resources`	no	Reserves resources for MKE and Kubernetes components that are running on manager nodes.
`worker_kube_reserved_resources`	no	Reserves resources for MKE and Kubernetes components that are running on worker nodes.
`kubelet_max_pods`	yes	Sets the number of Pods that can run on a node. Maximum: `250` Default: `110`
`kubelet_pods_per_core`	no	Sets the maximum number of Pods per core. `0` indicates that there is no limit on the number of Pods per core. The number cannot exceed the `kubelet_max_pods` setting. Recommended: `10` Default: `0`
`secure_overlay`	no	Enables IPSec network encryption in Kubernetes. Valid values: `true`, `false`. Default: `false`
`image_scan_aggregation_enabled`	no	Enables image scan result aggregation. The feature displays image vulnerabilities in shared resource/containers and shared resources/images pages. Valid values: `true`, `false`. Default: `false`
`swarm_polling_disabled`	no	Determines whether resource polling is disabled for both Swarm and Kubernetes resources, which is recommended for production instances. Valid values: `true`, `false`. Default: `false`
`oidc_client_id`	no	Sets the OIDC client ID, using the eNZi service ID that is in the ODIC authorization flow.
`hide_swarm_ui`	no	Determines whether the UI is hidden for all Swarm-only object types (has no effect on Admin Settings). Valid values: `true`, `false`. Default: `false` You can also set the parameter using the MKE web UI: Log in to the MKE web UI as an administrator. In the left-side navigation panel, click the user name drop-down. Click Admin Settings > Tuning to open the Tuning screen. Toggle on the Hide Swarm Navigation slider located under the Configure MKE UI heading.
`unmanaged_cni`	yes	Sets Calico as the CNI provider, managed by MKE. Note that Calico is the default CNI provider.
`calico_ebpf_enabled`	yes	Enables Calico eBPF mode.
`kube_default_drop_masq_bits`	yes	Sets the use of Kubernetes default values for iptables drop and masquerade bits.
`kube_proxy_mode`	yes	Sets the operational mode for `kube-proxy`. Valid values: `iptables`, `ipvs`, `disabled`. Default: `iptables`
`cipher_suites_for_kube_api_server`	no	Sets the value for the `kube-apiserver` `--tls-cipher-suites` parameter.
`cipher_suites_for_kubelet`	no	Sets the value for the `kubelet` `--tls-cipher-suites` parameter.
`cipher_suites_for_etcd_server`	no	Sets the value for the `etcd` server `--cipher-suites` parameter.
`image_prune_schedule`	no	Sets the cron expression used for the scheduling of image pruning. The parameter accepts either full crontab specifications or descriptors, but not both. Full crontab specifications, which include `<seconds> <minutes> <hours> <day of month> <month> <day of week>`. For example, `"0 0 0 * * *"`. Descriptors, which are textual in nature, with a preceding @ symbol. For example: `"@midnight"` or `"@every 1h30m"`. Refer to the cron documentation for more information.
`cpu_usage_banner_threshold`	no	Sets the CPU usage threshold, above which the MKE web UI displays a warning banner. Default: `20`.
`cpu_usage_banner_scrape_interval`	no	Sets the MKE CPU usage measurement interval, which enables the function of the `cpu_usage_banner_threshold` option. Default: `"10m"`.
`etcd_storage_quota`	no	Sets the etcd storage size limit. Example values: `4GB`, `8GB`. Default value: `2GB`.
`nvidia_device_partitioner`	no	Enables the NVIDIA device partitioner. Default: `true`.
`kube_api_server_profiling_enabled`	no	Enables profiling for the Kubernetes API server. Default: `true`.
`kube_controller_manager_profiling_enabled`	no	Enables profiling for the Kubernetes controller manager. Default: `true`.
`kube_scheduler_profiling_enabled`	no	Enables profiling for the Kubernetes scheduler. Default: `true`.
`kube_scheduler_bind_to_all`	no	Enables kube scheduler to bind to all available network interfaces, rather than just localhost. Default: `false`.
`use_flex_volume_driver`	no	Extends support of FlexVolume drivers, which have been deprecated since the release of MKE 3.4.13. Default: `false`.
`pubkey_auth_cache_enabled`	no	Warning Implement `pubkey_auth_cache_enabled` only in cases in which there are certain performance issues in high-load clusters, and only under the guidance of Mirantis Support personnel. Enables public key authentication cache. Note `ucp-controller` must be restarted for setting changes to take effect. Default: `false`.
`shared_sans`	no	Subject alternative names for manager nodes.
`kube_manager_terminated_pod_gc_threshold`	no	Allows users to set the threshold for the terminated Pod garbage collector in Kube Controller Manager according to their cluster-specific requirement. Default: `12500`
`kube_api_server_request_timeout`	no	Timeout for Kube API server requests. Default: `1m`
`cadvisor_enabled`	no	Enables the `ucp-cadvisor` comoponent, which runs a standalone cadvisor instance on each node to provide additional container level metrics with all expected labels. Default: `false`
`kube_api_server_audit_log_maxage`	no	Sets the maximum number of days for which to retain old audit log files in Kubernetes API server. Default: `30`.
`kube_api_server_audit_log_maxbackup`	no	Sets the maximum number of audit log files for which to retain in the Kubernetes API server. Default: `10`.
`kube_api_server_audit_log_maxsize`	no	Sets the maximum size the audit log file can attain, in megabytes, before it is rotated in Kubernetes API server. Default: `10`.
`KubeAPIServerCustomAuditPolicyYaml`	no	Specifies a Kubernetes audit logging policy. Refer to https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/ for more information.
`KubeAPIServerEnableCustomAuditPolicy`	no	Enables the use of a specified custom audit policy yaml file. Default: `false`.

cluster_config.image_prune_whitelist (optional)¶

Configures the images that you do not want removed by MKE image pruning.

Note

Where possible, use the image ID to specify the image rather than the image name.

Parameter

Required

Description

key

yes

Sets the filter key.

Valid values: dangling, label, before, since, and reference.

For more information, refer to the Docker documentation on Filtering.

value

yes

Sets the filter value.

For more information, refer to the Docker documentation on Filtering.

cluster_config.ingress_controller (optional)¶

Set the configuration for the NGINX Ingress Controller to manage traffic that originates outside of your cluster (ingress traffic).

Note

Prior versions of MKE use Istio Ingress to manage traffic that originates from outside of the cluster, which employs many of the same parameters as NGINX Ingress Controller.

Parameter	Required	Description
`enabled`	No	Disables HTTP ingress for Kubernetes. Valid values: `true`, `false`. Default: `false`
`ingress_num_replicas`	No	Sets the number of NGINX Ingress Controller deployment replicas. Default: `2`
`ingress_external_ips`	No	Sets the list of external IPs for Ingress service. Default: `[]` (empty)
`ingress_enable_lb`	No	Enables an external load balancer. Valid values: `true`, `false`. Default: `false`
`ingress_preserve_client_ip`	No	Enables preserving inbound traffic source IP. Valid values: `true`, `false`. Default: `false`
`ingress_exposed_ports`	No	Sets ports to expose. For each port, provide arrays that contain the following port information (defaults as displayed): name = `http2` port = `80` target_port = `0` node_port = `33000` name = `https` port = `443` target_port = `0` node_port = `33001` name = `tcp` port = `31400` target_port = `0` node_port = `33002`
`ingress_node_affinity`	No	Sets node affinity. key = `com.docker.ucp.manager` value = `""` target_port = `0` node_port = `0`
`ingress_node_toleration`	No	Sets node toleration. For each node, provide an array that contains the following information (defaults as displayed): key = `com.docker.ucp.manager` value = `""` operator = `Exists` effect = `NoSchedule`
`config_map`	No	Sets advanced options for the NGINX proxy. NGINX Ingress Controller uses `ConfigMap` to configure the NGINX proxy. For the complete list of available options, refer to the NGINX Ingress Controller documentation ConfigMap: configuration options. Examples: `map-hash-bucket-size = "128"` `ssl-protocols = "SSLv2"`

cluster_config.policy_enforcement (optional)¶

Enable and disable Pod Security Policies (PSPs).

Parameter

Required

Description

pod_security_policy

Enables the use of Pod Security Policies (PSPs).

Valid values: true, false.

Default: true.

cluster_config.policy_enforcement.gatekeeper (optional)¶

Enable and disable OPA Gatekeeper for policy enforcement.

Note

By design, when the OPA Gatekeeper is disabled using the configuration file, the Pods are deleted but the policies are not cleaned up. Thus, when the OPA Gatekeeper is re-enabled, the cluster can immediately adopt the existing policies.

The retention of the policies poses no risk, as they are just data on the API server and have no value outside of a OPA Gatekeeper deployment.

Parameter

Required

Description

enabled

Enables the Gatekeeper function.

Valid values: true, false.

Default: false.

excluded_namespaces

Excludes from the Gatekeeper admission webhook all of the resources that are contained in a list of namespaces. Specify as a comma-separated list.

For example: "kube-system", "gatekeeper-system"

iSCSI (optional)¶

Configures iSCSI options for MKE.

Parameter

Required

Description

--storage-iscsi=true

Enables iSCSI-based Persistent Volumes in Kubernetes.

Valid values: true, false.

Default: false

--iscsiadm-path=<path>

Specifies the path of the iscsiadm binary on the host.

Default: /usr/sbin/iscsiadm

--iscsidb-path=<path>

Specifies the path of the iscsi database on the host.

Default: /etc/iscsi

pre_logon_message¶

Configures a pre-logon message.

Parameter	Required	Description
`pre_logon_message`	no	Sets a pre-logon message to alert users prior to log in.

backup_schedule_config (optional)¶

Configures backup scheduling and notifications for MKE.

Parameter	Required	Description
`notification-delay`	yes	Sets the number of days that elapse before a user is notified that they have not performed a recent backup. Set to `-1` to disable notifications. Default: `7`
`enabled`	yes	Enables backup scheduling. Valid values: `true`, `false`. Default: `false`
`path`	yes	Sets the storage path for scheduled backups. Use `chmod o+w /<path>` to ensure that other users have write privileges.
`no_passphrase`	yes	Sets whether a passphrase is necessary to encrypt the TAR file. A value of `true` negates the use of a passphrase. A non-empty value in the `passphrase` parameter requires that `no-passphrase` be set to `false`. Default: `false`
`passphrase`	yes	Encrypts the TAR file with a passphrase for all scheduled backups. Must remain empty if `no_passphrase` is set to `true`. Do not share the configuration file if a passphrase is used, as the passphrase displays in plain text.
`cron_spec`	yes	Sets the cron expression in use for scheduling backups. The parameter accepts either full crontab specifications or descriptors, but not both. Full crontab specifications include `<seconds> <minutes> <hours> <day of month> <month> <day of week>`. For example: `"0 0 0 * * *"`. Descriptors, which are textual in nature, have a preceding `@` symbol. For example: `"@midnight"` or `"@every 1h30m"`. For more information, refer to the cron documentation.
`include_logs`	yes	Determines whether a log file is generated in addition to the backup. Refer to backup for more information.
`backup_limits`	yes	Sets the number of backups to store. Once this number is reached, older backups are deleted. Set to `-1` to disable backup rotation.

windows_gmsa¶

Configures use of Windows GMSA credentia specifications.

Parameter

Required

Description

windows_gmsa

Allows creation of GMSA credential specifications for the Kubernetes cluster, as well as automatic population of full credential specifications for any Pod on which the GMSA credential specification is referenced in the security context of that Pod.

The schema for gmsa credential spec MKE uses is publicly documented at https://github.com/kubernetes-sigs/windows-gmsa/blob/master/charts/gmsa/templates/credentialspec.yaml.

For information on how to enable GMSA and how to obtain different components of the GMSA specification for one or more GMSA accounts in your domain, refer to the official Windows documentation.

Scale an MKE cluster¶

By adding or removing nodes from the MKE cluster, you can horizontally scale MKE to fit your needs as your applications grow in size and use.

Scale using the MKE web UI¶

For detail on how to use the MKE web UI to scale your cluster, refer to Join Linux nodes or Join Windows worker nodes, depending on which operating system you use. In particular, these topics offer information on adding nodes to a cluster and configuring node availability.

Scale using the CLI¶

You can also use the command line to perform all scaling operations.

Scale operation	Command
Obtain the join token	Run the following command on a manager node to obtain the `join` token that is required for cluster scaling. Use either worker or manager for the `<node-type>`: docker swarm join-token <node-type>
Configure a custom listen address	Specify the address and port where the new node listens for inbound cluster management traffic: docker swarm join \ --token SWMTKN-1-2o5ra9t7022neymg4u15f3jjfh0qh3yof817nunoioxa9i7lsp-dkmt01ebwp2m0wce1u31h6lmj \ --listen-addr 234.234.234.234 \ 192.168.99.100:2377
Verify node addition	Once your node is added, run the following command on a manager node to verify its presence: docker node ls
Set node availability state	Use the --availability option to set node availability, indicating `active`, `pause`, or `drain`: docker node update --availability <availability-state> <node-hostname>
Remove the node	docker node rm <node-hostname>

Configure KMS plugin for MKE¶

Mirantis Kubernetes Engine (MKE) offers support for a Key Management Service (KMS) plugin that allows access to third-party secrets management solutions, such as Vault. MKE uses this plugin to facilitate access from Kubernetes clusters.

MKE will not health check, clean up, or otherwise manage the KMS plugin. Thus, you must deploy KMS before a machine becomes a MKE manager, or else it may be considered unhealthy.

Configuration¶

Use MKE to configure the KMS plugin configuration. MKE maintains ownership of the Kubernetes EncryptionConfig file, where the KMS plugin is configured for Kubernetes. MKE does not check the file contents following deployment.

MKE adds new configuration options to the cluster configuration table. Configuration of these options takes place through the API and not the MKE web UI.

The following table presents the configuration options for the KMS plugin, all of which are optional.

Parameter	Type	Description
`kms_enabled`	bool	Sets MKE to configure a KMS plugin.
`kms_name`	string	Name of the KMS plugin resource (for example, `vault`).
`kms_endpoint`	string	Path of the KMS plugin socket. The path must refer to a UNIX socket on the host (for example, `/tmp/socketfile.sock`). MKE bind mounts this file to make it accessible to the API server.
kms_cachesize	int	Number of data encryption keys (DEKs) to cache in the clear.

See also

Use a local node network in a swarm¶

Mirantis Kubernetes Engine (MKE) can use local network drivers to orchestrate your cluster. You can create a config network with a driver such as MAC VLAN, and use this network in the same way as any other named network in MKE. In addition, if it is set up as attachable you can attach containers.

Warning

Encrypting communication between containers on different nodes only works with overlay networks.

Create node-specific networks with MKE¶

To create a node-specific network for use with MKE, always do so through MKE, using either the MKE web UI or the CLI with an admin bundle. If you create such a network without MKE, it will not have the correct access label and it will not be available in MKE.

Create a MAC VLAN network¶

Log in to the MKE web UI as an administrator.
In the left-side navigation menu, click Swarm > Networks.
Click Create to call the Create Network screen.
Select macvlan from the Drivers` dropdown.
Enter macvlan into the Name field.
Select the type of network to create, Network or Local Config.
- If you select Local Config, the SCOPE is automatically set to Local. You subsequently select the nodes for which to create the Local Config from those listed. MKE will prefix the network with the node name for each selected node to ensure consistent application of access labels, and you then select a Collection for the Local Configs to reside in. All Local Configs with the same name must be in the same collection, or MKE returns an error. If you do not not select a Collection, the network is placed in your default collection, which is / in a new MKE installation.
- If you select Network, the SCOPE is automatically set to Swarm. Choose an existing Local Config from which to create the network. The network and its labels and collection placement are inherited from the related Local Configs.
Optional. Configure IPAM.
Click Create.

Use your own TLS certificates¶

To ensure all communications between clients and MKE are encrypted, all MKE services are exposed using HTTPS. By default, this is done using self-signed TLS certificates that are not trusted by client tools such as web browsers. Thus, when you try to access MKE, your browser warns that it does not trust MKE or that MKE has an invalid certificate.

You can configure MKE to use your own TLS certificates. As a result, your browser and other client tools will trust your MKE installation.

Mirantis recommends that you make this change outside of peak business hours. Your applications will continue to run normally, but existing MKE client certificates will become invalid, and thus users will have to download new certificates to access MKE from the CLI.

To configure MKE to use your own TLS certificates and keys:

Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to <user name> > Admin Settings > Certificates.

Upload your certificates and keys based on the following table.

Note

All keys and certificates must be uploaded in PEM format.

Type	Description
Private key	The unencrypted private key for MKE. This key must correspond to the public key used in the server certificate. This key does not use a password. Click Upload Key to upload a PEM file.
Server certificate	The MKE public key certificate, which establishes a chain of trust up to the root CA certificate. It is followed by the certificates of any intermediate certificate authorities. Click Upload Certificate to upload a PEM file.
CA certificate	The public key certificate of the root certificate authority that issued the MKE server certificate. If you do not have a CA certificate, use the top-most intermediate certificate instead. Click Upload CA Certificate to upload a PEM file.
Client CA	This field may contain one or more Root CA certificates that the MKE controller uses to verify that client certificates are issued by a trusted entity. Click Upload CA Certificate to upload a PEM file. Click Download MKE Server CA Certificate to download the certificate as a PEM file. Note MKE is automatically configured to trust its internal CAs, which issue client certificates as part of generated client bundles. However, you may supply MKE with additional custom root CA certificates using this field to enable MKE to trust the client certificates issued by your corporate or trusted third-party certificate authorities. Note that your custom root certificates will be appended to MKE internal root CA certificates.

Click Save.

After replacing the TLS certificates, your users will not be able to authenticate with their old client certificate bundles. Ask your users to access the MKE web UI and download new client certificate bundles.

Finally, Mirantis Secure Registry (MSR) deployments must be reconfigured to trust the new MKE TLS certificates. To do this, MSR 3.1.x users can refer to Add a custom TLS certificate, MSR 3.0.x userr to Add a custom TLS certificate, and MSR 2.9.x users to Add a custom TLS certificate.

Manage and deploy private images¶

Mirantis offers its own image registry, Mirantis Secure Registry (MSR), which you can use to store and manage the images that you deploy to your cluster. This topic describes how to use MKE to push the official WordPress image to MSR and later deploy that image to your cluster.

To create an MSR image repository:

Log in to the MKE web UI.
From the left-side navigation panel, navigate to <user name> > Admin Settings > Mirantis Secure Registry.
In the Installed MSRs section, capture the MSR URL for your cluster.
In a new browser tab, navigate to the MSR URL captured in the previous step.
From the left-side navigation panel, click Repositories.
Click New repository.
In the namespace field under New Repository, select the required namespace. The default namespace is your user name.
In the name field under New Repository, enter the name wordpress.
To create the repository, click Save.

To push an image to MSR:

In this example, you will pull the official WordPress image from Docker Hub, tag it, and push it to MSR. Once pushed to MSR, only authorized users will be able to make changes to the image. Pushing to MSR requires CLI access to a licensed MSR installation.

Pull the public WordPress image from Docker Hub:
```
docker pull wordpress
```
Tag the image, using the IP address or DNS name of your MSR instance. For example:
```
docker tag wordpress:latest <msr-url>:<port>/<namespace>/wordpress:latest
```
Log in to an MKE manager node.

Push the tagged image to MSR:

docker image push <msr-url>:<port>/admin/wordpress:latest

Verify that the image is stored in your MSR repository:
1. Log in to the MSR web UI.
2. In the left-side navigation panel, click Repositories.
3. Click admin/wordpress to open the repo.
4. Click the Tags tab to view the stored images.
5. Verify that the latest tag is present.

To deploy the private image to MKE:

Log in to the MKE web UI.
In the left-side navigation panel, click Kubernetes.
Click Create to open the Create Kubernetes Object page.
In the Namespace dropdown, select default.

In the Object YAML editor, paste the following Deployment object YAML:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress-deployment
spec:
  selector:
    matchLabels:
      app: wordpress
  replicas: 2
  template:
    metadata:
      labels:
        app: wordpress
    spec:
      containers:
        - name: wordpress
          image: 52.10.217.20:444/admin/wordpress:latest
          ports:
            - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: wordpress-service
  labels:
    app: wordpress
spec:
  type: NodePort
  ports:
    - port: 80
      nodePort: 32768
  selector:
    app: wordpress

The Deployment object YAML specifies your MSR image in the Pod template spec: image: <msr-url>:<port>/admin/wordpress:latest. Also, the YAML file defines a NodePort service that exposes the WordPress application so that it is accessible from outside the cluster.

Click Create. Creating the new Kubernetes objects will open the Controllers page.
After a few seconds, verify that wordpress-deployment has a green status icon and is thus successfully deployed.

Set the node orchestrator¶

When you add a node to your cluster, by default its workloads are managed by Swarm. Changing the default orchestrator does not affect existing nodes in the cluster. You can also change the orchestrator type for individual nodes in the cluster.

Select the node orchestrator¶

The workloads on your cluster can be scheduled by Kubernetes, Swarm, or a combination of the two. If you choose to run a mixed cluster, be aware that different orchestrators are not aware of each other, and thus there is no coordination between them.

Mirantis recommends that you decide which orchestrator you will use when initially setting up your cluster. Once you start deploying workloads, avoid changing the orchestrator setting. If you do change the node orchestrator, your workloads will be evicted and you will need to deploy them again using the new orchestrator.

Caution

When you promote a worker node to be a manager, its orchestrator type automatically changes to Mixed. If you later demote that node to be a worker, its orchestrator type remains as Mixed.

Note

The default behavior for Mirantis Secure Registry (MSR) nodes is to run in the Mixed orchestration mode. If you change the MSR orchestrator type to Swarm or Kubernetes only, reconciliation will revert the node back to the Mixed mode.

Changing a node orchestrator¶

When you change the node orchestrator, existing workloads are evicted and they are not automatically migrated to the new orchestrator. You must manually migrate them to the new orchestrator. For example, if you deploy WordPress on Swarm, and you change the node orchestrator to Kubernetes, MKE does not migrate the workload, and WordPress continues running on Swarm. You must manually migrate your WordPress deployment to Kubernetes.

The following table summarizes the results of changing a node orchestrator.

Workload	Orchestrator-related change
Containers	Containers continue running on the node.
Docker service	The node is drained and tasks are rescheduled to another node.
Pods and other imperative resources	Imperative resources continue running on the node.
Deployments and other declarative resources	New declarative resources will not be scheduled on the node and existing ones will be rescheduled at a time that can vary based on resource details.

If a node is running containers and you change the node to Kubernetes, the containers will continue running and Kubernetes will not be aware of them. This is functionally the same as running the node in the Mixed mode.

Warning

The Mixed mode is not intended for production use and it may impact the existing workloads on the node.

This is because the two orchestrator types have different views of the node resources and they are not aware of the other orchestrator resources. One orchestrator can schedule a workload without knowing that the node resources are already committed to another workload that was scheduled by the other orchestrator. When this happens, the node can run out of memory or other resources.

Mirantis strongly recommends against using the Mixed mode in production environments.

Change the node orchestrator¶

This topic describes how to set the default orchestrator and change the orchestrator for individual nodes.

Set the default orchestrator¶

To set the default orchestrator using the MKE web UI:

Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to <user name> > Admin Settings > Orchestration.
Under Scheduler, select the required default orchestrator.
Click Save.

New workloads will now be scheduled by the specified orchestrator type. Existing nodes in the cluster are not affected.

Once a node is joined to the cluster, you can change the orchestrator that schedules its workloads.

To set the default orchestrator using the MKE configuration file:

Obtain the current MKE configuration file for your cluster.
Set default_node_orchestrator to "swarm" or "kubernetes".
Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

Change the node orchestrator¶

To change the node orchestrator using the MKE web UI:

Log in to the MKE web UI as an administrator.
From the left-side navigation panel, navigate to Shared Resources > Nodes.
Click the node that you want to assign to a different orchestrator.
In the upper right, click the Edit Node icon.
In the Details pane, in the Role section under ORCHESTRATOR TYPE, select either Swarm, Kubernetes, or Mixed.

Warning

Mirantis strongly recommends against using the Mixed mode in production environments.
Click Save to assign the node to the selected orchestrator.

To change the node orchestrator using the CLI:

Set the orchestrator on a node by assigning the orchestrator labels, com.docker.ucp.orchestrator.swarm or com.docker.ucp.orchestrator.kubernetes to true.

Change the node orchestrator. Select from the following options:

Schedule Swarm workloads on a node:

docker node update --label-add com.docker.ucp.orchestrator.swarm=true <node-id>

Schedule Kubernetes workloads on a node:

docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>

Schedule both Kubernetes and Swarm workloads on a node:

docker node update --label-add com.docker.ucp.orchestrator.swarm=true <node-id>
docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>

Warning

Mirantis strongly recommends against using the Mixed mode in production environments.

Change the orchestrator type for a node from Swarm to Kubernetes:

docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>
docker node update --label-rm com.docker.ucp.orchestrator.swarm <node-id>

Change the orchestrator type for a node from Kubernetes to Swarm:

docker node update --label-add com.docker.ucp.orchestrator.swarm=true <node-id>
docker node update --label-rm com.docker.ucp.orchestrator.kubernetes <node-id>

Note

You must first add the target orchestrator label and then remove the old orchestrator label. Doing this in the reverse order can fail to change the orchestrator.

Verify the value of the orchestrator label by inspecting the node:

docker node inspect <node-id> | grep -i orchestrator

Example output:

"com.docker.ucp.orchestrator.kubernetes": "true"

Important

The com.docker.ucp.orchestrator label is not displayed in the MKE web UI Labels list, which presents in the Overview pane for each node.

View Kubernetes objects in a namespace¶

MKE administrators can filter the view of Kubernetes objects by the namespace that the objects are assigned to, specifying a single namespace or all available namespaces. This topic describes how to deploy services to two newly created namespaces and then view those services, filtered by namespace.

To create two namespaces:

Log in to the MKE web UI as an administrator.
From the left-side navigation panel, click Kubernetes.
Click Create to open the Create Kubernetes Object page.
Leave the Namespace drop-down blank.

In the Object YAML editor, paste the following YAML code:

apiVersion: v1
kind: Namespace
metadata:
  name: blue
---
apiVersion: v1
kind: Namespace
metadata:
  name: green

Click Create to create the blue and green namespaces.

To deploy services:

Create a NodePort service in the blue namespace:
1. From the left-side navigation panel, navigate to Kubernetes > Create.
2. In the Namespace drop-down, select blue.
3. In the Object YAML editor, paste the following YAML code:
```
apiVersion: v1
kind: Service
metadata:
  name: app-service-blue
  labels:
    app: app-blue
spec:
  type: NodePort
  ports:
    - port: 80
      nodePort: 32768
  selector:
    app: app-blue
```
4. Click Create to deploy the service in the blue namespace.
Create a NodePort service in the green namespace:
1. From the left-side navigation panel, navigate to Kubernetes > Create.
2. In the Namespace drop-down, select green.
3. In the Object YAML editor, paste the following YAML code:
```
apiVersion: v1
kind: Service
metadata:
  name: app-service-green
  labels:
    app: app-green
spec:
  type: NodePort
  ports:
    - port: 80
      nodePort: 32769
  selector:
    app: app-green
```
4. Click Create to deploy the service in the green namespace.

To view the newly created services:

In the left-side navigation panel, click Namespaces.
In the upper-right corner, click the Set context for all namespaces toggle. The indicator in the left-side navigation panel under Namespaces changes to All Namespaces.
Click Services to view your services.

Filter the view by namespace:

In the left-side navigation panel, click Namespaces.
Hover over the blue namespace and click Set Context. The indicator in the left-side navigation panel under Namespaces changes to blue.
Click Services to view the app-service-blue service. Note that the app-service-green service does not display.

Perform the forgoing steps on the green namespace to view only the services deployed in the green namespace.

Join Nodes¶

Set up high availability¶

MKE is designed to facilitate high availability (HA). You can join multiple manager nodes to the cluster, so that if one manager node fails, another one can automatically take its place without impacting the cluster.

Including multiple manager nodes in your cluster allows you to handle manager node failures and load-balance user requests across all manager nodes.

The following table exhibits the relationship between the number of manager nodes used and the number of faults that your cluster can tolerate:

Manager nodes	Failures tolerated
1	0
3	1
5	2

For deployment into product environments, follow these best practices:

For HA with minimal network overhead, Mirantis recommends using three manager nodes and a maximum of five. Adding more manager nodes than this can lead to performance degradation, as configuration changes must be replicated across all manager nodes.
You should bring failed manager nodes back online as soon as possible, as each failed manager node decreases the number of failures that your cluster can tolerate.
You should distribute your manager nodes across different availability zones. This way your cluster can continue working even if an entire availability zone goes down.

Join Linux nodes¶

MKE allows you to add or remove nodes from your cluster as your needs change over time.

Because MKE leverages the clustering functionality provided by Mirantis Container Runtime (MCR), you use the docker swarm join command to add more nodes to your cluster. When you join a new node, MCR services start running on the node automatically.

You can add both Linux manager and worker nodes to your cluster.

Join a node to the cluster¶

Important

Prior to adding a node that was previously a part of the same MKE cluster or a different one, you must run the following command to remove any stale MKE volumes:

docker volume rm `docker volume list --filter name=ucp* -q`

Next, run the following command to verify the removal of the stale volumes:

docker volume list --filter name=ucp*

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes.
Click Add Node.
Select Linux for the node type.
Select either Manager or Worker, as required.
Optional. Select Use a custom listen address to specify the address and port where the new node listens for inbound cluster management traffic.
Optional. Select Use a custom advertise address to specify the IP address that is advertised to all members of the cluster for API access.
Copy the displayed command, which looks similar to the following:
```
docker swarm join --token <token> <mke-node-ip>
```
Use SSH to log in to the host that you want to join to the cluster.
Run the docker swarm join command captured previously.

The node will display in the Shared Resources > Nodes page.

Pause or drain a node¶

Note

You can pause or drain a node only with swarm workloads.

You can configure the availability of a node so that it is in one of the following three states:

Active: The node can receive and execute tasks.
Paused: The node continues running existing tasks, but does not receive new tasks.
Drained: Existing tasks are stopped, while replica tasks are launched in active nodes. The node does not receive new tasks.

To pause or drain a node:

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.
In the Details pane, click Configure and select Details to open the Edit Node page.
In the upper right, select the Edit Node icon.
In the Availability section, click Active, Pause, or Drain.
Click Save.

Promote or demote a node¶

You can promote worker nodes to managers to make MKE fault tolerant. You can also demote a manager node into a worker node.

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.
In the upper right, select the Edit Node icon.
In the Role section, click Manager or Worker.
Click Save and wait until the operation completes.
Navigate to Shared Resources > Nodes and verify the new node role.

Note

If you are load balancing user requests to MKE across multiple manager nodes, you must remove these nodes from the load-balancing pool when demoting them to workers.

Remove a node from the cluster¶

To remove an inaccessible worker node or one that is down:

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.
In the upper right, select the vertical ellipsis and click Remove.
When prompted, click Confirm.

To remove an inactive worker node:

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.
Drain the node, to ensure that the workload is scheduled to another node.
Click the vertical ellipsis in the upper right and select Force Remove.
When prompted, click Confirm.

To remove a manager node:

Verify that all nodes in the cluster are healthy.

Warning

Do not remove the manager node if all nodes are not healthy.
Demote the manager to a worker node.
Remove the newly-demoted worker from the cluster, as described in the preceding steps.

Join Windows worker nodes¶

MKE allows you to add or remove nodes from your cluster as your needs change over time.

MKE supports running worker nodes on Windows Server. You must run all manager nodes on Linux.

Windows nodes limitations¶

The following features are not yet supported using Windows Server:

Category	Feature
Networking	Encrypted networks are not supported. If you have upgraded from a previous version of MKE, you will need to recreate an unencrypted version of the `ucp-hrm` network.
Secrets	When using secrets with Windows services, Windows stores temporary secret files on your disk. You can use BitLocker on the volume containing the Docker root directory to encrypt the secret data at rest. When creating a service that uses Windows containers, the options to specify UID, GID, and mode are not supported for secrets. Secrets are only accessible by administrators and users with system access within the container.
Mounts	On Windows, Docker cannot listen on a Unix socket. Use TCP or a named pipe instead.

Configure the Docker daemon for Windows nodes¶

Note

If the cluster is deployed in a site that is offline, sideload MKE images onto the Windows Server nodes. For more information, refer to Install MKE offline.

On a manager node, list the images that are required on Windows nodes:

docker container run --rm -v /var/run/docker.sock:/var/run/docker.sock mirantis/ucp:3.6.16 images --list --enable-windows

Example output:

mirantis/ucp-agent-win:3.6.16
mirantis/ucp-dsinfo-win:3.6.16

Pull the required images. For example:

docker image pull mirantis/ucp-agent-win:3.6.16
docker image pull mirantis/ucp-dsinfo-win:3.6.16

Join Windows nodes to the cluster¶

Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to Shared Resources > Nodes.
Click Add Node.
Select Windows for the node type.
Optional. Select Use a custom listen address to specify the address and port where the new node listens for inbound cluster management traffic.
Optional. Select Use a custom advertise address to specify the IP address that is advertised to all members of the cluster for API access.
Copy the displayed command, which looks similar to the following:
```
docker swarm join --token <token> <mke-worker-ip>
```
Alternatively, you can use the command line to obtain the join token. Using your MKE client bundle, run:
```
docker swarm join-token worker
```
Run the docker swarm join command captured in the previous step on each instance of Windows Server that will be a worker node.

Use a load balancer¶

After joining multiple manager nodes for high availability (HA), you can configure your own load balancer to balance user requests across all manager nodes.

Use of a load balancer allows users to access MKE using a centralized domain name. The load balancer can detect when a manager node fails and stop forwarding requests to that node, so that users are unaffected by the failure.

Configure load balancing on MKE¶

Because MKE uses TLS, do the following when configuring your load balancer:
- Load-balance TCP traffic on ports 443 and 6443.
- Do not terminate HTTPS connections.
- On each manager node, use the /_ping endpoint to verify whether the node is healthy and whether or not it should remain in the load balancing pool.

Use the following examples to configure your load balancer for MKE:

NGINX

user  nginx;
   worker_processes  1;

   error_log  /var/log/nginx/error.log warn;
   pid        /var/run/nginx.pid;

   events {
      worker_connections  1024;
   }

   stream {
      upstream ucp_443 {
         server <UCP_MANAGER_1_IP>:443 max_fails=2 fail_timeout=30s;
         server <UCP_MANAGER_2_IP>:443 max_fails=2 fail_timeout=30s;
         server <UCP_MANAGER_N_IP>:443 max_fails=2 fail_timeout=30s;
      }
      server {
         listen 443;
         proxy_pass ucp_443;
      }
      upstream ucp_6443 {
         server <UCP_MANAGER_1_IP>:6443 max_fails=2 fail_timeout=30s;
         server <UCP_MANAGER_2_IP>:6443 max_fails=2 fail_timeout=30s;
         server <UCP_MANAGER_N_IP>:6443 max_fails=2 fail_timeout=30s;
      }
      server {
         listen 6443;
         proxy_pass ucp_6443;
      }
   }

HAProxy

global
      log /dev/log    local0
      log /dev/log    local1 notice

   defaults
         mode    tcp
         option  dontlognull
         timeout connect     5s
         timeout client      50s
         timeout server      50s
         timeout tunnel      1h
         timeout client-fin  50s
   ### frontends
   # Optional HAProxy Stats Page accessible at http://<host-ip>:8181/haproxy?stats
   frontend ucp_stats
         mode http
         bind 0.0.0.0:8181
         default_backend ucp_stats
   frontend ucp_443
         mode tcp
         bind 0.0.0.0:443
         default_backend ucp_upstream_servers_443
   frontend ucp_6443
         mode tcp
         bind 0.0.0.0:6443
         default_backend ucp_upstream_servers_6443
   ### backends
   backend ucp_stats
         mode http
         option httplog
         stats enable
         stats admin if TRUE
         stats refresh 5m
   backend ucp_upstream_servers_443
         mode tcp
         option httpchk GET /_ping HTTP/1.1\r\nHost:\ <UCP_FQDN>
         server node01 <UCP_MANAGER_1_IP>:443 weight 100 check check-ssl verify none
         server node02 <UCP_MANAGER_2_IP>:443 weight 100 check check-ssl verify none
         server node03 <UCP_MANAGER_N_IP>:443 weight 100 check check-ssl verify none
   backend ucp_upstream_servers_6443
         mode tcp
         option httpchk GET /_ping HTTP/1.1\r\nHost:\ <UCP_FQDN>
         server node01 <UCP_MANAGER_1_IP>:6443 weight 100 check check-ssl verify none
         server node02 <UCP_MANAGER_2_IP>:6443 weight 100 check check-ssl verify none
         server node03 <UCP_MANAGER_N_IP>:6443 weight 100 check check-ssl verify none

AWS LB

{
      "Subnets": [
         "subnet-XXXXXXXX",
         "subnet-YYYYYYYY",
         "subnet-ZZZZZZZZ"
      ],
      "CanonicalHostedZoneNameID": "XXXXXXXXXXX",
      "CanonicalHostedZoneName": "XXXXXXXXX.us-west-XXX.elb.amazonaws.com",
      "ListenerDescriptions": [
         {
               "Listener": {
                  "InstancePort": 443,
                  "LoadBalancerPort": 443,
                  "Protocol": "TCP",
                  "InstanceProtocol": "TCP"
               },
               "PolicyNames": []
         },
         {
               "Listener": {
                  "InstancePort": 6443,
                  "LoadBalancerPort": 6443,
                  "Protocol": "TCP",
                  "InstanceProtocol": "TCP"
               },
               "PolicyNames": []
         }
      ],
      "HealthCheck": {
         "HealthyThreshold": 2,
         "Interval": 10,
         "Target": "HTTPS:443/_ping",
         "Timeout": 2,
         "UnhealthyThreshold": 4
      },
      "VPCId": "vpc-XXXXXX",
      "BackendServerDescriptions": [],
      "Instances": [
         {
               "InstanceId": "i-XXXXXXXXX"
         },
         {
               "InstanceId": "i-XXXXXXXXX"
         },
         {
               "InstanceId": "i-XXXXXXXXX"
         }
      ],
      "DNSName": "XXXXXXXXXXXX.us-west-2.elb.amazonaws.com",
      "SecurityGroups": [
         "sg-XXXXXXXXX"
      ],
      "Policies": {
         "LBCookieStickinessPolicies": [],
         "AppCookieStickinessPolicies": [],
         "OtherPolicies": []
      },
      "LoadBalancerName": "ELB-UCP",
      "CreatedTime": "2017-02-13T21:40:15.400Z",
      "AvailabilityZones": [
         "us-west-2c",
         "us-west-2a",
         "us-west-2b"
      ],
      "Scheme": "internet-facing",
      "SourceSecurityGroup": {
         "OwnerAlias": "XXXXXXXXXXXX",
         "GroupName":  "XXXXXXXXXXXX"
      }
   }

Create either the nginx.conf or haproxy.cfg file, as required.

For instruction on deploying with AWS LB, refer to Getting Started with Network Load Balancers in the AWS documentation.

Deploy the load balancer:

NGINX

docker run --detach \
--name ucp-lb \
--restart=unless-stopped \
--publish 443:443 \
--publish 6443:6443 \
--volume ${PWD}/nginx.conf:/etc/nginx/nginx.conf:ro \
nginx:stable-alpine

HAProxy

docker run --detach \
--name ucp-lb \
--publish 443:443 \
--publish 6443:6443 \
--publish 8181:8181 \
--restart=unless-stopped \
--volume ${PWD}/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro \
haproxy:1.7-alpine haproxy -d -f /usr/local/etc/haproxy/haproxy.cfg

Load balancing MKE and MSR together¶

By default, both MKE and Mirantis Secure Registry (MSR) use port 443. If you plan to deploy MKE and MSR together, your load balancer must distinguish traffic between the two by IP address or port number.

If you want MKE and MSR both to use port 443, then you must either use separate load balancers for each or use two virtual IPs. Otherwise, you must configure your load balancer to expose MKE or MSR on a port other than 443.

Use two-factor authentication¶

Two-factor authentication (2FA) adds an extra layer of security when logging in to the MKE web UI. Once enabled, 2FA requires the user to submit an additional authentication code generated on a separate mobile device along with their user name and password at login.

Configure 2FA¶

MKE 2FA requires the use of a time-based one-time password (TOTP) application installed on a mobile device to generate a time-based authentication code for each login to the MKE web UI. Examples of such applications include 1Password, Authy, and LastPass Authenticator.

To configure 2FA:

Install a TOTP application to your mobile device.
In the MKE web UI, navigate to My Profile > Security.
Toggle the Two-factor authentication control to enabled.
Open the TOTP application and scan the offered QR code. The device will display a six-digit code.
Enter the six-digit code in the offered field and click Register. The TOTP application will save your MKE account.

Important

A set of recovery codes displays in the MKE web UI when two-factor authentication is enabled. Save these codes in a safe location, as they can be used to access the MKE web UI if for any reason the configured mobile device becomes unavailable. Refer to Recover 2FA for details.

Access MKE using 2FA¶

Once 2FA is enabled, you will need to provide an authentication code each time you log in to the MKE web UI. Typically, the TOTP application installed on your mobile device generates the code and refreshes it every 30 seconds.

Access the MKE web UI with 2FA enabled:

In the MKE web UI, click Sign in. The Sign in page will display.
Enter a valid user name and password.
Access the MKE code in the TOTP application on your mobile device.
Enter the current code in the 2FA Code field in the MKE web UI.

Note

Multiple authentication failures may indicate a lack of synchronization between the mobile device clock and the mobile provider.

Disable 2FA¶

Mirantis strongly recommends using 2FA to secure MKE accounts. If you need to temporarily disable 2FA, re-enable it as soon as possible.

To disable 2FA:

In the MKE web UI, navigate to My Profile > Security.
Toggle the Two-factor authentication control to disabled.

Recover 2FA¶

If the mobile device with authentication codes is unavailable, you can re-access MKE using any of the recovery codes that display in the MKE web UI when 2FA is first enabled.

To recover 2FA:

Enter one of the recovery codes when prompted for the two-factor authentication code upon login to the MKE web UI.
Navigate to My Profile > Security.
Disable 2FA and then re-enable it.
Open the TOTP application and scan the offered QR code. The device will display a six-digit code.
Enter the six-digit code in the offered field and click Register. The TOTP application will save your MKE account.

If there are no recovery codes to draw from, ask your system administrator to disable 2FA in order to regain access to the MKE web UI. Once done, repeat the Configure 2FA procedure to reinstate 2FA protection.

MKE administrators are not able to re-enable 2FA for users.

Account lockout¶

You can configure MKE so that a user account is temporarily blocked from logging in following a series of unsuccessful login attempts. The account lockout feature only prevents log in attempts that are made using basic authorization or LDAP. Log in attempts using either SAML or OIDC do not trigger the account lockout feature. Admin accounts are never locked.

Account lockouts expire after a set amount of time, after which the affected user can log in as normal. Subsequent log in attempts on a locked account do not extend the lockout period. Log in attempts against a locked account always cause a standard incorrect credentials error, providing no indication to the user that the account is locked. Only MKE admins can see account lockout status.

Configure account lockout functionality¶

Obtain the current MKE configuration file for your cluster.
Set the following parameters in the auth.account_lock section of the MKE configuration file:
- Set the value of enabled to true.
- Set the value of failureTriggers to the number of failed log in attempts that can be made before an account is locked.
- Set the value of durationSeconds to the desired lockout duration. A value of 0 indicates that the account will remain locked until it is unlocked by an administrator.
Upload the new MKE configuration file.

Note

You can verify the lockout status of your organization accounts by issuing a GET request to the /accounts endpoint.

Unlock an account¶

The account remains locked until the specified amount of time has elapsed. Otherwise, you must either have an administrator unlock the account or globally disable the account lockout feature.

To unlock a locked account:

Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to Access Control > Users and select the user who is locked out of their account.
Click the gear icon in the upper right corner.
Navigate to the Security tab.

Note

An expired account lock only resets once a new log in attempt is made. Until such time, the account will present as locked to administrators.
Click the Unlock account button.

To globally disable the account lockout feature:

Obtain the current MKE configuration file for your cluster.
In the auth.account_lock section of the MKE configuration file, set the value of enabled to false.
Upload the new MKE configuration file.

Configure and use OpsCare¶

Any time there is an issue with your cluster, OpsCare routes notifications from your MKE deployment to Mirantis support engineers. These company personnel will then either directly resolve the problem or arrange to troubleshoot the matter with you.

For more information, refer to Mirantis OpsCare Plus, OpsCare & LabCare.

Configure OpsCare¶

To configure OpsCare you must first obtain a Salesforce username, password, and environment ID from your Mirantis Customer Success Manager. You then store these credentials as Swarm secrets using the following naming convention:

User name: sfdc_opscare_api_username
Password: sfdc_opscare_api_password
Environment ID: sfdc_environment_id

Note

Every cluster that uses OpsCare must have its own unique sfdc_environment_id.
OpsCare requires that MKE has access to mirantis.my.salesforce.com on port 443.
Any custom certificates in use must contain all of the manager node private IP addresses.
The provided Salesforce credentials are not associated with the Mirantis support portal login, but are for Opscare alerting only.

To configure OpsCare using the CLI:

Download and configure the client bundle.

Create secrets for your Salesforce login credentials:

printf "<username-obtained-from-csm>" | docker secret create sfdc_opscare_api_username -
printf "<password-obtained-from-csm>" | docker secret create sfdc_opscare_api_password -
printf "<environment-id-obtained-from-csm>" | docker secret create sfdc_environment_id -

Enable OpsCare:

MKE_USERNAME=<mke-username>
MKE_PASSWORD=<mke-password>
MKE_HOST=<mke-host>

AUTHTOKEN=$(curl --silent --insecure --data "{\"username\":\"$MKE_USERNAME\",\"password\":\"$MKE_PASSWORD\"}" https://$MKE_HOST/auth/login | jq --raw-output .auth_token)
curl --silent --insecure -X GET "https://$MKE_HOST/api/ucp/config-toml" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" > ucp-config.toml
sed -i 's/ops_care = false/ops_care = true/' ucp-config.toml
curl --silent --insecure -X PUT -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file './ucp-config.toml' https://$MKE_HOST/api/ucp/config-toml

To configure OpsCare using the MKE web UI:

Log in to the MKE web UI.
Using the left-side navigation panel, navigate to <username> > Admin Settings > Usage.
In the Salesforce Username field, enter your Salesforce user name.
Next, enter your Salesforce password and Salesforce environment ID.
Click Create Secrets.
Under OpsCare Settings, toggle the Enable OpsCare slider to the right.
Click Save.

Manage Salesforce alerts¶

OpsCare uses a predefined group of MKE alerts to notify your Customer Success Manager of problems with your deployment. This alerts group is identical to those seen in any MKE cluster that is provisioned by Mirantis Container Cloud. A single watchdog alert serves to verify the proper function of the OpsCare alert pipeline as a whole.

To verify that the OpsCare alerts are functioning properly:

Log in to Salesforce.
Navigate to Cases and verify that the watchdog alert is present. It presents as Watchdog alert. It is always firing.

Disable OpsCare¶

You must disable OpsCare before you can delete the three secrets in use.

To disable OpsCare:

Log in to the MKE web UI.
Using the left-side navigation panel, navigate to <username> > Admin Settings > Usage.
Toggle the Enable Ops Care slider to the left.

Alternatively, you can disable OpsCare by changing the ops_care entry in the MKE configuration file to false.

Configure cluster and service networking in an existing cluster¶

On systems that use the managed CNI, you can switch existing clusters to either kube-proxy with ipvs proxier or eBPF mode.

MKE does not support switching kube-proxy in an existing cluster from ipvs proxier to iptables proxier, nor does it support disabling eBPF mode after it has been enabled. Using a CNI that supports both cluster and service networking requires that you disable kube-proxy.

Refer to Cluster and service networking options in the MKE Installation Guide for information on how to configure cluster and service networking at install time.

Caution

The configuration changes described here cannot be reversed. As such, Mirantis recommends that you make a cluster backup, drain your workloads, and take your cluster offline prior to performing any of these changes.

Caution

Swarm workloads that require the use of encrypted overlay networks must use iptables proxier. Be aware that the other networking options detailed here automatically disable Docker Swarm encrypted overlay networks.

To switch an existing cluster to kube-proxy with ipvs proxier while using the managed CNI:

Obtain the current MKE configuration file for your cluster.
Set kube_proxy_mode to "ipvs".
Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

Verify that the following values are set in your MKE configuration file:

unmanaged_cni = false
calico_ebpf_enabled = false
kube_default_drop_masq_bits = false
kube_proxy_mode = "ipvs"
kube_proxy_no_cleanup_on_start = false

Verify that the ucp-kube-proxy container logs on all nodes contain the following:

KUBE_PROXY_MODE (ipvs) CLEANUP_ON_START_DISABLED false
Performing cleanup
kube-proxy cleanup succeeded
Actually starting kube-proxy....

Obtain the current MKE configuration file for your cluster.
Set kube_proxy_no_cleanup_on_start to true.
Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.
Reboot all nodes.

Verify that the following values are set in your MKE configuration file and that your cluster is in a healthy state with all nodes ready:

unmanaged_cni = false
calico_ebpf_enabled = false
kube_default_drop_masq_bits = false
kube_proxy_mode = "ipvs"
kube_proxy_no_cleanup_on_start = true

Verify that the ucp-kube-proxy container logs on all nodes contain the following:

KUBE_PROXY_MODE (ipvs) CLEANUP_ON_START_DISABLED true
Actually starting kube-proxy....
.....
I1111 02:41:05.559641     1 server_others.go:274] Using ipvs Proxier.
W1111 02:41:05.559951     1 proxier.go:445] IPVS scheduler not specified, use rr by default

Optional. Configure the following ipvs-related parameters in the MKE configuration file (otherwise, MKE will use the Kubernetes default parameter settings):
- ipvs_exclude_cidrs = ""
- ipvs_min_sync_period = ""
- ipvs_scheduler = ""
- ipvs_strict_arp = false
- ipvs_sync_period = ""
- ipvs_tcp_timeout = ""
- ipvs_tcpfin_timeout = ""
- ipvs_udp_timeout = ""
For more information on using these parameters, refer to kube-proxy in the Kubernetes documentation.

To switch an existing cluster to eBPF mode while using the managed CNI:

Verify that the prerequisites for eBPF use have been met, including kernel compatibility, for all Linux manager and worker nodes. Refer to the Calico documentation Enable the eBPF dataplane for more information.
Obtain the current MKE configuration file for your cluster.
Set kube_default_drop_masq_bits to true.
Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

Verify that the ucp-kube-proxy container started on all nodes, that the kube-proxy cleanup took place, and that ucp-kube-proxy launched kube-proxy.

for cont in $(docker ps -a|rev | cut -d' ' -f 1 | rev|grep ucp-kube-proxy); \
do nodeName=$(echo $cont|cut -d '/' -f1); \
docker logs $cont 2>/dev/null|grep -q 'kube-proxy cleanup succeeded'; \
if [ $? -ne 0 ]; \
then echo $nodeName; \
fi; \
done|sort

Expected output in the ucp-kube-proxy logs:

KUBE_PROXY_MODE (iptables) CLEANUP_ON_START_DISABLED false
Performing cleanup
kube-proxy cleanup succeeded
Actually starting kube-proxy....

Note

If the count returned by the command does not quickly converge at 0, check the ucp-kube-proxy logs on the nodes where either of the following took place:

The ucp-kube-proxy container did not launch.
The kube-proxy cleanup did not happen.

Reboot all nodes.
Obtain the current MKE configuration file for your cluster.

Verify that the following values are set in your MKE configuration file:

unmanaged_cni = false
calico_ebpf_enabled = false
kube_default_drop_masq_bits = true
kube_proxy_mode = "iptables"
kube_proxy_no_cleanup_on_start = false

Verify that the ucp-kube-proxy container logs on all nodes contain the following:

KUBE_PROXY_MODE (iptables) CLEANUP_ON_START_DISABLED false
Performing cleanup
....
kube-proxy cleanup succeeded
Actually starting kube-proxy....
....
I1111 03:29:25.048458     1 server_others.go:212] Using iptables Proxier.

Set kube_proxy_mode to "disabled".
Set calico_ebpf_enabled to true.
Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.
Verify that the ucp-kube-proxy container started on all nodes, that the kube-proxy cleanup took place, and that ucp-kube-proxy did not launch kube-proxy.
```
for cont in $(docker ps -a|rev | cut -d' ' -f 1 | rev|grep ucp-kube-proxy); \
do nodeName=$(echo $cont|cut -d '/' -f1); \
docker logs $cont 2>/dev/null|grep -q 'Sleeping forever'; \
if [ $? -ne 0 ]; \
then echo $nodeName; \
fi; \
done|sort
```
Expected output in the ucp-kube-proxy logs:
```
KUBE_PROXY_MODE (disabled) CLEANUP_ON_START_DISABLED false
Performing cleanup
kube-proxy cleanup succeeded
Sleeping forever....
```
Note

If the count returned by the command does not quickly converge at 0, check the ucp-kube-proxy logs on the nodes where either of the following took place:
- The ucp-kube-proxy container did not launch.
- The ucp-kube-proxy container launched kube-proxy.
Obtain the current MKE configuration file for your cluster.

Verify that the following values are set in your MKE configuration file:

unmanaged_cni = false
calico_ebpf_enabled = true
kube_default_drop_masq_bits = true
kube_proxy_mode = "disabled"
kube_proxy_no_cleanup_on_start = false

Set kube_proxy_no_cleanup_on_start to true.
Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

Verify that the following values are set in your MKE configuration file and that your cluster is in a healthy state with all nodes ready:

unmanaged_cni = false
calico_ebpf_enabled = true
kube_default_drop_masq_bits = true
kube_proxy_mode = "disabled"
kube_proxy_no_cleanup_on_start = true

Verify that eBPF mode is operational by confirming the presence of the following lines in the ucp-kube-proxy container logs:
```
KUBE_PROXY_MODE (disabled) CLEANUP_ON_START_DISABLED true
"Sleeping forever...."
```
Verify that you can SSH into all nodes.

Schedule image pruning¶

MKE administrators can schedule the cleanup of unused images, whitelisting which images to keep. To determine which images will be removed, they can perform a dry run prior to setting the image-pruning schedule.

Schedule image pruning using the CLI¶

To perform a dry run without whitelisting any images:

Perform a dry run to determine which images will be pruned:

AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/images/prune/dry

Example response:

[
   {
      "Containers":-1,
      "Created":1647029986,
      "Id":"sha256:2fb6fc2d97e10c79983aa10e013824cc7fc8bae50630e32159821197dda95fe3",
      "Labels":null,
      "ParentId":"",
      "RepoDigests":[
         "busybox@sha256:caa382c432891547782ce7140fb3b7304613d3b0438834dce1cad68896ab110a"
      ],
      "RepoTags":[
         "busybox:latest"
      ],
      "SharedSize":-1,
      "Size":1239748,
      "VirtualSize":1239748
   }
]

To perform a dry run with whitelisted images:

Obtain the current MKE configuration file for your cluster.
Whitelist the images that should not be removed.

Note

Where possible, use the image ID to specify the image rather than the image name.

For example:
```
[[cluster_config.image_prune_whitelist]]
  key = "label"
  value = "<label-value>"

[[cluster_config.image_prune_whitelist]]
  key = "before"
  value = "<image-id>"
```
Refer to cluster_config.image_prune_whitelist (optional) for more information.
Upload the new MKE configuration file.

Perform a dry run to determine which images will be pruned:

AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/images/prune/dry

Example response:

[
   {
      "Containers":-1,
      "Created":1647029986,
      "Id":"sha256:2fb6fc2d97e10c79983aa10e013824cc7fc8bae50630e32159821197dda95fe3",
      "Labels":null,
      "ParentId":"",
      "RepoDigests":[
         "busybox@sha256:caa382c432891547782ce7140fb3b7304613d3b0438834dce1cad68896ab110a"
      ],
      "RepoTags":[
         "busybox:latest"
      ],
      "SharedSize":-1,
      "Size":1239748,
      "VirtualSize":1239748
   }
]

To schedule image pruning:

Obtain the current MKE configuration file for your cluster.
Optional. Whitelist the images that should not be removed, if you have not already done so.

Note

Where possible, use the image ID to specify the image rather than the image name.

For example:
```
[[cluster_config.image_prune_whitelist]]
  key = "label"
  value = "<label-value>"

[[cluster_config.image_prune_whitelist]]
  key = "before"
  value = "<image-id>"
```
Refer to cluster_config.image_prune_whitelist (optional) for more information.
Set the value of image_prune_schedule to the desired cron schedule. Refer to cluster_config table (required) for more information.

The following example schedules image pruning for every day at midnight:
```
[cluster_config]

    image_prune_schedule = "0 0 0 * * *"
```
Upload the new MKE configuration file.

Schedule image pruning using the MKE web UI¶

Log in to the MKE web UI as an administrator.
From the left-side navigation panel, navigate to <user name> > Admin Settings > Tuning and scroll to Image pruning config.
Enter the desired pruning schedule.
Optional. Select the desired whitelist rules.
Optional. Test your image pruning configuration by clicking Start a dry run under Test configuration.

Manage etcd¶

etcd is a consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. It handles leader elections during network partitions and can tolerate machine failure, even in the leader node.

For MKE, etcd serves as the Kubernetes backing store for all cluster data, with an etcd replica deployed on each MKE manager node. This is a primary reason why Mirantis recommends that you deploy an odd number of MKE manager nodes, as etcd uses the Raft consensus algorithm and thus requires that a quorum of nodes agree on any updates to the cluster state.

Configure etcd storage quota¶

You can control the etcd distributed key-value storage quota using the etcd_storage_quota parameter in the MKE configuration file. By default, the value of the parameter is 2GB.

Note

MKE may cease to function if etcd exceeds the set quota. As such, Mirantis recommends either cleaning etcd or increasing the key-value storage quota size when the size of the database quota approaches 85%, the point at which MKE will start presenting warning banners.

For information on how to adjust the parameter, refer to Configure an MKE cluster.

Apply etcd defragmentation¶

The etcd distributed key-value store retains a history of its keyspace. That history is set for compaction following a specified number of revisions, however it only releases the used space back to the host filesystem following defragmentation. For more information, refer to the etcd documentation.

With MKE you can defragment the etcd cluster while avoiding cluster outages. To do this, you apply defragmentation to etcd members one at a time. MKE will defragment the current etcd leader last, to prevent the triggering of multiple leader elections.

Important

In a High Availability (HA) cluster, the defragmentation process subtly affects cluster dynamics, because when a node undergoes defragmentation it temporarily leaves the pool of active nodes. This subsequent reduction in the active node count results in a proportional increase of the load on the remaining nodes, which can lead to performance degradation if the remaining nodes do not have the capacity to handle the additional load. In addition, at the end of the process, when the leader node is undergoing defragmentation, there is a brief period during which cluster write operations do not take place. This pause occurs when the system initiates and completes the leader election process, and though it is automated and brief it does result in a momentary write block on the cluster.

Taking the described factors into account, Mirantis recommends taking a cautious scheduling approach in defragmenting HA clusters. Ideally, the defragmentation should occur during planned maintenance windows rather than relying on a recurring cron job, as during such periods you can closely monitor potential impacts on performance and availability and mitigate as necessary.

To defragment the etcd cluster:

Trigger the etcd cluster defragmentation by issuing a POST to the https://MKE_HOST/api/ucp/etcd/defrag endpoint.

You can specify two parameters:

timeoutSeconds
Sets how long MKE waits for each member to finish defragmentation. Default: 60 seconds. MKE will cancel the defragmentation if the timeout occurs before the member defragmentation completes.

pauseSeconds
Sets how long MKE waits between each member defragmentation. Default: 60 seconds.

Mirantis recommends that you adjust these parameters based on the size of the etcd database and the amount of time that has elapsed since the last defragmentation.

Example command:
```
AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/etcd/defrag --data '{"timeoutSeconds": 60, "pauseSeconds": 60}'
```
Example response:
```
"Cluster Defragmentation Initiated"
```

Review the state of individual etcd cluster members and the state of the cluster defragmentation by running the following command:

AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/etcd/info

Example output:

{
    "DefragInProgress": true,
    "DefragResult": "Cluster Defrag Initiated",
    "MemberInfo": [
        {
            "MemberID": 5051939019959384922,
            "Endpoint": "https://172.31.21.33:12379",
            "EtcdVersion": "3.4.16",
            "DbSize": "2 MB",
            "IsLeader": true,
            "Alarms": null
        },
        {
            "MemberID": 10749614093923491478,
            "Endpoint": "https://172.31.30.179:12379",
            "EtcdVersion": "3.4.16",
            "DbSize": "2 MB",
            "IsLeader": false,
            "Alarms": null
        },
        {
            "MemberID": 7837950661722744517,
            "Endpoint": "https://172.31.30.44:12379",
            "EtcdVersion": "3.4.16",
            "DbSize": "2 MB",
            "IsLeader": false,
            "Alarms": null
        }
    ]
}

You can monitor this endpoint until the defragmentation is complete. The information is also available in the ucp-controller logs.

To manually remove the etcd defragmentation lock file:

To maintain etcd cluster availability, MKE uses a lock file that prevents multiple defragmentations from being simultaneously implemented. MKE removes the lock file at the conclusion of defragmentation, however you can manually remove it as necessary.

Manually remove the lock file by running the following command:

docker exec ucp-controller rm /var/lock/etcd-defrag

etcd alarms response¶

Available since MKE 3.6.10

etcd issues alarms to indicate problems that need to be quickly addressed to ensure uninterrupted function.

NOSPACE alarm¶

A NOSPACE alarm is issued in the event that etcd runs low on storage space, to protect the cluster from further writes. Once this low storage space state is reached, etcd will respond to all write requests with the mvcc: database space exceeded error message until the issue is rectified.

When MKE detects the NOSPACE alarm condition, it displays a critical banner to inform administrators. In addition, MKE restarts etcd with an increased value for the etcd datastore quota, thus allowing administrators to resolve the NOSPACE alarm without interference.

To resolve the NOSPACE alarm:

Identify what data occupies most of the storage space. Be aware that in MKE the recommended etcdctl commands must be run in the ucp-kv container, the instruction for which is available in Troubleshoot the etcd key-value store with the CLI.

If a bug-ridden appliction is the cause of the unexpected use of storage space, stop that application.
Manually delete the unused data from etcd, if possible.
Apply etcd defragmentation.
If necessary, increase the etcd_storage_quota setting in the cluster_config table of the MKE configuration file.

Note

Contact Mirantis Support if you require assistance in resolving the etcd NOSPACE alarm.

CORRUPT alarm¶

The CORRUPT alarm is issued when a cluster corruption is detected by etcd. MKE cluster administrators are informed of the condition by way of a critical banner. To resolve such an issue, contact Mirantis Support and refer to the official etcd documentation regarding data corruption recovery.

Operate a hybrid Windows cluster¶

Hybrid Windows clusters concurrently run two versions of Windows Server, with one version deployed on one set of nodes and the second version deployed on a different set of nodes. The Windows versions that MKE supports are:

Windows Server 2019, build number 10.0.17763
Windows Server 2022, build number 10.0.20348

For more information on Windows releases and build numbers, refer to Windows container version compatibility.

To learn how to upgrade to Windows Server 2022, refer to Upgrade nodes to Windows Server 2022.

Limitations¶

A Windows Server 2019 node cannot run a container that uses a Windows Server 2022 image.
For a Windows Server 2022 node to run a container that uses a Windows Server 2019 image, you must run the container with Hyper-V isolation. Refer to the Microsoft documentation Hyper-V isolation for containers for more information.

Mirantis recommends that you use the same version of Windows Server for both your container images and for the node on which the containers run. For reference purposes, in both Kubernetes and Swarm clusters, MKE assigns a label to Windows nodes that includes the Windows Server version.

Run hybrid workloads in Kubernetes¶

To run Windows workloads in a hybrid Windows Kubernetes cluster, you must target your workloads to nodes that are running the correct Windows version. Failure to correctly target your workloads may result in an error when Kubernetes schedules the Pod on an incompatible node:

Error response from daemon: hcsshim::CreateComputeSystem win2019-deployment-no-nodeselect: The container operating system does not match the host operating system.

Note the Windows version associated with each of the nodes in your cluster:

kubectl get node

Example output:

NAME                         STATUS   ROLES    AGE   VERSION
manager-node                 Ready    master   51m   v1.23.4-mirantis-1
win2019-node                 Ready    <none>   44m   v1.23.4
win2022-node                 Ready    <none>   38m   v1.23.4

Create a deployment with the appropriate node selectors. Use 10.0.17763 for Windows Server 2019 workloads and 10.0.20348 for Windows Server 2022 workloads.

For example purposes, paste the following content into a file called win2019-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: win2019-deployment
  name: win2019-deployment
spec:
  replicas: 5
  selector:
    matchLabels:
      app: win2019-deployment
  template:
    metadata:
      labels:
        app: win2019-deployment
      name: win2019-deployment
    spec:
      containers:
      - name: win2019-deployment
        image: mcr.microsoft.com/windows/nanoserver:1809
        command: ["cmd", "/c", "ping -t localhost"]
        ports:
        - containerPort: 80
      nodeSelector:
        kubernetes.io/os: windows
        node.kubernetes.io/windows-build: 10.0.17763

Apply the deployment:

kubectl apply -f win2019-deployment.yaml

Verify that the Pods are scheduled on the required node:

kubectl get pods -o wide

Example output:

NAME                                                READY   STATUS             RESTARTS      AGE     IP              NODE                      NOMINATED NODE   READINESS GATES
win2019-deployment-57d75f6f9f-ldsqf                 1/1     Running            0             6m39s   192.168.50.76   win2019-node              <none>           <none>
win2019-deployment-57d75f6f9f-n5b25                 1/1     Running            0             6m39s   192.168.50.79   win2019-node              <none>           <none>
win2019-deployment-57d75f6f9f-r5mz6                 1/1     Running            0             6m39s   192.168.50.78   win2019-node              <none>           <none>
win2019-deployment-57d75f6f9f-xggmt                 1/1     Running            0             6m39s   192.168.50.77   win2019-node              <none>           <none>
win2019-deployment-57d75f6f9f-zltk2                 1/1     Running            0             7m7s    192.168.50.73   win2019-node              <none>           <none>

Run hybrid workloads in Swarm¶

To run Windows workloads in a hybrid Windows Swarm cluster, you must target your workloads to nodes that are running the correct Windows version. Failure to correctly target your workloads may result in an operating system mismatch error.

Verify that nodes running the appropriate Windows version are present in the cluster. Use an OsVersion label of 10.0.17763 for Windows Server 2019 and 10.0.20348 for Windows Server 2022. For example:

docker node ls -f "node.label=OsVersion=10.0.20348"

Example output:

ID                            HOSTNAME                  STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
yft1t1mnytt524y03zmdevzuk     win2022-node-1            Ready     Active                          20.10.12

Create a service that runs the required version of Windows Server, in this case Windows Server 2022. The service requires the inclusion of various constraints, to ensure that it is scheduled on the correct node. For example:

docker service create --name windows2022-example-service \
--constraint "node.platform.OS == windows" \
--constraint "node.labels.OsVersion == 10.0.20348" \
mcr.microsoft.com/windows/nanoserver:ltsc2022 cmd "/c ping -t localhost"

Verify that the service is scheduled on the required node:

docker service ps windows2022-example-service

Example output:

ID             NAME                            IMAGE                                           NODE                      DESIRED STATE   CURRENT STATE           ERROR     PORTS
uqrosib62602   windows2022-example-service.1   mcr.microsoft.com/windows/nanoserver:ltsc2022   win2022-node-1            Running         Running 9 minutes ago

Authorize role-based access¶

MKE allows administrators to authorize users to view, edit, and use cluster resources by granting role-based permissions for specific resource sets. This section describes how to configure all the relevant components of role-based access control (RBAC).

Refer to Role-based access control for detailed reference information.

Create organizations, teams, and users¶

This topic describes how to create organizations, teams, and users.

Note

Individual users can belong to multiple teams but a team can belong to only one organization.
New users have a default permission level that you can extend by adding the user to a team and creating grants. Alternatively, you can make the user an administrator to extend their permission level.
In addition to integrating with LDAP services, MKE provides built-in authentication. You must manually create users to use MKE built-in authentication.

Create an organization¶

Log in to the MKE web UI as an administrator.
Navigate to Access Control > Orgs & Teams > Create.
Enter a unique organization name that is 1-100 characters in length and which does not contain any of the following:
- Capital letters
- Spaces
- The following non-alphabetic characters: \*+[\]:;|=,?<>"'
Click Create.

Create a team in the organization¶

Log in to the MKE web UI as an administrator.
Navigate to the required organization and click the plus icon in the top right corner to call the Create Team dialog.
Enter a team name with a maximum of 100 characters.
Optional. Enter a description for the team. Maximum: 140 characters.
Click Create.

Add an existing user to a team¶

Log in to the MKE web UI as an administrator.
Navigate to the required team and click the plus sign in the top right corner.
Select the users you want to include and click Add Users.

Create a user¶

Log in to the MKE web UI as an administrator.
Navigate to Access Control > Users > Create.
Enter a unique user name that is 1-100 characters in length and which does not contain any of the following:
- Capital letters
- Spaces
- The following non-alphabetic characters: \*+[\]:;|=,?<>"'
Enter a password that contains at least 8 characters.
Enter the full name of the user.
Optional. Toggle IS A MIRANTIS KUBERNETES ENGINE ADMIN to Yes to give the user administrator privileges.
Click Create.

Enable LDAP and sync teams and users¶

Once you enable LDAP you can sync your LDAP directory to the teams and users that are present in MKE.

To enable LDAP:

Log in to the MKE web UI as an MKE administrator.
In the left-side navigation panel, navigate to <user name> > Admin Settings > Authentication & Authorization.
Scroll down to the Identity Provider Integration section.
Toggle LDAP to Enabled. A list of LDAP settings displays.
Enter the values that correspond with your LDAP server installation.
Use the built-in MKE LDAP Test login tool to confirm that your LDAP settings are correctly configured.

To synchronize LDAP users into MKE teams:

In the left-side navigation panel, navigate to Access Control > Orgs & Teams and select an organization.
Click + to create a team.
Enter a team name and description.
Toggle ENABLE SYNC TEAM MEMBERS to Yes.
Choose between the following two methods for matching group members from an LDAP directory. Refer to the table below for more information.
- Keep the default Match Search Results method and fill out Search Base DN, Search filter, and Search subtree instead of just one level as required.
- Toggle LDAP MATCH METHOD to change the method for matching group members in the LDAP directory to Match Group Members.
Optional. Select Immediately Sync Team Members to run an LDAP sync operation after saving the configuration for the team.
Optional. To allow non-LDAP team members to sync the LDAP directory, select Allow non-LDAP members.

Note

If you do not select Allow non-LDAP members, manually-added and SAML users are removed during the LDAP sync.
Click Create.
Repeat the preceding steps to synchronize LDAP users into additional teams.

There are two methods for matching group members from an LDAP directory:

Bind method

Description

Match Search Results (search bind)

Specifies that team members are synced using a search query against the LDAP directory of your organization. The team membership is synced to match the users in the search results.

Search Base DN: The distinguished name of the node in the directory tree where the search starts looking for users.
Search filter: Filter to find users. If empty, existing users in the search scope are added as members of the team.
Search subtree instead of just one level: Defines search through the full LDAP tree, not just one level, starting at the base DN.

Match Group Members (direct bind)

Specifies that team members are synced directly with members of a group in your LDAP directory. The team membership syncs to match the membership of the group.

Group DN: The distinguished name of the group from which you select users.
Group Member Attribute: The value of this attribute corresponds to the distinguished names of the members of the group.

Define roles with authorized API operations¶

Roles define a set of API operations permitted for a resource set. You apply roles to users and teams by creating grants. Roles have the following important characteristics:

Roles are always enabled.
Roles cannot be edited. To change a role, you must delete it and create a new role with the changes you want to implement.
To delete roles used within a grant, you must first delete the grant.
Only administrators can create and delete roles.

This topic explains how to create custom Swarm roles and describes default and Swarm operations roles.

Default roles¶

The following describes the built-in roles:

Role	Description
None	Users have no access to Swarm or Kubernetes resources. Maps to `No Access` role in UCP 2.1.x.
View Only	Users can view resources but cannot create them.
Restricted Control	Users can view and edit resources but cannot run a service or container in a way that affects the node where it is running. Users cannot mount a node directory, `exec` into containers, or run containers in privileged mode or with additional kernel capabilities.
Scheduler	Users can view worker and manager nodes and schedule, but not view, workloads on these nodes. By default, all users are granted the Scheduler role for the Shared collection. To view workloads, users need Container View permissions.
Full Control	Users can view and edit all granted resources. They can create containers without any restriction, but cannot see the containers of other users.

To learn how to apply a default role using a grant, refer to Create grants.

Create a custom Swarm role¶

You can use default or custom roles.

To create a custom Swarm role:

Log in to the MKE web UI.
Click Access Control > Roles.
Select the Swarm tab and click Create.
On the Details tab, enter the role name.
On the Operations tab, select the permitted operations for each resource type. For the operation descriptions, refer to Swarm operations roles.
Click Create.

Note

The Roles page lists all applicable default and custom roles in the organization.
You can apply a role with the same name to different resource sets.

To learn how to apply a custom role using a grant, refer to Create grants.

Swarm operations roles¶

The following describes the set of operations (calls) that you can execute to the Swarm resources. Each permission corresponds to a CLI command and enables the user to execute that command. Refer to the Docker CLI documentation for a complete list of commands and examples.

Operation	Command	Description
Config	`docker config`	Manage Docker configurations.
Container	`docker container`	Manage Docker containers.
Container	`docker container create`	Create a new container.
Container	`docker create [OPTIONS] IMAGE [COMMAND] [ARG...]`	Create new containers.
Container	`docker update [OPTIONS] CONTAINER [CONTAINER...]`	Update configuration of one or more containers. Using this command can also prevent containers from consuming too many resources from their Docker host.
Container	`docker rm [OPTIONS] CONTAINER [CONTAINER...]`	Remove one or more containers.
Image	`docker image COMMAND`	Remove one or more containers.
Image	`docker image remove`	Remove one or more images.
Network	`docker network`	Manage networks. You can use child commands to create, inspect, list, remove, prune, connect, and disconnect networks.
Node	`docker node COMMAND`	Manage Swarm nodes.
Secret	`docker secret COMMAND`	Manage Docker secrets.
Service	`docker service COMMAND`	Manage services.
Volume	`docker volume create [OPTIONS] [VOLUME]`	Create a new volume that containers can consume and store data in.
Volume	`docker volume rm [OPTIONS] VOLUME [VOLUME...]`	Remove one or more volumes. Users cannot remove a volume that is in use by a container.

Use collections and namespaces¶

MKE enables access control to cluster resources by grouping them into two types of resource sets: Swarm collections (for Swarm workloads) and Kubernetes namespaces (for Kubernetes workloads). Refer to Role-based access control for a description of the difference between Swarm collections and Kubernetes namespaces. Administrators use grants to combine resources sets, giving users permission to access specific cluster resources.

Swarm collection labels¶

Users assign resources to collections with labels. The following resource types have editable labels and thus you can assign them to collections: services, nodes, secrets, and configs. For these resources types, change com.docker.ucp.access.label to move a resource to a different collection. Collections have generic names by default, but you can assign them meaningful names as required (such as dev, test, and prod).

Note

The following resource types do not have editable labels and thus you cannot assign them to collections: containers, networks, and volumes.

Groups of resources identified by a shared label are called stacks. You can place one stack of resources in multiple collections. MKE automatically places resources in the default collection. Users can change this using a specific com.docker.ucp.access.label in the stack/compose file.

The system uses com.docker.ucp.collection.* to enable efficient resource lookup. You do not need to manage these labels, as MKE controls them automatically. Nodes have the following labels set to true by default:

com.docker.ucp.collection.root
com.docker.ucp.collection.shared
com.docker.ucp.collection.swarm

Default and built-in Swarm collections¶

This topic describes both MKE default and built-in Swarm collections.

Default Swarm collections
Built-in Swarm collections

Default Swarm collections¶

Each user has a default collection, which can be changed in the MKE preferences.

To deploy resources, they must belong to a collection. When a user deploys a resource without using an access label to specify its collection, MKE automatically places the resource in the default collection.

Default collections are useful for the following types of users:

Users who work only on a well-defined portion of the system
Users who deploy stacks but do not want to edit the contents of their compose files

Custom collections are appropriate for users with more complex roles in the system, such as administrators.

Note

For those using Docker Compose, the system applies default collection labels across all resources in the stack unless you explicitly set com.docker.ucp.access.label.

Built-in Swarm collections¶

MKE includes the following built-in Swarm collections:

Built-in Swarm collection	Description
`/`	Path to all resources in the Swarm cluster. Resources not in a collection are put here.
`/System`	Path to MKE managers, MSR nodes, and MKE/MSR system services. By default, only administrators have access to this collection.
`/Shared`	Path to a user’s private collection. Private collections are not created until the user logs in for the first time.
`/Shared/Private`	Path to a user’s private collection. Private collections are not created until the user logs in for the first time.
`/Shared/Legacy`	Path to the access control labels of legacy versions (UCP 2.1 and earlier).

Group and isolate cluster resources¶

This topic describes how to group and isolate cluster resources into swarm collections and Kubernetes namespaces.

To create a Swarm collection:

Navigate to Shared Resources > Collections.
Click View Children next to Swarm.
Click Create Collection.
Enter a collection name and click Create.

To move a resource to a different collection:

In the left-side navigation panel, navigate to the resource type you want to move and click it. As an example, navigate to and click on Shared Resources > Nodes.
Click the node you want to move to display the information window for that node.
Click the slider icon at the top right of the information window to display the edit dialog for the node.
Scroll down to Labels and change the com.docker.ucp.access.label swarm label to the name of your collection.

Note

Optionally, you can navigate to Collection in the left-side navigation panel and select the collection to which you want to move the resource.

To create a Kubernetes namespace:

Navigate to Kubernetes > Namespaces and click Create.
Leave the Namespace drop-down blank.

Paste the following in the Object YAML editor:

apiVersion: v1
kind: Namespace
metadata:
  name: namespace-name

Click Create.

Note

For more information on assigning resources to a particular namespace, refer to Kubernetes Documentation: Namespaces Walkthrough.

See also

See also

Create grants¶

MKE administrators create grants to control how users and organizations access resource sets. A grant defines user permissions to access resources. Each grant associates one subject with one role and one resource set. For example, you can grant the Prod Team Restricted Control over services in the /Production collection.

The following is a common workflow for creating grants:

create-manually.
Define custom roles (or use defaults) by adding permitted API operations per type of resource.
Group cluster resources into Swarm collections or Kubernetes namespaces.
Create grants by combining subject, role, and resource set.

Note

This section assumes that you have created the relevant objects for the grant, including the subject, role, and resource set (Kubernetes namespace or Swarm collection).

To create a Kubernetes grant:

Log in to the MKE web UI.
Navigate to Access Control > Grants.
Select the Kubernetes tab and click Create Role Binding.
Under Subject, select Users, Organizations, or Service Account.
- For Users, select the user from the pull-down menu.
- For Organizations, select the organization and, optionally, the team from the pull-down menu.
- For Service Account, select the namespace and service account from the pull-down menu.
Click Next to save your selections.
Under Resource Set, toggle the switch labeled Apply Role Binding to all namespaces (Cluster Role Binding).
Click Next.
Under Role, select a cluster role.
Click Create.

To create a Swarm grant:

Log in to the MKE web UI.
Navigate to Access Control > Grants.
Select the Swarm tab and click Create Grant.
Under Subject, select Users or Organizations.
- For Users, select a user from the pull-down menu.
- For Organizations, select the organization and, optionally, the team from the pull-down menu.
Click Next to save your selections.
Under Resource Set, click View Children until the required collection displays.
Click Select Collection next to the required collection.
Click Next.
Under Role, select a role type from the drop-down menu.
Click Create.

Note

MKE places new users in the docker-datacenter organization by default. To apply permissions to all MKE users, create a grant with the docker-datacenter organization as a subject.

Grant users permission to pull images¶

By default, only administrators can pull images into a cluster managed by MKE. This topic describes how to give non-administrator users permission to pull images.

Images are always in the swarm collection, as they are a shared resource. Grant users the Image Create permission for the Swarm collection to allow them to pull images.

To grant a user permission to pull images:

Log in to the MKE web UI as an administrator.
Navigate to Access Control > Roles.
Select the Swarm tab and click Create.
On the Details tab, enter Pull images for the role name.
On the Operations tab, select Image Create from the IMAGE OPERATIONS drop-down.
Click Create.
Navigate to Access Control > Grants.
Select the Swarm tab and click Create Grant.
Under Subject, click Users and select the required user from the drop-down.
Click Next.
Under Resource Set, select the Swarm collection and click Next.
Under Role, select Pull images from the drop-down.
Click Create.

Reset passwords¶

This topic describes how to reset passwords for users and administrators.

To change a user password in MKE:

Log in to the MKE web UI with administrator credentials.
Click Access Control > Users.
Select the user whose password you want to change.
Click the gear icon in the top right corner.
Select Security from the left navigation.
Enter the new password, confirm that it is correct, and click Update Password.

Note

For users managed with an LDAP service, you must change user passwords on the LDAP server.

To change an administrator password in MKE:

SSH to an MKE manager node and run:

docker run --net=host -v ucp-auth-api-certs:/tls -it \
"$(docker inspect --format \
'{{ .Spec.TaskTemplate.ContainerSpec.Image }}' \
ucp-auth-api)" \
"$(docker inspect --format \
'{{ index .Spec.TaskTemplate.ContainerSpec.Args 0 }}' \
ucp-auth-api)" \
passwd -i

Optional. If you have DEBUG set as your global log level within MKE, running $(docker inspect --format '{{ index .Spec.TaskTemplate.ContainerSpec.Args 0 }}` returns --debug instead of --db-addr.

Pass Args 1 to $docker inspect instead to reset your administrator password:

docker run --net=host -v ucp-auth-api-certs:/tls -it \
"$(docker inspect --format \
'{{ .Spec.TaskTemplate.ContainerSpec.Image }}' \
ucp-auth-api)" \
"$(docker inspect --format \
'{{ index .Spec.TaskTemplate.ContainerSpec.Args 1 }}' \
ucp-auth-api)" \
passwd -i

Note

Alternatively, ask another administrator to change your password.

RBAC tutorials¶

This section contains a collection of tutorials that explain how to use RBAC in a variety of scenarios.

Deploy a simple stateless app with RBAC¶

This topic describes how to deploy an NGINX web server, limiting access to one team using role-based access control (RBAC).

You are the MKE system administrator and will configure permissions to company resources using a four-step process:

Build the organization with teams and users.
Define roles with allowable operations per resource type, such as permission to run containers.
Create collections or namespaces for accessing actual resources.
Create grants that join team, role, and resource set.

To deploy a simple stateless app with RBAC:

Build the organization:
1. Log in to the MKE web UI.
2. Add an organization called company-datacenter.
3. Create three teams according to the following structure:
  
  Team
  
  Users
  
  DBA
  
  Alex
  
  Dev
  
  Bett
  
  Ops
  
  Alex, Chad
Deploy NGINX with Kubernetes:
1. Create a namespace:
  1. Click Kubernetes > Create.
  2. Paste the following manifest in the Object YAML editor and click Create.
    apiVersion: v1 kind: Namespace metadata: name: nginx-namespace
2. Create a role for the Ops team called kube-deploy:
  1. Click Kubernetes > Create.
  2. Select nginx-namespace from the Namespace drop-down.
  3. Paste the following manifest in the Object YAML editor and click Create.
    apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: kube-deploy rules: - apiGroups: ["*"] resources: ["*"] verbs: ["*"]
3. Create a role binding, to allow the Ops team to deploy applications to nginx-namespace:
  1. Click Access Control > Grants.
  2. Select the Kubernetes tab and click Create Role Binding.
  3. Under Subject, select Organizations and configure Organization as company-datacenter and Team as Ops.
  4. Click Next.
  5. Under Resource Set, select nginx-namespace and click Next.
  6. Under Role, select the kube-deploy role and click Create.
4. Deploy an application as a member of the Ops team:
  1. Log in to the MKE web UI as Chad, a member of the Ops team.
  2. Click Kubernetes > Create.
  3. Select nginx-namespace from the Namespace drop-down.
  4. Paste the following manifest in the Object YAML editor and click Create.
    apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 80
Verify that Ops team members can view the nginx-deployment resources:
1. Log in to the MKE web UI as Alex, a member of the Ops team.
2. Click Kubernetes > Controllers.
3. Confirm the presence of NGINX deployment and ReplicaSet.
Verify that Dev team members cannot view the nginx-deployment resources:
1. Log in to the MKE web UI as Bett, who is not a member of the Ops team.
2. Click Kubernetes > Controllers.
3. Confirm that NGINX deployment and ReplicaSet are not present.
Deploy NGINX as a Swarm service:
1. Create a collection for NGINX resources called nginx-collection nested under the Shared collection. To view child collections, click View Children.
2. Create a simple role for the Ops team called Swarm Deploy.
3. Create a grant for the Ops team to access the nginx-collection with the Swarm Deploy custom role.
4. Log in to the MKE web UI as Chad on the Ops team.
5. Click Swarm > Services > Create.
6. On the Details tab, enter the following:
  - Name: nginx-service
  - Image: nginx:latest
7. On the Collection tab, click View Children next to Swarm and then next to Shared.
8. Click nginx-collection, then click Create.
9. Sign in as each user and verify that the following users cannot see nginx-collection:
  - Alex on the DBA team
  - Bett on the Dev team

Isolate volumes to specific teams¶

This topic describes how to grant two teams access to separate volumes in two different resource collections such that neither team can see the volumes of the other team. MKE allows you to do this even if the volumes are on the same nodes.

To create two teams:

Log in to the MKE web UI.
Navigate to Orgs & Teams.
Create two teams in the engineering organization named Dev and Prod.
Add a non-admin MKE user to the Dev team.
Add a non-admin MKE user to the Prod team.

To create two resource collections:

Create a Swarm collection called dev-volumes nested under the Shared collection.
Create a Swarm collection called prod-volumes nested under the Shared collection.

To create grants for controlling access to the new volumes:

Create a grant for the Dev team to access the dev-volumes collection with the Restricted Control built-in role.
Create a grant for the Prod team to access the prod-volumes collection with the Restricted Control built-in role.

To create a volume as a team member:

Log in as one of the users on the Dev team.
Navigate to Swarm > Volumes and click Create.
On the Details tab, name the new volume dev-data.
On the Collection tab, navigate to the dev-volumes collection and click Create.
Log in as one of the users on the Prod team.
Navigate to Swarm > Volumes and click Create.
On the Details tab, name the new volume prod-data.
On the Collection tab, navigate to the prod-volumes collection and click Create.

As a result, the user on the Prod team cannot see the Dev team volumes, and the user on the Dev team cannot see the Prod team volumes. MKE administrators can see all of the volumes created by either team.

Isolate nodes¶

You can use MKE to physically isolate resources by organizing nodes into collections and granting Scheduler access for different users. Control access to nodes by moving them to dedicated collections where you can grant access to specific users, teams, and organizations.

The following tutorials explain how to isolate nodes using Swarm and Kubernetes.

Isolate cluster nodes with Swarm¶

This tutorial explains how to give a team access to a node collection and a resource collection. MKE access control ensures that team members cannot view or use Swarm resources that are not in their collection.

Note

You need an MKE license and at least two worker nodes to complete this tutorial.

The following is a high-level overview of the steps you will take to isolate cluster nodes:

Create an Ops team and assign a user to it.
Create a Prod collection for the team node.
Assign a worker node to the Prod collection.
Grant the Ops teams access to its collection.

To create a team:

Log in to the MKE web UI.
Create a team named Ops in your organization.
Add a user to the team who is not an administrator.

To create the team collections:

In this example, the Ops team uses a collection for its assigned nodes and another for its resources.

Create a Swarm collection called Prod nested under the Swarm collection.
Create a Swarm collection called Webserver nested under the Prod collection.

The Prod collection is for the worker nodes and the Webserver sub-collection is for an application that you will deploy on the corresponding worker nodes.

To move a worker node to a different collection:

Note

MKE places worker nodes in the Shared collection by default, and it places those running MSR in the System collection.

Navigate to Shared Resources > Nodes to view all of the nodes in the swarm.
Find a node located in the Shared collection. You cannot move worker nodes that are assigned to the System collection.
Click the slider icon on the node details page.
In the Labels section on the Details tab, change com.docker.ucp.access.label from /Shared to /Prod.
Click Save to move the node to the Prod collection.

To create two grants for team access to the two collections:

Create a grant for the Ops team to access the Webserver collection with the built-in Restricted Control role.
Create a grant for the Ops team to access the Prod collection with the built-in Scheduler role.

The cluster is now set up for node isolation. Users with access to nodes in the Prod collection can deploy Swarm services and Kubernetes apps. They cannot, however, schedule workloads on nodes that are not in the collection.

To deploy a Swarm service as a team member:

When a user deploys a Swarm service, MKE assigns its resources to the default collection. As a user on the Ops team, set Webserver to be your default collection.

Note

From the resource target collection, MKE walks up the ancestor collections until it finds the highest ancestor that the user has Scheduler access to. MKE schedules tasks on any nodes in the tree below this ancestor. In this example, MKE assigns the user service to the Webserver collection and schedules tasks on nodes in the Prod collection.

Log in as a user on the Ops team.
Navigate to Shared Resources > Collections.
Navigate to the Webserver collection.
Under the vertical ellipsis menu, select Set to default.
Navigate to Swarm > Services and click Create to create a Swarm service.
Name the service NGINX, enter nginx:latest in the Image* field, and click Create.
Click the NGINX service when it turns green.
Scroll down to TASKS, click the NGINX container, and confirm that it is in the Webserver collection.
Navigate to the Metrics tab on the container page, select the node, and confirm that it is in the Prod collection.

Note

An alternative approach is to use a grant instead of changing the default collection. An administrator can create a grant for a role that has the Service Create permission for the Webserver collection or a child collection. In this case, the user sets the value of com.docker.ucp.access.label to the new collection or one of its children that has a Service Create grant for the required user.

Isolate cluster nodes with Kubernetes¶

This topic describes how to use a Kubernetes namespace to deploy a Kubernetes workload to worker nodes using the MKE web UI.

MKE uses the scheduler.alpha.kubernetes.io/node-selector annotation key to assign node selectors to namespaces. Assigning the name of the node selector to this annotation pins all applications deployed in the namespace to the nodes that have the given node selector specified.

To isolate cluster nodes with Kubernetes:

Create a Kubernetes namespace.

Note

You can also associate nodes with a namespace by providing the namespace definition information in a configuration file.
1. Log in to the MKE web UI as an administrator.
2. In the left-side navigation panel, navigate to Kubernetes and click Create to open the Create Kubernetes Object page.
3. Paste the following in the Object YAML editor:
```
apiVersion: v1
kind: Namespace
metadata:
  name: namespace-name
```
4. Click Create to create the namespace-name namespace.
Grant access to the Kubernetes namespace:
1. Create a role binding for a user of your choice to access the namespace-name namespace with the built-in cluster-admin Cluster Role.
Associate nodes with the namespace:
1. From the left-side navigation panel, navigate to Shared Resources > Nodes.
2. Select the required node.
3. Click the Edit Node icon in the upper-right corner.
4. Scroll down to the Kubernetes Labels section and click Add Label.
5. In the Key field, enter zone.
6. In the Value field, enter example-zone.
7. Click Save.
8. Add a scheduler node selector annotation as part of the namespace definition:
```
apiVersion: v1
   kind: Namespace
   metadata:
      annotations:
      scheduler.alpha.kubernetes.io/node-selector: zone=example-zone
      name: ops-nodes
```

Set up access control architecture¶

This tutorial explains how to set up a complete access architecture for a fictitious company called OrcaBank.

OrcaBank is reorganizing their application teams by product with each team providing shared services as necessary. Developers at OrcaBank perform their own DevOps and deploy and manage the lifecycle of their applications.

OrcaBank has four teams with the following resource needs:

Security needs view-only access to all applications in the cluster.
DB (database) needs full access to all database applications and resources.
Mobile needs full access to their mobile applications and limited access to shared DB services.
Payments needs full access to their payments applications and limited access to shared DB services.

OrcaBank is taking advantage of the flexibility in the MKE grant model by applying two grants to each application team. One grant allows each team to fully manage the apps in their own collection, and the second grant gives them the (limited) access they need to networks and secrets within the db collection.

The resulting access architecture has applications connecting across collection boundaries. By assigning multiple grants per team, the Mobile and Payments applications teams can connect to dedicated database resources through a secure and controlled interface, leveraging database networks and secrets.

Note

MKE deploys all resources across the same group of worker nodes while providing the option to segment nodes.

To set up a complete access control architecture:

Set up LDAP/AD integration and create the required teams.

OrcaBank will standardize on LDAP for centralized authentication to help their identity team scale across all the platforms they manage.

To implement LDAP authentication in MKE, OrcaBank is using the MKE native LDAP/AD integration to map LDAP groups directly to MKE teams. You can add or remove users from MKE teams via LDAP, which the OrcaBank identity team will centrally manage.
1. Enable LDAP in MKE and sync your directory.
2. Create the following teams: Security, DB, Mobile, and Payments.
Define the required roles:
1. Define an Ops role that allows users to perform all operations against configs, containers, images, networks, nodes, secrets, services, and volumes.
2. Define a View & Use Networks + Secrets role that enables users to view and connect to networks and view and use secrets used by DB containers, but that prevents them from seeing or impacting the DB applications themselves.
Note

You will also use the built-in View Only role that allows users to see all resources, but not edit or use them.
Create the required Swarm collections.

All OrcaBank applications share the same physical resources, so all nodes and applications are configured in collections that nest under the built-in Shared collection.

Create the following collections:
- /Shared/mobile to host all mobile applications and resources.
- /Shared/payments to host all payments applications and resources.
- /Shared/db to serve as a top-level collection for all db resources.
- /Shared/db/mobile to hold db resources for mobile applications.
- /Shared/db/payments to hold db resources for payments applications.
Note

The OrcaBank grant composition will ensure that the Swarm collection architecture gives the DB team access to all db resources and restricts app teams to shared db resources.
Create the required grants:
1. For the Security team, create grants to access the following collections with the View Only built-in role: /Shared/mobile, /Shared/payments, /Shared/db, /Shared/db/mobile, and /Shared/db/payments.
2. For the DB team, create grants to access the /Shared/db, /Shared/db/mobile, and /Shared/db/payments collections with the Ops custom role.
3. For the Mobile team, create a grant to access the /Shared/mobile collection with the Ops custom role.
4. For the Mobile team, create a grant to access the /Shared/db/mobile collection with the View & Use Networks + Secrets custom role.
5. For the Payments team, create a grant to access the /Shared/payments collection with the Ops custom role.
6. For the Payments team, create a grant to access the /Shared/db/payments collection with the View & Use Networks + Secrets custom role.

Set up access control architecture with additional security requirements¶

Caution

Complete the Set up access control architecture tutorial before you attempt this advanced tutorial.

In the previous tutorial, you assigned multiple grants to resources across collection boundaries on a single platform. In this tutorial, you will implement the following stricter security requirements for the fictitious company, OrcaBank:

OrcaBank is adding a staging zone to their deployment model, deploying applications first from development, then from staging, and finally from production.
OrcaBank will no longer permit production applications to share any physical infrastructure with non-production infrastructure. They will use node access control to segment application scheduling and access.

Note

Node access control is an MKE feature that provides secure multi-tenancy with node-based isolation. Use it to place nodes in different collections so that you can schedule and isolate resources on disparate physical or virtual hardware. For more information, refer to Isolate nodes.

OrcaBank will still use its three application teams from the previous tutorial (DB, Mobile, and Payments) but with varying levels of segmentation between them. The new access architecture will organize the MKE cluster into staging and production collections with separate security zones on separate physical infrastructure.

The four OrcaBank teams now have the following production and staging needs:

Security` needs view-only access to all applications in production and no access to staging.
DB needs full access to all database applications and resources in production and no access to staging.
In both production and staging, Mobile needs full access to their applications and limited access to shared DB services.
In both production and staging, Payments needs full access to their applications and limited access to shared DB services.

The resulting access architecture will provide physical segmentation between production and staging using node access control.

Applications are scheduled only on MKE worker nodes in the dedicated application collection. Applications use shared resources across collection boundaries to access the databases in the /prod/db collection.

To set up a complete access control architecture with additional security requirements:

Verify LDAP, teams, and roles are set up properly:
1. Verify LDAP is enabled and syncing. If it is not, configure that now.
2. Verify the following teams are present in your organization: Security, DB, Mobile, and Payment, and if they are not, create them.
3. Verify that there is a View & Use Networks + Secrets role. If there is not, define a View & Use Networks + Secrets role that enables users to view and connect to networks and view and use secrets used by DB containers. Configure the role so that it prevents those who use it from seeing or impacting the DB applications themselves.
Note

You will also use the following built-in roles:
- View Only allows users to see but not edit all cluster resources.
- Full Control allows users complete control of all collections granted to them. They can also create containers without restriction but cannot see the containers of other users. This role will replace the custom Ops role from the previous tutorial.
Create the required Swarm collections.

In the previous tutorial, OrcaBank created separate collections for each application team and nested them all under /Shared.

To meet their new security requirements for production, OrcaBank will add top-level prod and staging collections with mobile and payments application collections nested underneath. The prod collection (but not the staging collection) will also include a db collection with a second set of mobile and payments collections nested underneath.

OrcaBank will also segment their nodes such that the production and staging zones will have dedicated nodes, and in production each application will be on a dedicated node.

Create the following collections:
- /prod
- /prod/mobile
- /prod/payments
- /prod/db
- /prod/db/mobile
- /prod/db/payments
- /staging
- /staging/mobile
- /staging/payments
Create the required grants as described in Create grants:
1. For the Security team, create grants to access the following collections with the View Only built-in role: /prod, /prod/mobile, /prod/payments, /prod/db, /prod/db/mobile, and /prod/db/payments.
2. For the DB team, create grants to access the following collections with the Full Control built-in role: /prod/db, /prod/db/mobile, and /prod/db/payments.
3. For the Mobile team, create grants to access the /prod/mobile and /staging/mobile collections with the Full Control built-in role.
4. For the Mobile team, create a grant to access the /prod/db/mobile collection with the View & Use Networks + Secrets custom role.
5. For the Payments team, create grants to access the /prod/payments and /staging/payments collections with the Full Control built-in role.
6. For the Payments team, create a grant to access the /prod/db/payments collection with the View & Use Networks + Secrets custom role.

Upgrades and migrations¶

Upgrade an MKE installation¶

Note

Prior to upgrading MKE, review the MKE release notes for information that may be relevant to the upgrade process.

In line with your MKE upgrade, you should plan to upgrade the Mirantis Container Runtime (MCR) instance on each cluster node to version 20.10.0 or later. Mirantis recommends that you schedule the upgrade for non-business hours to ensure minimal user impact.

Do not make changes to your MKE configuration while upgrading, as doing so can cause misconfigurations that are difficult to troubleshoot.

Semantic versioning¶

MKE uses semantic versioning. While downgrades are not supported, Mirantis supports upgrades according to the following rules:

When you upgrade from one patch version to another, you can skip patch versions as no data migration takes place between patch versions.
When you upgrade between minor releases, you cannot skip releases. You can, however, upgrade from any patch version from the previous minor release to any patch version of the subsequent minor release.
When you upgrade between major releases, you cannot skip releases.

Warning

Upgrading from one MKE minor version to another minor version can result in a downgrading of MKE middleware components. For more information, refer to the component listings in the release notes of both the source and target MKE versions.

Supported upgrade paths¶
Description	From	To	Supported
Patch upgrade	x.y.0	x.y.1	Yes
Skip patch version	x.y.0	x.y.2	Yes
Patch downgrade	x.y.2	x.y.1	No
Minor upgrade	x.y.*	x.y+1.*	Yes
Skip minor version	x.y.*	x.y+2.*	No
Minor downgrade	x.y.*	x.y-1.*	No
Major upgrade	x.y.z	x+1.0.0	Yes
Major upgrade skipping minor version	x.y.z	x+1.y+1.z	No
Skip major version	x..	x+2..	No
Major downgrade	x..	x-1..	No

Verify your environment¶

Before you perform the environment verifications necessary to ensure a smooth upgrade, Mirantis recommends that you run upgrade checks:

docker container run --rm -it \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp \
upgrade checks [command options]

This process confirms:

Port availability
Sufficient memory and disk space
Supported OS version is in use
Existing backup availability

To perform system verifications:

Verify time synchronization across all nodes and assess time daemon logs for any large time drifting.
Verify that PROD=4vCPU/16GB system requirements are met for MKE managers and MSR replicas.
Verify that your port configurations meet all MKE, MSR, and MCR port requirements.
Verify that your cluster nodes meet the minimum requirements.
Verify that you meet all minimum hardware and software requirements.

Note

Azure installations have additional prerequisites. Refer to Install MKE on Azure for more information.

To perform storage verifications:

Verify that no more than 70% of /var/ storage is used. If more than 70% is used, allocate enough storage to meet this requirement. Refer to MKE hardware requirements for the minimum and recommended storage requirements.
Verify whether any node local file systems have disk storage issues, including MSR backend storage, for example, NFS.
Verify that you are using Overlay2 storage drivers, as they are more stable. If you are not, you should transition to Overlay2 at this time. Transitioning from device mapper to Overlay2 is a destructive rebuild.

To perform operating system verifications:

Patch all relevant packages to the most recent cluster node operating system version, including the kernel.
Perform rolling restart of each node to confirm in-memory settings are the same as startup scripts.
After performing rolling restarts, run check-config.sh on each cluster node checking for kernel compatibility issues.

To perform procedural verifications:

Perform Swarm, MKE, and MSR backups.
Gather Compose, service, and stack files.
Generate an MKE support bundle for this specific point in time.
Preinstall MKE, MSR, and MCR images. If your cluster does not have an Internet connection, Mirantis provides tarballs containing all the required container images. If your cluster does have an Internet connection, pull the required container images onto your nodes:
```
$ docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.6.16 images \
--list | xargs -L 1 docker pull
```
Load troubleshooting packages, for example, netshoot.

To upgrade MCR:

The MKE upgrade requires MCR 20.10.0 or later to be running on every cluster node. If it is not, perform the following steps first on manager and then on worker nodes:

Log in to the node using SSH.
Upgrade MCR to version 20.10.0 or later.
Using the MKE web UI, verify that the node is in a healthy state:
1. Log in to the MKE web UI.
2. Navigate to Shared Resources > Nodes.
3. Verify that the node is healthy and a part of the cluster.

Caution

Mirantis recommends upgrading in the following order: MCR, MKE, MSR. This topic is limited to the upgrade instructions for MKE.

To perform cluster verifications:

Verify that your cluster is in a healthy state, as it will be easier to troubleshoot should a problem occur.
Create a backup of your cluster, thus allowing you to recover should something go wrong during the upgrade process.
Verify that the Docker engine is running on all MKE cluster nodes.

Note

You cannot use the backup archive during the upgrade process, as it is version specific. For example, if you create a backup archive for an MKE 3.4.2 cluster, you cannot use the archive file after you upgrade to MKE 3.4.4.

Perform the upgrade¶

This topic describes the following three different methods of upgrading MKE:

Automated in-place cluster upgrade
Phased in-place cluster upgrade
Replace existing worker nodes using blue-green deployment

Note

To upgrade MKE on machines that are not connected to the Internet, refer to Install MKE offline to learn how to download the MKE package for offline installation.

In all three methods, manager nodes are automatically upgraded in place. You cannot control the order of manager node upgrades. For each worker node that requires an upgrade, you can upgrade that node in place or you can replace the node with a new worker node. The type of upgrade you perform depends on what is needed for each node.

Consult the following table to determine which method is right for you:

Upgrade method	Description
Automated in-place cluster upgrade	Performed on any manager node. This method automatically upgrades the entire cluster.
Phased in-place cluster upgrade	Automatically upgrades manager nodes and allows you to control the upgrade order of worker nodes. This type of upgrade is more advanced than the automated in-place cluster upgrade.
Replace existing worker nodes using blue-green deployment	This type of upgrade allows you to stand up a new cluster in parallel to the current one and switch over when the upgrade is complete. It requires that you join new worker nodes, schedule workloads to run on them, pause, drain, and remove old worker nodes in batches (rather than one at a time), and shut down servers to remove worker nodes. This is the most advanced upgrade method.

Automated in-place cluster upgrade method:¶

This is the standard method of upgrading MKE. It updates all MKE components on all nodes within the MKE cluster one-by-one until the upgrade is complete, and is thus not ideal for those needing to upgrade their worker nodes in a particular order.

Verify that all MCR instances have been upgraded to the corresponding new version.
SSH into one MKE manager node and run the following command (do not run this command on a workstation with a client bundle):
```
docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.6.16 \
upgrade \
--interactive
```
The upgrade command will print messages as it automatically upgrades MKE on all nodes in the cluster.

Phased in-place cluster upgrade¶

This method allows granular control of the MKE upgrade process by first upgrading a manager node and then allowing you to upgrade worker nodes manually in the order that you select. This allows you to migrate workloads and control traffic while upgrading. You can temporarily run MKE worker nodes with different versions of MKE and MCR.

This method allows you to handle failover by adding additional worker node capacity during an upgrade. You can add worker nodes to a partially-upgraded cluster, migrate workloads, and finish upgrading the remaining worker nodes.

Verify that all MCR instances have been upgraded to the corresponding new version.
SSH into one MKE manager node and run the following command (do not run this command on a workstation with a client bundle):
```
docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.6.16 \
upgrade \
--manual-worker-upgrade \
--interactive
```
The --manual-worker-upgrade flag allows MKE to upgrade only the manager nodes. It adds an upgrade-hold label to all worker nodes, which prevents MKE from upgrading each worker node until you remove the label.
Optional. Join additional worker nodes to your cluster:
```
docker swarm join --token SWMTKN-<swarm-token> <manager-ip>:2377
```
For more information, refer to Join Linux nodes.

Note

New worker nodes will already have the newer version of MCR and MKE installed when they join the cluster.

Remove the upgrade-hold label from each worker node to upgrade:

docker node update --label-rm com.docker.ucp.upgrade-hold \
<node-name-or-id>

Replace existing worker nodes using blue-green deployment¶

This method creates a parallel environment for a new deployment, which reduces downtime, upgrades worker nodes without disrupting workloads, and allows you to migrate traffic to the new environment with worker node rollback capability.

Note

You do not have to replace all worker nodes in the cluster at one time, but can instead replace them in groups.

Verify that all MCR instances have been upgraded to the corresponding new version.
SSH into one MKE manager node and run the following command (do not run this command on a workstation with a client bundle):
```
docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.6.16 \
upgrade \
--manual-worker-upgrade \
--interactive
```
The --manual-worker-upgrade flag allows MKE to upgrade only the manager nodes. It adds an upgrade-hold label to all worker nodes, which prevents MKE from upgrading each worker node until the label is removed.
Join additional worker nodes to your cluster:
```
docker swarm join --token SWMTKN-<swarm-token> <manager-ip>:2377
```
For more information, refer to Join Linux nodes.

Note

New worker nodes will already have the newer version of MCR and MKE installed when they join the cluster.

Join MCR to the cluster:

docker swarm join --token SWMTKN-<your-token> <manager-ip>:2377

Pause all existing worker nodes to ensure that MKE does not deploy new workloads on existing nodes:
```
docker node update --availability pause <node-name>
```
Drain the paused nodes in preparation for migrating your workloads:
```
docker node update --availability drain <node-name>
```
Note

MKE automatically reschedules workloads onto new nodes while existing nodes are paused.
Remove each fully-drained node:
```
docker swarm leave <node-name>
```
Remove each manager node after its worker nodes become unresponsive:
```
docker node rm <node-name>
```
From any manager node, remove old MKE agents after the upgrade is complete, including 390x and Windows agents carried over from the previous install:
```
docker service rm ucp-agent
docker service rm ucp-agent-win
docker service rm ucp-agent-s390x
```

Troubleshoot the upgrade process¶

This topic describes common problems and errors that occur during the upgrade process and how to identify and resolve them.

To check for multiple conflicting upgrades:

The upgrade command automatically checks for multiple ucp-worker-agents, the existence of which can indicate that the cluster is still undergoing a prior manual upgrade. You must resolve the conflicting node labels before proceeding with the upgrade.

To resolve upgrade failures:

You can resolve upgrade failures on worker nodes by changing the node labels back to the previous version, but this is not supported on manager nodes.

To check Kubernetes errors:

For more information on anything that might have gone wrong during the upgrade process, check Kubernetes errors in node state messages after the upgrade is complete.

See also

Upgrade nodes to Windows Server 2022¶

You can upgrade your cluster to use Windows Server 2002 nodes in one of two ways. The approach that Mirantis recommends is to join nodes that have a fresh installation of Windows Server 2022, whereas the alternative is to perform an in-place upgrade of existing Windows Server 2019 nodes.

Approach #1 (Recommended): Join new Windows Server 2022 nodes¶

The preferred method for upgrading to Windows Server 2022 is to first add new nodes that are set to run the new operating system, and then remove the Windows Server 2019 nodes that the new nodes are meant to replace. You can do this by adding all of the new nodes prior to removing their original counterparts, or you can perform the operation one node at a time, as shown in the following procedure:

Join a new Windows Server 2022 node¶

Verify that the workloads you plan to run on the Windows Server 2022 nodes run on Windows Server 2022 images.
Join a Windows Server 2022 worker node to your cluster.
Apply a constraint or nodeSelector in order to run the required workloads on the Windows Server 2022 node:
- Swarm:
  
  Add the following constraint to your workloads:
```
"node.labels.OsVersion == 10.0.20348"
```
  For more information, refer to Add or remove a service constraint using the MKE web UI.
- Kubernetes:
  
  Update your workload nodeSelector field to:
```
nodeSelector:
  kubernetes.io/os: windows
  node.kubernetes.io/windows-build: 10.0.20348
```
Verify that your workloads are running on the Windows Server 2022 node.

Remove an existing Windows Server 2019 node¶

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required Window Server 2019 node.
In the upper right, select the Edit Node icon.
In the Availability section, click Drain.
Click Save to evict the workloads from the node.
In the upper right, select the vertical ellipsis and click Remove.
Click Confirm.

Note

If you are planning to run only Windows Server 2022 nodes, you can remove any added constraints or nodeSelectors. If, though, you plan to run a combination of Windows Server 2022 and Windows Server 2019 nodes, keep your constraints or nodeSelectors in place and add them to any future workloads. Refer to Operate a hybrid Windows cluster for more information.

Approach #2: Upgrade existing Windows Server nodes¶

While it is not recommended, you can upgrade to Windows Server 2022 by performing an in-place upgrade of the existing Windows Server 2019 nodes.

Upgrade existing Windows Server nodes¶

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required Window Server 2019 node.
In the upper right, select the Edit Node icon.
In the Availability section, click Drain.
Click Save to evict the workloads from the node.
Upgrade the node from Windows Server 2019 to Windows Server 2022.
- Windows full version nodes:
  
  Connect to the node and use the Windows UI to perform the upgrade. For instructions, refer to Perform an in-place upgrade of Windows Server in the Microsoft documentation.
- Windows core version nodes:
  1. Mount the ISO for Windows Server 2022.
    
    If you are using a physical server, insert a drive that has the Windows Server 2022 installation media installed. Otherwise, upload the ISO to the server and mount the image.
    
    Note
    
    Windows core version users can mount the ISO in PowerShell using Mount-DiskImage -ImagePath "path".
  2. Navigate to the drive where the ISO is mounted and run setup.exe to launch the setup wizard.
  3. Follow the steps offered in the Microsoft documentation, Perform an in-place upgrade of Windows Server.
Once the upgrade completes, remove all the MKE images on the node and re-pull them. Docker will automatically pull the image versions that are built for Windows Server 2022.

Note

To obtain the list of required images, refer to Configure the Docker daemon for Windows nodes.
If ucp-work-agent-win is not running on the node, go to the following section, To troubleshoot the upgrade process.
Return to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.
In the upper right, select the Edit Node icon.
In the Availability section, click Active.
Click Save.

Troubleshoot the upgrade process¶

If ucp-work-agent-win is not running on the node, use Docker Swarm to rerun the service on the node:
```
docker service update ucp-worker-agent-win-x
```
If ucp-work-agent-win is still not running on the node, it could be due operating system mismatches, which can occur after failing to update registry keys during the Windows upgrade process.
1. Review the output of the following command, looking for references to Windows Server 2019 or build number 17763:
```
Get-ItemProperty "HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion"
```
2. Update any out-of-date registry keys:
```
Set-Itemproperty -path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\' -Name CurrentBuildNumber -value 20348
```
Return to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.
In the upper right, select the Edit Node icon.
In the Availability section, click Active.
Click Save.

Migrate an MKE cluster to a new OS¶

MKE supports the use of a Node-replacement strategy in migrating an active cluster to any supported Linux OS.

Migrate manager nodes¶

When migrating manager Nodes, Mirantis recommends that you replace one manager Node at a time, to preserve fault tolerance and minimize performance impact.

Add a Node that is running the new OS to your MKE cluster.
Promote the new Node to an MKE manager and wait until the Node becomes healthy.
Demote a manager node that is running the old OS.
Remove the demoted Node from the cluster.
Repeat the previous steps until all manager Nodes are running the new OS.

Migrate worker nodes¶

It is not necessary to migrate worker Nodes one at a time.

Deploy applications with Swarm¶

Deploy a single-service application¶

This topic describes how to use both the MKE web UI and the CLI to deploy an NGINX web server and make it accessible on port 8000.

To deploy a single-service application using the MKE web UI:

Log in to the MKE web UI.
Navigate to Swarm > Services and click Create a service.
In the Service Name field, enter nginx.
In the Image Name field, enter nginx:latest.
Navigate to Network > Ports and click Publish Port.
In the Target port field, enter 80.
In the Protocol field, enter tcp.
In the Publish mode field, enter Ingress.
In the Published port field, enter 8000.
Click Confirm to map the ports for the NGINX service.
Specify the service image and ports.
Click Create to deploy the service into the MKE cluster.

To view the default NGINX page through the MKE web UI:

Navigate to Swarm > Services.
Click nginx.
Click Published Endpoints.
Click the link to open a new tab with the default NGINX home page.

To deploy a single service using the CLI:

Verify that you have downloaded and configured the client bundle.

Deploy the single-service application:

docker service create --name nginx \
--publish mode=ingress,target=80,published=8000 \
--label com.docker.ucp.access.owner=<your-username> \
nginx

View the default NGINX page by visiting http://<node-ip>:8000.

See also

Deploy a multi-service application¶

This topic describes how to use both the MKE web UI and the CLI to deploy a multi-service application for voting on whether you prefer cats or dogs.

To deploy a multi-service application using the MKE web UI:

Log in to the MKE web UI.
Navigate to Shared Resources > Stacks and click Create Stack.
In the Name field, enter voting-app.
Under ORCHESTRATOR MODE, select Swarm Services and click Next.

In the Add Application File editor, paste the following application definition written in the docker-compose.yml format:

version: "3"
services:

  # A Redis key-value store to serve as message queue
  redis:
    image: redis:alpine
    ports:
      - "6379"
    networks:
      - frontend

  # A PostgreSQL database for persistent storage
  db:
    image: postgres:9.4
    volumes:
      - db-data:/var/lib/postgresql/data
    networks:
      - backend

  # Web UI for voting
  vote:
    image: dockersamples/examplevotingapp_vote:before
    ports:
      - 5000:80
    networks:
      - frontend
    depends_on:
      - redis

  # Web UI to count voting results
  result:
    image: dockersamples/examplevotingapp_result:before
    ports:
      - 5001:80
    networks:
      - backend
    depends_on:
      - db

  # Worker service to read from message queue
  worker:
    image: dockersamples/examplevotingapp_worker
    networks:
      - frontend
      - backend

networks:
  frontend:
  backend:

volumes:
  db-data:

Click Create to deploy the stack.
In the list on the Shared Resources > Stacks page, verify that the application is deployed by looking for voting-app. If the application is in the list, it is deployed.
To view the individual application services, click voting-app and navigate to the Services tab.
Cast votes by accessing the service on port 5000.

Caution

MKE does not support referencing external files when using the MKE web UI to deploy applications, and thus does not support the following keywords:
- build
- dockerfile
- env_file
You must use a version control system to store the stack definition used to deploy the stack, as MKE does not store the stack definition.

To deploy a multi-service application using the MKE CLI:

Download and configure the client bundle.

Create a file named docker-compose.yml with the following content:

version: "3"
services:

  # A Redis key-value store to serve as message queue
  redis:
    image: redis:alpine
    ports:
      - "6379"
    networks:
      - frontend

  # A PostgreSQL database for persistent storage
  db:
    image: postgres:9.4
    volumes:
      - db-data:/var/lib/postgresql/data
    networks:
      - backend
    environment:
      - POSTGRES_PASSWORD=<password>

  # Web UI for voting
  vote:
    image: dockersamples/examplevotingapp_vote:before
    ports:
      - 5000:80
    networks:
      - frontend
    depends_on:
      - redis

  # Web UI to count voting results
  result:
    image: dockersamples/examplevotingapp_result:before
    ports:
      - 5001:80
    networks:
      - backend
    depends_on:
      - db

  # Worker service to read from message queue
  worker:
    image: dockersamples/examplevotingapp_worker
    networks:
      - frontend
      - backend

networks:
  frontend:
  backend:

volumes:
  db-data:

Create the application:

Built-in Swarm orchestrator
Classic Swarm orchestrator

docker stack deploy --compose-file docker-compose.yml voting-app

docker-compose --file docker-compose.yml --project-name voting-app up -d

Verify that the application is deployed:
```
docker stack ps voting-app
```
Cast votes by accessing the service on port 5000.

Deploy services to a Swarm collection¶

This topic describes how to use both the CLI and a Compose file to deploy application resources to a particular Swarm collection. Attach the Swarm collection path to the service access label to assign the service to the required collection. MKE automatically assigns new services to the default collection unless you use either of the methods presented here to assign a different Swarm collection.

Caution

To assign services to Swarm collections, an administrator must first create the Swarm collection and grant the user access to the required collection. Otherwise the deployment will fail.

Note

If required, you can place application resources into multiple collections.

To deploy a service to a Swarm collection using the CLI:

Use docker service create to deploy your service to a collection:

docker service create \
--name <service-name> \
--label com.docker.ucp.access.label="</collection/path>"
<app-name>:<version>

To deploy a service to a Swarm collection using a Compose file:

Use a labels: dictionary in a Compose file and add the Swarm collection path to the com.docker.ucp.access.label key.

The following example specifies two services, WordPress and MySQL, and assigns /Shared/wordpress to their access labels:

version: '3.1'

services:

  wordpress:
    image: wordpress
    networks:
      - wp
    ports:
      - 8080:80
    environment:
      WORDPRESS_DB_PASSWORD: example
    deploy:
      labels:
        com.docker.ucp.access.label: /Shared/wordpress
  mysql:
    image: mysql:5.7
    networks:
      - wp
    environment:
      MYSQL_ROOT_PASSWORD: example
    deploy:
      labels:
        com.docker.ucp.access.label: /Shared/wordpress

networks:
  wp:
    driver: overlay
    labels:
      com.docker.ucp.access.label: /Shared/wordpress

Log in to the MKE web UI.
Navigate to the Shared Resources > Stacks and click Create Stack.
Name the application wordpress.
Under ORCHESTRATOR MODE, select Swarm Services and click Next.
In the Add Application File editor, paste the Compose file.
Click Create to deploy the application
Click Done when the deployment completes.

Note

MKE reports an error if the /Shared/wordpress collection does not exist or if you do not have a grant for accessing it.

To confirm that the service deployed to the correct Swarm collection:

Navigate to Shared Resources > Stacks and select your application.
Navigate to the to Services tab and select the required service.
On the details pages, verify that the service is assigned to the correct Swarm collection.

Note

MKE creates a default overlay network for your stack that attaches to each container you deploy. This works well for administrators and those assigned full control roles. If you have lesser permissions, define a custom network with the same com.docker.ucp.access.label label as your services and attach this network to each service. This correctly groups your network with the other resources in your stack.

Use secrets in Swarm deployments¶

This topic describes how to create and use secrets with MKE by showing you how to deploy a WordPress application that uses a secret for storing a plaintext password. Other sensitive information you might use a secret to store includes TLS certificates and private keys. MKE allows you to securely store secrets and configure who can access and manage them using role-based access control (RBAC).

The application you will create in this topic includes the following two services:

wordpress
Apache, PHP, and WordPress
wordpress-db
MySQL database

The following example stores a password in a secret, and the secret is stored in a file inside the container that runs the services you will deploy. The services have access to the file, but no one else can see the plaintext password. To make things simple, you will not configure the database to persist data, and thus when the service stops, the data is lost.

To create a secret:

Log in to the MKE web UI.
Navigate to Swarm > Secrets and click Create.

Note

After you create the secret, you will not be able to edit or see the secret again.
Name the secret wordpress-password-v1.
In the Content field, assign a value to the secret.
Optional. Define a permission label so that other users can be given permission to use this secret.

Note

To use services and secrets together, they must either have the same permission label or no label at all.

To create a network for your services:

Navigate to Swarm > Networks and click Create.
Create a network called wordpress-network with the default settings.

To create the MySQL service:

Navigate to Swarm > Services and click Create.
Under Service Details, name the service wordpress-db.
Under Task Template, enter mysql:5.7.
In the left-side menu, navigate to Network, click Attach Network +, and select wordpress-network from the drop-down.
In the left-side menu, navigate to Environment, click Use Secret +, and select wordpress-password-v1 from the drop-down.
Click Confirm to associate the secret with the service.
Scroll down to Environment variables and click Add Environment Variable +.
Enter the following string to create an environment variable that contains the path to the password file in the container:
```
MYSQL_ROOT_PASSWORD_FILE=/run/secrets/wordpress-password-v1
```
If you specified a permission label on the secret, you must set the same permission label on this service.
Click Create to deploy the MySQL service.

This creates a MySQL service that is attached to the wordpress-network network and that uses the wordpress-password-v1 secret. By default, this creates a file with the same name in /run/secrets/<secret-name> inside the container running the service.

We also set the MYSQL_ROOT_PASSWORD_FILE environment variable to configure MySQL to use the content of the /run/secrets/wordpress-password-v1 file as the root password.

To create the WordPress service:

Navigate to Swarm > Services and click Create.
Under Service Details, name the service wordpress.
Under Task Template, enter wordpress:latest.
In the left-side menu, navigate to Network, click Attach Network +, and select wordpress-network from the drop-down.
In the left-side menu, navigate to Environment, click Use Secret +, and select wordpress-password-v1 from the drop-down.
Click Confirm to associate the secret with the service.
Scroll down to Environment variables and click Add Environment Variable +.
Enter the following string to create an environment variable that contains the path to the password file in the container:
```
WORDPRESS_DB_PASSWORD_FILE=/run/secrets/wordpress-password-v1
```
Add another environment variable and enter the following string:
```
WORDPRESS_DB_HOST=wordpress-db:3306
```
If you specified a permission label on the secret, you must set the same permission label on this service.
Click Create to deploy the WordPress service.

This creates a WordPress service that is attached to the same network as the MySQL service so that they can communicate, and maps the port 80 of the service to port 8000 of the cluster routing mesh.

Once you deploy this service, you will be able to access it on port 8000 using the IP address of any node in your MKE cluster.

To update a secret:

If the secret is compromised, you need to change it, update the services that use it, and delete the old secret.

Create a new secret named wordpress-password-v2.
From Swarm > Secrets, select the wordpress-password-v1 secret to view all the services that you need to update. In this example, it is straightforward, but that will not always be the case.
Update wordpress-db to use the new secret.
Update the MYSQL_ROOT_PASSWORD_FILE environment variable with either of the following methods:
- Update the environment variable directly with the following:
```
MYSQL_ROOT_PASSWORD_FILE=/run/secrets/wordpress-password-v2
```
- Mount the secret file in /run/secrets/wordpress-password-v1 by setting the Target Name field with wordpress-password-v1. This mounts the file with the wordpress-password-v2 content in /run/secrets/wordpress-password-v1.
Delete the wordpress-password-v1 secret and click Update.
Repeat the foregoing steps for the WordPress service.

Interlock¶

Layer 7 routing¶

MKE includes a system for application-layer (layer 7) routing that offers both application routing and load balancing (ingress routing) for Swarm orchestration. The Interlock architecture leverages Swarm components to provide scalable layer 7 routing and Layer 4 VIP mode functionality.

Swarm mode provides MCR with a routing mesh, which enables users to access services using the IP address of any node in the swarm. layer 7 routing enables you to access services through any node in the swarm by using a domain name, with Interlock routing the traffic to the node with the relevant container.

Interlock uses the Docker remote API to automatically configure extensions such as NGINX and HAProxy for application traffic. Interlock is designed for:

Full integration with MCR, including Swarm services, secrets, and configs
Enhanced configuration, including context roots, TLS, zero downtime deployment, and rollback
Support through extensions for external load balancers, such as NGINX, HAProxy, and F5
Least privilege for extensions, such that they have no Docker API access

Note

Interlock and layer 7 routing are used for Swarm deployments. Refer to NGINX Ingress Controller for information on routing traffic to your Kubernetes applications.

Terminology
Interlock services
Features and benefits

Terminology¶

Cluster: A group of compute resources running MKE
Swarm: An MKE cluster running in Swarm mode
Upstream: An upstream container that serves an application
Proxy service: A service, such as NGINX, that provides load balancing and proxying
Extension service: A secondary service that configures the proxy service
Service cluster: A combined Interlock extension and proxy service
gRPC: A high-performance RPC framework

Interlock services¶

Interlock

The central piece of the layer 7 routing solution. The core service is responsible for interacting with the Docker remote API and building an upstream configuration for the extensions. Interlock uses the Docker API to monitor events, and manages the extension and proxy services, and it serves this on a gRPC API that the extensions are configured to access.

Interlock manages extension and proxy service updates for both configuration changes and application service deployments. There is no operator intervention required.

The Interlock service starts a single replica on a manager node. The Interlock extension service runs a single replica on any available node, and the Interlock proxy service starts two replicas on any available node. Interlock prioritizes replica placement in the following order:

Replicas on the same worker node
Replicas on different worker nodes
Replicas on any available nodes, including managers

Interlock extension

A secondary service that queries the Interlock gRPC API for the upstream configuration. The extension service configures the proxy service according to the upstream configuration. For proxy services that use files such as NGINX or HAProxy, the extension service generates the file and sends it to Interlock using the gRPC API. Interlock then updates the corresponding Docker configuration object for the proxy service.

Interlock proxy

A proxy and load-balancing service that handles requests for the upstream application services. Interlock configures these using the data created by the corresponding extension service. By default, this service is a containerized NGINX deployment.

Features and benefits¶

High availability: All layer 7 routing components are failure-tolerant and leverage Docker Swarm for high availability.
Automatic configuration: Interlock uses the Docker API for automatic configuration, without needing you to manually update or restart anything to make services available. MKE monitors your services and automatically reconfigures proxy services.
Scalability: Interlock uses a modular design with a separate proxy service, allowing an operator to individually customize and scale the proxy Layer to handle user requests and meet services demands, with transparency and no downtime for users.
TLS: You can leverage Docker secrets to securely manage TLS certificates and keys for your services. Interlock supports both TLS termination and TCP passthrough.
Context-based routing: Interlock supports advanced application request routing by context or path.
Host mode networking: Layer 7 routing leverages the Docker Swarm routing mesh by default, but Interlock also supports running proxy and application services in host mode networking, allowing you to bypass the routing mesh completely, thus promoting maximum application performance.
Security: The layer 7 routing components that are exposed to the outside world run on worker nodes, thus your cluster will not be affected if they are compromised.
SSL: Interlock leverages Docker secrets to securely store and use SSL certificates for services, supporting both SSL termination and TCP passthrough.
Blue-green and canary service deployment: Interlock supports blue-green service deployment allowing an operator to deploy a new application while the current version is serving. Once the new application verifies the traffic, the operator can scale the older version to zero. If there is a problem, the operation is easy to reverse.
Service cluster support: Interlock supports multiple extension and proxy service combinations, thus allowing for operators to partition load balancing resources to be used, for example, in region- or organization-based load balancing.
Least privilege: Interlock supports being deployed where the load balancing proxies do not need to be colocated with a Swarm manager. This is a more secure approach to deployment as it ensures that the extension and proxy services do not have access to the Docker API.

Single Interlock deployment¶

When an application image is updated, the following actions occur:

The service is updated with a new version of the application.
The default “stop-first” policy stops the first replica before scheduling the second. The interlock proxies remove ip1.0 out of the backend pool as the app.1 task is removed.
The first application task is rescheduled with the new image after the first task stops.
The interlock proxy.1 is then rescheduled with the new NGINX configuration that contains the update for the new app.1 task.
After proxy.1 is complete, proxy.2 redeploys with the updated ngnix configuration for the app.1 task.
In this scenario, the amount of time that the service is unavailable is less than 30 seconds.

Optimizing Interlock for applications¶

Application update order¶

Swarm provides control over the order in which old tasks are removed while new ones are created. This is controlled on the service-level with --update-order.

stop-first (default)- Configures the currently updating task to stop before the new task is scheduled.
start-first - Configures the current task to stop after the new task has scheduled. This guarantees that the new task is running before the old task has shut down.

Use start-first if …

You have a single application replica and you cannot have service interruption. Both the old and new tasks run simultaneously during the update, but this ensurse that there is no gap in service during the update.

Use stop-first if …

Old and new tasks of your service cannot serve clients simultaneously.
You do not have enough cluster resourcing to run old and new replicas simultaneously.

In most cases, start-first is the best choice because it optimizes for high availability during updates.

Application update delay¶

Swarm services use update-delay to control the speed at which a service is updated. This adds a timed delay between application tasks as they are updated. The delay controls the time from when the first task of a service transitions to healthy state and the time that the second task begins its update. The default is 0 seconds, which means that a replica task begins updating as soon as the previous updated task transitions in to a healthy state.

Use update-delay if …

You are optimizing for the least number of dropped connections and a longer update cycle as an acceptable tradeoff.
Interlock update convergence takes a long time in your environment (can occur when having large amount of overlay networks).

Do not use update-delay if …

Service updates must occur rapidly.
Old and new tasks of your service cannot serve clients simultaneously.

Use application health checks¶

Swarm uses application health checks extensively to ensure that its updates do not cause service interruption. health-cmd can be configured in a Dockerfile or compose file to define a method for health checking an application. Without health checks, Swarm cannot determine when an application is truly ready to service traffic and will mark it as healthy as soon as the container process is running. This can potentially send traffic to an application before it is capable of serving clients, leading to dropped connections.

Application stop grace period¶

Use stop-grace-period to configure the maximum time period delay prior to force killing of the task (default: 10 seconds). In short, under the default setting a task can continue to run for no more than 10 seconds once its shutdown cycle has been initiated. This benefits applications that require long periods to process requests, allowing connection to terminate normally.

Interlock optimizations¶

Use service clusters for Interlock segmentation¶

Interlock service clusters allow Interlock to be segmented into multiple logical instances called “service clusters”, which have independently managed proxies. Application traffic only uses the proxies for a specific service cluster, allowing the full segmentation of traffic. Each service cluster only connects to the networks using that specific service cluster, which reduces the number of overlay networks to which proxies connect. Because service clusters also deploy separate proxies, this also reduces the amount of churn in LB configs when there are service updates.

Minimizing number of overlay networks¶

Interlock proxy containers connect to the overlay network of every Swarm service. Having many networks connected to Interlock adds incremental delay when Interlock updates its load balancer configuration. Each network connected to Interlock generally adds 1-2 seconds of update delay. With many networks, the Interlock update delay causes the LB config to be out of date for too long, which can cause traffic to be dropped.

Minimizing the number of overlay networks that Interlock connects to can be accomplished in two ways:

Reduce the number of networks. If the architecture permits it, applications can be grouped together to use the same networks.
Use Interlock service clusters. By segmenting Interlock, service clusters also segment which networks are connected to Interlock, reducing the number of networks to which each proxy is connected.
Use admin-defined networks and limit the number of networks per service cluster.

Use Interlock VIP Mode¶

VIP Mode can be used to reduce the impact of application updates on the Interlock proxies. It utilizes the Swarm L4 load balancing VIPs instead of individual task IPs to load balance traffic to a more stable internal endpoint. This prevents the proxy LB configs from changing for most kinds of app service updates reducing churn for Interlock. The following features are not supported in VIP mode:

Sticky sessions
Websockets
Canary deployments

The following features are supported in VIP mode:

Host & context routing
Context root rewrites
Interlock TLS termination
TLS passthrough
Service clusters

See also

Deploy¶

Deploy a layer 7 routing solution¶

This topic describes how to route traffic to Swarm services by deploying a layer 7 routing solution into a Swarm-orchestrated cluster. It has the following prerequisites:

MCR 17.06 or later
MKE in Swarm mode
Internet access (for offline installation instructions, refer to Offline installation considerations)

Enabling layer 7 routing causes the following to occur:

MKE creates the ucp-interlock overlay network.
MKE deploys the ucp-interlock service and attaches it both to the Docker socket and the overlay network that was created. This allows the Interlock service to use the Docker API, which is why this service needs to run on a manger node.
The ucp-interlock service starts the ucp-interlock-extension service and attaches it to the ucp-interlock network, allowing both services to communicate.
The ucp-interlock-extension generates a configuration for the proxy service to use. By default the proxy service is NGINX, so this service generates a standard NGINX configuration. MKE creates the com.docker.ucp.interlock.conf-1 configuration file and uses it to configure all the internal components of this service.
The ucp-interlock service takes the proxy configuration and uses it to start the ucp-interlock-proxy service.

Note

Layer 7 routing is disabled by default.

To enable layer 7 routing using the MKE web UI:

Log in to the MKE web UI as an administrator.
Navigate to <user-name> > Admin Settings.
Click Ingress.
Toggle the Swarm HTTP ingress slider to the right.
Optional. By default, the routing mesh service listens on port 8080 for HTTP and 8443 for HTTPS. Change these ports if you already have services using them.

The three primary Interlock services include the core service, the extensions, and the proxy. The following is the default MKE configuration, which is created automatically when you enable Interlock as described in this topic.

ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
AllowInsecure = false
PollInterval = "3s"

[Extensions]
  [Extensions.default]
    Image = "mirantis/ucp-interlock-extension:3.6.16"
    ServiceName = "ucp-interlock-extension"
    Args = []
    Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
    ProxyImage = "mirantis/ucp-interlock-proxy:3.6.16"
    ProxyServiceName = "ucp-interlock-proxy"
    ProxyConfigPath = "/etc/nginx/nginx.conf"
    ProxyReplicas = 2
    ProxyStopSignal = "SIGQUIT"
    ProxyStopGracePeriod = "5s"
    ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
    PublishMode = "ingress"
    PublishedPort = 8080
    TargetPort = 80
    PublishedSSLPort = 8443
    TargetSSLPort = 443
    [Extensions.default.Labels]
      "com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
    [Extensions.default.ContainerLabels]
      "com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
    [Extensions.default.ProxyLabels]
      "com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
    [Extensions.default.ProxyContainerLabels]
      "com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
    [Extensions.default.Config]
      Version = ""
      User = "nginx"
      PidPath = "/var/run/proxy.pid"
      MaxConnections = 1024
      ConnectTimeout = 5
      SendTimeout = 600
      ReadTimeout = 600
      IPHash = false
      AdminUser = ""
      AdminPass = ""
      SSLOpts = ""
      SSLDefaultDHParam = 1024
      SSLDefaultDHParamPath = ""
      SSLVerify = "required"
      WorkerProcesses = 1
      RLimitNoFile = 65535
      SSLCiphers = "HIGH:!aNULL:!MD5"
      SSLProtocols = "TLSv1.2"
      AccessLogPath = "/dev/stdout"
      ErrorLogPath = "/dev/stdout"
      MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t    '$status $body_bytes_sent \"$http_referer\" '\n\t\t    '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
      TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t    '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t    '\"$http_x_forwarded_for\" $request_id $msec $request_time '\n\t\t    '$upstream_connect_time $upstream_header_time $upstream_response_time';"
      KeepaliveTimeout = "75s"
      ClientMaxBodySize = "32m"
      ClientBodyBufferSize = "8k"
      ClientHeaderBufferSize = "1k"
      LargeClientHeaderBuffers = "4 8k"
      ClientBodyTimeout = "60s"
      UnderscoresInHeaders = false
      HideInfoHeaders = false

Note

The value of LargeClientHeaderBuffers indicates the number of buffers to use to read a large client request header, as well as the size of those buffers.

To enable layer 7 routing from the command line:

Interlock uses a TOML file for the core service configuration. The following example uses Swarm deployment and recovery features by creating a Docker config object.

Create a Docker config object:

cat << EOF | docker config create service.interlock.conf -
ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
PollInterval = "3s"

[Extensions]
  [Extensions.default]
    Image = "mirantis/ucp-interlock-extension:3.6.16"
    Args = ["-D"]
    ProxyImage = "mirantis/ucp-interlock-proxy:3.6.16"
    ProxyArgs = []
    ProxyConfigPath = "/etc/nginx/nginx.conf"
    ProxyReplicas = 1
    ProxyStopGracePeriod = "3s"
    ServiceCluster = ""
    PublishMode = "ingress"
    PublishedPort = 8080
    TargetPort = 80
    PublishedSSLPort = 8443
    TargetSSLPort = 443
    [Extensions.default.Config]
      User = "nginx"
      PidPath = "/var/run/proxy.pid"
      WorkerProcesses = 1
      RlimitNoFile = 65535
      MaxConnections = 2048
EOF
oqkvv1asncf6p2axhx41vylgt

Create a dedicated network for Interlock and the extensions:
```
docker network create --driver overlay ucp-interlock
```

Create the Interlock service:

docker service create \
--name ucp-interlock \
--mount src=/var/run/docker.sock,dst=/var/run/docker.sock,type=bind \
--network ucp-interlock \
--constraint node.role==manager \
--config src=service.interlock.conf,target=/config.toml \
mirantis/ucp-interlock:3.6.16 -D run -c /config.toml

Note

The Interlock core service must have access to a Swarm manager (--constraint node.role==manager), however the extension and proxy services are recommended to run on workers.

Verify that the three services are created, one for the Interlock service, one for the extension service, and one for the proxy service:

docker service ls
ID                  NAME                     MODE                REPLICAS            IMAGE                                                                PORTS
sjpgq7h621ex        ucp-interlock            replicated          1/1                 mirantis/ucp-interlock:3.6.16
oxjvqc6gxf91        ucp-interlock-extension  replicated          1/1                 mirantis/ucp-interlock-extension:3.6.16
lheajcskcbby        ucp-interlock-proxy      replicated          1/1                 mirantis/ucp-interlock-proxy:3.6.16        *:80->80/tcp *:443->443/tcp

Configure layer 7 routing for production¶

This topic describes how to configure Interlock for a production environment and builds upon the instruction in the previous topic, Deploy a layer 7 routing solution. It does not describe infrastructure deployment, and it assumes you are using a typical Swarm cluster, using docker init and docker swarm join from the nodes.

The layer 7 solution that ships with MKE is highly available, fault tolerant, and designed to work independently of how many nodes you manage with MKE.

The following procedures require that you dedicate two worker nodes for running the ucp-interlock-proxy service. This tuning ensures the following:

The proxy services have dedicated resources to handle user requests. You can configure these nodes with higher performance network interfaces.
No application traffic can be routed to a manager node, thus making your deployment more secure.
If one of the two dedicated nodes fails, layer 7 routing continues working.

To dedicate two nodes to running the proxy service:

Select two nodes that you will dedicate to running the proxy service.
Log in to one of the Swarm manager nodes.

Add labels to the two dedicated proxy service nodes, configuring them as load balancer worker nodes, for example, lb-00 and lb-01:

docker node update --label-add nodetype=loadbalancer lb-00
lb-00
docker node update --label-add nodetype=loadbalancer lb-01
lb-01

Verify that the labels were added successfully:

docker node inspect -f '{{ .Spec.Labels  }}' lb-00
map[nodetype:loadbalancer]
docker node inspect -f '{{ .Spec.Labels  }}' lb-01
map[nodetype:loadbalancer]

To update the proxy service:

You must update the ucp-interlock-proxy service configuration to deploy the proxy service properly constrained to the dedicated worker nodes.

From a manager node, add a constraint to the ucp-interlock-proxy service to update the running service:
```
docker service update --replicas=2 \
--constraint-add node.labels.nodetype==loadbalancer \
--stop-signal SIGQUIT \
--stop-grace-period=5s \
$(docker service ls -f 'label=type=com.docker.interlock.core.proxy' -q)
```
This updates the proxy service to have two replicas, ensures that they are constrained to the workers with the label nodetype==loadbalancer, and configures the stop signal for the tasks to be a SIGQUIT with a grace period of five seconds. This ensures that NGINX does not exit before the client request is finished.

Inspect the service to verify that the replicas have started on the selected nodes:

docker service ps $(docker service ls -f \
'label=type=com.docker.interlock.core.proxy' -q)

Example of system response:

ID            NAME                    IMAGE          NODE     DESIRED STATE   CURRENT STATE                   ERROR   PORTS
o21esdruwu30  interlock-proxy.1       nginx:alpine   lb-01    Running         Preparing 3 seconds ago
n8yed2gp36o6   \_ interlock-proxy.1   nginx:alpine   mgr-01   Shutdown        Shutdown less than a second ago
aubpjc4cnw79  interlock-proxy.2       nginx:alpine   lb-00    Running         Preparing 3 seconds ago

Add the constraint to the ProxyConstraints array in the interlock-proxy service configuration in case Interlock is restored from backup:

[Extensions]
  [Extensions.default]
    ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.nodetype==loadbalancer"]

Optional. By default, the config service is global, scheduling one task on every node in the cluster. To modify constraint scheduling, update the ProxyConstraints variable in the Interlock configuration file. Refer to Configure layer 7 routing service for more information.
Verify that the proxy service is running on the dedicated nodes:
```
docker service ps ucp-interlock-proxy
```
Update the settings in the upstream load balancer, such as ELB or F5, with the addresses of the dedicated ingress workers, thus directing all traffic to these two worker nodes.

See also

Offline installation considerations¶

To install Interlock on your cluster without an Internet connection, you must have the required Docker images loaded on your computer. This topic describes how to export the required images from a local instance of MCR and then load them to your Swarm-orchestrated cluster.

To export Docker images from a local instance:

Using a local instance of MCR, save the required images:
```
docker save mirantis/ucp-interlock:3.6.16 > interlock.tar
docker save mirantis/ucp-interlock-extension:3.6.16 > interlock-extension-nginx.tar
docker save mirantis/ucp-interlock-proxy:3.6.16 > interlock-proxy-nginx.tar
```
This saves the following three files:
- interlock.tar - the core Interlock application.
- interlock-extension-nginx.tar - the Interlock extension for NGINX.
- interlock-proxy-nginx.tar - the official NGINX image based on Alpine.
Note

Replace mirantis/ucp-interlock-extension:3.6.16 and mirantis/ucp-interlock-proxy:3.6.16 with the corresponding extension and proxy image if you are not using NGINX.

Copy the three files you just saved to each node in the cluster and load each image:

docker load < interlock.tar
docker load < interlock-extension-nginx.tar
docker load < interlock-proxy-nginx.tar

Refer to Deploy a layer 7 routing solution to continue the installation.

See also

Configure¶

Configure layer 7 routing service¶

This section describes how to customize layer 7 routing by updating the ucp-interlock service with a new Docker configuration, including configuration options and the procedure for creating a proxy service.

Configure the Interlock service¶

This topic describes how to update the ucp-interlock service with a new Docker configuration.

Obtain the current configuration for the ucp-interlock service and save it as a TOML file named config.toml:

CURRENT_CONFIG_NAME=$(docker service inspect --format \
'{{ (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' \
ucp-interlock) && docker config inspect --format \
'{{ printf "%s" .Spec.Data }}' $CURRENT_CONFIG_NAME > config.toml

Configure config.toml as required. Refer to Configuration file options for layer 7 routing for layer 7 routing customization options.

Create a new Docker configuration object from the config.toml file:

NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$\
(( $(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
docker config create $NEW_CONFIG_NAME config.toml

Verify that the configuration was successfully created:

docker config ls --filter name=com.docker.ucp.interlock

Example output:

ID                          NAME                              CREATED          UPDATED
vsnakyzr12z3zgh6tlo9mqekx   com.docker.ucp.interlock.conf-1   6 hours ago      6 hours ago
64wp5yggeu2c262z6flhaos37   com.docker.ucp.interlock.conf-2   54 seconds ago   54 seconds ago

Optional. If you provide an invalid configuration, the ucp-interlock service is configured to roll back to a previous stable configuration, by default. Configure the service to pause instead of rolling back:
```
docker service update \
--update-failure-action pause \
ucp-interlock
```

Update the ucp-interlock service to begin using the new configuration:

docker service update \
--config-rm $CURRENT_CONFIG_NAME \
--config-add source=$NEW_CONFIG_NAME,target=/config.toml \
ucp-interlock

Enable Interlock proxy NGINX debugging mode¶

As Interlock proxy NGINX debugging mode generates copious log files and can produce core dumps, you can only set it manually to run.

Caution

Mirantis strongly recommends that you use debugging mode only for as long as is necessary, and that you do not use it in production environments.

Obtain the current configuration for the ucp-interlock service and save it as a TOML file named config.toml:

CURRENT_CONFIG_NAME=$(docker service inspect --format \
'{{ (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' \
ucp-interlock) docker config inspect --format \
'{{ printf "%s" .Spec.Data }}' $CURRENT_CONFIG_NAME > config.toml

Add the ProxyArgs attribute to the config.toml file, if it is not already present, and assign to it the following value:
```
ProxyArgs = ["/entrypoint.sh","nginx-debug","-g","daemon off;"]
```
Set the value of ProxyArgs to ["/entrypoint.sh","nginx-debug","-g","daemon off;"].

Create a new Docker configuration object from the config.toml file:

NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$\
(( $(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
docker config create $NEW_CONFIG_NAME config.toml

Update the ucp-interlock service to begin using the new configuration:

docker service update \
--config-rm $CURRENT_CONFIG_NAME \
--config-add source=$NEW_CONFIG_NAME,target=/config.toml \
ucp-interlock

Configuration file options for layer 7 routing¶

This topic describes the configuration options for the primary Interlock services.

For configuration instructions, see Configure layer 7 routing service.

Core configuration¶

The following core configuration options are available for the ucp-interlock service:

Option	Type	Description
`ListenAddr`	string	Address to serve the Interlock GRPC API. The default is `8080`.
`DockerURL`	string	Path to the socket or TCP address to the Docker API. The default is `unix:// /var/run/docker.sock`.
`TLSCACert`	string	Path to the CA certificate for connecting securely to the Docker API.
`TLSCert`	string	Path to the certificate for connecting securely to the Docker API.
`TLSKey`	string	Path to the key for connecting securely to the Docker API.
`AllowInsecure`	bool	A value of `true` skips TLS verification when connecting to the Docker API via TLS.
`PollInterval`	string	Interval to poll the Docker API for changes. The default is `3s`.
`EndpointOverride`	string	Override the default GRPC API endpoint for extensions. Swarm detects the default.
`Extensions`	[]extension	Refer to Extension configuration for the array of extensions.

Extension configuration¶

The following options are available to configure the extensions. Interlock must contain at least one extension to service traffic.

Option	Type	Description
`Image`	string	Name of the Docker image to use for the extension.
`Args`	[]string	Arguments to pass to the extension service.
`Labels`	map[string]string	Labels to add to the extension service.
`Networks`	[]string	Allows the administrator to cherry pick a list of networks that Interlock can connect to. If this option is not specified, the proxy service can connect to all networks.
`ContainerLabels`	map[string]string	Labels for the extension service tasks.
`Constraints`	[]string	One or more constraints to use when scheduling the extension service.
`PlacementPreferences`	[]string	One of more placement preferences.
`ServiceName`	string	Name of the extension service.
`ProxyImage`	string	Name of the Docker image to use for the proxy service.
`ProxyArgs`	[]string	Arguments to pass to the proxy service.
`ProxyLabels`	map[string]string	Labels to add to the proxy service.
`ProxyContainerLabels`	map[string]string	Labels to add to the proxy service tasks.
`ProxyServiceName`	string	Name of the proxy service.
`ProxyConfigPath`	string	Path in the service for the generated proxy configuration.
`ProxyReplicas`	unit	Number or proxy service replicas.
`ProxyStopSignal`	string	Stop signal for the proxy service. For example, `SIGQUIT`.
`ProxyStopGracePeriod`	string	Stop grace period for the proxy service in seconds. For example, `5s`.
`ProxyConstraints`	[]string	One or more constraints to use when scheduling the proxy service. Set the variable to `false`, as it is currently set to `true` by default.
`ProxyPlacementPreferences`	[]string	One or more placement preferences to use when scheduling the proxy service.
`ProxyUpdateDelay`	string	Delay between rolling proxy container updates.
`ServiceCluster`	string	Name of the cluster that this extension serves.
`PublishMode`	string (`ingress` or `host`)	Publish mode that the proxy service uses.
`PublishedPort`	int	Port on which the proxy service serves non-SSL traffic.
`PublishedSSLPort`	int	Port on which the proxy service serves SSL traffic.
`Template`	int	Docker configuration object that is used as the extension template.
`Config`	config	Proxy configuration used by the extensions as described in this section.
`HitlessServiceUpdate`	bool	When set to `true`, services can be updated without restarting the proxy container.
`ConfigImage`	config	Name for the config service used by hitless service updates. For example, `mirantis/ucp-interlock-config:3.2.1`.
`ConfigServiceName`	config	Name of the config service. This name is equivalent to `ProxyServiceName`. For example, `ucp-interlock-config`.

Proxy configuration¶

Options are available to the extensions, and the extensions use the options needed for proxy service configuration. This provides overrides to the extension configuration.

Because Interlock passes the extension configuration directly to the extension, each extension has different configuration options available.

The default proxy service used by MKE to provide layer 7 routing is NGINX. If users try to access a route that has not been configured, they will see the default NGINX 404 page.

You can customize this by labeling a service with com.docker.lb.default_backend=true. If users try to access a route that is not configured, they will be redirected to the custom service.

For details, see Create a proxy service.

See also

Create a proxy service¶

If you want to customize the default NGINX proxy service used by MKE to provide layer 7 routing, follow the steps below to create an example proxy service where users will be redirected if they try to access a route that is not configured.

To create an example proxy service:

Create a docker-compose.yml file:

version: "3.2"

services:
  demo:
    image: httpd
    deploy:
      replicas: 1
      labels:
        com.docker.lb.default_backend: "true"
        com.docker.lb.port: 80
    networks:
      - demo-network

networks:
  demo-network:
    driver: overlay

Download and configure the client bundle and deploy the service:
```
docker stack deploy --compose-file docker-compose.yml demo
```
If users try to access a route that is not configured, they are directed to this demo service.
Optional. To minimize forwarding interruption to the updating service while updating a single replicated service, add the following line to the labels section of the docker-compose.yml file:
```
com.docker.lb.backend_mode: "vip"
```
And then update the existing service:
```
docker stack deploy --compose-file docker-compose.yml demo
```

Refer to Use service labels for information on how to set Interlock labels on services.

Configure host mode networking¶

Layer 7 routing components communicate with one another by default using overlay networks, but Interlock also supports host mode networking in a variety of ways, including proxy only, Interlock only, application only, and hybrid.

When using host mode networking, you cannot use DNS service discovery, since that functionality requires overlay networking. For services to communicate, each service needs to know the IP address of the node where the other service is running.

Note

Use an alternative to DNS service discovery such as Registrator if you require this functionality.

The following is a high-level overview of how to use host mode instead of overlay networking:

Update the ucp-interlock configuration.
Deploy your Swarm services.
Configure proxy services.

If you have not already done so, configure the layer 7 routing solution for production with the ucp-interlock-proxy service replicas running on their own dedicated nodes.

Update the ucp-interlock configuration¶

Update the PublishMode key in the ucp-interlock service configuration so that it uses host mode networking:
```
PublishMode = "host"
```
Update the ucp-interlock service to use the new Docker configuration so that it starts publishing its port on the host:
```
docker service update \
--config-rm $CURRENT_CONFIG_NAME \
--config-add source=$NEW_CONFIG_NAME,target=/config.toml \
--publish-add mode=host,target=8080 \
ucp-interlock
```
The ucp-interlock and ucp-interlock-extension services are now communicating using host mode networking.

Deploy Swarm services¶

This section describes how to deploy an example Swarm service on an eight-node cluster using host mode networking to route traffic without using overlay networks. The cluster has three manager nodes and five worker nodes, with two workers configured as dedicated ingress cluster load balancer nodes that will receive all application traffic.

This example does not cover the actual infrastructure deployment, and assumes you have a typical Swarm cluster using docker init and docker swarm join from the nodes.

Download and configure the client bundle.

Deploy an example Swarm demo service that uses host mode networking:

docker service create \
--name demo \
--detach=false \
--label com.docker.lb.hosts=app.example.org \
--label com.docker.lb.port=8080 \
--publish mode=host,target=8080 \
--env METADATA="demo" \
mirantiseng/docker-demo

This example allocates a high random port on the host where the service can be reached.

Test that the service works:
```
curl --header "Host: app.example.org" \
http://<proxy-address>:<routing-http-port>/ping
```
- <proxy-address> is the domain name or IP address of a node where the proxy service is running.
- <routing-http-port> is the port used to route HTTP traffic.
A properly-working service will produce a result similar to the following:
```
{"instance":"63b855978452", "version":"0.1", "request_id":"d641430be9496937f2669ce6963b67d6"}
```

Log in to one of the manager nodes and configure the load balancer worker nodes with node labels in order to pin the Interlock Proxy service:

docker node update --label-add nodetype=loadbalancer lb-00
lb-00
docker node update --label-add nodetype=loadbalancer lb-01
lb-01

Verify that the labels were successfully added to each node:

docker node inspect -f '{{ .Spec.Labels  }}' lb-00
map[nodetype:loadbalancer]
docker node inspect -f '{{ .Spec.Labels  }}' lb-01
map[nodetype:loadbalancer]

Create a configuration object for Interlock that specifies host mode networking:

cat << EOF | docker config create service.interlock.conf -
ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
PollInterval = "3s"

[Extensions]
  [Extensions.default]
    Image = "mirantis/ucp-interlock-extension:3.6.16"
    Args = []
    ServiceName = "interlock-ext"
    ProxyImage = "mirantis/ucp-interlock-proxy:3.6.16"
    ProxyArgs = []
    ProxyServiceName = "interlock-proxy"
    ProxyConfigPath = "/etc/nginx/nginx.conf"
    ProxyReplicas = 1
    PublishMode = "host"
    PublishedPort = 80
    TargetPort = 80
    PublishedSSLPort = 443
    TargetSSLPort = 443
    [Extensions.default.Config]
      User = "nginx"
      PidPath = "/var/run/proxy.pid"
      WorkerProcesses = 1
      RlimitNoFile = 65535
      MaxConnections = 2048
EOF
oqkvv1asncf6p2axhx41vylgt

Create the Interlock service using host mode networking:

docker service create \
--name interlock \
--mount src=/var/run/docker.sock,dst=/var/run/docker.sock,type=bind \
--constraint node.role==manager \
--publish mode=host,target=8080 \
--config src=service.interlock.conf,target=/config.toml \
mirantis/ucp-interlock:3.6.16 -D run -c /config.toml
sjpgq7h621exno6svdnsvpv9z

Configure proxy services¶

You can use node labels to reconfigure the Interlock Proxy services to be constrained to the workers.

From a manager node, pin the proxy services to the load balancer worker nodes:

docker service update \
--constraint-add node.labels.nodetype==loadbalancer \
interlock-proxy

Deploy the application:

docker service create \
--name demo \
--detach=false \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--publish mode=host,target=8080 \
--env METADATA="demo" \
mirantiseng/docker-demo

This runs the service using host mode networking. Each task for the service has a high port, such as 32768, and uses the node IP address to connect.

Inspect the headers from the request to verify that each task uses the node IP address to connect:

curl -vs -H "Host: demo.local" http://127.0.0.1/ping
curl -vs -H "Host: demo.local" http://127.0.0.1/ping

Example of system response:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Fri, 10 Nov 2017 15:38:40 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 110
< Connection: keep-alive
< Set-Cookie: session=1510328320174129112; Path=/; Expires=Sat, 11 Nov 2017 15:38:40 GMT; Max-Age=86400
< x-request-id: e4180a8fc6ee15f8d46f11df67c24a7d
< x-proxy-id: d07b29c99f18
< x-server-info: interlock/2.0.0-preview (17476782) linux/amd64
< x-upstream-addr: 172.20.0.4:32768
< x-upstream-response-time: 1510328320.172
<
{"instance":"897d3c7b9e9c","version":"0.1","metadata":"demo","request_id":"e4180a8fc6ee15f8d46f11df67c24a7d"}

Configure NGINX¶

By default, NGINX is used as a proxy. The following configuration options are available for the NGINX extension.

Note

The ServerNamesHashBucketSize option, which allowed the user to manually set the bucket size for the server names hash table, was removed in MKE 3.4.2 because MKE now adaptively calculates the setting and overrides any manual input.

Option	Type	Description	Defaults
`User`	string	User name for the proxy	`nginx`
`PidPath`	string	Path to the PID file for the proxy service	`/var/run/proxy.pid`
`MaxConnections`	int	Maximum number of connections for the proxy service	`1024`
`ConnectTimeout`	int	Timeout in seconds for clients to connect	`600`
`SendTimeout`	int	Timeout in seconds for the service to read a response from the proxied upstream	`600`
`ReadTimeout`	int	Timeout in seconds for the service to read a response from the proxied upstream	`600`
`SSLOpts`	int	Options to be passed when configuring SSL	N/A
`SSLDefaultDHParam`	int	Size of DH parameters	`1024`
`SSLDefaultDHParamPath`	string	Path to DH parameters file	N/A
`SSLVerify`	string	SSL client verification	`required`
`WorkerProcesses`	string	Number of worker processes for the proxy service	`1`
`RLimitNoFile`	int	Maximum number of open files for the proxy service	`65535`
`SSLCiphers`	string	SSL ciphers to use for the proxy service	`HIGH:!aNULL:!MD5`
`SSLProtocols`	string	Enable the specified TLS protocols	`TLSv1.2`
`HideInfoHeaders`	bool	Hide proxy-related response headers	N/A
`KeepaliveTimeout`	string	Connection keep-alive timeout	`75s`
`ClientMaxBodySize`	string	Maximum allowed client request body size	`1 m`
`ClientBodyBufferSize`	string	Buffer size for reading client request body	`8k`
`ClientHeaderBufferSize`	string	Maximum number and size of buffers used for reading large client request header	`1k`
`LargeClientHeaderBuffers`	string	Maximum number and size of buffers used for reading large client request header	`4 8k`
`ClientBodyTimeout`	string	Timeout for reading client request body	`60s`
`UnderscoresInHeaders`	bool	Enables or disables the use of underscores in client request header fields	`false`
`UpstreamZoneSize`	int	Size of the shared memory zone (in KB)	`64`
`GlobalOptions`	[]string	List of options that are included in the global configuration	N/A
`HTTPOptions`	[]string	List of options that are included in the HTTP configuration	N/A
`TCPOptions`	[]string	List of options that are included in the stream (TCP) configuration	N/A
`AccessLogPath`	string	Path to use for access logs	`/dev/stdout`
`ErrorLogPath`	string	Path to use for error logs	`/dev/stdout`
`MainLogFormat`	string	Format to use for main logger	N/A
`TraceLogFormat`	string	Format to use for trace logger	N/A

See also

Kubernetes official documentation

Tune the proxy service¶

This topic describes how to tune various components of the proxy service.

Constrain the proxy service to multiple dedicated worker nodes:
```
<need-sme-instructions>
```
Adjust the stop signal and grace period, for example, to SIGTERM for the stop signal and ten seconds for the grace period:
```
docker service update --stop-signal=SIGTERM \
--stop-grace-period=10s interlock-proxy
```
Change the action that Swarm takes when an update fails using update-failure-action (the default is pause), for example, to rollback to the previous configuration:
```
docker service update --update-failure-action=rollback \
interlock-proxy
```
Change the amount of time between proxy updates using update-delay (the default is to use rolling updates), for example, setting the delay to thirty seconds:
```
docker service update --update-delay=30s interlock-proxy
```

Update Interlock services¶

This topic describes how to update Interlock services by first updating the Interlock configuration to specify the new extension or proxy image versions and then updating the Interlock services to use the new configuration and image.

To update Interlock services:

Create the new Interlock configuration:

docker config create service.interlock.conf.v2 <path-to-new-config>

Remove the old configuration and specify the new configuration:

docker service update --config-rm \
service.interlock.conf ucp-interlock
docker service update --config-add \
source=service.interlock.conf.v2,target=/config.toml \
ucp-interlock

Update the Interlock service to use the new image, for example, to pull the latest version of MKE:

docker pull v/ucp:latest

Example output:

latest: Pulling from mirantis/ucp
cd784148e348: Already exists
3871e7d70c20: Already exists
cad04e4a4815: Pull complete
Digest: sha256:63ca6d3a6c7e94aca60e604b98fccd1295bffd1f69f3d6210031b72fc2467444
Status: Downloaded newer image for mirantis/ucp:latest
docker.io/mirantis/ucp:latest

List all of the latest MKE images:

docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp images --list

Example output

mirantis/ucp-agent:3.6.16
mirantis/ucp-auth-store:3.6.16
mirantis/ucp-auth:3.6.16
mirantis/ucp-azure-ip-allocator:3.6.16
mirantis/ucp-calico-cni:3.6.16
mirantis/ucp-calico-kube-controllers:3.6.16
mirantis/ucp-calico-node:3.6.16
mirantis/ucp-cfssl:3.6.16
mirantis/ucp-compose:3.6.16
mirantis/ucp-controller:3.6.16
mirantis/ucp-dsinfo:3.6.16
mirantis/ucp-etcd:3.6.16
mirantis/ucp-hyperkube:3.6.16
mirantis/ucp-interlock-extension:3.6.16
mirantis/ucp-interlock-proxy:3.6.16
mirantis/ucp-interlock:3.6.16
mirantis/ucp-kube-compose-api:3.6.16
mirantis/ucp-kube-compose:3.6.16
mirantis/ucp-kube-dns-dnsmasq-nanny:3.6.16
mirantis/ucp-kube-dns-sidecar:3.6.16
mirantis/ucp-kube-dns:3.6.16
mirantis/ucp-metrics:3.6.16
mirantis/ucp-pause:3.6.16
mirantis/ucp-swarm:3.6.16
mirantis/ucp:3.6.16

Start Interlock to verify the configuration object, which has the new extension version, and deploy a rolling update on all extensions:
```
docker service update \
--image mirantis/ucp-interlock:3.6.16 \
ucp-interlock
```

Routing traffic to services¶

Route traffic to a Swarm service¶

After Interlock is deployed, you can launch and publish services and applications. This topic describes how to configure services to publish themselves to the load balancer by using service labels.

Caution

The following procedures assume a DNS entry exists for each of the applications (or local hosts entry for local testing).

To publish a demo service with four replicas to the host (demo.local):

Create a Docker Service using the following two labels:
- com.docker.lb.hosts for Interlock to determine where the service is available.
- com.docker.lb.port for the proxy service to determine which port to use to access the upstreams.
Create an overlay network so that service traffic is isolated and secure:
```
docker network create -d overlay demo
1se1glh749q1i4pw0kf26mfx5
```

Deploy the application:

docker service create \
--name demo \
--network demo \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
mirantiseng/docker-demo
6r0wiglf5f3bdpcy6zesh1pzx

Interlock detects when the service is available and publishes it.

After tasks are running and the proxy service is updated, the application is available through http://demo.local:

curl -s -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"c2f1afe673d4","version":"0.1",request_id":"7bcec438af14f8875ffc3deab9215bc5"}

To increase service capacity, use the docker service scale command:
```
docker service scale demo=4
demo scaled to 4
```

The load balancer balances traffic across all four service replicas configured in this example.

To publish a service with a web interface

This procedure deploys a simple service that includes the following:

A JSON endpoint that returns the ID of the task serving the request.
A web interface available at http://app.example.org that shows how many tasks the service is running.

Create a docker-compose.yml file that includes the following:

version: "3.2"

services:
  demo:
    image: mirantiseng/docker-demo
    deploy:
      replicas: 1
      labels:
        com.docker.lb.hosts: app.example.org
        com.docker.lb.network: demo_demo-network
        com.docker.lb.port: 8080
    networks:
      - demo-network

networks:
  demo-network:
    driver: overlay

Label	Description
`com.docker.lb.hosts`	Defines the hostname for the service. When the layer 7 routing solution gets a request containing `app.example.org` in the host header, that request is forwarded to the demo service.
`com.docker.lb.network`	Defines which network the `ucp-interlock-proxy` should attach to in order to communicate with the demo service. To use layer 7 routing, you must attach your services to at least one network. If your service is attached to a single network, you do not need to add a label to specify which network to use for routing. When using a common stack file for multiple deployments leveraging MKE Interlock and layer 7 routing, prefix `com.docker.lb.network` with the stack name to ensure traffic is directed to the correct overlay network. In combination with `com.docker.lb.ssl_passthrough`, the label in mandatory even if your service is only attached to a single network.
`com.docker.lb.port`	Specifies which port the `ucp-interlock-proxy` service should use to communicate with this demo service. Your service does not need to expose a port in the Swarm routing mesh. All communications are done using the network that you have specified.

The ucp-interlock service detects that your service is using these labels and automatically reconfigures the ucp-interlock-proxy service.

Download and configure the client bundle and deploy the service:

docker stack deploy --compose-file docker-compose.yml demo

To test your services using the CLI:

Verify that requests are routed to the demo service:

curl --header "Host: app.example.org" \
http://<mke-address>:<routing-http-port>/ping

<mke-address> is the domain name or IP address of an MKE node.
<routing-http-port> is the port used to route HTTP traffic.

Example of a successful response:

{"instance":"63b855978452", "version":"0.1", "request_id":"d641430be9496937f2669ce6963b67d6"}

To test your services using a browser:

Because the demo service exposes an HTTP endpoint, you can also use your browser to validate that it works.

Verify that the /etc/hosts file in your system has an entry mapping app.example.org to the IP address of an MKE node.
Navigate to http://app.example.org in your browser.

Publish a service as a canary instance¶

This topic describes how to publish an initial or an updated service as a canary instance.

To publish a service as a canary instance:

Create an overlay network to isolate and secure service traffic:
```
docker network create -d overlay demo
```
Example output:
```
1se1glh749q1i4pw0kf26mfx5
```

Create the initial service:

docker service create \
--name demo-v1 \
--network demo \
--detach=false \
--replicas=4 \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--env METADATA="demo-version-1" \
mirantiseng/docker-demo

Interlock detects when the service is available and publishes it.

After tasks are running and the proxy service is updated, the application is available at http://demo.local:

curl -vs -H "Host: demo.local" http://127.0.0.1/ping

Example output:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to demo.local (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 20:28:26 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 120
< Connection: keep-alive
< Set-Cookie: session=1510172906715624280; Path=/; Expires=Thu, 09 Nov 2017 20:28:26 GMT; Max-Age=86400
< x-request-id: f884cf37e8331612b8e7630ad0ee4e0d
< x-proxy-id: 5ad7c31f9f00
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
< x-upstream-addr: 10.0.2.4:8080
< x-upstream-response-time: 1510172906.714
<
{"instance":"df20f55fc943","version":"0.1","metadata":"demo-version-1","request_id":"f884cf37e8331612b8e7630ad0ee4e0d"}

The value of metadata is demo-version-1.

To deploy an updated service as a canary instance:

Deploy an updated service as a canary instance:

docker service create \
--name demo-v2 \
--network demo \
--detach=false \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--env METADATA="demo-version-2" \
--env VERSION="0.2" \
mirantiseng/docker-demo

Because this has one replica and the initial version has four replicas, 20% of application traffic is sent to demo-version-2:

curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"23d9a5ec47ef","version":"0.1","metadata":"demo-version-1","request_id":"060c609a3ab4b7d9462233488826791c"}
curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"f42f7f0a30f9","version":"0.1","metadata":"demo-version-1","request_id":"c848e978e10d4785ac8584347952b963"}
curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"c2a686ae5694","version":"0.1","metadata":"demo-version-1","request_id":"724c21d0fb9d7e265821b3c95ed08b61"}
curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"1b0d55ed3d2f","version":"0.2","metadata":"demo-version-2","request_id":"b86ff1476842e801bf20a1b5f96cf94e"}
curl -vs -H "Host: demo.local" http://127.0.0.1/ping
{"instance":"c2a686ae5694","version":"0.1","metadata":"demo-version-1","request_id":"724c21d0fb9d7e265821b3c95ed08b61"}

Optional. Increase traffic to the new version by adding more replicas. For example:
```
docker service scale demo-v2=4
```
Example output:
```
demo-v2
```
Complete the upgrade by scaling the demo-v1 service to zero replicas:
```
docker service scale demo-v1=0
```
Example output:
```
demo-v1
```
This routes all application traffic to the new version. If you need to roll back your service, scale the v1 service back up and the v2 service back down.

Use context or path-based routing¶

This topic describes how to publish a service using context or path-based routing.

Create an overlay network to isolate and secure service traffic:
```
docker network create -d overlay demo
```
Example output:
```
1se1glh749q1i4pw0kf26mfx5
```

Create the initial service:

docker service create \
--name demo \
--network demo \
--detach=false \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.context_root=/app \
--label com.docker.lb.context_root_rewrite=true \
--env METADATA="demo-context-root" \
mirantiseng/docker-demo

Interlock detects when the service is available and publishes it.

Note

Interlock only supports one path per host for each service cluster. When a specific com.docker.lb.hosts label is applied, it cannot be applied again in the same service cluster.

After the tasks are running and the proxy service is updated, the application is available at http://demo.local:

curl -vs -H "Host: demo.local" http://127.0.0.1/app/

Example output:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /app/ HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Fri, 17 Nov 2017 14:25:17 GMT
< Content-Type: text/html; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< x-request-id: 077d18b67831519defca158e6f009f82
< x-proxy-id: 77c0c37d2c46
< x-server-info: interlock/2.0.0-dev (732c77e7) linux/amd64
< x-upstream-addr: 10.0.1.3:8080
< x-upstream-response-time: 1510928717.306

Configure a routing mode¶

This topic describes how to publish services using the task and VIP backend routing modes.

Routing modes¶

The following table describes the two backend routing modes:

Routing modes¶
	Task mode	VIP mode
Default	yes	no
Traffic routing	Interlock uses backend task IPs to route traffic from the proxy to each container. Traffic to the front-end route is layer 7 load balanced directly to service tasks. This allows for routing functionality such as sticky sessions for each container. Task routing mode applies layer 7 routing and then sends packets directly to a container.	Interlock uses the Swarm service VIP as the backend IP instead of using container IPs. Traffic to the front-end route is layer 7 load balanced to the Swarm service VIP, which Layer 4 load balances to backend tasks. VIP mode is useful for reducing the amount of churn in Interlock proxy service configurations, which can be an advantage in highly dynamic environments. VIP mode optimizes for fewer proxy updates with the tradeoff of a reduced feature set. Most application updates do not require configuring backends in VIP mode. In VIP routing mode, Interlock uses the service VIP, which is a persistent endpoint that exists from service creation to service deletion, as the proxy backend. VIP routing mode applies Layer 7 routing and then sends packets to the Swarm Layer 4 load balancer, which routes traffic to service containers.
Canary deployments	In task mode, a canary service with one task next to an existing service with four tasks represents one out of five total tasks, so the canary will receive 20% of incoming requests.	Because VIP mode routes by service IP rather than by task IP, it affects the behavior of canary deployments. In VIP mode, a canary service with one task next to an existing service with four tasks will receive 50% of incoming requests, as it represents one out of two total services.

Specify a routing mode¶

You can set each service to use either the task or the VIP backend routing mode. Task mode is the default and is used if a label is not specified or if it is set to task.

Set the routing mode to VIP¶

Apply the following label to set the routing mode to VIP:
```
com.docker.lb.backend_mode=vip
```
Perform a proxy reconfiguration for the following two updates, as they create or remove a service VIP:
- Adding or removing a network on a service
- Deploying or deleting a service
Note

The following is a non-exhaustive list of application events that do not require proxy reconfiguration in VIP mode:
- Increasing or decreasing a service replica
- Deploying a new image
- Updating a configuration or secret
- Adding or removing a label
- Adding or removing an environment variable
- Rescheduling a failed application task

Publish a default host service¶

The following example publishes a service to be a default host. The service responds whenever a request is made to an unconfigured host.

Create an overlay network to isolate and secure the service traffic:
```
docker network create -d overlay demo
```
Example output:
```
1se1glh749q1i4pw0kf26mfx5
```

Create the initial service:

docker service create \
--name demo-default \
--network demo \
--detach=false \
--replicas=1 \
--label com.docker.lb.default_backend=true \
--label com.docker.lb.port=8080 \
ehazlett/interlock-default-app

Interlock detects when the service is available and publishes it. After tasks are running and the proxy service is updated, the application is available at any URL that is not configured.

Publish a service using the VIP backend mode¶

Create an overlay network to isolate and secure the service traffic:
```
docker network create -d overlay demo
```
Example output:
```
1se1glh749q1i4pw0kf26mfx5
```

Create the initial service:

docker service create \
--name demo \
--network demo \
--detach=false \
--replicas=4 \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.backend_mode=vip \
--env METADATA="demo-vip-1" \
mirantiseng/docker-demo

Interlock detects when the service is available and publishes it.

After tasks are running and the proxy service is updated, the application is available at http://demo.local:

curl -vs -H "Host: demo.local" http://127.0.0.1/ping

Example output:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to demo.local (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 20:28:26 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 120
< Connection: keep-alive
< Set-Cookie: session=1510172906715624280; Path=/; Expires=Thu, 09 Nov 2017 20:28:26 GMT; Max-Age=86400
< x-request-id: f884cf37e8331612b8e7630ad0ee4e0d
< x-proxy-id: 5ad7c31f9f00
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
< x-upstream-addr: 10.0.2.9:8080
< x-upstream-response-time: 1510172906.714
<
{"instance":"df20f55fc943","version":"0.1","metadata":"demo","request_id":"f884cf37e8331612b8e7630ad0ee4e0d"}

Using VIP mode causes Interlock to use the virtual IPs of the service for load balancing rather than using each task IP.

Inspect the service to see the VIPs, as in the following example:

"Endpoint": {
    "Spec": {
                "Mode": "vip"

    },
    "VirtualIPs": [
        {
                "NetworkID": "jed11c1x685a1r8acirk2ylol",
                "Addr": "10.0.2.9/24"
        }
    ]
}

In this example, Interlock configures a single upstream for the host using IP 10.0.2.9. Interlock skips further proxy updates as long as there is at least one replica for the service, as the only upstream is the VIP.

Use service labels¶

Interlock uses service labels to configure how applications are published, to define the host names that are routed to the service, to define the applicable ports, and to define other routing configurations.

The following occurs when you deploy or update a Swarm service with service labels:

The ucp-interlock service monitors the Docker API for events and publishes the events to the ucp-interlock-extension service.
The ucp-interlock-extension service generates a new configuration for the proxy service based on the labels you have added to your services.
The ucp-interlock service takes the new configuration and reconfigures ucp-interlock-proxy to start using the new configuration.

This process occurs in milliseconds and does not interrupt services.

The following table lists the service labels that Interlock uses:

Label	Description	Example
`com.docker.lb.hosts`	Comma-separated list of the hosts for the service to serve.	`example.com, test.com`
`com.docker.lb.port`	Port to use for internal upstream communication.	`8080`
`com.docker.lb.network`	Name of the network for the proxy service to attach to for upstream connectivity.	`app-network-a`
`com.docker.lb.context_root`	Context or path to use for the application.	`/app`
`com.docker.lb.context_root_rewrite`	Changes the path from the value of label `com.docker.lb.context_root` to `/` when set to `true`.	`true`
`com.docker.lb.ssl_cert`	Docker secret to use for the SSL certificate.	`example.com.cert`
`com.docker.lb.ssl_key`	Docker secret to use for the SSL key.	`example.com.key`
`com.docker.lb.websocket_endpoints`	Comma-separated list of endpoints to be upgraded for websockets.	`/ws,/foo`
`com.docker.lb.service_cluster`	Name of the service cluster to use for the application.	`us-east`
`com.docker.lb.sticky_session_cookie`	Cookie to use for sticky sessions.	`app_session`
`com.docker.lb.redirects`	Semicolon-separated list of redirects to add in the format of `<source>, <target>`.	`http://old.example.com, http://new.example.com`
`com.docker.lb.ssl_passthrough`	Enables SSL passthrough when set to `true`.	`false`
`com.docker.lb.backend_mode`	Selects the backend mode that the proxy should use to access the upstreams. The default is `task`.	`vip`

Configure redirects¶

This topic describes how to publish a service with a redirect from old.local to new.local.

Note

Redirects do not work if a service is configured for TLS passthrough in the Interlock proxy.

Create an overlay network to isolate and secure service traffic:
```
docker network create -d overlay demo
```
Example output:
```
1se1glh749q1i4pw0kf26mfx5
```

Create the service with the redirect:

docker service create \
--name demo \
--network demo \
--detach=false \
--label com.docker.lb.hosts=old.local,new.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.redirects=http://old.local,http://new.local \
--env METADATA="demo-new" \
mirantiseng/docker-demo

Interlock detects when the service is available and publishes it.

After tasks are running and the proxy service is updated, the application is available through http://new.local with a redirect configured that sends http://old.local to http://new.local:

curl -vs -H "Host: old.local" http://127.0.0.1

Example output:

* Rebuilt URL to: http://127.0.0.1/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> Host: old.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 302 Moved Temporarily
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 19:06:27 GMT
< Content-Type: text/html
< Content-Length: 161
< Connection: keep-alive
< Location: http://new.local/
< x-request-id: c4128318413b589cafb6d9ff8b2aef17
< x-proxy-id: 48854cd435a4
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
<
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/1.13.6</center>
</body>
</html>

Service clusters¶

Reconfiguring the single proxy service that Interlock manages by default can take one to two seconds for each overlay network that the proxy manages. You can scale up to a larger number of Interlock-routed networks and services by implementing a service cluster. Service clusters use Interlock to manage multiple proxy services, each responsible for routing to a separate set of services and their corresponding networks, thereby minimizing proxy reconfiguration time.

Configure service clusters¶

This topic and the next assume that the following prerequisites have been met:

You have an operational MKE cluster with at least two worker nodes (mke-node-0 and mke-node-1), which you will use as dedicated proxy servers for two independent Interlock service clusters.
You have enabled Interlock with an HTTP port of 80 and an HTTPS port of 8443.

From a manager node, apply node labels to the MKE workers that you have chosen to use as your proxy servers:
```
docker node update --label-add nodetype=loadbalancer --label-add region=east mke-node-0
docker node update --label-add nodetype=loadbalancer --label-add region=west mke-node-1
```
In this example, mke-node-0 serves as the proxy for the east region and mke-node-1 serves as the proxy for the west region.

Create a dedicated overlay network for each region proxy to manage traffic:

docker network create --driver overlay eastnet
docker network create --driver overlay westnet

Modify the Interlock configuration to create two service clusters:

CURRENT_CONFIG_NAME=$(docker service inspect --format '{{ \
(index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' \
ucp-interlock)
docker config inspect --format '{{ printf "%s" .Spec.Data }}' \
$CURRENT_CONFIG_NAME > old_config.toml

Create the following config.toml file that declares two service clusters, east and west:

ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
AllowInsecure = false
PollInterval = "3s"

[Extensions]
  [Extensions.east]
    Image = "mirantis/ucp-interlock-extension:3.2.3"
    ServiceName = "ucp-interlock-extension-east"
    Args = []
    Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
    ConfigImage = "mirantis/ucp-interlock-config:3.2.3"
    ConfigServiceName = "ucp-interlock-config-east"
    ProxyImage = "mirantis/ucp-interlock-proxy:3.2.3"
    ProxyServiceName = "ucp-interlock-proxy-east"
    ServiceCluster="east"
    Networks=["eastnet"]
    ProxyConfigPath = "/etc/nginx/nginx.conf"
    ProxyReplicas = 1
    ProxyStopSignal = "SIGQUIT"
    ProxyStopGracePeriod = "5s"
    ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.region==east"]
    PublishMode = "host"
    PublishedPort = 80
    TargetPort = 80
    PublishedSSLPort = 8443
    TargetSSLPort = 443
    [Extensions.east.Labels]
      "ext_region" = "east"
      "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
    [Extensions.east.ContainerLabels]
      "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
    [Extensions.east.ProxyLabels]
      "proxy_region" = "east"
      "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
    [Extensions.east.ProxyContainerLabels]
      "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
    [Extensions.east.Config]
      Version = ""
      HTTPVersion = "1.1"
      User = "nginx"
      PidPath = "/var/run/proxy.pid"
      MaxConnections = 1024
      ConnectTimeout = 5
      SendTimeout = 600
      ReadTimeout = 600
      IPHash = false
      AdminUser = ""
      AdminPass = ""
      SSLOpts = ""
      SSLDefaultDHParam = 1024
      SSLDefaultDHParamPath = ""
      SSLVerify = "required"
      WorkerProcesses = 1
      RLimitNoFile = 65535
      SSLCiphers = "HIGH:!aNULL:!MD5"
      SSLProtocols = "TLSv1.2"
      AccessLogPath = "/dev/stdout"
      ErrorLogPath = "/dev/stdout"
      MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t    '$status $body_bytes_sent \"$http_referer\" '\n\t\t    '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
      TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t    '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t    '\"$http_x_forwarded_for\" $reqid $msec $request_time '\n\t\t    '$upstream_connect_time $upstream_header_time $upstream_response_time';"
      KeepaliveTimeout = "75s"
      ClientMaxBodySize = "32m"
      ClientBodyBufferSize = "8k"
      ClientHeaderBufferSize = "1k"
      LargeClientHeaderBuffers = "4 8k"
      ClientBodyTimeout = "60s"
      UnderscoresInHeaders = false
      UpstreamZoneSize = 64
      ServerNamesHashBucketSize = 128
      GlobalOptions = []
      HTTPOptions = []
      TCPOptions = []
      HideInfoHeaders = false

  [Extensions.west]
    Image = "mirantis/ucp-interlock-extension:3.2.3"
    ServiceName = "ucp-interlock-extension-west"
    Args = []
    Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
    ConfigImage = "mirantis/ucp-interlock-config:3.2.3"
    ConfigServiceName = "ucp-interlock-config-west"
    ProxyImage = "mirantis/ucp-interlock-proxy:3.2.3"
    ProxyServiceName = "ucp-interlock-proxy-west"
    ServiceCluster="west"
    Networks=["westnet"]
    ProxyConfigPath = "/etc/nginx/nginx.conf"
    ProxyReplicas = 1
    ProxyStopSignal = "SIGQUIT"
    ProxyStopGracePeriod = "5s"
    ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.region==west"]
    PublishMode = "host"
    PublishedPort = 80
    TargetPort = 80
    PublishedSSLPort = 8443
    TargetSSLPort = 443
    [Extensions.west.Labels]
      "ext_region" = "west"
      "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
    [Extensions.west.ContainerLabels]
      "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
    [Extensions.west.ProxyLabels]
      "proxy_region" = "west"
      "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
    [Extensions.west.ProxyContainerLabels]
      "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
    [Extensions.west.Config]
      Version = ""
      HTTPVersion = "1.1"
      User = "nginx"
      PidPath = "/var/run/proxy.pid"
      MaxConnections = 1024
      ConnectTimeout = 5
      SendTimeout = 600
      ReadTimeout = 600
      IPHash = false
      AdminUser = ""
      AdminPass = ""
      SSLOpts = ""
      SSLDefaultDHParam = 1024
      SSLDefaultDHParamPath = ""
      SSLVerify = "required"
      WorkerProcesses = 1
      RLimitNoFile = 65535
      SSLCiphers = "HIGH:!aNULL:!MD5"
      SSLProtocols = "TLSv1.2"
      AccessLogPath = "/dev/stdout"
      ErrorLogPath = "/dev/stdout"
      MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t    '$status $body_bytes_sent \"$http_referer\" '\n\t\t    '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
      TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t    '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t    '\"$http_x_forwarded_for\" $reqid $msec $request_time '\n\t\t    '$upstream_connect_time $upstream_header_time $upstream_response_time';"
      KeepaliveTimeout = "75s"
      ClientMaxBodySize = "32m"
      ClientBodyBufferSize = "8k"
      ClientHeaderBufferSize = "1k"
      LargeClientHeaderBuffers = "4 8k"
      ClientBodyTimeout = "60s"
      UnderscoresInHeaders = false
      UpstreamZoneSize = 64
      ServerNamesHashBucketSize = 128
      GlobalOptions = []
      HTTPOptions = []
      TCPOptions = []
      HideInfoHeaders = false

Note

Change all instances of the MKE version and *.ucp.InstanceID in the above to match your deployment.

Optional. Modify the configuration file that Interlock creates by default:
1. Replace [Extensions.default] with [Extensions.east].
2. Change ServiceName to "ucp-interlock-extension-east".
3. Change ConfigServiceName to "ucp-interlock-config-east".
4. Change ProxyServiceName to "ucp-interlock-proxy-east".
5. Add the "node.labels.region==east" constraint to the ProxyConstraints list.
6. Add the ServiceCluster="east" key immediately below and inline with ProxyServiceName.
7. Add the Networks=["eastnet"] key immediately below and inline with ServiceCluster. This list can contain as many overlay networks as you require. Interlock only connects to the specified networks and connects to them all at startup.
8. Change PublishMode="ingress" to PublishMode="host".
9. Change the [Extensions.default.Labels] section title to [Extensions.east.Labels].
10. Add the "ext_region" = "east" key under the [Extensions.east.Labels] section.
11. Change the [Extensions.default.ContainerLabels] section title to [Extensions.east.ContainerLabels].
12. Change the [Extensions.default.ProxyLabels] section title to [Extensions.east.ProxyLabels].
13. Add the "proxy_region" = "east" key under the [Extensions.east.ProxyLabels] section.
14. Change the [Extensions.default.ProxyContainerLabels] section title to [Extensions.east.ProxyContainerLabels].
15. Change the [Extensions.default.Config] section title to [Extensions.east.Config].
16. Optional. Change ProxyReplicas=2 to ProxyReplicas=1. This is only necessary if there is a single node labeled as a proxy for each service cluster.
17. Configure your west service cluster by duplicating the entire [Extensions.east] block and changing all instances of east to west.

Create a new docker config object from the config.toml file:

NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$(( \
$(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
docker config create $NEW_CONFIG_NAME config.toml

Update the ucp-interlock service to start using the new configuration:

docker service update \
--config-rm $CURRENT_CONFIG_NAME \
--config-add source=$NEW_CONFIG_NAME,target=/config.toml \
ucp-interlock

View your service clusters:
```
docker service ls
```
The following two proxy services will display: ucp-interlock-proxy-east and ucp-interlock-proxy-west.

Note

If only one proxy service displays, delete it using docker service rm and rerun docker service ls to display the two new proxy services.

Deploy services in separate service clusters¶

With your service clusters configured, you can now deploy services, routing to them with your new proxy services using the service_cluster label.

Create two example services:

docker service create --name demoeast \
--network eastnet \
--label com.docker.lb.hosts=demo.A \
--label com.docker.lb.port=8000 \
--label com.docker.lb.service_cluster=east \
training/whoami:latest

docker service create --name demowest \
--network westnet \
--label com.docker.lb.hosts=demo.B \
--label com.docker.lb.port=8000 \
--label com.docker.lb.service_cluster=west \
training/whoami:latest

Ping your whoami service on the mke-node-0 proxy server:
```
curl -H "Host: demo.A" http://<mke-node-0 public IP>
```
The response contains the container ID of the whoami container declared by the demoeast service.

The same curl command on mke-node-1 fails because that Interlock proxy only routes traffic to services with the service_cluster=west label, which are connected to the westnet Docker network that you listed in the configuration for that service cluster.
Ping your whoami service on the mke-node-1 proxy server:
```
curl -H "Host: demo.B" http://<mke-node-1 public IP>
```
The service routed by Host: demo.B is only reachable through the Interlock proxy mapped to port 80 on mke-node-1.

Remove a service cluster¶

In removing a service cluster, Interlock removes all of the services that are used internally to manage the service cluster, while leaving all of the user services intact. For continued function, however, you may need to update, modify, or remove the user services that remain. For instance:

Any remaining user service that depends on functionality provided by the removed service cluster will need to be provisioned and managed by different means.
All load balancing that is managed by the service cluster will no longer be available following its removal, and thus must be reconfigured.

Following the removal of the service cluster, all ports that were previously managed by the service cluster will once again be available. Also, any manually created networks will remain in place.

To remove a service cluster:

Obtain the current Interlock configuration file:

CURRENT_CONFIG_NAME=$(docker service inspect --format '{{ \
(index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' \
ucp-interlock)
docker config inspect --format '{{ printf "%s" .Spec.Data }}' \
$CURRENT_CONFIG_NAME > old_config.toml

Open the old_config.toml file.
Remove the subsection from [Extensions] that corresponds with the service cluster that you want to remove, but leave the [Extensions] section header itself in place. For example, remove the entire [Extensions.east] subsection from the config.toml file generated in Configure service clusters.

Create a new docker config object from the old_config.toml file:

NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$(( \
$(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
docker config create $NEW_CONFIG_NAME config.toml

Update the ucp-interlock service to use the new configuration:

 docker service update \
 --config-rm $CURRENT_CONFIG_NAME \
 --config-add source=$NEW_CONFIG_NAME,target=/config.toml \
ucp-interlock

Wait for two minutes, and then verify that Interlock has removed the services that were previously associated with the service cluster:
```
docker service ls
```

Use persistent sessions¶

This topic describes how to publish a service with a proxy that is configured for persistent sessions using either cookies or IP hashing. Persistent sessions are also known as sticky sessions.

Configure persistent sessions using cookies¶

Create an overlay network to isolate and secure service traffic:
```
docker network create -d overlay demo
```
Example output:
```
1se1glh749q1i4pw0kf26mfx5
```

Create a service with the persistent session cookie:

docker service create \
--name demo \
--network demo \
--detach=false \
--replicas=5 \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.sticky_session_cookie=session \
--label com.docker.lb.port=8080 \
--env METADATA="demo-sticky" \
mirantiseng/docker-demo

Interlock detects when the service is available and publishes it.

After tasks are running and the proxy service is updated, the application is configured to use persistent sessions and is available at http://demo.local:

curl -vs -c cookie.txt -b cookie.txt -H "Host: demo.local" http://127.0.0.1/ping

Example output:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
> Cookie: session=1510171444496686286
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 20:04:36 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 117
< Connection: keep-alive
* Replaced cookie session="1510171444496686286" for domain demo.local, path /, expire 0
< Set-Cookie: session=1510171444496686286
< x-request-id: 3014728b429320f786728401a83246b8
< x-proxy-id: eae36bf0a3dc
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
< x-upstream-addr: 10.0.2.5:8080
< x-upstream-response-time: 1510171476.948
<
{"instance":"9c67a943ffce","version":"0.1","metadata":"demo-sticky","request_id":"3014728b429320f786728401a83246b8"}

The curl command stores Set-Cookie from the application and sends it with subsequent requests, which are pinned to the same instance. If you make multiple requests, the same x-upstream-addr is present in each.

Configure persistent sessions using IP hashing¶

Using client IP hashing to configure persistent sessions is not as flexible or consistent as using cookies but it enables workarounds for applications that cannot use the other method. To use IP hashing, you must reconfigure Interlock proxy to use host mode networking, because the default ingress networking mode uses SNAT, which obscures client IP addresses.

Create an overlay network to isolate and secure service traffic:
```
docker network create -d overlay demo
```
Example output:
```
1se1glh749q1i4pw0kf26mfx5
```

Create a service using IP hashing:

docker service create \
--name demo \
--network demo \
--detach=false \
--replicas=5 \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.ip_hash=true \
--env METADATA="demo-sticky" \
mirantiseng/docker-demo

Interlock detects when the service is available and publishes it.

After tasks are running and the proxy service is updated, the application is configured to use persistent sessions and is available at http://demo.local:

curl -vs -H "Host: demo.local" http://127.0.0.1/ping

Example output:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /ping HTTP/1.1
> Host: demo.local
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.6
< Date: Wed, 08 Nov 2017 20:04:36 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 117
< Connection: keep-alive
< x-request-id: 3014728b429320f786728401a83246b8
< x-proxy-id: eae36bf0a3dc
< x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
< x-upstream-addr: 10.0.2.5:8080
< x-upstream-response-time: 1510171476.948
<
{"instance":"9c67a943ffce","version":"0.1","metadata":"demo-sticky","request_id":"3014728b429320f786728401a83246b8"}

Optional. Add additional replicas:
```
docker service scale demo=10
```
Note

IP hashing for extensions creates a new upstream address when scaling replicas because the proxy uses the new set of replicas to determine where to pin the requests. When the upstreams are determined, a new “sticky” backend is selected as the dedicated upstream.

Secure services with TLS¶

MKE offers you two different methods for securing your services with Transport Layer Security (TLS): proxy-managed TLS and service-managed TLS.

Method	Description
Proxy-managed TLS	All traffic between users and the proxy is encrypted, but the traffic between the proxy and your Swarm service is not secure.
Service-managed TLS	The end-to-end traffic is encrypted and the proxy service allows TLS traffic to pass through unchanged.

Proxy-managed TLS¶

This topic describes how to deploy a Swarm service wherein the proxy manages the TLS connection. Using proxy-managed TLS entails that the traffic between the proxy and the Swarm service is not secure, so you should only use this option if you trust that no one can monitor traffic inside the services that run in your datacenter.

To deploy a Swarm service with proxy-managed TLS:

Obtain a private key and certificate for the TLS connection. The Common Name (CN) in the certificate must match the name where your service will be available. Generate a self-signed certificate for app.example.org:
```
openssl req \
-new \
-newkey rsa:4096 \
-days 3650 \
-nodes \
-x509 \
-subj "/C=US/ST=CA/L=SF/O=Docker-demo/CN=app.example.org" \
-keyout app.example.org.key \
-out app.example.org.cert
```

Create the following docker-compose.yml file:

version: "3.2"

services:
  demo:
    image: mirantiseng/docker-demo
    deploy:
      replicas: 1
      labels:
        com.docker.lb.hosts: app.example.org
        com.docker.lb.network: demo-network
        com.docker.lb.port: 8080
        com.docker.lb.ssl_cert: demo_app.example.org.cert
        com.docker.lb.ssl_key: demo_app.example.org.key
    environment:
      METADATA: proxy-handles-tls
    networks:
      - demo-network

networks:
  demo-network:
    driver: overlay
secrets:
  app.example.org.cert:
    file: ./app.example.org.cert
  app.example.org.key:
    file: ./app.example.org.key

The demo service has labels specifying that the proxy service routes app.example.org traffic to this service. All traffic between the service and proxy occurs using the demo-network network. The service has labels that specify the Docker secrets used on the proxy service for terminating the TLS connection.

The private key and certificate are stored as Docker secrets, and thus you can readily scale the number of replicas used for running the proxy service, with MKE distributing the secrets to the replicas.

Download and configure the client bundle and deploy the service:

docker stack deploy --compose-file docker-compose.yml demo

Test that everything works correctly by updating your /etc/hosts file to map app.example.org to the IP address of an MKE node.
Optional. In a production deployment, create a DNS entry so that users can access the service using the domain name of your choice. After creating the DNS entry, access your service at https://<hostname>:<https-port>.
- hostname is the name you specified with the com.docker.lb.hosts. label.
- https-port is the port you configured in the MKE settings.
Because this example uses self-signed certificates, client tools such as browsers display a warning that the connection is insecure.
Optional. Test that everything works using the CLI:
```
curl --insecure \
--resolve <hostname>:<https-port>:<mke-ip-address> \
https://<hostname>:<https-port>/ping
```
Example output:
```
{"instance":"f537436efb04","version":"0.1","request_id":"5a6a0488b20a73801aa89940b6f8c5d2"}
```
The proxy uses SNI to determine where to route traffic, and thus you must verify that you are using a version of curl that includes the SNI header with insecure requests. Otherwise, curl displays the following error:
```
Server aborted the SSL handshake
```

Note

There is no way to update expired certificates using the proxy-managed TLS method. You must create a new secret and then update the corresponding service.

Service-managed TLS¶

This topic describes how to deploy a Swarm service wherein the service manages the TLS connection by encrypting traffic from users to your Swarm service.

Deploy your Swarm service using the following example docker-compose.yml file:

version: "3.2"

services:
  demo:
    image: mirantiseng/docker-demo
    command: --tls-cert=/run/secrets/cert.pem --tls-key=/run/secrets/key.pem
    deploy:
      replicas: 1
      labels:
        com.docker.lb.hosts: app.example.org
        com.docker.lb.network: demo-network
        com.docker.lb.port: 8080
        com.docker.lb.ssl_passthrough: "true"
    environment:
      METADATA: end-to-end-TLS
    networks:
      - demo-network
    secrets:
      - source: app.example.org.cert
        target: /run/secrets/cert.pem
      - source: app.example.org.key
        target: /run/secrets/key.pem

networks:
  demo-network:
    driver: overlay
secrets:
  app.example.org.cert:
    file: ./app.example.org.cert
  app.example.org.key:
    file: ./app.example.org.key

This updates the service to start using the secrets with the private key and certificate and it labels the service with com.docker.lb.ssl_passthrough: true, thus configuring the proxy service such that TLS traffic for app.example.org is passed to the service.

Since the connection is fully encrypted from end-to-end, the proxy service cannot add metadata such as version information or the request ID to the response headers.

Deploy services with mTLS enabled¶

Mutual Transport Layer Security (mTLS) is a process of mutual authentication in which both parties verify the identity of the other party, using a signed certificate.

You must have the following items to deploy services with mTLS:

One or more CA certificates for signing the server and client certificates and keys.
A signed certificate and key for the server
A signed certificate and key for the client

To deploy a backend service with proxy-managed mTLS enabled:

Create a secret for the CA certificate that the client uses to authenticate the server.

Modify the docker-compose.yml file produced in Proxy-managed TLS:

Add the following label to the docker-compose.yml file:

com.docker.lb.client_ca_cert: demo_app.example.org.client-ca-cert

Add the CA certificate to the secrets: in the docker-compose.yml file:

app.example.org.client-ca.cert:
  file: ./app.example.org.client-ca.cert

The docker-compose-yml file presents as follows:

version: "3.2"

services:
  demo:
    image: mirantiseng/docker-demo
    deploy:
      replicas: 1
      labels:
        com.docker.lb.hosts: app.example.org
        com.docker.lb.network: demo-network
        com.docker.lb.port: 8080
        com.docker.lb.ssl_cert: demo_app.example.org.cert
        com.docker.lb.ssl_key: demo_app.example.org.key
        com.docker.lb.client_ca_cert: demo_app.example.org.client-ca.cert
    environment:
      METADATA: proxy-handles-tls
    networks:
      - demo-network

networks:
  demo-network:
    driver: overlay
secrets:
  app.example.org.cert:
    file: ./app.example.org.cert
  app.example.org.key:
    file: ./app.example.org.key
  app.example.org.client-ca.cert:
    file: ./app.example.org.client-ca.cert

Deploy the service:

docker stack deploy --compose-file docker-compose.yml demo

Test the mTLS-enabled service:

curl --insecure \
--resolve app.example.org:<mke-https-port>:<mke-ip-address> \
--cacert client_ca_cert.pem \
--cert client_cert.pem \
--key client_key.pem \
https://app.example.org:<mke-https-port>/ping

A successful deployment returns a JSON payload in plain text.

Note

Omitting --cacert, --cert, or --key from the cURL command returns an error message, as all three parameters are required.

Use websockets¶

This topic describes how to use websockets with Interlock.

Create an overlay network to isolate and secure service traffic:
```
docker network create -d overlay demo
```
Example output:
```
1se1glh749q1i4pw0kf26mfx5
```

Create the service with websocket endpoints:

docker service create \
--name demo \
--network demo \
--detach=false \
--label com.docker.lb.hosts=demo.local \
--label com.docker.lb.port=8080 \
--label com.docker.lb.websocket_endpoints=/ws \
ehazlett/websocket-chat

Interlock detects when the service is available and publishes it.

Note

You must have an entry for demo.local in your /etc/hosts file or use a routable domain.

Once tasks are running and the proxy service is updated, the application will be available at http://demo.local. Navigate to this URL in two different browser windows and notice that the text you enter in one window displays automatically in the other.

Deploy applications with Kubernetes¶

Use Kubernetes on Windows Server nodes¶

Observe the following prerequisites prior to using Kubernetes on Windows Server nodes.

Install MKE.
Create a single-node, linux-only cluster.

Note

Running Kubernetes on Windows Server nodes is only supported on MKE 3.3.0 and later. If you want to run Kubernetes on Windows Server nodes on a cluster that is currently running an earlier version of MKE than 3.3.0, you must perform a fresh install of MKE 3.3.0 or later.

Add Windows Server nodes¶

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes and click Add Node.
Under NODE TYPE, select Windows. Windows Server nodes can only be workers.
Optional. Specify custom listen and advertise addresses by using the relevant slider.

Copy the command generated at the bottom of the Add Node page, which includes the join-token.

Example command:

docker swarm join \
--token SWMTKN-1-2is7c14ff43tq1g61ubc5egvisgilh6m8qxm6dndjzgov9qjme-4388n8bpyqivzudz4fidqm7ey \
172.31.2.154:2377

Add your Windows Server node to the MKE cluster by running the docker swarm join command copied in the previous step.

Validate your cluster¶

To validate your cluster using the MKE web UI:

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Nodes. A green circle indicates a healthy node state. All nodes should be green.
Change each node orchestrator to Kubernetes:
1. Click on the node.
2. In the upper-right corner, click the slider icon.
3. In the Role section of the Details tab, select Kubernetes under ORCHESTRATOR TYPE.
4. Click Save.
5. Repeat the above steps for each node.

To validate your cluster using the command line:

View the status of all the nodes in your cluster:

kubectl get nodes

Your nodes should all have a status value of Ready, as in the following example:

NAME                   STATUS   ROLES    AGE     VERSION
user-135716-win-0      Ready    <none>   2m16s   v1.17.2
user-7d985f-ubuntu-0   Ready    master   4m55s   v1.17.2-docker-d-2
user-135716-win-1      Ready    <none>   1m12s   v1.17.2

Change each node orchestrator to Kubernetes:

docker node update <node name> --label-add com.docker.ucp.orchestrator.kubernetes=true

Repeat the last step for each node.
Deploy a workload on your cluster to verify that everything works as expected.

Troubleshoot¶

If you cannot join your Windows Server node to the cluster, confirm that the correct processes are running on the node.

Verify that the calico-node process is operational:

PS C:\> Get-Process calico-node

Example output:

Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
-------  ------    -----      -----     ------     --  -- -----------
    276      17    33284      40948      39.89   8132   0 calico-node

Verify that the kubelet process is operational:

PS C:\> Get-Process kubelet

Example output:

Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
-------  ------    -----      -----     ------     --  -- -----------
    524      23    47332      73380     828.50   6520   0 kubelet

Verify that the kube-proxy process is operational:

PS C:\> Get-Process kube-proxy

Example output:

Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
-------  ------    -----      -----     ------     --  -- -----------
    322      19    25464      33488      21.00   7852   0 kube-proxy

If any of the process verifications indicate a problem, review the container logs that bootstrap the Kubernetes components on the Windows node:

docker container logs (docker container ls --filter name=ucp-kubelet-win -q)
docker container logs (docker container ls --filter name=ucp-kube-proxy -q)
docker container logs (docker container ls --filter name=ucp-tigera-node-win -q)
docker container logs (docker container ls --filter name=ucp-tigera-felix-win -q)

Deploy a workload on Windows Server¶

The following procedure deploys a complete web application on IIS servers as Kubernetes Services. The example workload includes an MSSQL database and a load balancer. The procedure includes the following tasks:

Namespace creation
Pod and deployment scheduling
Kubernetes Service provisioning
Application workload deployment
Pod, Node, and Service configuration

Download and configure the client bundle.

Create the following namespace file:

demo-namespace.yaml¶

apiVersion: v1
kind: Namespace
metadata:
  name: demo

Create a namespace:
```
kubectl create -f demo-namespace.yaml
```

Create the following Windows web server file:

win-webserver.yaml¶

apiVersion: v1
kind: Service
metadata:
  name: win-webserver
  labels:
    app: win-webserver
  namespace: demo
spec:
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: win-webserver
  type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: win-webserver
  labels:
    app: win-webserver
  namespace: demo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: win-webserver
  template:
    metadata:
      labels:
        app: win-webserver
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                  - win-webserver
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: windowswebserver
        image: mcr.microsoft.com/windows/servercore:ltsc2019
        command:
        - powershell.exe
        - -command
        - "<#code used from https://gist.github.com/wagnerandrade/5424431#> ; $$listener = New-Object System.Net.HttpListener ; $$listener.Prefixes.Add('http://*:80/') ; $$listener.Start() ; $$callerCounts = @{} ; Write-Host('Listening at http://*:80/') ; while ($$listener.IsListening) { ;$$context = $$listener.GetContext() ;$$requestUrl = $$context.Request.Url ;$$clientIP = $$context.Request.RemoteEndPoint.Address ;$$response = $$context.Response ;Write-Host '' ;Write-Host('> {0}' -f $$requestUrl) ;  ;$$count = 1 ;$$k=$$callerCounts.Get_Item($$clientIP) ;if ($$k -ne $$null) { $$count += $$k } ;$$callerCounts.Set_Item($$clientIP, $$count) ;$$ip=(Get-NetAdapter | Get-NetIpAddress); $$header='<html><body><H1>Windows Container Web Server</H1>' ;$$callerCountsString='' ;$$callerCounts.Keys | % { $$callerCountsString+='<p>IP {0} callerCount {1} ' -f $$ip[1].IPAddress,$$callerCounts.Item($$_) } ;$$footer='</body></html>' ;$$content='{0}{1}{2}' -f $$header,$$callerCountsString,$$footer ;Write-Output $$content ;$$buffer = [System.Text.Encoding]::UTF8.GetBytes($$content) ;$$response.ContentLength64 = $$buffer.Length ;$$response.OutputStream.Write($$buffer, 0, $$buffer.Length) ;$$response.Close() ;$$responseStatus = $$response.StatusCode ;Write-Host('< {0}' -f $$responseStatus)  } ; "
      nodeSelector:
        kubernetes.io/os: windows

Note

If the Windows nodes in your MKE cluster are Windows Server 2022, edit the image tag in the win-webserver.yaml file from ltsc2019 to ltsc2022.

Create the web service:

kubectl create -f win-webserver.yaml

Expected output:

service/win-webserver created
deployment.apps/win-webserver created

Verify creation of the Kubernetes Service:

kubectl get service --namespace demo

Expected output:

NAME            TYPE       CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
win-webserver   NodePort   10.96.29.12   <none>        80:35048/TCP   12m

Review the pods deployed on your Windows Server worker nodes with inter-pod affinity and anti-affinity.

Note

After creating the web service, it may take several minutes for the pods to enter a ready state.

kubectl get pod --namespace demo

Expected output:

NAME                            READY   STATUS    RESTARTS   AGE
win-webserver-8c5678c68-qggzh   1/1     Running   0          6m21s
win-webserver-8c5678c68-v8p84   1/1     Running   0          6m21s

Review the detailed status of pods deployed:

kubectl describe pod win-webserver-8c5678c68-qggzh --namespace demo

From a kubectl client, access the web service using node-to-pod communication across the network:

kubectl get pods --namespace demo -o wide

Example output:

NAME                            READY   STATUS    RESTARTS   AGE   IP              NODE              NOMINATED NODE   READINESS GATES
win-webserver-8c5678c68-qggzh   1/1     Running   0          16m   192.168.77.68   user-135716-win-1 <none>           <none>
win-webserver-8c5678c68-v8p84   1/1     Running   0          16m   192.168.4.206   user-135716-win-0 <none>           <none>

SSH into the master node:

ssh -o ServerAliveInterval=15 root@<master-node>

Use curl to access the web service by way of the CLUSTER-IP listed for the win-webserver service.

curl 10.96.29.12

Example output:

<html><body><H1>Windows Container Web Server</H1><p>IP 192.168.77.68 callerCount 1 </body></html>

Run the curl command a second time. You can see the second request load-balanced to a different pod:

curl 10.96.29.12

Example output:

<html><body><H1>Windows Container Web Server</H1><p>IP 192.168.4.206 callerCount 1 </body></html>

From a kubectl client, access the web service using pod-to-pod communication across the network:

kubectl get service --namespace demo

Expample output:

NAME            TYPE       CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
win-webserver   NodePort   10.96.29.12   <none>        80:35048/TCP   12m

Review the pod status:

kubectl get pods --namespace demo -o wide

Example output:

NAME                            READY   STATUS    RESTARTS   AGE   IP              NODE              NOMINATED NODE   READINESS GATES
win-webserver-8c5678c68-qggzh   1/1     Running   0          16m   192.168.77.68   user-135716-win-1 <none>           <none>
win-webserver-8c5678c68-v8p84   1/1     Running   0          16m   192.168.4.206   user-135716-win-0 <none>           <none>

Exec into the web service:

kubectl exec -it win-webserver-8c5678c68-qggzh --namespace demo cmd

Example output:

Microsoft Windows [Version 10.0.17763.1098]
(c) 2018 Microsoft Corporation. All rights reserved.

Use curl to access the web service:

C:\>curl 10.96.29.12

Example output:

<html><body><H1>Windows Container Web Server</H1><p>IP 192.168.77.68
callerCount 1 <p>IP 192.168.77.68 callerCount 1 </body></html>

Access Kubernetes resources¶

Using the MKE web UI left-side navigation panel, under Kubernetes, you can access the following Kubernetes resources:

Kubernetes menu item	Kubernetes resources
Namespaces	Namespaces
Service Accounts	Service accounts
Controllers	Deployments ReplicaSet DaemonSet StatefulSet Job Cronjobs
Load Balancers	Pods
Configurations	Secrets ResourceQuota PodSecurityPolicy NetworkSecurityPolicy ConfigMap LimitRange
Storage	PersistentVolumes PersistentVolumeClaims StorageClasses

See also

Deploy a workload to a Kubernetes cluster¶

MKE supports using both the web UI and the CLI to deploy your Kubernetes YAML files.

Deploy a workload using the MKE web UI¶

This example defines a Kubernetes deployment object for an NGINX server.

Deploy an NGINX server¶

Log in to the MKE web UI.
In the left-side navigation menu, navigate to Kubernetes and click Create.
In the Namespace drop-down, select default.

Paste the following configuration details in the Object YAML editor:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

This YAML file specifies an earlier version of NGINX, which you will update in a later section of this topic.

Click Create.
Navigate to Kubernetes > Namespaces, hover over the default namespace, and select Set Context.

Inspect the deployment¶

You can review the status of your deployment in the Kubernetes section of the left-side navigation panel.

In the left-side navigation panel, navigate to Kubernetes > Controllers to review the resource controllers created for the NGINX server.
Click the nginx-deployment controller.
To review the values used to create the deployment, click the slider icon in the upper right corner.
In the left-side navigation panel, navigate to Kubernetes > Pods to review the Pods that are provisioned for the NGINX server.
Click one of the Pods.
In the Overview tab, review the Pod phase, IP address, and other properties.

Expose the server¶

The NGINX server is operational, but it is not accessible from outside of the cluster. Create a YAML file to add a NodePort service, which exposes the server on a specified port.

In the left-side navigation menu, navigate to Kubernetes and click Create.
In the Namespace drop-down, select default.

Paste the following configuration details in the Object YAML editor:

apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  type: NodePort
  ports:
    - port: 80
      nodePort: 32768
  selector:
    app: nginx

The service connects internal port 80 of the cluster to the external port 32768.

Click Create, and the Services page opens.
Select the nginx service and in the Overview tab, scroll to the Ports section.
To review the default NGINX page, navigate to <node-ip>:<nodeport> in your browser.

Note

To display the NGINX page, you may need to add a rule in your cloud provider firewall settings to allow inbound traffic on the port specified in the YAML file.

The YAML definition connects the service to the NGINX server using the app label nginx and a corresponding label selector.

Update the deployment¶

MKE supports updating an existing deployment by applying an updated YAML file. In this example, you will scale the server up to four replicas and update NGINX to a later version.

In the left-side navigation panel, navigate to Kubernetes > Controllers and select nginx-deployment.
To edit the deployment, click the gear icon in the upper right corner.
Update the number of replicas from 2 to 4.
Update the value of image from nginx:1.7.9 to nginx:1.8.
Click Save to update the deployment with the new configuration settings.
To review the newly-created replicas, in the left-side navigation panel, navigate to Kubernetes > Pods.

The content of the updated YAML file is as follows:

...
spec:
  progressDeadlineSeconds: 600
  replicas: 4
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: nginx
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.8
...

See also

Deploy a workload using the CLI¶

MKE supports deploying your Kubernetes objects on the command line using kubectl.

Deploy an NGINX server¶

Download and configure the client bundle.

Create a file called deployment.yaml that contains the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
      nodeSelector:
        kubernetes.io/os: linux
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  type: NodePort
  ports:
    - port: 80
      nodePort: 32768
  selector:
    app: nginx

Deploy the NGINX server:
```
kubectl apply -f deployment.yaml
```
Use the describe deployment option to review the deployment:
```
kubectl describe deployment nginx-deployment
```

Update the deployment¶

Update an existing deployment by applying an updated YAML file.

Increase the number of replicas to 4:

kubectl scale --replicas=4 deployment/nginx-deployment

Update the NGINX version to 1.8:

kubectl set image deployment/nginx-deployment nginx=nginx:1.8

Deploy the updated NGINX server:
```
kubectl apply -f update.yaml
```

Verify that the deployment was scaled up successfully by listing the deployments in the cluster:

kubectl get deployments

Expected output:

NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment       4         4         4            4           2d

Verify that the pods are running the updated image:

kubectl describe deployment nginx-deployment | grep -i image

Expected output:

Image:        nginx:1.8

See also

Policy enforcement¶

For purposes of policy enforcement, Mirantis currently supports the use of either OPA Gatekeeper or Pod Security Policies (PSPs). It is recommended, however, that you use OPA Gatekeeper, as PSPs are deprecated in MKE and will inevitably be removed from the product.

Deploy OPA Gatekeeper¶

Open Policy Agent (OPA) is an open source policy engine that facilitates policy-based control for cloud native environments. OPA introduces a high-level declarative language called Rego that decouples policy decisions from enforcement.

The OPA Constraint Framework introduces two primary resources: constraint templates and constraints.

Constraint templates: OPA policy definitions, written in Rego
Constraints: The application of a constraint template to a given set of objects

Gatekeeper uses the Kubernetes API to integrate OPA into Kubernetes. Policies are defined in the form of Kubernetes CustomResourceDefinitions (CRDs) and are enforced with custom admission controller webhooks. These CRDs define constraint templates and constraints on the API server. Any time a request to create, delete, or update a resource is sent to the Kubernetes cluster API server, Gatekeeper validates that resource against the predefined policies. Gatekeeper also audits preexisting resource constraint violations against newly defined policies.

Using OPA Gatekeeper, you can enforce a wide range of policies against your Kubernetes cluster. Policy examples include:

Container images can only be pulled from a set of whitelisted repositories.
New resources must be appropriately labeled.
Deployments must specify a minimum number of replicas.

Note

By design, when the OPA Gatekeeper is disabled using the configuration file, the policies are not cleaned up. Thus, when the OPA Gatekeeper is re-enabled, the cluster can immediately adopt the existing policies.

The retention of the policies poses no risk, as they are merely data on the API server and have no value outside of an OPA Gatekeeper deployment.

The following topics offer installation instructions and an example use case.

Install OPA Gatekeeper¶

Important

If you are currently using Pod Security Policies for policy enforcement, Mirantis recommends that you disable PSPs in MKE prior to installing OPA Gatekeeper.

The installation of OPA Gatekeeper is achieved simply by updating the MKE configuration file.

Obtain the current MKE configuration file for your cluster.
Set the cluster_config.policy_enforcement.gatekeeper.enabled configuration parameter to "true". For more information on Gatekeeper configuration options, refer to cluster_config.policy_enforcement.gatekeeper.
Optional. Exclude resources that are contained in a specified set of namespaces by assigning a comma-separated list of namespaces to the cluster_config.policy_enforcement.gatekeeper.excluded_namespaces configuration parameter.

Caution

Avoid adding namespaces to the excluded_namespaces list that do not yet exist in the cluster.
Upload the newly modified MKE configuration file. Be aware that the upload requires a wait time of approximately five minutes.

Verify the successful installation of Gatekeeper by running the following commands in sequence:

Verify that the gatekeeper-system namespace was created:

kubectl get ns gatekeeper-system

Expected output:

NAME                STATUS   AGE
gatekeeper-system   Active   1m

Verify the contents of the gatekeeper-system deployment:

kubectl get deployment -n gatekeeper-system

Expected output:

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
gatekeeper-audit                1/1     1            1           1m
gatekeeper-controller-manager   3/3     3            3           1m

Verify that gatekeeper-webhook-service was created:

kubectl get service -n gatekeeper-system

Expected output:

NAME                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
gatekeeper-webhook-service   ClusterIP   10.96.143.125   <none>        443/TCP   1m

Verify that the correct CustomResourceDefinitions were created:

kubectl get crd

Expected output:

NAME                                                 CREATED AT
assign.mutations.gatekeeper.sh                       2022-08-01T06:25:12Z
assignmetadata.mutations.gatekeeper.sh               2022-08-01T06:25:12Z
configs.config.gatekeeper.sh                         2022-08-01T06:25:12Z
constraintpodstatuses.status.gatekeeper.sh           2022-08-01T06:25:12Z
constrainttemplatepodstatuses.status.gatekeeper.sh   2022-08-01T06:25:12Z
constrainttemplates.templates.gatekeeper.sh          2022-08-01T06:25:12Z
modifyset.mutations.gatekeeper.sh                    2022-08-01T06:25:12Z
mutatorpodstatuses.status.gatekeeper.sh              2022-08-01T06:25:12Z
providers.externaldata.gatekeeper.sh                 2022-08-01T06:25:12Z

Verify exempted namespaces, if applicable:

kubectl describe ns kube-system gatekeeper-system

Expected output:

Name:         kube-system
Labels:       admission.gatekeeper.sh/ignore=exempted-by-mke
     kubernetes.io/metadata.name=kube-system
Annotations:  <none>
Status:       Active

No resource quota.

No LimitRange resource.


Name:         gatekeeper-system
Labels:       admission.gatekeeper.sh/ignore=no-self-managing
              control-plane=controller-manager
              gatekeeper.sh/system=yes
              kubernetes.io/metadata.name=gatekeeper-system
Annotations:  <none>
Status:       Active

Resource Quotas
  Name:     gatekeeper-critical-pods
  Resource  Used  Hard
  --------  ---   ---
  pods      4     100

No LimitRange resource.

Use OPA Gatekeeper¶

To guide you in the creation of OPA Gatekeeper policies, as an example this topic illustrates how to generate a policy for restricting escalation to root privileges.

Note

Gatekeeper provides a library of commonly used policies, including replacements for familiar PodSecurityPolicies.

Important

For users who are new to Gatekeeper, Mirantis recommends performing a dry run on potential policies prior to production deployment. Such an approach, by only auditing violations, will prevent potential cluster disruption. To perform a dry run, set spec.enforcementAction to dryrun in the constraint.yaml detailed herein.

Create a YAML file called template.yaml and place the following code in that file:

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8spspallowprivilegeescalationcontainer
  annotations:
    description: >-
      Controls restricting escalation to root privileges. Corresponds to the
      `allowPrivilegeEscalation` field in a PodSecurityPolicy. For more
      information, see
      https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privilege-escalation
spec:
  crd:
    spec:
      names:
        kind: K8sPSPAllowPrivilegeEscalationContainer
      validation:
        openAPIV3Schema:
          type: object
          description: >-
            Controls restricting escalation to root privileges. Corresponds to the
            `allowPrivilegeEscalation` field in a PodSecurityPolicy. For more
            information, see
            https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privilege-escalation
          properties:
            exemptImages:
              description: >-
                Any container that uses an image that matches an entry in this list will be excluded
                from enforcement. Prefix-matching can be signified with `*`. For example: `my-image-*`.

                It is recommended that users use the fully-qualified Docker image name (e.g. start with a domain name)
                in order to avoid unexpectedly exempting images from an untrusted repository.
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8spspallowprivilegeescalationcontainer

        import data.lib.exempt_container.is_exempt

        violation[{"msg": msg, "details": {}}] {
            c := input_containers[_]
            not is_exempt(c)
            input_allow_privilege_escalation(c)
            msg := sprintf("Privilege escalation container is not allowed: %v", [c.name])
        }

        input_allow_privilege_escalation(c) {
            not has_field(c, "securityContext")
        }
        input_allow_privilege_escalation(c) {
            not c.securityContext.allowPrivilegeEscalation == false
        }
        input_containers[c] {
            c := input.review.object.spec.containers[_]
        }
        input_containers[c] {
            c := input.review.object.spec.initContainers[_]
        }
        input_containers[c] {
            c := input.review.object.spec.ephemeralContainers[_]
        }
        # has_field returns whether an object has a field
        has_field(object, field) = true {
            object[field]
        }
      libs:
        - |
          package lib.exempt_container

          is_exempt(container) {
              exempt_images := object.get(object.get(input, "parameters", {}), "exemptImages", [])
              img := container.image
              exemption := exempt_images[_]
              _matches_exemption(img, exemption)
          }

          _matches_exemption(img, exemption) {
              not endswith(exemption, "*")
              exemption == img
          }

          _matches_exemption(img, exemption) {
              endswith(exemption, "*")
              prefix := trim_suffix(exemption, "*")
              startswith(img, prefix)
          }

Create the constraint template:

kubectl create -f template.yaml

Expected output:

constrainttemplate.templates.gatekeeper.sh/k8spspallowprivilegeescalationcontainer created

Create a YAML file called constraint.yaml and place the following code in that file:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPAllowPrivilegeEscalationContainer
metadata:
  name: psp-allow-privilege-escalation-container
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]

Create the constraint:

kubectl create -f constraint.yaml

Expected output:

k8spspallowprivilegeescalationcontainer.constraints.gatekeeper.sh/psp-allow-privilege-escalation-container created

Create a YAML file called disallowed-pod.yaml and place the following code in that file:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-privilege-escalation-disallowed
  labels:
    app: nginx-privilege-escalation
spec:
  containers:
  - name: nginx
    image: nginx
    securityContext:
      allowPrivilegeEscalation: true

Create the Pod:

kubectl create -f disallowed-pod.yaml

Expected output:

Error from server (Forbidden): error when creating "disallowed.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [psp-allow-privilege-escalation-container] Privilege escalation container is not allowed: nginx

Create a YAML file called allowed-pod.yaml and place the following code in that file:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-privilege-escalation-allowed
  labels:
    app: nginx-privilege-escalation
spec:
  containers:
  - name: nginx
    image: nginx
    securityContext:
      allowPrivilegeEscalation: false

Create the Pod:

kubectl create -f allowed-pod.yaml

Expected output:

pod/nginx-privilege-escalation-allowed created

See also

Use Pod Security Policies¶

Pod Security Policies (PSPs) are default-enabled cluster-level resources. There are two default PSPs in MKE: a privileged policy and an unprivileged policy. Administrators of the cluster can enforce additional policies and apply them to users and teams for further control of what runs in the Kubernetes cluster. This topic describes the two default policies and provides two example use cases for custom policies.

Configure Kubernetes access for PSPs¶

To interact with PSPs, a user must have access to the PodSecurityPolicy object in Kubernetes role-based access control (RBAC). MKE admins automatically have access to this object.

To grant regular users access to the PodSecurityPolicy object, an MKE admin must create the following ClusterRole and ClusterRoleBinding and assign them to the required users:

cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: psp-admin
rules:
- apiGroups:
  - extensions
  resources:
  - podsecuritypolicies
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
EOF

USER=<user-name>

cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: psp-admin:$USER
roleRef:
  kind: ClusterRole
  name: psp-admin
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: User
  name: $USER
EOF

Default Pod security policies in MKE¶

By default, the two Pod security policies defined within MKE are privileged and unprivileged. Additionally, to ensure backward compatibility after an upgrade, there is a ClusterRoleBinding that gives every user access to the privileged policy. By default, any user can create any Pod.

Note

PSPs do not override security defaults built into the MKE RBAC engine for Kubernetes Pods. These security defaults prevent non-admin users from mounting host paths into Pods or starting privileged Pods.

To review the default PSPs:

kubectl get podsecuritypolicies

Expected output:

NAME           PRIV    CAPS   SELINUX    RUNASUSER   FSGROUP    SUPGROUP   READONLYROOTFS   VOLUMES
privileged     true    *      RunAsAny   RunAsAny    RunAsAny   RunAsAny   false            *
unprivileged   false          RunAsAny   RunAsAny    RunAsAny   RunAsAny   false            *

The following specification is for the privileged policy:

allowPrivilegeEscalation: true
allowedCapabilities:
- '*'
fsGroup:
  rule: RunAsAny
hostIPC: true
hostNetwork: true
hostPID: true
hostPorts:
- max: 65535
  min: 0
privileged: true
runAsUser:
  rule: RunAsAny
seLinux:
  rule: RunAsAny
supplementalGroups:
  rule: RunAsAny
volumes:
- '*'

The following specification is for the unprivileged policy:

allowPrivilegeEscalation: false
allowedHostPaths:
- pathPrefix: /dev/null
  readOnly: true
fsGroup:
  rule: RunAsAny
hostPorts:
- max: 65535
  min: 0
runAsUser:
  rule: RunAsAny
seLinux:
  rule: RunAsAny
supplementalGroups:
  rule: RunAsAny
volumes:
- '*'

The privileged options include pods with any of the following defined in the PodTemplate:

Privileged option	Description
`PodSpec.hostIPC`	Prevents users from deploying a pod in the host IPC namespace.
`PodSpec.hostNetwork`	Prevents users from deploying a pod in the host network namespace.
`PodSpec.hostPID`	Prevents users from deploying a pod in the host PID namespace.
`SecurityContext.allowPrivilegeEscalation`	Prevents a child process of a container from gaining more privileges than its parent.
`SecurityContext.capabilities`	Prevents users from adding Linux capabilities to a pod.
`SecurityContext.privileged`	Prevents users from deploying a privileged container.
`Volume.hostPath`	Prevents users from mounting a path from the host into the container. This can be a file, directory, or the Docker socket.

The privileged options also include persistent volumes that use the following storage class:

StorageClass	Description
`Local`	Prevents users from creating a persistent volume with the `Local` StorageClass. The `Local` `StorageClass` allows users to mount directories from the host into a pod. This could be a file, directory, or the Docker socket.

Note

If an administrator creates a persistent volume with the Local` ``StorageClass, a non-administrator can consume this with a persistent volume claim.

If a user without a cluster-admin role tries to deploy a pod with any of these privileged options, an error similar to the following example displays:

Error from server (Forbidden): error when creating "pod.yaml":
pods "mypod" is forbidden: user "<user-id>" is not an admin
and does not have permissions to use privileged mode for
resource

Granting the cluster-admin ClusterRole to normal users does not allow them to deploy privileged pods.

Use the unprivileged policy¶

To switch users from the privileged policy to the unprivileged policy, a cluster admin must remove the ClusterRoleBinding that links all users and service accounts to the privileged policy and then create a RoleBinding to link users to the alternate policy. A ClusterRole is already defined but must be assigned to the required users or teams. The procedures in this section apply both to using the unprivileged as well as any custom policy.

Note

When the ClusterRoleBinding is removed, cluster admins can still deploy Pods, and these Pods are deployed with the privileged policy. However, users or service accounts will be unable to deploy pods until the RoleBinding is created, because Kubernetes cannot determine which pod security policy to apply.

To apply the unprivileged policy:

The following steps must be performed by a cluster admin.

Remove the ClusterRoleBinding:

kubectl delete clusterrolebindings ucp:all:privileged-psp-role

List the existing ClusterRoles:

kubectl get clusterrole | grep psp

Expected output:

privileged-psp-role                                                    3h47m
unprivileged-psp-role                                                  3h47m

Define the user or team to whom you want to apply the ClusterRole:
```
USER=<user-name>
```

Create a RoleBinding that links the ClusterRole to the user or team:

cat <<EOF | kubectl create -f -
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: unprivileged-psp-role:$USER
  namespace: default
roleRef:
  kind: ClusterRole
  name: unprivileged-psp-role
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: User
  name: $USER
  namespace: default
EOF

To verify the application of the unprivileged policy:

The following example must be performed by a regular user who is assigned to the unprivileged policy.

Deploy a basic nginx Pod:

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: demopod
spec:
  containers:
    - name:  demopod
      image: nginx
EOF

Review the status of the Pod:

kubectl get pods

Example output:

NAME      READY   STATUS    RESTARTS   AGE
demopod   1/1     Running   0          10m

Review which policy is applied to the Pod using the -o yaml or -o json syntax with kubectl and parse the JSON output with jq:
```
kubectl get pods demopod -o json | jq -r '.metadata.annotations."kubernetes.io/psp"'
```
Expected output:
```
unprivileged
```

Use the unprivileged policy in a Deployment¶

If you have disabled the privileged policy and created a RoleBinding to map a user to a new PSP, Kubernetes objects like Deployments and DaemonSets will not be able to deploy Pods. This is because Kubernetes objects use a ServiceAccount to schedule Pods, instead of the user that created the Deployment.

For this Deployment to be able to schedule Pods, the service account defined within the Deployment specification needs to be associated with a policy.

To review the status of the NGINX Deployment created in the previous section:

List your Deployments:

kubectl get deployments

Example output:

NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   0/1     0            0           88s

List your ReplicaSets:

kubectl get replicasets

Example output:

NAME              DESIRED   CURRENT   READY   AGE
nginx-cdcdd9f5c   1         0         0       92s

Describe your ReplicaSets:

kubectl describe replicasets nginx-cdcdd9f5c

Example output:

Warning  FailedCreate  48s (x15 over 2m10s)  \
replicaset-controller  Error creating: \
pods "nginx-cdcdd9f5c-" is forbidden: \
unable to validate against any pod security policy: []

If a service account is not defined within a Deployment specification, the default service account in a namespace is used. This is the case in the Deployment output above. Because there is no service account defined, a Rolebinding is needed to grant the default service account in the default namespace to use the PSP.

To associate the unprivileged policy with the default service account:

The following must be performed by a cluster admin.

Create a RoleBinding to associate the unprivileged policy with the default service account:

cat <<EOF | kubectl create -f -
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: unprivileged-psp-role:defaultsa
  namespace: default
roleRef:
  kind: ClusterRole
  name: unprivileged-psp-role
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: default
  namespace: default
EOF

To verify the application of the unprivileged policy to the default service account:

List your Deployments:

kubectl get deployments

Example output:

NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   1/1     1            1           6m11s

List your ReplicaSets:

kubectl get replicasets

Example output:

NAME              DESIRED   CURRENT   READY   AGE
nginx-cdcdd9f5c   1         1         1       6m16s

List your Pods:

kubectl get pods

Example output:

NAME                    READY   STATUS    RESTARTS   AGE
nginx-cdcdd9f5c-9kknc   1/1     Running   0          6m17s

Review which policy is applied to the default service account. For example:

kubectl get pod nginx-cdcdd9f5c-9kknc  -o json | jq -r '.metadata.annotations."kubernetes.io/psp"'

Expected output:

unprivileged

Apply the unprivileged PSP to a namespace¶

A common PSP use case is to apply a particular policy to particular namespace(s). For example, an admin might want to use the privileged policy for all of the infrastructure namespaces and use the unprivileged policy for the end user namespaces.

In the following example, infrastructure workloads are deployed in the kube-system and monitoring namespaces, while end user workloads are deployed in the default namespace.

To apply the privileged and unprivileged PSPs to different namespaces:

The following steps must be performed by a cluster admin.

Delete the ClusterRoleBinding that is applied by default in MKE:

kubectl delete clusterrolebindings ucp:all:privileged-psp-role

Create a new ClusterRoleBinding that will enforce the privileged PSP for all users and service accounts in the kube-system and monitoring namespaces:

cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ucp:infrastructure:privileged-psp-role
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: privileged-psp-role
subjects:
- kind: Group
  name: system:authenticated:kube-system
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: system:authenticated:monitoring
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: system:serviceaccounts:kube-system
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: system:serviceaccounts:monitoring
  apiGroup: rbac.authorization.k8s.io
EOF

Create a ClusterRoleBinding to allow all users who deploy Pods and Deployments in the default namespace to use the unprivileged policy.

cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ucp:default:unprivileged-psp-role
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: unprivileged-psp-role
subjects:
- kind: Group
  name: system:authenticated:default
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: system:serviceaccounts:default
  apiGroup: rbac.authorization.k8s.io
EOF

To verify the application of the privileged and unprivileged policies:

Create two Deployments, one in the monitoring namespace and the other in the default namespace:

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: demopod
  namespace: monitoring
spec:
  containers:
    - name:  demopod
      image: nginx
---
apiVersion: v1
kind: Pod
metadata:
  name: demopod
  namespace: default
spec:
  containers:
    - name:  demopod
      image: nginx
EOF

Review which policy is applied to the monitoring namespace:

kubectl get pods demopod -n monitoring -o json | jq -r '.metadata.annotations."kubernetes.io/psp"'

Expected output:

privileged

Review which policy is applied to the default namespace:

kubectl get pods demopod -n default -o json | jq -r '.metadata.annotations."kubernetes.io/psp"'

Expected output:

unprivileged

Reenable the privileged PSP for all users¶

To revert to the default MKE configuration in which all MKE users and service accounts use the privileged PSP, while signed in as a cluster admin, recreate the default ClusterRoleBinding:

cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ucp:all:privileged-psp-role
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: privileged-psp-role
subjects:
- kind: Group
  name: system:authenticated
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: system:serviceaccounts
  apiGroup: rbac.authorization.k8s.io
EOF

PSP examples¶

MKE admins or users with the appropriate permissions can create their own custom policies and attach them to MKE users or teams. This section highlights two use cases for custom PSPs. However, you can only apply one PSP to a Pod at a given time. Alternatively, you can combine the two use cases into one policy. Note that there are more PSP use cases than those covered in this topic.

Use a PSP to exclude root users¶

You can create a PSP to prevent a user from deploying containers that run with the root user.

To create a policy that excludes root users:

Create a PSP using the parameter MustRunAsNonRoot:

cat <<EOF | kubectl create -f -
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: norootcontainers
spec:
  allowPrivilegeEscalation: false
  allowedHostPaths:
  - pathPrefix: /dev/null
    readOnly: true
  fsGroup:
    rule: RunAsAny
  hostPorts:
  - max: 65535
    min: 0
  runAsUser:
    rule: MustRunAsNonRoot
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - '*'
EOF

If not done previously, remove the ClusterRoleBinding for the privileged policy:
```
kubectl delete clusterrolebindings ucp:all:privileged-psp-role
```

Create a ClusterRole that grants access to the new policy:

cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: norootcontainers-psp-role
rules:
- apiGroups:
  - policy
  resourceNames:
  - norootcontainers
  resources:
  - podsecuritypolicies
  verbs:
  - use
EOF

Define a user to attach to the new policy:
```
USER=<user-name>
```

Create a RoleBinding that attaches the user to the ClusterRole:

cat <<EOF | kubectl create -f -
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: norootcontainers-psp-role:$USER
  namespace: default
roleRef:
  kind: ClusterRole
  name: norootcontainers-psp-role
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: User
  name: $USER
  namespace: default
EOF

To verify the application of the new policy:

Deploy a Pod that runs as a root user:

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: demopod
spec:
  containers:
    - name:  demopod
      image: nginx
EOF

Review the status of the Pod:

kubectl get pods

Example output:

NAME      READY   STATUS                       RESTARTS   AGE
demopod   0/1     CreateContainerConfigError   0          37s

Describe the Pod:

kubectl describe pods demopod

Expected output:

<..>
 Error: container has runAsNonRoot and image will run as root

Use a PSP to apply seccomp policies¶

You can create a PSP to prevent a user from deploying containers that do not have a seccomp policy.

To create a PSP that enforces the use of a seccomp policy:

Create the new PSP:

cat <<EOF | kubectl create -f -
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: seccomppolicy
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName:  'docker/default'
spec:
  allowPrivilegeEscalation: false
  allowedHostPaths:
  - pathPrefix: /dev/null
    readOnly: true
  fsGroup:
    rule: RunAsAny
  hostPorts:
  - max: 65535
    min: 0
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - '*'
EOF

If not done previously, remove the ClusterRoleBinding for the privileged policy:
```
kubectl delete clusterrolebindings ucp:all:privileged-psp-role
```

Create a ClusterRole that grants access to the new policy:

cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: applyseccompprofile-psp-role
rules:
- apiGroups:
  - policy
  resourceNames:
  - seccomppolicy
  resources:
  - podsecuritypolicies
  verbs:
  - use
EOF

Define a user to attach to the new policy:
```
USER=<user-name>
```

Create a RoleBinding that attaches the user to the ClusterRole:

cat <<EOF | kubectl create -f -
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: applyseccompprofile-psp-role:$USER
  namespace: default
roleRef:
  kind: ClusterRole
  name: applyseccompprofile-psp-role
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: User
  name: $USER
  namespace: default
EOF

To verify the application of the new policy:

If a user tries, for example, to deploy an nginx Pod without applying a seccomp policy as Pod metadata, Kubernetes automatically applies a policy for the user.

Deploy a Pod without applying a seccomp policy in the Pod metadata:

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: demopod
spec:
  containers:
    - name:  demopod
      image: nginx
EOF

Review the status of the Pod:

kubectl get pods

Example output:

NAME      READY   STATUS    RESTARTS   AGE
demopod   1/1     Running   0          16s

Verify that the seccomp policy is applied automatically:

kubectl get pods demopod -o json | jq '.metadata.annotations."seccomp.security.alpha.kubernetes.io/pod"'

Expected output:

"docker/default"

Disable PSPs¶

You can disable the function of Pod Security Policies (PSPs) in MKE by making an update to the MKE configuration file.

Caution

Disabling PSPs will cause Pods to run without a seccomp policy, which enables the Pods to make system calls that were formerly blocked.

Obtain the current MKE configuration file for your cluster.
Set the cluster_config.policy_enforcement.pod_security_policy configuration parameter to "false". For more information, refer to cluster_config.policy_enforcement.
Optional, and recommended. Enable the default seccomp policy for MKE Pods by setting the cluster_config.custom_kubelet_flags parameter to ["--feature-gates=SeccompDefault=true","--seccomp-default"].
Upload the new MKE configuration file. Be aware that the upload requires a wait time of at least five minutes.

See also

Use admission controllers for access¶

MKE supports using a selective grant to allow a set of user and service accounts to use privileged attributes on Kubernetes Pods. This enables administrators to create scenarios that would ordinarily require administrators or cluster-admins to execute. Such selective grants can be used to temporarily bypass restrictions on non-administrator accounts, as the changes can be reverted at any time.

The privileged attributes associated with user and service accounts are specified separately. It is only possible to specify one list of privileged attributes for user accounts and one list for service accounts.

The user accounts specified for access must be non-administrator users and the service accounts specified for access must not be bound to the cluster-admin role.

The following privileged attributes can be assigned using a selective grant:

Attribute	Description
`hostIPC`	Allows the Pod containers to share the host IPC namespace
`hostNetwork`	Allows the Pod to use the network namespace and network resources of the host node
`hostPID`	Allows the Pod containers to share the host process ID namespace
`hostBindMounts`	Allows the Pod containers to use directories and volumes mounted on the container host
`privileged`	Allows one or more Pod containers to run privileged, escalate privileges, or both
`kernelCapabilities`	Allows you to specify the addition of kernel capabilities on one or more of the kernel capabilities

The following Pod manifest demonstrates the use of several of the privileged attributes in a Pod:

Example Pod manifest¶

apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
  - image: ubuntu
    command:
      - sleep
      - "36000"
    imagePullPolicy: IfNotPresent
    name: busybox
    securityContext:
      capabilities:
        add:
          - NET_ADMIN
        drop:
          - CHOWN
      privileged: false
      allowPrivilegeEscalation: true

  restartPolicy: Always

To configure privileged attributes for user and service account access:

Obtain the current MKE configuration file for your cluster.
In the [cluster_config] section on the MKE configuration file, specify the required privileged attributes for user accounts using the priv_attributes_allowed_for_user_accounts parameter.
Specify the associated user accounts with the priv_attributes_user_accounts parameter.
Specify the required privileged attributes for service accounts using the priv_attributes_allowed_for_service_accounts parameter.
Specify the associated service accounts with the priv_attributes_service_accounts parameter.
Upload the new MKE configuration file.

Example privileged attribute specification in the MKE configuration file:

priv_attributes_allowed_for_user_accounts = ["privileged"]
priv_attributes_user_accounts = ["Abby"]
priv_attributes_allowed_for_service_accounts = ["hostBindMounts", "hostIPC"]
priv_attributes_service_accounts = ["default:sa1"]

Create a service account for a Kubernetes app¶

Kubernetes uses service accounts to enable workload access control. A service account is an identity for processes that run in a Pod. When a process is authenticated through a service account, it can contact the API server and access cluster resources. The default service account is default.

You provide a service account with access to cluster resources by creating a role binding, just as you do for users and teams.

This example illustrates how to create a service account and role binding used with an NGINX server.

To create a Kubernetes namespace:

It is necessary to create a namespace for use with your service account, as unlike user accounts, service accounts are scoped to a particular namespace.

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Kubernetes > Namespaces and click Create.
Leave the Namespace drop-down blank.

Paste the following in the Object YAML editor:

apiVersion: v1
kind: Namespace
metadata:
  name: nginx

Click Create.
Navigate to the nginx namespace.
Click the vertical ellipsis in the upper-right corner and click Set Context.

To create a service account:

In the left-side navigation panel, navigate to Kubernetes > Service Accounts and click Create.
In the Namespace drop-down, select nginx.

Paste the following in the Object YAML editor:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: nginx-service-account

Click Create.

There are now two service accounts associated with the nginx namespace: default and nginx-service-account.

To create a role binding:

To give the service account access to cluster resources, create a role binding with view permissions.

From the left-side navigation panel, navigate to Access Control > Grants.

Note

If Hide Swarm Navigation is selected on the <username> > Admin Settings > Tuning page, Grants will display as Role Bindings under the Access Control menu item.
In the Grants pane, select the Kubernetes tab and click Create Role Binding.
In the Subject pane, under SELECT SUBJECT TYPE, select Service Account.
In the Namespace drop-down, select nginx.
In the Service Account drop-down, select nginx-service-account and then click Next.
In the Resource Set pane, select the nginx namespace.
In the Role pane, under ROLE TYPE, select Cluster Role and then select view.
Click Create.

The NGINX service account can now access all cluster resources in the nginx namespace.

Install an unmanaged CNI plugin¶

Calico affords MKE secure networking functionality for container-to-container communication within Kubernetes. MKE manages the Calico lifecycle, packaging it at both the time of installation and upgrade, and fully supports its use with MKE

MKE also supports the use of alternative, unmanaged CNI plugins available on Docker Hub. Mirantis can provide limited instruction on basic configuration, but for detailed guidance on third-party CNI components, you must refer to the external product documentation or support.

Consider the following limitations before implementing an unmanaged CNI plugin:

MKE only supports implementation of an unmanaged CNI plugin at install time.
MKE does not manage the version or configuration of alternative CNI plugins.
MKE does not upgrade or reconfigure alternative CNI plugins. To switch from the managed CNI to an unmanaged CNI plugin, or vice versa, you must uninstall and then reinstall MKE.

Install an unmanaged CNI plugin on MKE¶

Verify that your system meets all MKE requirements and third-party CNI plugin requirements.
Install MKE with the --unmanaged-cni flag:
```
docker container run --rm -it --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.6.16 install \
  --host-address <node-ip-address> \
  --unmanaged-cni \
  --interactive
```
MKE components that require Kubernetes networking will remain in the Container Creating state in Kubernetes until a CNI is installed. Once the installation is complete, you can access MKE from a web browser. Note that the manager node will be unhealthy as the kubelet will report NetworkPluginNotReady. Additionally, the metrics in the MKE dashboard will also be unavailable, as this runs in a Kubernetes pod.
Download and configure the client bundle.

Review the status of the MKE components that run on Kubernetes:

kubectl get nodes

Example output:

NAME         STATUS     ROLES     AGE       VERSION
manager-01   NotReady   master    10m       v1.11.9-docker-1

kubectl get pods -n kube-system -o wide

Example output:

NAME                           READY     STATUS              RESTARTS   AGE       IP        NODE         NOMINATED NODE
compose-565f7cf9ff-gq2gv       0/1       Pending             0          10m       <none>    <none>       <none>
compose-api-574d64f46f-r4c5g   0/1       Pending             0          10m       <none>    <none>       <none>
kube-dns-6d96c4d9c6-8jzv7      0/3       Pending             0          10m       <none>    <none>       <none>
ucp-metrics-nwt2z              0/3       ContainerCreating   0          10m       <none>    manager-01   <none>

Install the unmanaged CNI plugin. Follow the CNI plugin documentation for specific installation instructions. The unmanaged CNI plugin install steps typically include:
1. Download the relevant upstream CNI binaries.
2. Place the CNI binaries in /opt/cni/bin.
3. Download the relevant CNI plugin Kubernetes Manifest YAML file.
4. Run kubectl apply -f <your-custom-cni-plugin>.yaml.
Caution

You must install the unmanaged CNI immediately after installing MKE and before joining any manager or worker nodes to the cluster.

Note

While troubleshooting a custom CNI plugin, you may want to access logs within the kubelet. Connect to an MKE manager node and run docker logs ucp-kubelet.

Verify the MKE installation¶

Upon successful installation of the CNI plugin, the relevant MKE components will have a Running status once the pods have become available.

To review the status of the Kubernetes components:

kubectl get pods -n kube-system -o wide

Example output:

NAME                           READY     STATUS    RESTARTS   AGE       IP            NODE         NOMINATED NODE
compose-565f7cf9ff-gq2gv       1/1       Running   0          21m       10.32.0.2     manager-01   <none>
compose-api-574d64f46f-r4c5g   1/1       Running   0          21m       10.32.0.3     manager-01   <none>
kube-dns-6d96c4d9c6-8jzv7      3/3       Running   0          22m       10.32.0.5     manager-01   <none>
ucp-metrics-nwt2z              3/3       Running   0          22m       10.32.0.4     manager-01   <none>
weave-net-wgvcd                2/2       Running   0          8m        172.31.6.95   manager-01   <none>

Note

Weave Net serves as the CNI plugin for the above example. If you are using an alternative CNI plugin, verify its status in the output.

Enable an unmanaged CNI for Windows Server nodes¶

When MKE is installed with --unmanaged-cni, the ucp-kube-proxy-win container on Windows nodes will not fully start, but will instead log the following suggestion in a loop:

example : [System.Environment]::SetEnvironmentVariable("CNINetworkName", "ElangoNet", [System.EnvironmentVariableTarget]::Machine)
example : [System.Environment]::SetEnvironmentVariable("CNISourceVip", "192.32.31.1", [System.EnvironmentVariableTarget]::Machine)

This occurs because kube-proxy requires more information to program routes for Kubernetes services.

To enable an unmanaged CNI for Windows Server nodes:

There are two options for supplying kube-proxy with the required information.

Deploy your own kube-proxy along with the CNI, as implemented by the kube-proxy manifest and documented in the Kubernetes 1.21 Windows Install Guide.
If using a VXLAN-based CNI, define the following variables:
- CNINetworkName must match the name of the Windows Kubernetes HNS network, which you can find either in the installation documentation for the third party CNI or by using hnsdiag list networks.
- CNISourceVip must use the value of the source VIP for this node, which should be available in the installation documentation for the third party CNI. Because the source VIP will be different for each node and can change across host reboots, Mirantis recommends setting this variable using a utility script.
The following is an example of how to define these variables using PowerShell:
[System.Environment]::SetEnvironmentVariable("CNINetworkName", "vxlan0", [System.EnvironmentVariableTarget]::Machine) [System.Environment]::SetEnvironmentVariable("CNISourceVip", "192.32.31.1", [System.EnvironmentVariableTarget]::Machine)

Kubernetes network encryption¶

MKE provides data-plane level IPSec network encryption to securely encrypt application traffic in a Kubernetes cluster. This secures application traffic within a cluster when running in untrusted infrastructure or environments. It is an optional feature of MKE that is enabled by deploying the SecureOverlay components on Kubernetes when using the default Calico driver for networking with the default IPIP tunneling configuration.

Kubernetes network encryption is enabled by two components in MKE:

SecureOverlay Agent
SecureOverlay Master

The SecureOverlay Agent is deployed as a per-node service that manages the encryption state of the data plane. The Agent controls the IPSec encryption on Calico IPIP tunnel traffic between different nodes in the Kubernetes cluster. The Master is deployed on an MKE manager node and acts as the key management process that configures and periodically rotates the encryption keys.

Kubernetes network encryption uses AES Galois Counter Mode (AES-GCM) with 128-bit keys by default.

You must deploy the SecureOverlay Agent and Master on MKE to enable encryption, as it is not enabled by default. You can enable or disable encryption at any time during the cluster lifecycle. However, be aware that enabling or disabling encryption can cause temporary traffic outages between Pods, lasting up to a few minutes. When enabled, Kubernetes Pod traffic between hosts is encrypted at the IPIP tunnel interface in the MKE host.

Kubernetes network encryption is supported on the following platforms:

Platform	Encryption support
MKE 3.1 and later	Yes
Kubernetes 1.11 and later	Yes
On-premises	Yes
AWS	Yes
GCE	Yes
All MKE-supported Linux OSes	Yes
Azure	No
Unmanaged CNI plugins	No

Configure maximum transmission units¶

Maximum transmission units (MTUs) are the largest packet length that a container will allow. Before deploying the SecureOverlay components, verify that Calico is configured so that the IPIP tunnel MTU leaves sufficient room for the encryption overhead. Encryption adds 26 bytes of overhead, but every IPSec packet size must be a multiple of 4 bytes. IPIP tunnels require 20 bytes of encapsulation overhead. The IPIP tunnel interface MTU must be no more than EXTMTU - 46 - ((EXTMTU - 46) modulo 4), where EXTMTU is the minimum MTU of the external interfaces. An IPIP MTU of 1452 should generally be safe for most deployments.

In the MKE configuration file, update the ipip_mtu parameter with the new MTU:

[cluster_config]
 ...
 ipip_mtu = "1452"
 ...

Configure SecureOverlay¶

Once the cluster node MTUs are properly configured, deploy the SecureOverlay components to MKE using either the MKE configuration file or the SecureOverlay YAML file.

To configure SecureOverlay using the MKE configuration file:

Set the value of secure_overlay in the MKE configuration file cluster_config table to true.

To configure SecureOverlay using the SecureOverlay YAML file:

Run the following procedure at the time of cluster installation, prior to starting any workloads.

Copy the contents of the SecureOverlay YAML file into a YAML file called ucp-secureoverlay.yaml.

SecureOverlay YAML

# Cluster role for key management jobs
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: ucp-secureoverlay-mgr
rules:
  - apiGroups: [""]
    resources:
      - secrets
    verbs:
      - get
      - update
---
# Cluster role binding for key management jobs
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: ucp-secureoverlay-mgr
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ucp-secureoverlay-mgr
subjects:
- kind: ServiceAccount
  name: ucp-secureoverlay-mgr
  namespace: kube-system
---
# Service account for key management jobs
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ucp-secureoverlay-mgr
  namespace: kube-system
---
# Cluster role for secure overlay per-node agent
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: ucp-secureoverlay-agent
rules:
  - apiGroups: [""]
    resources:
      - nodes
    verbs:
      - get
      - list
      - watch
---
# Cluster role binding for secure overlay per-node agent
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: ucp-secureoverlay-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ucp-secureoverlay-agent
subjects:
- kind: ServiceAccount
  name: ucp-secureoverlay-agent
  namespace: kube-system
---
# Service account secure overlay per-node agent
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ucp-secureoverlay-agent
  namespace: kube-system
---
# K8s secret of current key configuration
apiVersion: v1
kind: Secret
metadata:
  name: ucp-secureoverlay
  namespace: kube-system
type: Opaque
data:
  keys: ""
---
# DaemonSet for secure overlay per-node agent
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: ucp-secureoverlay-agent
  namespace: kube-system
  labels:
    k8s-app: ucp-secureoverlay-agent
spec:
  selector:
    matchLabels:
      k8s-app: ucp-secureoverlay-agent
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: ucp-secureoverlay-agent
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      hostNetwork: true
      priorityClassName: system-node-critical
      terminationGracePeriodSeconds: 10
      serviceAccountName: ucp-secureoverlay-agent
      containers:
      - name: ucp-secureoverlay-agent
        image: docker/ucp-secureoverlay-agent:3.1.0
        securityContext:
          capabilities:
            add: ["NET_ADMIN"]
        env:
        - name: MY_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        volumeMounts:
        - name: ucp-secureoverlay
          mountPath: /etc/secureoverlay/
          readOnly: true
      volumes:
      - name: ucp-secureoverlay
        secret:
          secretName: ucp-secureoverlay
---
# Deployment for manager of the whole cluster (primarily to rotate keys)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ucp-secureoverlay-mgr
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: ucp-secureoverlay-mgr
  replicas: 1
  template:
    metadata:
      name: ucp-secureoverlay-mgr
      namespace: kube-system
      labels:
        app: ucp-secureoverlay-mgr
    spec:
      serviceAccountName: ucp-secureoverlay-mgr
      restartPolicy: Always
      containers:
      - name: ucp-secureoverlay-mgr
        image: docker/ucp-secureoverlay-mgr:3.1.0

Download and configure the client bundle.
Enable network encryption:
```
kubectl apply -f ucp-secureoverlay.yml
```

Note

To remove network encryption from the system, issue the following command:

kubectl delete -f ucp-secureoverlay.yml

Persistent Kubernetes Storage¶

Use NFS Storage¶

You can provide persistent storage for MKE workloads by using NFS storage. When mounted into the running container, NFS shares provide state to the application, managing data external to the container lifecycle.

Note

The following subjects are out of the scope of this topic:

Provisioning an NFS server
Exporting an NFS share
Using external Kubernetes plugins to dynamically provision NFS shares

There are two different ways to mount existing NFS shares within Kubernetes Pods:

Define NFS shares within the Pod definitions. NFS shares are defined manually by each tenant when creating a workload.
Define NFS shares as a cluster object through PersistentVolumes, with the cluster object lifecycle handled separately from the workload. This is common for operators who want to define a range of NFS shares for tenants to request and consume.

Define NFS shares in the Pod definition¶

While defining workloads in Kubernetes manifest files, users can reference the NFS shares that they want to mount within the Pod specification for each Pod. This can be a standalone Pod or it can be wrapped in a higher-level object like a Deployment, DaemonSet, or StatefulSet.

The following example includes a running MKE cluster and a downloaded client bundle with permission to schedule Pods in a namespace.

Create nfs-in-a-pod.yaml with the following content:

kind: Pod
apiVersion: v1
metadata:
  name: nfs-in-a-pod
spec:
  containers:
    - name: app
      image: alpine
      volumeMounts:
        - name: nfs-volume
          mountPath: /var/nfs
      command: ["/bin/sh"]
      args: ["-c", "sleep 500000"]
  volumes:
    - name: nfs-volume
      nfs:
        server: nfs.example.com
        path: /share1

Change the value of mountPath to the location where you want the share to be mounted.
Change the value of server to your NFS server.
Change the value of path to the relevant share.

Create the Pod specification:
```
kubectl create -f nfs-in-a-pod.yaml
```

Verify that the Pod is created successfully:

kubectl get pods

Example output:

NAME                     READY     STATUS      RESTARTS   AGE
nfs-in-a-pod             1/1       Running     0          6m

Verify everything was mounted correctly by accessing a shell prompt within the container and searching for your mount:
Access a shell prompt within the container:
```
kubectl exec -it nfs-in-a-pod sh
```
Verify that everything is correctly mounted by searching for your mount:
```
mount | grep nfs.example.com
```

Note

MKE and Kubernetes are unaware of the NFS share because it is defined as part of the Pod specification. As such, when you delete the Pod, the NFS share detaches from the cluster, though the data remains in the NFS share.

Expose NFS shares as a cluster object¶

This method uses the Kubernetes PersistentVolume (PV) and PersistentVolumeClaim (PVC) objects to manage NFS share lifecycle and access.

You can define multiple shares for a tenant to use within the cluster. The PV is a cluster-wide object, so it can be pre-provisioned. A PVC is a claim by a tenant for using a PV within the tenant namespace.

To create PV objects at the cluster level, you will need a ClusterRoleBinding grant.

Note

The “NFS share lifecycle” refers to granting and removing the end user ability to consume NFS storage, rather than the lifecycle of the NFS server.

To define the PersistentVolume at the cluster level:

Create pvwithnfs.yaml with the following content:
```
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-nfs-share
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  nfs:
    server: nfs.example.com
    path: /share1
```
- The 5Gi storage size is used to match the volume to the tenant claim.
- The valid accessModes values for an NFS PV are:
  - ReadOnlyMany: the volume can be mounted as read-only by many nodes.
  - ReadWriteOnce: the volume can be mounted as read-write by a single node.
  - ReadWriteMany: the volume can be mounted as read-write by many nodes.
  The access mode in the PV definition is used to match a PV to a Claim. When a PV is defined and created inside of Kubernetes, a volume is not mounted. Refer to Access Modes for more information, including any changes to the valid accessModes.
- The valid persistentVolumeReclaimPolicy values are:
  - Reclaim
  - Recycle
  - Delete
  MKE uses the reclaim policy to define what the cluster does after a PV is released from a claim. Refer to Reclaiming in the official Kubernetes documentation for more information, including any changes to the valid persistentVolumeReclaimPolicy values.
- Change the value of server to your NFS server.
- Change the value of path to the relevant share.
Create the volume:
```
kubectl create -f pvwithnfs.yaml
```

Verify that the volume is created successfully:

kubectl get pv

Example output:

NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                       STORAGECLASS   REASON    AGE

my-nfs-share   5Gi        RWO            Recycle          Available                               slow                     7s

To define a PersistentVolumeClaim:

A tenant can now “claim” a PV for use within their workloads by using a Kubernetes PVC. A PVC exists within a namespace and it attempts to match available PVs to the tenant request.

Create myapp-cliam.yaml with the following content:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: myapp-nfs
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
To deploy this PVC, the tenant must have a RoleBinding that permits the creation of PVCs. If there is a PV that meets the tenant criteria, Kubernetes binds the PV to the claim. This does not, however, mount the share.

Create the PVC:

kubectl create -f myapp-claim.yaml

Expected output:

persistentvolumeclaim "myapp-nfs" created

Verify that the claim is created successfully:

kubectl get pvc

Example output:

NAME        STATUS    VOLUME         CAPACITY   ACCESS MODES   STORAGECLASS   AGE
myapp-nfs   Bound     my-nfs-share   5Gi        RWO            slow           2s

Verify that the claim is associated with the PV:

kubectl get pv

Example output:

NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM              STORAGECLASS   REASON    AGE
my-nfs-share   5Gi        RWO            Recycle          Bound     default/myapp-nfs  slow                     4m

To define a workload:

The final task is to deploy a workload to consume the PVC. The PVC is defined within the Pod specification, which can be a standalone Pod or wrapped in a higher-level object such as a Deployment, DaemonSet, or StatefulSet.

Create myapp-pod.yaml with the following content:

kind: Pod
apiVersion: v1
metadata:
  name: pod-using-nfs
spec:
  containers:
    - name: app
      image: alpine
      volumeMounts:
      - name: data
          mountPath: /var/nfs
      command: ["/bin/sh"]
      args: ["-c", "sleep 500000"]
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: myapp-nfs

Change the value of mountPath to the location where you want the share mounted.

Deploy the Pod:
```
kubectl create -f myapp-pod.yaml
```

Verify that the Pod is created successfully:

kubectl get pod

Example output:

NAME                     READY     STATUS      RESTARTS   AGE
pod-using-nfs            1/1       Running     0          1m

Access a shell prompt within the container:
```
kubectl exec -it pod-using-nfs sh
```
Verify that everything is correctly mounted by searching for your mount:
```
mount | grep nfs.example.com
```

See also

Kubernetes Pods in the official Kubernetes documentation

Use Azure Disk Storage¶

You can provide persistent storage for MKE workloads on Microsoft Azure by using Azure Disk Storage. You can either pre-provision Azure Disk Storage to be consumed by Kubernetes Pods, or you can use the Azure Kubernetes integration to dynamically provision Azure Disks as needed.

This guide assumes that you have already provisioned an MKE environment on Microsoft Azure and that you have provisioned a cluster after meeting all of the prerequisites listed in Install MKE on Azure.

To complete the steps in this topic, you must download and configure the client bundle.

Manually provision Azure Disks¶

You can use existing Azure Disks or manually provision new ones to provide persistent storage for Kubernetes Pods. You can manually provision Azure Disks in the Azure Portal, using ARM Templates, or using the Azure CLI. The following example uses the Azure CLI to manually provision an Azure Disk.

Create an environment variable for myresourcegroup:
```
RG=myresourcegroup
```

Provision an Azure Disk:

az disk create \
--resource-group $RG \
--name k8s_volume_1  \
--size-gb 20 \
--query id \
--output tsv

This command returns the Azure ID of the Azure Disk Object.

Example output:

/subscriptions/<subscriptionID>/resourceGroups/<resourcegroup>/providers/Microsoft.Compute/disks/<diskname>

Make note of the Azure ID of the Azure Disk Object returned by the previous step.

You can now create Kubernetes Objects that refer to this Azure Disk. The following example uses a Kubernetes Pod, though the same Azure Disk syntax can be used for DaemonSets, Deployments, and StatefulSets. In the example, the Azure diskName and diskURI refer to the manually created Azure Disk:

$ cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: mypod-azuredisk
spec:
  containers:
  - image: nginx
    name: mypod
    volumeMounts:
      - name: mystorage
        mountPath: /data
  volumes:
      - name: mystorage
        azureDisk:
          kind: Managed
          diskName: k8s_volume_1
          diskURI: /subscriptions/<subscriptionID>/resourceGroups/<resourcegroup>/providers/Microsoft.Compute/disks/<diskname>
EOF

Dynamically provision Azure Disks¶

Kubernetes can dynamically provision Azure Disks using the Azure Kubernetes integration, configured at the time of your MKE installation. For Kubernetes to determine which APIs to use when provisioning storage, you must create Kubernetes StorageClass objects specific to each storage backend.

There are two different Azure Disk types that can be consumed by Kubernetes: Azure Disk Standard Volumes and Azure Disk Premium Volumes.

Depending on your use case, you can deploy one or both of the Azure Disk storage classes.

To define the Azure Disk storage class:

Create the storage class:

cat <<EOF | kubectl create -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard
provisioner: kubernetes.io/azure-disk
parameters:
  storageaccounttype: <disk-type>
  kind: Managed
EOF

For storageaccounttype, enter Standard_LRS for the standard storage class Premium_LRS for the premium storage class.

Verify which storage classes have been provisioned:

kubectl get storageclasses

Example output:

NAME       PROVISIONER                AGE
premium    kubernetes.io/azure-disk   1m
standard   kubernetes.io/azure-disk   1m

To create an Azure Disk with a PersistentVolumeClaim:

After you create a storage class, you can use Kubernetes Objects to dynamically provision Azure Disks. This is done using Kubernetes PersistentVolumesClaims.

The following example uses the standard storage class and creates a 5 GiB Azure Disk. Alter these values to fit your use case.

Create a PersistentVolumeClaim:

cat <<EOF | kubectl create -f -
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: azure-disk-pvc
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
EOF

Verify the creation of the PersistentVolumeClaim:

kubectl get persistentvolumeclaim

Example output:

NAME              STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
azure-disk-pvc    Bound     pvc-587deeb6-6ad6-11e9-9509-0242ac11000b   5Gi        RWO            standard       1m

Verify the creation of the PersistentVolume:

kubectl get persistentvolume

Expected output:

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                     STORAGECLASS   REASON    AGE
pvc-587deeb6-6ad6-11e9-9509-0242ac11000b   5Gi        RWO            Delete           Bound     default/azure-disk-pvc    standard                 3m

Verify the creation of a new Azure Disk in the Azure Portal.

To attach the new Azure Disk to a Kubernetes Pod:

You can now mount the Kubernetes PersistentVolume into a Kubernetes Pod. The disk can be consumed by any Kubernetes object type, including a Deployment, DaemonSet, or StatefulSet. However, the following example simply mounts the PersistentVolume into a standalone Pod.

Attach the new Azure Disk to a Kubernetes pod:

cat <<EOF | kubectl create -f -
kind: Pod
apiVersion: v1
metadata:
  name: mypod-dynamic-azuredisk
spec:
  containers:
    - name: mypod
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: storage
  volumes:
    - name: storage
      persistentVolumeClaim:
        claimName: azure-disk-pvc
EOF

Data disk capacity of an Azure Virtual Machine¶

Azure limits the number of data disks that can be attached to each Virtual Machine. Refer to Azure Virtual Machine Sizes for this information. Kubernetes prevents Pods from deploying on Nodes that have reached their maximum Azure Disk Capacity. In such cases, Pods remain stuck in the ContainerCreating status, as demonstrated in the following example:

Review Pods:

kubectl get pods

Example output:

NAME                  READY     STATUS              RESTARTS   AGE
mypod-azure-disk      0/1       ContainerCreating   0          4m

Describe the Pod to display troubleshooting logs, which indicate the node has reached its capacity:

kubectl describe pods mypod-azure-disk

Example output:

Warning  FailedAttachVolume  7s (x11 over 6m)  attachdetach-controller  \
AttachVolume.Attach failed for volume "pvc-6b09dae3-6ad6-11e9-9509-0242ac11000b" : \
Attach volume "kubernetes-dynamic-pvc-6b09dae3-6ad6-11e9-9509-0242ac11000b" to instance \
"/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Compute/virtualMachines/worker-03" \
failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: \
StatusCode=409 -- Original Error: failed request: autorest/azure: \
Service returned an error. Status=<nil> Code="OperationNotAllowed" \
Message="The maximum number of data disks allowed to be attached to a VM of this size is 4." \
Target="dataDisks"

See also

Kubernetes Pods in the official Kubernetes documentation
Azure Kubernetes in the official Microsoft Azure documentation

Use Azure Files Storage¶

You can provide persistent storage for MKE workloads on Microsoft Azure by using Azure Files. You can either pre-provision Azure Files shares to be consumed by Kubernetes Pods, or you can use the Azure Kubernetes integration to dynamically provision Azure Files shares as needed.

To complete the steps in this topic, you must download and configure the client bundle.

Manually provision Azure Files shares¶

You can use existing Azure Files shares or manually provision new ones to provide persistent storage for Kubernetes Pods. You can manually provision Azure Files shares in the Azure Portal, using ARM Templates, or using the Azure CLI. The following example uses the Azure CLI to manually provision an Azure Files share.

To manually provision an Azure Files share:

Note

The Azure Kubernetes driver does not support Azure Storage accounts created using Azure Premium Storage.

Create an Azure Storage account:

Create the following environment variables, replacing <region> with the required region:
```
REGION=<region>
SA=mystorageaccount
RG=myresourcegroup
```

Create the Azure Storage account:

az storage account create \
--name $SA \
--resource-group $RG \
--location $REGION \
--sku Standard_LRS

Provision an Azure Files share:

Create the following environment variables, adjusting the size of this share to satisfy the user requirements.
```
FS=myfileshare
SIZE=5
```

Obtain the Azure collection string, which you can also obtain from the Azure Portal:

export AZURE_STORAGE_CONNECTION_STRING=`az storage account show-connection-string --name $SA --resource-group $RG -o tsv`

Provision the Azure Files share:

az storage share create \
--name $FS \
--quota $SIZE \
--connection-string $AZURE_STORAGE_CONNECTION_STRING

To configure a Kubernetes Secret:

After creating an Azure Files share, you must load the Azure Storage account access key into MKE as a Kubernetes Secret. This provides access to the file share when Kubernetes attempts to mount the share into a Pod. You can find this Secret either in the Azure Portal or by using the Azure CLI, as in the following example.

Create the following environment variables, if you have not done so already:
```
SA=mystorageaccount
RG=myresourcegroup
FS=myfileshare
```

Obtain the Azure Storage account access key, which you can also obtain from the Azure Portal:

STORAGE_KEY=$(az storage account keys list --resource-group $RG --account-name $SA --query "[0].value" -o tsv)

Load the Azure Storage account access key into MKE as a Kubernetes Secret:

kubectl create secret generic azure-secret \
--from-literal=azurestorageaccountname=$SA \
--from-literal=azurestorageaccountkey=$STORAGE_KEY

To mount the Azure Files share into a Kubernetes Pod:

The following example creates a standalone Kubernetes Pod, though you can use the same syntax to create DaemonSets, Deployments, and StatefulSets.

Create the following environment variable:
```
FS=myfileshare
```

Mount the Azure Files share into a Kubernetes Pod:

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: mypod-azurefile
spec:
  containers:
  - image: nginx
    name: mypod
    volumeMounts:
      - name: mystorage
        mountPath: /data
  volumes:
  - name: mystorage
    azureFile:
      secretName: azure-secret
      shareName: $FS
      readOnly: false
EOF

Dynamically provision Azure Files shares¶

Kubernetes can dynamically provision Azure Files shares using the Azure Kubernetes integration, configured at the time of your MKE installation. For Kubernetes to determine which APIs to use when provisioning storage, you must create Kubernetes StorageClass objects specific to each storage backend.

Note

The Azure Kubernetes plugin only supports using the Standard StorageClass. File shares that use the Premium StorageClass will fail to mount.

To define the Azure Files StorageClass:

Create the storage class:

cat <<EOF | kubectl create -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard
provisioner: kubernetes.io/azure-file
mountOptions:
  - dir_mode=0777
  - file_mode=0777
  - uid=1000
  - gid=1000
parameters:
  skuName: Standard_LRS
  storageAccount: <existingstorageaccount> # Optional
  location: <existingstorageaccountlocation> # Optional
EOF

Verify which storage classes have been provisioned:

kubectl get storageclasses

Example output:

NAME       PROVISIONER                AGE
azurefile  kubernetes.io/azure-file   1m

To create an Azure Files share using a PersistentVolumeClaim:

After you create a storage class, you can use Kubernetes Objects to dynamically provision Azure Files shares. This is done using Kubernetes PersistentVolumesClaims.

Kubernetes uses an existing Azure Storage account, if one exists inside of the Azure Resource Group. If an Azure Storage account does not exist, Kubernetes creates one.

The following example uses the standard storage class and creates a 5 Gi Azure File share. Alter these values to fit your use case.

Create a PersistentVolumeClaim:

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: azure-file-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: standard
  resources:
    requests:
      storage: 5Gi
EOF

Verify the creation of the PersistentVolumeClaim:

kubectl get pvc

Example output:

NAME             STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
azure-file-pvc   Bound     pvc-f7ccebf0-70e0-11e9-8d0a-0242ac110007   5Gi        RWX            standard       22s

Verify the creation of the PerstentVolume:

kubectl get pv

Example output:

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                    STORAGECLASS   REASON    AGE
pvc-f7ccebf0-70e0-11e9-8d0a-0242ac110007   5Gi        RWX            Delete           Bound     default/azure-file-pvc   standard                 2m

To attach the new Azure Files share to a Kubernetes Pod:

You can now mount the Kubernetes PersistentVolume into a Kubernetes Pod. The file share can be consumed by any Kubernetes object type, including a Deployment, DaemonSet, or StatefulSet. However, the following example simply mounts the PersistentVolume into a standalone Pod.

Attach the new Azure Files share to a Kubernetes Pod:

cat <<EOF | kubectl create -f -
kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: storage
  volumes:
    - name: storage
      persistentVolumeClaim:
       claimName: azure-file-pvc
EOF

Troubleshoot Azure Files shares¶

When creating a PersistentVolumeClaim, the volume can get stuck in a Pending state if the persistent-volume-binder service account does not have the relevant Kubernetes RBAC permissions.

To resolve this issue:

Review the status of the PVC:

kubectl get pvc

Example output:

NAME             STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
azure-file-pvc   Pending                                      standard       32s

Describe the PVC:

kubectl describe pvc azure-file-pvc

The storage account creates a Kubernetes Secret to store the Azure Files storage account key. If the persistent-volume-binder service account does not have the correct permissions, a warning such as the following will display:

Warning    ProvisioningFailed  7s (x3 over 37s)  persistentvolume-controller
Failed to provision volume with StorageClass "standard": Couldn't create secret
secrets is forbidden: User "system:serviceaccount:kube-system:persistent-volume-binder"
cannot create resource "secrets" in API group "" in the namespace "default": access denied

Grant the persistent-volume-binder service account the relevant RBAC permissions by creating the following RBAC ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    subjectName: kube-system-persistent-volume-binder
  name: kube-system-persistent-volume-binder:cluster-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: persistent-volume-binder
  namespace: kube-system

See also

Kubernetes Pods in the official Kubernetes documentation
Azure Kubernetes in the official Microsoft Azure documentation

Use AWS EBS Storage¶

You can use AWS volumes as the persistent storage for your application by using Kubernetes to deploy AWS Elastic Block Store (EBS). Before you can use EBS volumes, you must configure MKE to use the AWS infrastructure.

Configure AWS infrastructure for Kubernetes¶

To configure the AWS infrastructure:

Configure the following AWS Identity and Access Management (IAM) master and worker node permissions, as doing so is required to provision EBS volumes using Kubernetes PersistentVolumeClaims:

IAM permission	Master	Worker
ec2:DescribeInstances	Yes	Yes
ec2:AttachVolume	Yes	Yes
ec2:DetachVolume	Yes	Yes
ec2:DescribeVolumes	Yes	Yes
ec2:DescribeSecurityGroups	Yes	Yes
ec2:CreateVolume	Yes	No
ec2:DeleteVolume	Yes	No
ec2:CreateTags	Yes	No

Set the host name of the EC2 instances to the private DNS host name of the instance.
Change the system host name so that it does not use a public DNS name.
Label the EC2 instances using the key KubernetesCluster and assign the same value across all nodes, for example, MKEKubenertesCluster.
Configure your cluster for use with AWS volumes. Select from the following options:
- In a new cluster during installation, issue the following cloud provider flag: --cloud-provider=aws.
- In an existing cluster:
  1. Update the MKE configuration file as follows:
    [cluster_config] ... cloud_provider = "aws"
  2. Update ucp-agent to propagate the new configuration.

Deploy AWS EBS volumes¶

You can now create PersistentVolumes (PVs) that deploy EBS volumes that are attached to hosts and mounted inside Pods. The EBS volumes are provisioned dynamically such they are created, attached, and destroyed according to the life cycle of the PVs. Users do not need direct access to AWS, as they request the required resources directly using Kubernetes primitives.

Mirantis recommends that you use the StorageClass and PersistentVolumeClaim resources, as these abstraction layers provide more portability and control over the storage layer across environments.

To deploy an AWS EBS volume:

Create a StorageClass to map a standard class of storage to the gp2 storage type in AWS EBS:

cat <<EOF | kubectl create -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Retain
mountOptions:
  - debug
EOF

Create a PersistentVolumeClaim (PVC) that makes a request for 1Gi of storage from the standard storage class:

cat <<EOF | kubectl create -f -
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: task-pv-claim
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
EOF

Deploy a PersistentVolume with the following Pod specification:

cat <<EOF | kubectl create -f -
kind: Pod
apiVersion: v1
metadata:
  name: task-pv-pod
spec:
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
       claimName: task-pv-claim
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: task-pv-storage
EOF

Verify that the PV is created and bound to the PVC:

kubectl get pv

Example output:

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                   STORAGECLASS   REASON    AGE
pvc-751c006e-a00b-11e8-8007-0242ac110012   1Gi        RWO            Retain           Bound     default/task-pv-claim   standard                 3h

Verify that the AWS console indicates that a volume has been provisioned with a matching name, a type of gp2, and a size of 1Gi.

Configure iSCSI¶

Internet Small Computer System Interface (iSCSI) is an IP-based standard that provides block-level access to storage devices. iSCSI receives requests from clients and fulfills them on remote SCSI devices. iSCSI support in MKE enables Kubernetes workloads to consume persistent storage from iSCSI targets.

Note

MKE does not support using iSCSI with Windows clusters.

Note

Challenge-Handshake Authentication Protocol (CHAP) secrets are supported for both iSCSI discovery and session management.

iSCSI components¶

The iSCSI initiator is any client that consumes storage and sends iSCSI commands. In an MKE cluster, the iSCSI initiator must be installed and running on any node where Pods can be scheduled. Configuration, target discovery, logging in, and logging out of a target are performed primarily by two software components: iscsid (service) and iscsiadm (CLI tool).

These two components are typically packaged as part of open-iscsi on Debian systems and iscsi-initiator-utils on RHEL, CentOS, and Fedora systems.

iscsid is the iSCSI initiator daemon and implements the control path of the iSCSI protocol. It communicates with iscsiadm and kernel modules.
iscsiadm is a CLI tool that allows discovery, login to iSCSI targets, session management, and access and management of the open-iscsi database.

The iSCSI target is any server that shares storage and receives iSCSI commands from an initiator.

Note

iSCSI kernel modules implement the data path. The most common modules used across Linux distributions are scsi_transport_iscsi.ko, libiscsi.ko, and iscsi_tcp.ko. These modules need to be loaded on the host for proper functioning of the iSCSI initiator.

Prerequisites¶

Complete hardware and software configuration of the iSCSI storage provider. There is no significant demand for RAM and disk when running external provisioners in MKE clusters. For setup information specific to a storage vendor, refer to the vendor documentation.
Configure kubectl on your clients.
Make sure that the iSCSI server is accessible to MKE worker nodes.

Configure an iSCSI target¶

An iSCSI target can run on dedicated, stand-alone hardware, or can be configured in a hyper-converged manner to run alongside container workloads on MKE nodes. To provide access to the storage device, configure each target with one or more logical unit numbers (LUNs).

iSCSI targets are specific to the storage vendor. Refer to the vendor documentation for setup instructions, including applicable RAM and disk space requirements, and expose them to the MKE cluster.

To expose iSCSI targets to the MKE cluster:

If necessary for access control, configure the target with client iSCSI qualified names (IQNs).
CHAP secrets for authentication.
Make sure that each iSCSI LUN is accessible by all nodes in the cluster. Configure the iSCSI service to expose storage as an iSCSI LUN to all nodes in the cluster. You can do this by allowing all MKE nodes, and along with them the IQNs, to join the target ACL list.

Configure a generic iSCSI initiator¶

Every Linux distribution packages the iSCSI initiator software in a particular way. Follow the instructions specific to the storage provider, using the following steps as a guideline.

Prepare all MKE nodes by installing OS-specific iSCSI packages and loading the necessary iSCSI kernel modules. In the following example, scsi_transport_iscsi.ko and libiscsi.ko are pre-loaded by the Linux distribution. The iscsi_tcp kernel module must be loaded with a separate command.
- For CentOS or Red Hat:
```
sudo yum install -y iscsi-initiator-utils sudo modprobe iscsi_tcp
```
- For Ubuntu:
```
sudo apt install open-iscsi sudo modprobe iscsi_tcp
```
Set up MKE nodes as iSCSI initiators. Configure initiator names for each node, using the format InitiatorName=iqn.<YYYY-MM.reverse.domain.name:OptionalIdentifier>:
```
sudo sh -c 'echo "InitiatorName=iqn.<YYYY-MM.reverse.domain.name:OptionalIdentifier>" >
/etc/iscsi/ <initiatorname>.iscsi sudo systemctl restart iscsid
```

Configure MKE¶

Update the MKE configuration file with the following options:

Configure --storage-iscsi=true to enable iSCSI-based PersistentVolumes (PVs) in Kubernetes.
Configure --iscsiadm-path=<path> to specify the absolute path of the iscsiadm binary on the host. The default value is /usr/sbin/iscsiad.
Configure --iscsidb-path=<path> to specify the path of the iSCSI database on the host. The default value is /etc/iscsi.

Configure in-tree iSCSI volumes¶

The Kubernetes in-tree iSCSI plugin only supports static provisioning, for which you must:

Verify that the desired iSCSI LUNs are pre-provisioned in the iSCSI targets.
Create iSCSI PV objects, which correspond to the pre-provisioned LUNs with the appropriate iSCSI configuration. As PersistentVolumeClaims (PVCs) are created to consume storage, the iSCSI PVs bind to the PVCs and satisfy the request for persistent storage.

To configure in-tree iSCSI volumes:

Create a YAML file for the PersistentVolume object based on the following example:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: iscsi-pv
spec:
  capacity:
    storage: 12Gi
  accessModes:
    - ReadWriteOnce
  iscsi:
     targetPortal: 192.0.2.100:3260
     iqn: iqn.2017-10.local.example.server:disk1
     lun: 0
     fsType: 'ext4'
     readOnly: false

Make the following changes using information appropriate for your environment:
- Replace 12Gi with the size of the storage available.
- Replace 192.0.2.100:3260 with the IP address and port number of the iSCSI target in your environment. Refer to the storage provider documentation for port information.
- Replace iqn.2017-10.local.example.server:disk1 with a unique name for the identifier. More than one iqn can be specified, but it must use the format iqn.YYYY-MM.reverse.domain.name:OptionalIdentifier. iqn.2017-10.local.example.server:disk1 is the IQN of the iSCSI initiator, which in this case is the MKE worker node. Each MKE worker must have a unique IQN.

Create the PersistentVolume:

kubectl create -f pv-iscsi.yml

Expected output:

persistentvolume/iscsi-pv created

External provisioner and Kubernetes objects¶

An external provisioner is a piece of software running out of process from Kubernetes that is responsible for creating and deleting PVs. External provisioners monitor the Kubernetes API server for PV claims and create PVs accordingly.

When using an external provisioner, you must perform the following additional steps:

Configure external provisioning based on your storage provider. Refer to your storage provider documentation for deployment information.
Define storage classes. Refer to your storage provider dynamic provisioning documentation for configuration information.
Define a PVC and a Pod. When you define a PVC to use the storage class, a PV is created and bound.
Start a Pod using the PVC that you defined.

Note

In some cases, on-premises storage providers use external provisioners to connect PV provisioning to the backend storage.

Troubleshooting¶

The following issues occur frequently in iSCSI integrations:

The host might not have iSCSI kernel modules loaded. To avoid this, always prepare your MKE worker nodes by installing the iSCSI packages and the iSCSI kernel modules prior to installing MKE. If worker nodes are not prepared correctly prior to an MKE installation:
1. Prepare the nodes.
2. Restart the ucp-kubelet container for changes to take effect.
Some hosts have depmod confusion. On some Linux distributions, the kernel modules cannot be loaded until the kernel sources are installed and depmod is run. If you experience problems with loading kernel modules, verify that you are running depmod after performing the kernel module installation.

Example¶

Download and configure the client bundle.

Create a YAML file with the following StorageClass object:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: iscsi-targetd-vg-targetd
provisioner: iscsi-targetd
parameters:
  targetPortal: 172.31.8.88
  iqn: iqn.2019-01.org.iscsi.docker:targetd
  iscsiInterface: default
  volumeGroup: vg-targetd
  initiators: iqn.2019-01.com.example:node1, iqn.2019-01.com.example:node2
  chapAuthDiscovery: "false"
  chapAuthSession: "false"

Apply the StorageClass YAML file:

kubectl apply -f iscsi-storageclass.yaml

Expected output:

storageclass "iscsi-targetd-vg-targetd" created

Verify the successful creation of the StorageClass object:

kubectl get sc

Example output:

NAME                       PROVISIONER     AGE
iscsi-targetd-vg-targetd   iscsi-targetd   30s

Create a YAML file with the following PersistentVolumeClaim object:
```
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: iscsi-claim
spec:
  storageClassName: "iscsi-targetd-vg-targetd"
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 100Mi
```
- The valid accessModes values for iSCSI are ReadWriteOnce and ReadOnlyMany.
- Change the value of storage as required.
Note

The scheduler automatically ensures that Pods with the same PVC run on the same worker node.

Apply the PersistentVolumeClaim YAML file:

kubectl apply -f pvc-iscsi.yml

Expected output:

persistentvolumeclaim "iscsi-claim" created

Verify the successful creation of the PersistentVolume and PersistentVolumeClaim and that the PersistentVolumeClaim is bound to the correct volume:

kubectl get pv,pvc

Example output:

NAME STATUS    VOLUME  CAPACITY   ACCESS MODES   STORAGECLASS  AGE
iscsi-claim   Bound     pvc-b9560992-24df-11e9-9f09-0242ac11000e   100Mi      RWO              iscsi-targetd-vg-targetd   1m

NAME CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS CLAIM STORAGECLASS                REASON    AGE
pvc-b9560992-24df-11e9-9f09-0242ac11000e   100Mi      RWO Delete Bound     default/iscsi- claim   iscsi-targetd-vg-targetd  36s

Configure Pods to use the PersistentVolumeClaim when binding to the PersistentVolume.

Create a YAML file with the following ReplicationController object. The ReplicationController is used to set up two replica Pods running web servers that use the PersistentVolumeClaim to mount the PersistentVolume onto a mountpath containing shared resources.

apiVersion: v1
kind: ReplicationController
metadata:
  name: rc-iscsi-test
spec:
  replicas: 2
  selector:
    app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - name: nginx
          containerPort: 80
        volumeMounts:
        - name: iscsi
          mountPath: "/usr/share/nginx/html"
      volumes:
      - name: iscsi
        persistentVolumeClaim:
          claimName: iscsi-claim

Create the ReplicationController object:

kubectl create -f rc-iscsi.yml

Expected output:

replicationcontroller "rc-iscsi-test" created

Verify successful creation of the Pods:

kubectl get pods

Example output:

NAME                  READY     STATUS    RESTARTS   AGE
rc-iscsi-test-05kdr   1/1       Running   0          9m
rc-iscsi-test-wv4p5   1/1       Running   0          9m

See also

Refer to iSCSI-targetd provisioner for detailed information on an external provisioner implementation using a target-based external provisioner.

Use CSI drivers¶

The Container Storage Interface (CSI) is a specification for container orchestrators to manage block- and file-based volumes for storing data. Storage vendors can each create a single CSI driver that works with multiple container orchestrators. The Kubernetes community maintains sidecar containers that a containerized CSI driver can use to interface with Kubernetes controllers in charge of the following:

Managing persistent volumes
Attaching volumes to nodes, if applicable
Mounting volumes to Pods
Taking snapshots

These sidecar containers include a driver registrar, external attacher, external provisioner, and external snapshotter.

Mirantis supports version 1.0 and later of the CSI specification, and thus MKE can manage storage backends that ship with an associated CSI driver.

Note

Enterprise storage vendors provide CSI drivers, whereas Mirantis does not. Kubernetes does not enforce a specific procedure for how storage providers (SPs) should bundle and distribute CSI drivers.

Review the Kubernetes CSI Developer Documentation for CSI architecture, security, and deployment information.

Prerequisites¶

Select a CSI driver to use with Kubernetes from the following MKE-certified CSI drivers:

Partner name

Kubernetes on MKE

Dell EMC

Certified (CSI)

HPE

Certified (CSI)

NetApp

Certified (Trident - CSI)
Optional. Set the --storage-expt-enabled flag in the MKE install configuration to enable experimental Kubernetes storage features.
Install the CSI plugin from your storage provider.
Apply RBAC for sidecars and the CSI driver.
Perform static or dynamic provisioning of PersistentVolumes (PVs) using the CSI plugin as the provisioner.

CSI driver deployment¶

The simplest way to deploy CSI drivers is for storage vendors to package them in containers. In the context of Kubernetes clusters, containerized CSI drivers typically deploy as StatefulSets for managing the cluster-wide logic and DaemonSets for managing node-specific logic.

Note the following considerations:

You can deploy multiple CSI drivers for different storage backends in the same cluster.
To avoid credential leak to user processes, Kubernetes recommends running CSI Controllers on master nodes and the CSI node plugin on worker nodes.
MKE allows running privileged Pods, which is required to run CSI drivers.
The Docker daemon on the hosts must be configured with shared mount propagation for CSI. This allows the sharing of volumes mounted by one container into other containers in the same Pod or to other Pods on the same node. By default, MKE enables bidirectional mount propagation in the Docker daemon.

Refer to Kubernetes CSI documentation for more information.

Role-based access control (RBAC)¶

Pods that contain CSI plugins must have the appropriate permissions to access and manipulate Kubernetes objects.

Using YAML files that the storage vendor provides, you can configure the cluster roles and bindings for service accounts associated with CSI driver Pods. MKE administrators must apply those YAML files to properly configure RBAC for the service accounts associated with CSI Pods.

Usage¶

The dynamic provisioning of persistent storage depends on the capabilities of the CSI driver and of the underlying storage backend. Review the CSI driver provider documentation for the available parameters. Refer to CSI HostPath Driver for a generic CSI plugin example.

You can access the following CSI deployment information in the MKE web UI:

Persistent storage objects: In the MKE web UI left-side navigation panel, navigate to Kubernetes > Storage for information on persistent storage objects such as StorageClass, PersistentVolumeClaim, and PersistentVolume.
Volumes: In the MKE web UI left-side navigation panel, navigate to Kubernetes > Pods, select a Pod, and scroll to Volumes to view the Pod volume information.

GPU support for Kubernetes workloads¶

MKE provides graphics processing unit (GPU) support for Kubernetes workloads that run on Linux worker nodes. This topic describes how to configure your system to use and deploy NVIDIA GPUs.

Install the GPU drivers¶

GPU support requires that you install GPU drivers, which you can do either prior to or after installing MKE. Installing the GPU drivers installs the NVIDIA driver using a runfile on your Linux host.

Note

This procedure describes how to manually install the GPU drivers. However, Mirantis recommends that you use a pre-existing automation system to automate the installation and patching of the drivers, along with the kernel and other host software.

Enable the NVIDIA GPU device plugin by setting nvidia_device_plugin to true in the MKE configuration file.
Verify that your system supports NVIDIA GPU:
```
lspci | grep -i nvidia
```
Verify that your GPU is a supported NVIDIA GPU Product.
Install all the dependencies listed in the NVIDIA Minimum Requirements.
Verify that your system is up to date and that you are running the latest kernel version.

Install the following packages:

Ubuntu:

sudo apt-get install -y gcc make curl linux-headers-$(uname -r)

RHEL:

sudo yum install -y kernel-devel-$(uname -r) \
kernel-headers-$(uname -r) gcc make curl elfutils-libelf-devel

Verify that the i2c_core and ipmi_msghandler kernel modules are loaded:
```
sudo modprobe -a i2c_core ipmi_msghandler
```

Persist the change across reboots:

echo -e "i2c_core\nipmi_msghandler" | sudo tee /etc/modules-load.d/nvidia.conf

Review the NVIDIA libraries, which are located under the following directory on the host:

NVIDIA_OPENGL_PREFIX=/opt/kubernetes/nvidia
sudo mkdir -p $NVIDIA_OPENGL_PREFIX/lib
echo "${NVIDIA_OPENGL_PREFIX}/lib" | sudo tee /etc/ld.so.conf.d/nvidia.conf
sudo ldconfig

Install the NVIDIA GPU driver:

NVIDIA_DRIVER_VERSION=<version-number>
curl -LSf https://us.download.nvidia.com/XFree86/Linux-x86_64/${NVIDIA_DRIVER_VERSION}/NVIDIA-Linux-x86_64-${NVIDIA_DRIVER_VERSION}.run -o nvidia.run
sudo sh nvidia.run --opengl-prefix="${NVIDIA_OPENGL_PREFIX}"

Set <version-number> to the NVIDIA driver version of your choice.

Load the NVIDIA Unified Memory kernel module and create device files for the module on startup:

sudo tee /etc/systemd/system/nvidia-modprobe.service << END
[Unit]
Description=NVIDIA modprobe

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/nvidia-modprobe -c0 -u

[Install]
WantedBy=multi-user.target
END

sudo systemctl enable nvidia-modprobe
sudo systemctl start nvidia-modprobe

Enable the NVIDIA persistence daemon to initialize GPUs and keep them initialized:

sudo tee /etc/systemd/system/nvidia-persistenced.service << END
[Unit]
Description=NVIDIA Persistence Daemon
Wants=syslog.target

[Service]
Type=forking
PIDFile=/var/run/nvidia-persistenced/nvidia-persistenced.pid
Restart=always
ExecStart=/usr/bin/nvidia-persistenced --verbose
ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced

[Install]
WantedBy=multi-user.target
END

sudo systemctl enable nvidia-persistenced
sudo systemctl start nvidia-persistenced

Test the device plugin and review its description:

kubectl describe node <node-name>

Example output:

Capacity:
cpu:                8
ephemeral-storage:  40593612Ki
hugepages-1Gi:      0
hugepages-2Mi:      0
memory:             62872884Ki
nvidia.com/gpu:     1
pods:               110
Allocatable:
cpu:                7750m
ephemeral-storage:  36399308Ki
hugepages-1Gi:      0
hugepages-2Mi:      0
memory:             60775732Ki
nvidia.com/gpu:     1
pods:               110
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource        Requests    Limits
--------        --------    ------
cpu             500m (6%)   200m (2%)
memory          150Mi (0%)  440Mi (0%)
nvidia.com/gpu  0           0

Schedule GPU workloads¶

The following example describes how to deploy a simple workload that reports detected NVIDIA CUDA devices.

Create a practice Deployment that requests nvidia.com/gpu in the limits section. The Pod will be scheduled on any available GPUs in your system.

kubectl apply -f- <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    run: gpu-test
  name: gpu-test
spec:
  replicas: 1
  selector:
    matchLabels:
      run: gpu-test
  template:
    metadata:
      labels:
        run: gpu-test
    spec:
      containers:
      - command:
        - sh
        - -c
        - "deviceQuery && sleep infinity"
        image: kshatrix/gpu-example:cuda-10.2
        name: gpu-test
        resources:
          limits:
            nvidia.com/gpu: "1"
EOF

Verify that it is in the Running state:

kubectl get pods | grep "gpu-test"

NAME                        READY   STATUS    RESTARTS   AGE
gpu-test-747d746885-hpv74   1/1     Running   0          14m

Review the logs. The presence of Result = PASS indicates a successful deployment:

kubectl logs <name of the pod>

Example output:

deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla V100-SXM2-16GB"
...

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS

Determine the overall GPU capacity of your cluster by inspecting its nodes:

echo $(kubectl get nodes -l com.docker.ucp.gpu.nvidia="true" \
-o jsonpath="0{range .items[*]}+{.status.allocatable['nvidia\.com/gpu']}{end}") | bc

Set the proper replica number to acquire all available GPUs:
```
kubectl scale deployment/gpu-test --replicas N
```

Verify that all of the replicas are scheduled:

kubectl get pods | grep "gpu-test"

Example output:

NAME                        READY   STATUS    RESTARTS   AGE
gpu-test-747d746885-hpv74   1/1     Running   0          12m
gpu-test-747d746885-swrrx   1/1     Running   0          11m

Remove the Deployment and corresponding Pods:
```
kubectl delete deployment gpu-test
```

Troubleshooting¶

If you attempt to add an additional replica to the previous example Deployment, it will result in a FailedScheduling error with the Insufficient nvidia.com/gpu message.

Add an additional replica:

kubectl scale deployment/gpu-test --replicas N+1
kubectl get pods | grep "gpu-test"

Example output:

NAME                        READY   STATUS    RESTARTS   AGE
gpu-test-747d746885-hpv74   1/1     Running   0          14m
gpu-test-747d746885-swrrx   1/1     Running   0          13m
gpu-test-747d746885-zgwfh   0/1     Pending   0          3m26s

Review the status of the failed Deployment:

kubectl describe po gpu-test-747d746885-zgwfh

Example output:

Events:
Type     Reason            Age        From               Message
----     ------            ----       ----               -------
Warning  FailedScheduling  <unknown>  default-scheduler  0/2 nodes are available: 2 Insufficient nvidia.com/gpu.

See also

NVIDIA GPU documentation

NGINX Ingress Controller¶

NGINX Ingress Controller for Kubernetes manages traffic that originates outside your cluster (ingress traffic) using the Kubernetes Ingress rules. You can use either the host name, path, or both the host name and path to route incoming requests to the appropriate service.

Only administrators can enable and disable NGINX Ingress Controller. Both administrators and regular users with the appropriate roles and permissions can create Ingress resources.

Configuration of the NGINX Ingress Controller is managed by way of cluster_config.ingress_controller option parameters in the MKE configuration file.

Configure NGINX Ingress Controller¶

Use the MKE web UI to enable and configure the NGINX Ingress Controller.

Log in to the MKE web UI as an administrator.
Using the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.
In the Kubernetes tab, toggle the HTTP Ingress Controller for Kubernetes slider to the right.
Under Configure proxy, specify the NGINX Ingress Controller service node ports through which external traffic can enter the cluster.
Verify that the specified node ports are open.

Note

On production applications, it is typical to expose services using the load balancer that your cloud provider offers.
Optional. Create a layer 7 load balancer in front of multiple nodes by toggling the External IP slider to the right and adding a list of external IP addresses to the NGINX Ingress Controller service.
Specify how to scale load balancing by setting the number of replicas.
Specify placement rules and load balancer configurations.
Specify any additional NGINX configuration options you require. Refer to the NGINX documentation for the complete list of configuration options.
Click Save.

Note

The NGINX Ingress Controller implements all Kubernetes Ingress resources with the IngressClassName of nginx-default, regardless of which namespace they are created in.

Note

The Ingress Controller implements any new Kubernetes Ingress resource that is created without IngressClassName.

Create a Kubernetes Ingress¶

A Kubernetes Ingress specifies a set of rules that route requests that match a particular <domain>/{path} to a given application. Ingresses are scoped to a single namespace and thus can route requests only to the applications inside that namespace.

Log in to the MKE web UI.
Navigate to Kubernetes > Ingresses and click Create.
In the Create Ingress Object page, enter an ingress name and the following rule details:
- Host (optional)
- Path
- Path type
- Service name
- Port number
- Port name
Generate the configuration file by clicking Generate YAML.
Select a namespace using the Namespace dropdown.
Click Create.

Example Kubernetes Ingress configuration file¶

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
  name: echo
spec:
  ingressClassName: nginx-default
  rules:
    - host: example.com
      http:
        paths:
          - path: /echo
            pathType: Exact
            backend:
              service:
                name: echo-service
                port:
                  number: 80

See also

Kubernetes Ingress in the Kubernetes documentation
NGINX Ingress Controller official documentation

Configure a canary deployment¶

Canary deployments release applications incrementally to a subset of users, which allows for the gradual deployment of new application versions without any downtime.

NGINX Ingress Controller supports traffic-splitting policies based on header, cookie, and weight. Whereas header- and cookie-based policies serve to provide a new service version to a subset of users, weight-based policies serve to divert a percentage of traffic to a new service version.

NGINX Ingress Controller uses the following annotations to enable canary deployments:

nginx.ingress.kubernetes.io/canary-by-header
nginx.ingress.kubernetes.io/canary-by-header-value
nginx.ingress.kubernetes.io/canary-by-header-pattern
nginx.ingress.kubernetes.io/canary-by-cookie
nginx.ingress.kubernetes.io/canary-weight

Canary rules are evaluated in the following order:

canary-by-header
canary-by-cookie
canary-weight

Canary deployments require that you create two ingresses: one for regular traffic and one for alternative traffic. Be aware that you can apply only one canary ingress.

You enable a particular traffic-splitting policy by setting the associated canary annotation to true in the Kubernetes Ingress resource, as in the following example:

nginx.ingress.kubernetes.io/canary-by-header: "true"

Refer to Ingress Canary Annotations in the NGINX Ingress Controller documentation for more information.

Example canary setup¶

Deploy two services, echo-v1 and echo-v2, using either the MKE web UI or kubectl.

To deploy echo-v1:¶

apiVersion: v1
kind: Service
metadata:
  name: echo-v1
spec:
  type: ClusterIP
  ports:
    - port: 80
      protocol: TCP
      name: http
  selector:
    app: echo
    version: v1

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: echo
      version: v1
  template:
    metadata:
      labels:
        app: echo
        version: v1
    spec:
      containers:
        - name: echo
          image: "docker.io/hashicorp/http-echo"
          args:
            - -listen=:80
            - --text="echo-v1"
          ports:
            - name: http
              protocol: TCP
              containerPort: 80

To deploy echo-v2:¶

apiVersion: v1
kind: Service
metadata:
  name: echo-v2
spec:
  type: ClusterIP
  ports:
    - port: 80
      protocol: TCP
      name: http
  selector:
    app: echo
    version: v2

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-v2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: echo
      version: v2
  template:
    metadata:
      labels:
        app: echo
        version: v2
    spec:
      containers:
        - name: echo
          image: "docker.io/hashicorp/http-echo"
          args:
            - -listen=:80
            - --text="echo-v2"
          ports:
            - name: http
              protocol: TCP
              containerPort: 80

Create an Ingress to route the traffic for the regular service:

Example Ingress¶

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ingress.kubernetes.io/rewrite-target: /
  name: ingress-echo
spec:
  ingressClassName: nginx-default
  rules:
    - host: canary.example.com
      http:
        paths:
          - path: /echo
            pathType: Exact
            backend:
              service:
                name: echo-v1
                port:
                  number: 80

Verify that traffic is successfully routed:

curl -H "Host: canary.example.com" http://<IP_ADDRESS>:<NODE_PORT>/echo

Expected output:

echo-v1

Canary deployment use cases¶

To provide a subset of users with a new service version using a request header:

Create a canary ingress that routes traffic to the echo-v2 service using the request header x-region: us-east:

Header-based policy¶

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-by-header: "x-region"
    nginx.ingress.kubernetes.io/canary-by-header-value: "us-east"
  name: ingress-echo-canary
spec:
  ingressClassName: nginx-default
  rules:
    - host: canary.example.com
      http:
        paths:
          - path: /echo
            pathType: Exact
            backend:
              service:
                name: nginx-v2
                port:
                  number: 80

Verify that traffic is properly routed:

curl -H "Host: canary.example.com" -H "x-region: us-east" \
http://<IP_ADDRESS>:<NODE_PORT>/echo
curl -H "Host: canary.example.com" -H "x-region: us-west" \
http://<IP_ADDRESS>:<NODE_PORT>/echo
curl -H "Host: canary.example.com" \
http://<IP_ADDRESS>:<NODE_PORT>/echo

Expected output:

echo-v2
echo-v1
echo-v1

To provide a subset of users with a new service version using a cookie:

Create a canary ingress that routes traffic to the echo-v2 service using a cookie:

Verify that traffic is properly routed:

curl -s -H "Host: canary.example.com" --cookie "my_cookie=always" \
http://<IP_ADDRESS>:<NODE_PORT>/echo
curl -s -H "Host: canary.example.com" --cookie "other_cookie=always" \
http://<IP_ADDRESS>:<NODE_PORT>/echo
curl -s -H "Host: canary.example.com" \
http://<IP_ADDRESS>:<NODE_PORT>/echo

Expected output:

echo-v2
echo-v1
echo-v1

To route a segment of traffic to a new service version:

Create a canary ingress that routes 20% of traffic to the echo-v2 service using the nginx.ingress.kubernetes.io/canary-weight annotation:

Weight-based policy¶

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-weight: "20"
  name: ingress-echo-canary
spec:
  ingressClassName: nginx-default
  rules:
    - host: canary.example.com
      http:
        paths:
          - path: /echo
            pathType: Exact
            backend:
              service:
                name: echo-v2
                port:
                  number: 80

Verify that traffic is properly routed:

for i in {1..10}; do curl -H "Host: canary.example.com" \
http://<IP_ADDRESS>:<NODE_PORT>/echo

Example output:

"echo-v1"
"echo-v2"
"echo-v2"
"echo-v1"
"echo-v1"
"echo-v1"
"echo-v1"
"echo-v1"
"echo-v1"
"echo-v1"

Configure a sticky session¶

Sticky sessions enable users who participate in split testing to consistently see a particular feature. Adding sticky sessions to the initial request forces NGINX Ingress Controller to route follow-up requests to the same Pod.

Enable the sticky session in the Kubernetes Ingress resource:
```
nginx.ingress.kubernetes.io/affinity: "cookie"
```

Specify the name of the required cookie (default: INGRESSCOOKIE).

nginx.ingress.kubernetes.io/session-cookie-name: "<cookie-name>"

Specify the time before the cookie expires (in seconds):

nginx.ingress.kubernetes.io/session-cookie-max-age: "<cookie-duration>"

The following is an example of a Kubernetes Ingress configuration file with a sticky session enabled:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: sticky-session-test
  annotations:
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/session-cookie-name: "route"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"

spec:
  rules:
  - host: stickyingress.example.com
    http:
      paths:
      - backend:
          serviceName: http-svc
          servicePort: 80
        path: /

Note

NGINX Ingress Controller only supports cookie-based sticky sessions.

See also

Sticky sessions in the NGINX Ingress Controller documentation.

Monitor an MKE cluster¶

You can monitor the health of your MKE cluster using the MKE web UI, the CLI, and the _ping endpoint. This topic describes how to monitor your cluster health, vulnerability counts, and disk usage.

For those running MSR in addition to MKE, MKE displays image vulnerability scanning count data obtained from MSR for containers, Swarm services, Pods, and images. This feature requires that you run MSR 2.6.x or later and enable MKE single sign-on.

The MKE web UI only displays the disk usage metrics, including space availability, for the /var/lib/docker part of the filesystem. Monitoring the total space available on each filesystem of an MKE worker or manager node requires that you deploy a third-party operating system-monitoring solution.

Monitor with the MKE web UI¶

Log in to the MKE web UI.
From the left-side navigation panel, navigate to the Dashboard page.

Cluster health-related warnings that require your immediate attention display on the cluster dashboard. A greater number of such warnings are likely to present for MKE administrators than for regular users.
Navigate to Shared Resources > Nodes to inspect the health of the nodes that MKE manages. To read the node health status, hover over the colored indicator.
Click a particular node to learn more about its health.
Click on the vertical ellipsis in the top right corner and select Tasks.
From the left-side navigation panel, click Agent Logs to examine log entries.

Monitor with the CLI¶

Download and configure the client bundle.
Examine the health of the nodes in your cluster:
```
docker node ls
```
Status messages that begin with [Pending] indicate a transient state that is expected to resolve itself and return to a healthy state.

Automate the monitoring process¶

Automate the MKE cluster monitoring process by using the https://<mke-manager-url>/_ping endpoint to evaluate the health of a single manager node. The MKE manager evaluates whether its internal components are functioning properly, and returns one of the following HTTP codes:

200 - all components are healthy
500 - one or more components are not healthy

Using an administrator client certificate as a TLS client certificate for the _ping endpoint returns a detailed error message if any component is unhealthy.

Do not access the _ping endpoint with a load balancer, as this method does not allow you to determine which manager node is not healthy. Instead, connect directly to the URL of a manager node. Use GET to ping the endpoint instead of HEAD, as HEAD returns a 404 error code.

Troubleshoot an MKE cluster¶

Troubleshooting is a necessary part of cluster maintenance. This section provides you with the tools you need to diagnose and resolve the problems you are likely to encounter in the course of operating your cluster.

Troubleshoot MKE node states¶

Nodes enter a variety of states in the course of their lifecycle, including transitional states such as when a node joins a cluster and when a node is promoted or demoted. MKE reports the steps of the transition process as they occur in both the ucp-controller logs and in the MKE web UI.

To view transitional node states in the MKE web UI:

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes. The transitional node state displays in the DETAILS column for each node.
Optional. Click the required node. The transitional node state displays in the Overview tab under Cluster Message.

The following table includes all the node states as they are reported by MKE, along with their description and expected duration:

Message	Description	Expected duration
Completing node registration	The node is undergoing the registration process and does not yet appear in the KV node inventory. This is expected to occur when a node first joins the MKE swarm.	5 - 30 seconds
heartbeat failure	The node has not contacted any swarm managers in the last 10 seconds. Verify the swarm state using docker info on the node. `inactive` indicates that the node has been removed from the swarm with docker swarm leave. `pending` indicates dockerd has been attempting to contact a manager since dockerd started on the node. Confirm that the network security policy allows TCP port 2377 from the node to the managers. `error` indicates an error prevented Swarm from starting on the node. Verify the docker daemon logs on the node.	Until resolved
Node is being reconfigured	The `ucp-reconcile` container is converging the current state of the node to the desired state. Depending on which state the node is currently in, this process can involve issuing certificates, pulling missing images, or starting containers.	1 - 60 seconds
Reconfiguration pending	The node is expected to be a manager but the `ucp-reconcile` container has not yet been started.	1 - 10 seconds
The `ucp-agent` task is `state`	The `ucp-agent` task on the node is not yet in a running state. This message is expected when the configuration has been updated or when a node first joins the MKE cluster. This step may take longer than expected if the MKE images need to be pulled from Docker Hub on the affected node.	1 - 10 seconds
Unable to determine node state	The `ucp-reconcile` container on the target node has just begun running and its state is not yet evident.	1 - 10 seconds
Unhealthy MKE Controller: node is unreachable	Other manager nodes in the cluster have not received a heartbeat message from the affected node within a predetermined timeout period. This usually indicates that there is either a temporary or permanent interruption in the network link to that manager node. Ensure that the underlying networking infrastructure is operational, and contact support if the symptom persists.	Until resolved
Unhealthy MKE Controller: unable to reach controller	The controller that the node is currently communicating with is not reachable within a predetermined timeout. Refresh the node listing to determine whether the symptom persists. The symptom appearing intermittently can indicate latency spikes between manager nodes, which can lead to temporary loss in the availability of MKE. Ensure the underlying networking infrastructure is operational and contact support if the symptom persists.	Until resolved
Unhealthy MKE Controller: Docker Swarm Cluster: Local node <ip> has status `Pending`	The MCR Engine ID is not unique in the swarm. When a node first joins the cluster, it is added to the node inventory and discovered as `Pending` by Swarm. MCR is considered validated if a `ucp-swarm-manager` container can connect to MCR through TLS and its Engine ID is unique in the swarm. If you see this issue repeatedly, make sure that MCR does not have duplicate IDs. Use docker info to view the Engine ID. To refresh the ID, remove the `/etc/docker/key.json` file and restart the daemon.	Until resolved

Troubleshoot using logs¶

You can troubleshoot your MKE cluster by using the MKE web UI, the ClI, and the support bundle to review the logs of the individual MKE components. You must have administrator privileges to view information about MKE system containers.

Review logs using the MKE web UI¶

Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to Shared Resources > Containers. By default, the system containers are hidden.
Click the slider icon and select Show system resources.
Click the required container to view details, which include configurations and logs.

Review logs using the CLI¶

Download and configure the client bundle.

Using the Docker CLI requires that you authenticate using client certificates. Client certificate bundles generated for users without administrator privileges do not permit viewing MKE system container logs.

Review the logs of MKE system containers. Use the -a flag to display system containers, as they are not displayed by default.

docker ps -a

Example output:

CONTAINER ID        IMAGE                                     COMMAND                  CREATED             STATUS                     PORTS                                                                             NAMES
8b77cfa87889        mirantis/ucp-agent:latest             "/bin/ucp-agent re..."   3 hours ago         Exited (0) 3 hours ago                                                                                       ucp-reconcile
b844cf76a7a5        mirantis/ucp-agent:latest             "/bin/ucp-agent agent"   3 hours ago         Up 3 hours                 2376/tcp                                                                          ucp-agent.tahzo3m4xjwhtsn6l3n8oc2bf.xx2hf6dg4zrphgvy2eohtpns9
de5b45871acb        mirantis/ucp-controller:latest        "/bin/controller s..."   3 hours ago         Up 3 hours (unhealthy)     0.0.0.0:443->8080/tcp                                                             ucp-controller
...

Optional. Review the log of a particular MKE container by using the docker logs <mke container ID> command. For example, the following command produces the log for the ucp-controller container listed in the previous step:

docker logs de5b45871acb

Example output:

{"level":"info","license_key":"PUagrRqOXhMH02UgxWYiKtg0kErLY8oLZf1GO4Pw8M6B","msg":"/v1.22/containers/ucp/ucp-controller/json",
"remote_addr":"192.168.10.1:59546","tags":["api","v1.22","get"],"time":"2016-04-25T23:49:27Z","type":"api","username":"dave.lauper"}
{"level":"info","license_key":"PUagrRqOXhMH02UgxWYiKtg0kErLY8oLZf1GO4Pw8M6B","msg":"/v1.22/containers/ucp/ucp-controller/logs",
"remote_addr":"192.168.10.1:59546","tags":["api","v1.22","get"],"time":"2016-04-25T23:49:27Z","type":"api","username":"dave.lauper"}

Review logs using a support bundle¶

With the logs contained in a support bundle you can troubleshoot problems that existed before you changed your MKE configuration. Do not alter your MKE configuration until after you have performed the following steps.

Log in to the MKE web UI.
In the left-side navigation panel, navigate to <username> > Admin Settings > Log & Audit Logs
Select DEBUG and click Save.

Increasing the MKE log level to DEBUG produces more descriptive logs, making it easier to understand the status of the MKE cluster.

Note

Changing the MKE log level restarts all MKE system components and introduces a small amount of downtime to MKE. Your applications will not be affected by this downtime.
support-dump.

Each of the following container types reports a different variety of problems in its logs:

Review the ucp-reconcile container logs for problems that occur after a node was added or removed.

Note

It is normal for the ucp-reconcile container to be stopped. This container starts only when the ucp-agent detects that a node needs to transition to a different state. The ucp-reconcile container is responsible for creating and removing containers, issuing certificates, and pulling missing images.
Review the ucp-controller container logs for problems that occur in the normal state of the system.
Review the ucp-auth-api and ucp-auth-store container logs for problems that occur when you are able to visit the MKE web UI but unable to log in.

Review logs using the API¶

Store the IP address for use in the shell:
```
IP=<ip-address>
```

Obtain a temporary access token:

curl -k -X POST -H 'Content-Type: application/json' https://$IP/auth/login --data-binary '
{
  "username": "<username>",
  "password": "<password>"
}
'

Example output:

{"auth_token":"88d790ab-5cc0-4284-b3c6-986272af50b6"}

Store the temporary access token for use in the shell:
```
AUTHTOKEN="88d790ab-5cc0-4284-b3c6-986272af50b6"
```

Determine which containers are present:

curl -k -X GET "https://$IP/containers/json?all=true&size=false" -H  "accept: application/json" -H  "Authorization: Bearer ${AUTHTOKEN}"

Truncated first line of example output:

{"Id":"2cebeb898636ce519ec68fadbad4abe499f2fdebb057eb534bb64ad5bbf7925f", ...}

Store the container ID for use in the shell:

ID=2cebeb898636ce519ec68fadbad4abe499f2fdebb057eb534bb64ad5bbf7925f

Obtain log files associated with the container ID:

curl -k -X GET "https://$IP/containers/$ID/logs?follow=false&stdout=true&stderr=true&since=0&until=0&timestamps=false&tail=all" \
-H "accept: application/json" -H  "Authorization: Bearer ${AUTHTOKEN}" --output output.txt

View log content:
```
cat output.txt
```

Troubleshoot cluster configurations¶

MKE regularly monitors its internal components, attempting to resolve issues as it discovers them.

In most cases where a single MKE component remains in a persistently failed state, removing and rejoining the unhealthy node restores the cluster to a healthy state.

MKE persists configuration data on an etcd key-value store and RethinkDB database that are replicated on all MKE manager nodes. These data stores are for internal use only and should not be used by other applications.

Troubleshoot the etcd key-value store with the HTTP API¶

This example uses curl to make requests to the key-value store REST API and jq to process the responses.

Install curl and jq on a Ubuntu distribution:

sudo apt-get update && sudo apt-get install curl jq

Use a client bundle to authenticate your requests. Download and configure the client bundle if you have not done so already.

Use the REST API to access the cluster configurations. The $DOCKER_HOST and $DOCKER_CERT_PATH environment variables are set when using the client bundle.

export KV_URL="https://$(echo $DOCKER_HOST | cut -f3 -d/ | cut -f1 -d:):12379"

curl -s \
     --cert ${DOCKER_CERT_PATH}/cert.pem \
     --key ${DOCKER_CERT_PATH}/key.pem \
     --cacert ${DOCKER_CERT_PATH}/ca.pem \
     ${KV_URL}/v2/keys | jq "."

Troubleshoot the etcd key-value store with the CLI¶

Execution of the MKE etcd key-value store takes place in containers with the name ucp-kv. To check the health of etcd clusters, execute commands inside these containers using docker exec` with etcdctl.

Log in to a manager node using SSH.
Troubleshoot an etcd key-value store:
```
docker exec -it ucp-kv sh -c \
'etcdctl --cluster=true endpoint health -w table 2>/dev/null'
```
If the command fails, an error code is the only output that displays.

Troubleshoot your cluster configuration using the RethinkDB database¶

User and organization data for MKE is stored in a RethinkDB database, which is replicated across all manager nodes in the MKE cluster.

The database replication and failover is typically handled automatically by the MKE configuration management processes. However, you can use the CLI to review the status of the database and manually reconfigure database replication.

Produce a detailed status of all servers and database tables in the RethinkDB cluster:

NODE_ADDRESS=$(docker info --format '{{.Swarm.NodeAddr}}')
VERSION=$(docker image ls --format '{{.Tag}}' mirantis/ucp-auth | head -n 1)
docker container run --rm -v ucp-auth-store-certs:/tls mirantis/ucp-auth:${VERSION} --db-addr=${NODE_ADDRESS}:12383 db-status

NODE_ADDRESS is the IP address of this Docker Swarm manager node.
VERSION is the most recent version of the mirantis/ucp-auth image.

Expected output:

Server Status: [
  {
    "ID": "ffa9cd5a-3370-4ccd-a21f-d7437c90e900",
    "Name": "ucp_auth_store_192_168_1_25",
    "Network": {
      "CanonicalAddresses": [
        {
          "Host": "192.168.1.25",
          "Port": 12384
        }
      ],
      "TimeConnected": "2017-07-14T17:21:44.198Z"
    }
  }
]
...

Repair the RethinkDB cluster so that the number of replicas it has is equal to the number of manager nodes in the cluster.

NODE_ADDRESS=$(docker info --format '{{.Swarm.NodeAddr}}')
NUM_MANAGERS=$(docker node ls --filter role=manager -q | wc -l)
VERSION=$(docker image ls --format '{{.Tag}}' mirantis/ucp-auth | head -n 1)
docker container run --rm -v ucp-auth-store-certs:/tls mirantis/ucp-auth:${VERSION} --db-addr=${NODE_ADDRESS}:12383 --debug reconfigure-db --num-replicas ${NUM_MANAGERS}

NODE_ADDRESS is the IP address of this Docker Swarm manager node.
NUM_MANAGERS is the current number of manager nodes in the cluster.
VERSION is the most recent version of the mirantis/ucp-auth image.

Example output:

time="2017-07-14T20:46:09Z" level=debug msg="Connecting to db ..."
time="2017-07-14T20:46:09Z" level=debug msg="connecting to DB Addrs: [192.168.1.25:12383]"
time="2017-07-14T20:46:09Z" level=debug msg="Reconfiguring number of replicas to 1"
time="2017-07-14T20:46:09Z" level=debug msg="(00/16) Reconfiguring Table Replication..."
time="2017-07-14T20:46:09Z" level=debug msg="(01/16) Reconfigured Replication of Table \"grant_objects\""
...

Note

If the quorum in any of the RethinkDB tables is lost, run the reconfigure-db command with the --emergency-repair flag.

See also

Disaster recovery¶

Perform disaster recovery procedures first for Swarm and then for MKE, with any required MSR disaster recovery procedures performed last.

Swarm disaster recovery¶

This section describes how to recover after losing quorum and how to force your swarm to rebalance.

Note

Perform the procedures in this section prior to those described in MKE disaster recovery.

Recover from losing the quorum¶

Swarms are resilient to failures and can recover from temporary node failures, such as machine reboots and restart crashes, and other transient errors. However, if a swarm loses quorum, it cannot automatically recover. In such cases, tasks on existing worker nodes continue to run, but it is not possible to perform administrative tasks, such as scaling or updating services and joining or removing nodes from the swarm. The best way to recover after losing quorum is to bring the missing manager nodes back online. If that is not possible, follow the instructions below.

In a swarm of N managers, a majority (quorum) of manager nodes must always be available. For example, in a swarm with 5 managers, a minimum of 3 managers must be operational and in communication with each other. In other words, the swarm can tolerate up to (N-1)/2 permanent failures, and beyond that, requests involving swarm management cannot be processed. Such permanent failures include data corruption and hardware failure.

If you lose a quorum of managers, you cannot administer the swarm. If you have lost the quorum and you attempt to perform any management operation on the swarm, MKE issues the following error:

Error response from daemon: rpc error: code = 4 desc = context deadline exceeded

To recover from losing quorum:

If you cannot recover from losing quorum by bringing the failed nodes back online, you must run the docker swarm init command with the --force-new-cluster flag from a manager node. Using this flag removes all managers except the manager from which the command was run.

Run --force-new-cluster from the manager node you want to recover:

docker swarm init --force-new-cluster --advertise-addr node01:2377

Promote nodes to become managers until you have the required number of manager nodes.

The Mirantis Container Runtime where you run the command becomes the manager node of a single-node swarm, which is capable of managing and running services. The manager has all the previous information about services and tasks, worker nodes continue to be part of the swarm, and services continue running. You need to add or re-add manager nodes to achieve your previous task distribution and ensure that you have enough managers to maintain high availability and prevent losing the quorum.

Force the swarm to rebalance¶

You do not usually need to force your swarm to rebalance its tasks. However, when you add a new node to a swarm or a node reconnects to the swarm after a period of unavailability, the swarm does not automatically give a workload to the idle node. This is a design decision; if the swarm periodically shifts tasks to different nodes for the sake of balance, the clients using those tasks would be disrupted. The goal is to avoid disrupting running services for the sake of balance across the swarm. When new tasks start, or when a node with running tasks becomes unavailable, those tasks are given to less busy nodes.

To force the swarm to rebalance its tasks:

Use the docker service update command with the --force or -f flag to force the service to redistribute its tasks across the available worker nodes. This causes the service tasks to restart. Client applications may be disrupted. If configured, your service will use a rolling update.

MKE disaster recovery¶

If you cannot recover half or more manager nodes to a healthy state, you have lost quorum and must restore your system using the following procedure.

Note

Perform Swarm disaster recovery procedures prior to those described here.

Recover an MKE cluster from an existing backup¶

If MKE is still installed on the swarm, uninstall MKE:

Note

Skip this step when restoring MKE on new machines.
```
docker container run -it --rm -v /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:<mke-version> uninstall-ucp -i
```
Substitute <mke-version> with the MKE version of your backup.

Confirm that you want to uninstall MKE.

Example output:

INFO[0000] Detected UCP instance tgokpm55qcx4s2dsu1ssdga92
INFO[0000] We're about to uninstall UCP from this Swarm cluster
Do you want to proceed with the uninstall? (y/n):

Restore MKE from the existing backup as described in Restore MKE.

If the swarm exists, restore MKE on a manager node. Otherwise, restore MKE on any node, and the swarm will be created automatically during the restore procedure.

Recreate Kubernetes and Swarm objects¶

For Kubernetes, MKE backs up the declarative state of Kubernetes objects in etcd.

For Swarm, it is not possible to take the state and export it to a declarative format, as the objects that are embedded within the Swarm raft logs are not easily transferable to other nodes or clusters.

To recreate swarm-related workloads, you must refer to the original scripts used for deployment. Alternatively, you can recreate the workloads by manually recreating output using the docker inspect commands.

Back up Swarm¶

MKE manager nodes store the swarm state and manager logs in the /var/lib/docker/swarm/ directory. Swarm raft logs contain crucial information for recreating Swarm-specific resources, including services, secrets, configurations, and node cryptographic identity. This data includes the keys used to encrypt the raft logs. You must have these keys to restore the swarm.

Because logs contain node IP address information and are not transferable to other nodes, you must perform a manual backup on each manager node. If you do not back up the raft logs, you cannot verify workloads or Swarm resource provisioning after restoring the cluster.

Note

You can avoid performing a Swarm backup by storing stacks, services definitions, secrets, and networks definitions in a source code management or config management tool.

Swarm backup contents¶
Data	Backed up	Description
Raft keys	Yes	Keys used to encrypt communication between Swarm nodes and to encrypt and decrypt raft logs
Membership	Yes	List of the nodes in the cluster
Services	Yes	Stacks and services stored in Swarm mode
Overlay networks	Yes	Overlay networks created on the cluster
Configs	Yes	Configs created in the cluster
Secrets	Yes	Secrets saved in the cluster
Swarm unlock key	No	Secret key needed to unlock a manager after its Docker daemon restarts

To back up Swarm:

Note

All commands that follow must be prefixed with sudo or executed from a superuser shell by first running sudo sh.

If auto-lock is enabled, retrieve your Swarm unlock key. Refer to Rotate the unlock key in the Docker documentation for more information.
Optional. Mirantis recommends that you run at least three manager nodes, in order to achieve high availability, as you must stop the engine of the manager node before performing the backup. A majority of managers must be online for a cluster to be operational. If you have less than 3 managers, the cluster will be unavailable during the backup.

Note

While a manager is shut down, your swarm is more likely to lose quorum if further nodes are lost. A loss of quorum renders the swarm unavailable until quorum is recovered. Quorum is only recovered when more than 50% of the nodes become available. If you regularly take down managers when performing backups, consider running a 5-manager swarm, as this will enable you to lose an additional manager while the backup is running, without disrupting services.
Select a manager node other than the leader to avoid a new election inside the cluster:
```
docker node ls -f "role=manager" | tail -n+2 | grep -vi leader
```
Optional. Store the Mirantis Container Runtime (MCR) version in a variable to easily add it to your backup name.
```
ENGINE=$(docker version -f '{{.Server.Version}}')
```
Stop MCR on the manager node before backing up the data, so that no data is changed during the backup:
```
systemctl stop docker
```

Back up the /var/lib/docker/swarm directory:

tar cvzf "/tmp/swarm-${ENGINE}-$(hostname -s)-$(date +%s%z).tgz" /var/lib/docker/swarm/

You can decode the Unix epoch in the file name by typing date -d @timestamp:

date -d @1531166143
Mon Jul  9 19:55:43 UTC 2018

If auto-lock is enabled, unlock the swarm:
```
docker swarm unlock
```
Restart MCR on the manager node:
```
systemctl start docker
```
Repeat the above steps for each manager node.

See also

Release Compatibility Matrix

Back up MKE¶

All manager nodes store the same data, thus it is only necessary to back up a single one.

Backing up MKE does not require that you pause the reconciler and delete MKE containers, nor does it affect manager node activities and user resources, such as services, containers, and stacks.

Backup considerations¶

Observe the following considerations prior to performing an MKE backup.

Limitations¶

MKE does not support using a backup that runs an earlier version of MKE to restore a cluster that runs a later version of MKE.
MKE does not support performing two backups at the same time. If a backup is attempted while another backup is in progress, or if two backups are scheduled at the same time, a message will display indicating that the second backup failed because another backup is in progress.
MKE may not be able to back up a cluster that has crashed. Mirantis recommends that you perform regular backups to avoid encountering this scenario.
MKE backups do not include Swarm workloads.

MKE backup contents¶

The following backup contents are stored in a .tar file. Backups contain MKE configuration metadata for recreating configurations such as LDAP, SAML, and RBAC.

Data	Backed up	Description
Configurations	Yes	MKE configurations, including Mirantis Container Runtime license, Swarm, and client CAs.
Access control	Yes	Swarm resource permissions for teams, including collections, grants, and roles.
Certificates and keys	Yes	Certificates, public and private keys used for authentication and mutual TLS communication.
Metrics data	Yes	Monitoring data gathered by MKE.
Organizations	Yes	Users, teams, and organizations.
Volumes	Yes	All MKE-named volumes including all MKE component certificates and data.
Overlay networks	No	Swarm mode overlay network definitions, including port information.
Configs, secrets	No	MKE configurations and secrets. Create a Swarm backup to back up these data.
Services	No	MKE stacks and services are stored in Swarm mode or SCM/config management.
`ucp-metrics-data`	No	Metrics server data.
`ucp-node-certs`	No	Certs used to lock down MKE system components.
Routing mesh settings	No	Interlock layer 7 ingress configuration information. A manual backup and restore process is possible and should be performed.

Note

Because Kubernetes stores the state of resources on etcd, a backup of etcd is sufficient for stateless backups.

Kubernetes settings, data, and state¶

MKE backups include all Kubernetes declarative objects, including secrets, and are stored in the ucp-kv etcd database.

Note

You cannot back up Kubernetes volumes and node labels. When you restore MKE, Kubernetes objects and containers are recreated and IP addresses are resolved.

For more information, refer to Backing up an etcd cluster.

Backup procedure¶

You can create an MKE backup using either the CLI, the MKE web UI, or the MKE API.

The backup process runs on one manager node.

Create an MKE backup using the CLI¶

The following example demonstrates how to:

Create an MKE manager node backup.
Encrypt the backup by using a passphrase.
Decrypt the backup.
Verify the backup contents.
Store the backup locally on the node at /tmp/mybackup.tar.

To create an MKE backup:

Run the mirantis/ucp:3.6.16 backup command on a single MKE manager node, including the --file and --include-logs options. This creates a .tar archive with the contents of all volumes used by MKE and streams it to stdout. Replace 3.6.16 with the version you are currently running.
```
docker container run \
  --rm \
  --log-driver none \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  --volume /tmp:/backup \
  mirantis/ucp:3.6.16 backup \
  --file mybackup.tar \
  --passphrase "secret12chars" \
  --include-logs=false
```
If you are running MKE with Security-Enhanced Linux (SELinux) enabled, which is typical for RHEL hosts, include --security-opt label=disable in the docker command, replacing 3.6.16 with the version you are currently running:
```
docker container run \
  --rm \
  --log-driver none \
  --security-opt label=disable \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.6.16 backup \
  --passphrase "secret12chars" > /tmp/mybackup.tar
```
Note

To determine whether SELinux is enabled in MCR, view the host /etc/docker/daemon.json file, and search for the string "selinux-enabled":"true".
You can access backup progress and error reporting in the stderr streams of the running backup container during the backup process. MKE updates progress after each backup step, for example, after volumes are backed up. The progress tracking is not preserved after the backup has completed.
A valid backup file contains at least 27 files, including ./ucp-controller-server-certs/key.pem. Verify that the backup is a valid .tar file by listing its contents, as in the following example:
```
gpg --decrypt /tmp/mybackup.tar | tar --list
```
A log file is also created, in the same directory as the backup file. The passphrase for the backup and log files are the same. Review the contents of the log file by using the following command:
```
gpg --decrypt '/tmp/mybackup.log'
```

Create a backup using the MKE web UI¶

Log in to the MKE web UI.
In the left-side navigation panel, navigate to Admin Settings.
Click Backup.
Initiate an immediate backup by clicking Backup Now.

The MKE web UI also provides the following options:

Display the status of a running backup
Display backup history
Display backup outcome

Create, list, and retrieve backups using the MKE API¶

The MKE API provides three endpoints for managing MKE backups:

/api/ucp/backup
/api/ucp/backups
/api/ucp/backup/{backup_id}

You must be an MKE administrator to access these API endpoints.

To create a backup using the MKE API:

You can create a backup with the POST: /api/ucp/backup endpoint. This JSON endpoint accepts the following arguments:

Field name	JSON data type	Description
`passphrase`	String	Encryption passphrase
`noPassphrase`	Boolean	Sets whether a passphrase is used
`fileName`	String	Backup file name
`includeLogs`	Boolean	Sets whether to include a log file
`hostPath`	String	File system location

The request returns one of the following HTTP status codes, and if successful, a backup ID.

200: Success
500: Internal server error
400: Malformed request (payload fails validation)

Example API call:

curl -sk -H "Authorization: Bearer $AUTHTOKEN"  https://$UCP_HOSTNAME/api/ucp/backup \
  -X POST \
  -H "Content-Type: application/json" \
  --data  '{"passphrase": "secret12chars", "includeLogs": true, "fileName": "backup1.tar", "logFileName": "backup1.log", "hostPath": "/tmp"}'

$AUTHTOKEN is your authentication bearer token if using auth token identification.
$UCP_HOSTNAME is your MKE hostname.

Example output:

200 OK

To list all backups using the MKE API:

You can view all existing backups with the GET: /api/ucp/backups endpoint. This request does not expect a payload and returns a list of backups, each as a JSON object following the schema detailed in Backup schema.

The request returns one of the following HTTP status codes, and if successful, a list of existing backups:

200: Success
500: Internal server error

Example API call:

curl -sk -H "Authorization: Bearer $AUTHTOKEN" https://$UCP_HOSTNAME/api/ucp/backups

Example output:

[
  {
    "id": "0d0525dd-948a-41b4-9f25-c6b4cd6d9fe4",
    "encrypted": true,
    "fileName": "backup2.tar",
    "logFileName": "backup2.log",
    "backupPath": "/secure-location",
    "backupState": "SUCCESS",
    "nodeLocation": "ucp-node-ubuntu-0",
    "shortError": "",
    "created_at": "2019-04-10T21:55:53.775Z",
    "completed_at": "2019-04-10T21:56:01.184Z"
  },
  {
    "id": "2cf210df-d641-44ca-bc21-bda757c08d18",
    "encrypted": true,
    "fileName": "backup1.tar",
    "logFileName": "backup1.log",
    "backupPath": "/secure-location",
    "backupState": "IN_PROGRESS",
    "nodeLocation": "ucp-node-ubuntu-0",
    "shortError": "",
    "created_at": "2019-04-10T01:23:59.404Z",
    "completed_at": "0001-01-01T00:00:00Z"
  }
]

To retrieve backup details using the MKE API:

You can retrieve details for a specific backup using the GET: /api/ucp/backup/{backup_id} endpoint, where {backup_id} is the ID of an existing backup. This request returns the backup, if it exists, as a JSON object following the schema detailed in Backup schema.

The request returns one of the following HTTP status codes, and if successful, the backup for the specified ID:

200: Success
404: Backup not found for the given {backup_id}
500: Internal server error

Specify a backup file¶

To avoid directly managing backup files, you can specify a file name and host directory on a secure and configured storage backend, such as NFS or another networked file system. The file system location is the backup folder on the manager node file system. This location must be writable by the nobody user, which is specified by changing the directory ownership to nobody. This operation requires administrator permissions to the manager node, and must only be run once for a given file system location.

To change the file system directory ownership to nobody:

sudo chown nobody:nogroup /path/to/folder

Caution

Specify a different name for each backup file. Otherwise, the existing backup file with the same name is overwritten.
Also specify a location that is mounted on a fault-tolerant file system, such as NFS, rather than the node local disk. Otherwise, it is important to regularly move backups from the manager node local disk to ensure adequate space for ongoing backups.

Backup schema¶

The following table describes the backup schema returned by the GET: /api/ucp/backups and GET: /api/ucp/backup/{backup_id} endpoints:

Field name	JSON data type	Description
`id`	String	Unique ID
`encrypted`	Boolean	Sets whether to encrypt with a passphrase
`filename`	String	Backup file name if backing up to a file, empty otherwise
`logFilename`	String	Backup log file name if saving backup logs, empty otherwise
`backupPath`	String	Host path where backup is located
`backupState`	String	Current state of the backup (`IN_PROGRESS`, `SUCCESS`, `FAILED`)
`nodeLocation`	String	Node on which the backup was taken
`shortError`	String	Empty unless `backupState` is set to `FAILED`
`created_at`	String	Time of backup creation
`completed_at`	String	Time of backup completion

See also

Release Compatibility Matrix

Restore Swarm¶

Prior to restoring Swarm, verify that you meet the following prerequisites:

The node you select for the restore must use the same IP address as the node from which you made the backup, as the command to force the new cluster does not reset the IP address in the swarm data.
The node you select for the restore must run the same version of Mirantis Container Runtime (MCR) as the node from which you made the backup.
You must have access to the list of manager node IP addresses located in state.json inside the zip file.
If auto-lock was enabled on the backed-up swarm, you must have access to the unlock key.

To perform the Swarm restore:

Caution

You must perform the Swarm restore on only the one manager node in your cluster and the manager node must be the same manager from which you made the backup.

Shut down MCR on the manager node that you have selected for your restore:
```
systemctl stop docker
```
On the new swarm, remove the contents of the /var/lib/docker/swarm directory. Create this directory if it does not exist.
Restore the /var/lib/docker/swarm directory with the contents of the backup:
```
tar -xvf <PATH_TO_TARBALL> -C /
```
Set <PATH_TO_TARBAL> to the location path where you saved the tarball during backup. If you are following the procedure in backup-swarm, the tarball will be in a /tmp/ folder with a unique name based on the engine version and timestamp: swarm-${ENGINE}-$(hostname -s)-$(date +%s%z).tgz.

Note

The new node uses the same encryption key for on-disk storage as the old one. It is not possible to change the on-disk storage encryption keys. For a swarm that has auto-lock enabled, the unlock key is the same as on the old swarm and is required to restore the swarm.
Unlock the swarm, if necessary:
```
docker swarm unlock
```
Start Docker on the new node:
```
systemctl start docker
```
Verify that the state of the swarm is as expected, including application-specific tests or checking the output of docker service ls to verify that all expected services are present.
If you use auto-lock, rotate the unlock key:
```
docker swarm unlock-key --rotate
```
Add the required manager and worker nodes to the new swarm.
Reinstate your previous backup process on the new swarm.

See also

Release Compatibility Matrix
Administer and maintain a swarm of Docker Engines in the Docker documentation

Restore MKE¶

MKE supports the following three different approaches to performing a restore:

Run the restore on the machines from which the backup originated or on new machines. You can use the same swarm from which the backup originated or a new swarm.
Run the restore on a manager node of an existing swarm that does not have MKE installed. In this case, the MKE restore uses the existing swarm and runs in place of an MKE install.
Run the restore on an instance of MCR that is not included in a swarm. The restore performs docker swarm init just as the install operation would do. This creates a new swarm and restores MKE thereon.

Note

During the MKE restore operation, Kubernetes declarative objects and containers are recreated and IP addresses are resolved.

For more information, refer to Restoring an etcd cluster.

Prerequisites¶

Consider the following requirements prior to restoring MKE:

To restore an existing MKE installation from a backup, you must uninstall MKE from the swarm by using the uninstall-ucp command.
Restore operations must run using the same major and minor MKE version and mirantis/ucp image version as the backed-up cluster.
If you restore MKE using a different swarm than the one where the backed-up MKE was deployed, MKE will use new TLS certificates. In this case, you must download new client bundles, as the existing ones will no longer be operational.

Restore MKE¶

Note

At the start of the restore operation, the script identifies the MKE version defined in the backup and performs one of the following actions:

The MKE restore fails if it runs using an image that does not match the MKE version from the backup. To override this in, for example, a testing scenario, use the --force flag.
MKE provides instructions on how to run the restore process for the MKE version in use.

Note

If SELinux is enabled, you must temporarily disable it prior to running the restore command. You can then reenable SELinux once the command has completed.

Volumes are placed onto the host where you run the MKE restore command.

Restore MKE from an existing backup file. The following example illustrates how to restore MKE from an existing backup file located in /tmp/backup.tar:
```
docker container run \
--rm \
--interactive \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock  \
mirantis/ucp:3.6.16 restore \
--san=${APISERVER_LB} < /tmp/backup.tar
```
- Replace mirantis/ucp:3.6.16 with the MKE version in your backup file.
- For the --san flag, assign the cluster API server IP address without the port number to the APISERVER_LB variable. For example, for https://172.16.243.2:443 use 172.16.243.2. For more information on the --san flag, refer to MKE CLI restore options.
If the backup file is encrypted with a passphrase, include the --passphrase flag in the restore command:
```
docker container run \
--rm \
--interactive \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock  \
mirantis/ucp:3.6.16 restore \
--san=${APISERVER_LB} \
--passphrase "secret" < /tmp/backup.tar
```
Alternatively, you can invoke the restore command in interactive mode by mounting the backup file to the container rather than streaming it through stdin:
```
docker container run \
--rm \
--interactive \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
-v /tmp/backup.tar:/config/backup.tar \
mirantis/ucp:3.6.16 restore -i
```
Regenerate certs. The current certs volume containing cluster-specific information, such as SANs, is invalid on new clusters with different IPs. For volumes that are not backed up, such as ucp-node-certs, the restore regenerates certs. For certs that are backed up, ucp-controller-server-certs, the restore does not perform a regeneration and you must correct those certs when the restore completes.
After you successfully restore MKE, add new managers and workers just as you would after a fresh installation.
For restore operations, review the output of the restore command.

Verify the MKE restore¶

Run the following command:
```
curl -s -k https://localhost/_ping
```
Log in to the MKE web UI.
In the left-side navigation panel, navigate to Shared Resources > Nodes.
Verify that all swarm manager nodes are healthy:
- Monitor all swarm managers for at least 15 minutes to ensure no degradation.
- Verify that no containers on swarm manager nodes are in an unhealthy state.
- Verify that no swarm nodes are running containers with the old version, except for Kubernetes Pods that use the ucp-pause image.

See also

Customer feedback¶

You can submit feedback on MKE to Mirantis either by rating your experience or through a Jira ticket.

To rate your MKE experience:

Log in to the MKE web UI.
Click Give feedback at the bottom of the screen.
Rate your MKE experience from one to five stars, and add any additional comments in the provided field.
Click Send feedback.

To offer more detailed feedback:

Log in to the MKE web UI.
Click Give feedback at the bottom of the screen.
Click create a ticket in the 5-star review dialog to open a Jira feedback collector.
Fill in the Jira feedback collector fields and add attachments as necessary.
Click Submit.

Launchpad¶

Mirantis’s Launchpad CLI Tool (Launchpad) is a command-line deployment and lifecycle-management tool that runs on virtually any Linux, Mac, or Windows machine. It simplifies and automates MKE, MSR, and MCR installation and deployments on public clouds, private clouds, virtualization platforms, and bare metal.

In addition, Launchpad provides full cluster lifecycle management. Using Launchpad, multi-manager, high availability clusters (defined as having sufficient node capacity to move active workloads around while updating) can be upgraded with no downtime.

Note

Launchpad is distributed as a binary executable. The main integration point with cluster management is the launchpad apply command and the input launchpad.yaml configuration for the cluster. As the configuration is in YAML format, you can integrate other tooling with Launchpad.

System requirements¶

Mirantis Launchpad is a static binary that works on the following operating systems:

Linux (x64)
MacOS (x64)
Windows (x64)

Important

The setup must meet MKE system requirements, in addition to the requirements for running Launchpad.

The following operating systems support MKE:

MKEx (Rocky&OSTree)
CentOS 7
Oracle Linux 7
Oracle Linux 8
Oracle Linux 9
Redhat Enterprise Linux 7
Redhat Enterprise Linux 8
Redhat Enterprise Linux 9
Rocky Linux 8
Rocky Linux 9
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 15
Ubuntu 18.04
Ubuntu 20.04
Ubuntu 22.04
Windows Server 2022, 2019

Be aware that Launchpad does not support all OS platform patch levels. Refer to the Compatibility Matrix for your version of MCR for full OS platform support information.

Hardware requirements¶

	Manager nodes	Worker nodes
Minimum hardware requirements	16 GB of RAM 2 vCPUs 25 GB of free disk space for the `/var` partition	4 GB of RAM
Recommended hardware requirements	24 - 32 GB of RAM 4 vCPUs 25 - 100 GB of free disk space

Note

Windows container images are typically larger than Linux container images, and thus it is necessary to provision more local storage for Windows nodes.

Permissions and privilege levels¶

Launchpad remote management must have high privilege on your system, both to prepare the system for installation and to perform the installation. This level of access is necessary for package managent, and also to allow remote users to execute MCR docker commands.

Note

For security reasons, Launchpad should not be executed with root/admin user authentication on any machine.

Package Management¶

Launchpad uses sudo commands to manage several packages through a system package manager, as detailed below:

Install the key components needed for installing Mirantis products:

curl
Used to retrieve the MCR installation script

iptables/iputils
MCR dependencies

socat
Enables Prometheus management in certain scenarios

RHEL rh-amazon-rhui-client
Used by AWS for various management tasks
Add remote users to the MCR group docker to allow docker commands.
Run the MCR installation script:
- Add package repositories for the MCR packages.
- Remove conflicting Docker-EE packages from the system.
- Install MCR, through the system package manager.
Optional. Uninstall MCR, by removing installed packages.
Optional. Prune MCR installations during unintall, by deleting system folders created by MCR.

Remote management¶

Launchpad connects through the use of a cryptographic network protocol (SSH on Linux systems, SSH or WinRM on Windows systems), and as such these must be set up on all host instances.

Note

Only passwordless sudo capable SSH Key-Based authentication is currently supported. On Windows the user must have administrator privileges.

OpenSSH¶

OpenSSH is the open-source version of the Secure Shell (SSH) tools used by administrators of Linux and other non-Windows operating systems for cross-platform management of remote systems. It is included in Windows Server 2019.

To enable SSH on Windows, you can run the following PowerShell snippets, modified for your specific configuration, on each Windows host.

# Install OpenSSH
Add-WindowsCapability -Online -Name OpenSSH.Client~~~~0.0.1.0
Add-WindowsCapability -Online -Name OpenSSH.Server~~~~0.0.1.0
Start-Service sshd
Set-Service -Name sshd -StartupType 'Automatic'

# Configure ssh key authentication
mkdir c:\Users\Administrator\.ssh\
$sshdConf = 'c:\ProgramData\ssh\sshd_config'
(Get-Content $sshdConf).replace('#PubkeyAuthentication yes', 'PubkeyAuthentication yes') | Set-Content $sshdConf
(Get-Content $sshdConf).replace('Match Group administrators', '#Match Group administrators') | Set-Content $sshdConf
(Get-Content $sshdConf).replace('       AuthorizedKeysFile __PROGRAMDATA__/ssh/administrators_authorized_keys', '#       AuthorizedKeysFile __PROGRAMDATA__/ssh/administrators_authorized_keys') | Set-Content $sshdConf
restart-service sshd

Transfer your SSH public key from your local machine to the host, using the following example but with your own values.

# Transfer SSH Key to Server
scp ~/.ssh/id_rsa.pub Administrator@1.2.1.2:C:\Users\Administrator\.ssh\authorized_keys
ssh --% Administrator@1.2.1.2 powershell -c $ConfirmPreference = 'None'; Repair-AuthorizedKeyPermission C:\Users\Administrator\.ssh\authorized_keys

WinRM¶

As an alternative to SSH, WinRM can be used on Windows hosts.

Ports Used¶

When installing an MKE cluster, a series of ports must be opened to incoming traffic.

Get started with Launchpad¶

Launchpad is a command-line deployment and lifecycle-management tool that enables users on any Linux, Mac, or Windows machine to easily install, deploy, modify, and update MKE, MSR, and MCR.

Set up a deployment environment¶

To fully evaluate and use MKE, MSR, and MCR, Mirantis recommends installing Launchpad on a real machine (Linux, Mac, or Windows) or a virtual machine (VM) that is capable of running:

A graphic desktop and browser, for accessing or installing:
- The MKE web UI
- Lens, an open source, stand-alone GUI application from Mirantis (available for Linux, Mac, and Windows) for multi-cluster management and operations
- Metrics, observability, visualization, and other tools
kubectl (the Kubernetes command-line client)
curl, Postman and/or client libraries, for accessing the Kubernetes REST API
Docker and related tools for using the Docker Swarm CLI, and for containerizing workloads and accessing local and remote registries.

The machine can reside in different contexts from the hosts and connect with those hosts in several different ways, depending on the infrastructure and services in use. It must be able to communicate with the hosts via their IP addresses on several ports. Depending on the infrastructure and security requirements, this can be relatively simple to achieve for evaluation clusters (refer to Networking Considerations for more information).

Configure hosts¶

A cluster is comprised of at least one manager node and one or more worker nodes. At the start, Mirantis recommends deploying a small evaluation cluster, with one manager and at least one worker node. Such a setup will allow you to become familiar with Launchpad, with the procedures for provisioning nodes, and with the features of MKE, MSR, and MCR. In addition, if the deployment is on a public cloud, the setup will minimize costs.

Ultimately, Launchpad can deploy manager and worker nodes in any combination, creating many different cluster configurations, such as:

Small evaluation clusters, with one manager and one or more worker nodes.
Diverse clusters, with Linux and Windows workers.
High-availability clusters, with two, three, or more manager node.
Clusters that Launchpad can auto-update, non-disruptively, with multiple managers (allowing one-by-one update of MKE without loss of cluster cohesion) and sufficient worker nodes of each type to allow workloads be drained to new homes as each node is updated.

The hosts must be able to communicate with one another (and potentially, with users in the outside world) by way of their IP addresses, using many ports. Depending on infrastructure and security requirements, this can be relatively simple to achieve for evaluation clusters (refer to Networking Considerations).

Install Launchpad¶

Note

Launchpad has built-in telemetry for tracking tool use. The telemetry data is used to improve the product and overall user experience. No sensitive data about the clusters is included in the telemetry payload.

Download Launchpad.
Rename the downloaded binary to launchpad, move it to a directory in the PATH variable, and give it permission to run (execute permission).

Tip

If macOS is in use it may be necessary to give Launchpad permissions in the Security & Privacy section in System Preferences.
Verify the installation by checking the installed tool version with the launchpad version command.
```
$ launchpad version
# console output:

version: 1.0.0
```

Complete the registration. Please be aware that the registration information will be used to assign evaluation licenses and to provide Launchpad use help.

$ launchpad register

name: Anthony Stark
company: Stark Industries
email: astark@example.com
I agree to Mirantis Launchpad Software Evaluation License Agreement https://github.com/Mirantis/launchpad/blob/master/LICENSE [Y/n]: Yes
INFO[0022] Registration completed!

Create a Launchpad configuration file¶

The cluster is configured using a yaml file.

In the example provided, a simple two-node MKE cluster is set up using Kubernetes: one node for MKE and one for a worker node.

In your editor, create a new file and copy-paste the following text as-is:

apiVersion: launchpad.mirantis.com/mke/v1.4
kind: mke
metadata:
  name: mke-kube
spec:
  mke:
    adminUsername: admin
    adminPassword: passw0rd!
    installFlags:
    - --default-node-orchestrator=kubernetes
  hosts:
  - role: manager
    ssh:
      address: 172.16.33.100
      keyPath: ~/.ssh/my_key
  - role: worker
    ssh:
      address: 172.16.33.101
      keyPath: ~/.ssh/my_key

Save the file as launchpad.yaml.
Adjust the text to meet your infrastructure requirements. The model should work to deploy hosts on most public clouds.

If you’re deploying on VirtualBox or some other desktop virtualization solution and are using bridged networking, it will be necessary to make a few minor adjustments to the launchpad.yaml.
- Deliberately set a –pod-cidr to ensure that pod IP addresses don’t overlap with node IP addresses (the latter are in the 192.168.x.x private IP network range on such a setup)
- Supply appropriate labels for the target nodes’ private IP network cards using the privateInterface parameter (this typically defaults to enp0s3 on Ubuntu 18.04 (other Linux distributions use similar nomenclature).
In addition, it may be necessary to set the username for logging in to the host.
```
apiVersion: launchpad.mirantis.com/mke/v1.4
kind: mke
metadata:
  name: my-mke
spec:
  mke:
    adminUsername: admin
    adminPassword: passw0rd!
    installFlags:
      - --default-node-orchestrator=kubernetes
      - --pod-cidr 10.0.0.0/16
  hosts:
  - role: manager
    ssh:
      address: 192.168.110.100
      keyPath: ~/.ssh/id_rsa
      user: theuser
    privateInterface: enp0s3
  - role: worker
    ssh:
      192.168.110.101
      keyPath: ~/.ssh/id_rsa
      user: theuser
    privateInterface: enp0s3
```

For more complex setups, Launchpad offers a full set of configuration options.

Note

Users who are familiar with Terraform can automate the infrastructure creation using Mirantis Terraform examples as a baseline.

Bootstrap your cluster¶

You can start the cluster once the cluster configuration file is fully set up. In the same directory where you created the launchpad.yaml file, run:

$ launchpad apply

The launchpad tool uses a cryptographic network protocol (SSH on Linux systems, SSH or WinRM on Windows systems) to connect to the infrastructure specified in the launchpad.yaml and configures on the hosts everything that is required. Within a few minutes the cluster should be up and running.

Connect to the cluster¶

Launchpad will present the information needed to connect to the cluster at the end of the installation procedure. For example:

INFO[0021] ==> Running phase: MKE cluster info
INFO[0021] Cluster is now configured.  You can access your admin UIs at:
INFO[0021] MKE cluster admin UI: https://test-mke-cluster-master-lb-895b79a08e57c67b.elb.eu-north-1.example.com
INFO[0021] You can also download the admin client bundle with the following command: launchpad client-config

By default, the administrator username is admin. If the password is not supplied in launchpad.yaml installFlags option like --admin-password=supersecret, the generated admin password will display in the install flow.

INFO[0083] 127.0.0.1:  time="2020-05-26T05:25:12Z" level=info msg= "Generated random admin password: wJm-TzIzQrRNx7d1fWMdcscu_1pN5Xs0"

Important

The addition or removal of nodes in subsequent Launchpad runs will fail if the password is not provided in the launchpad.yaml file.

See also

Networking considerations¶

Users will likely install Launchpad on a laptop or a VM with the intent of deploying MKE, MSR, or MCR onto VMs running on a public or private cloud that supports security groups for IP access control. Such an approach makes it fairly simple to configure networking in a way that provides adequate security and convenient access to the cluster for evaluation and experimentation.

The simplest way to configure the networking for a small, temporary cluster for evaluation:

Create a new virtual subnet (or VPC and subnet) for hosts.
Create a new security group called de_hosts (or another name of your choice) that permits inbound IPv4 traffic on all ports, either from the security group de_hosts, or from the new virtual subnet only.
Create another new security group (for example, admit_me) that permits inbound IPv4 traffic from your deployer machine’s public IP address only (for instance, the website whatismyip.com) to determine your public IP.
When launching hosts, attach them to the newly-created subnet and apply both new security groups.
(Optional) Once you know the IPv4 addresses (public, or VPN-accessible private) of your nodes, unless you are using local DNS it makes sense to assign names to your hosts (for example, manager, worker1, worker2… and so on). Then, insert IP addresses and names in your hostfile, thus letting you (and Launchpad) refer to hosts by hostname instead of IP address.

Once the hosts are booted, SSH into them from your deployer machine with your private key. For example:

ssh -i /my/private/keyfile username@mynode

After that, determine whether they can access the internet. One method for doing this is by pinging a Google nameserver:

$ ping 8.8.8.8

Now, proceed with installing Launchpad and configuring an MKE, MSR, or MCR deployment. Once completed, use your deployer machine to access the MKE web UI, run kubectl (after authenticating to your cluster) and other utilities (for example, Postman, curl, and so on).

Use a VPN¶

A more secure way to manage networking is to connect your deployer machine to your VPC/subnet using a VPN, and to then modify the de_hosts security group to accept traffic on all ports from this source.

More deliberate network security¶

If you intend to deploy a cluster for longer-term evaluation, it makes sense to secure it more deliberately. In this case, a certain range of ports will need to be opened on hosts. Refer to the MKE documentation for details.

Use DNS¶

Launchpad can deploy certificate bundles obtained from a certificate provider to authenticate your cluster. These can be used in combination with DNS to allow you to reach your cluster securely on a fully-qualified domain name (FQDN). Refer to the MKE documentation for details.

Upgrade components with Launchpad¶

Launchpad allows users to upgrade their clusters with the launchpad apply reconciliation command. The tool discovers the current state of the cluster and its components, and upgrades what is needed.

Upgrade Mirantis Container Runtime¶

Change the MCR version in the launchpad.yaml file.

apiVersion: launchpad.mirantis.com/mke/v1.4
kind: mke
metadata:
  name: <metadata-name>
spec:
  hosts:
  - role: manager
    ssh:
      address: 10.0.0.1
  mcr:
    version: 20.10.0

Run launchpad apply. Launchpad will upgrade MCR on all hosts in the following sequence:
1. Upgrade the container runtime on each manager node one-by-one, and thus if there is more than one manager node, all other manager nodes are available during the time that the first node is being updated.
2. Once the first manager node is updated and is running again, the second is updated, and so on, until all of the manager nodes are running the new version of MCR.
3. 10% of worker nodes are updated at a time, until all of the worker nodes are running the new version of MCR.

Upgrade MKE, MSR, AND MCR (separately or collectively)¶

Upgrading to newer versions of MKE, MSR, and MCR is as easy as changing the version tags in the launchpad.yaml and running the launchpad apply command.

Note

Launchpad upgrades MKE on all nodes.

Open the launchpad.yaml file.
Update the version tags to the new version of the component(s).
Save launchpad.yaml.
Run the launchpad apply command.

Launchpad connects to the nodes to get the current version of each component, after which it upgrades each node as described in Upgrading Mirantis Container Runtime. This may take several minutes.

Note

MKE and MSR upgrade paths require consecutive minor versions (for example, to upgrade from MKE 3.1.0 to MKE 3.3.0 it is necessary to upgrade from MKE 3.1.0 to MKE 3.2.0 first, and then upgrade from MKE 3.2.0 to MKE 3.3.0).

Manage nodes¶

The process of adding and removing nodes differs, depending on whether the affected nodes are Manager nodes, Worker nodes, or MSR nodes.

Manager Nodes¶

Swarm manager nodes use the Raft Consensus Algorithm to manage the swarm state. As such, it is advisable to have an understanding of some general Raft concepts in order to manage a swarm.

There is no limit on the number of manager nodes that can be deployed. The decision on how many manager nodes to implement comes down to a trade-off between performance and fault-tolerance. Adding manager nodes to a swarm makes the swarm more fault-tolerant, however additional manager nodes reduce write performance as more nodes must acknowledge proposals to update the swarm state (which means more network round-trip traffic).
Raft requires a majority of managers, also referred to as the quorum, to agree on proposed updates to the swarm, such as node additions or removals. Membership operations are subject to the same constraints as state replication.
In addition, Manager nodes host the control plane etcd cluster, and thus making changes to the cluster requires a working etcd cluster with the majority of peers present and working.
It is highly advisable to run an odd number of peers in quorum-based systems. MKE only works when a majority can be formed, so once more than one node has been added it is not possible to (automatically) go back to having only one node.

Add Manager Nodes¶

Adding manager nodes is as simple as adding them to the launchpad.yaml file. Re-running launchpad apply will configure MKE on the new node and also makes necessary changes in the swarm and etcd cluster.

Remove Manager Nodes¶

Remove the manager host from the launchpad.yaml file.
Enable pruning by changing the prune setting to true in spec.cluster.prune.
```
spec:
  cluster:
    prune: true
```
Run the launchpad apply command.
Remove the node in the infrastructure.

Worker Nodes¶

Add Worker Nodes¶

To add worker nodes, simply include them in the launchpad.yaml file. Re-running launchpad apply will configure everything on the new node and join it to the cluster.

Remove Worker Nodes¶

Remove the host from the launchpad.yaml file.
Enable pruning by changing the prune setting to true in spec.cluster.prune.
```
spec:
  cluster:
    prune: true
```
Run the launchpad apply command.
Remove the node in the infrastructure.

MSR Nodes¶

MSR nodes are identical to worker nodes. They participate in the MKE swarm, but should not be used as traditional worker nodes for both MSR and cluster workloads.

Note

By default, MKE will prevent scheduling of containers on MSR nodes.

MSR forms its own cluster and quorum in addition to the swarm formed by MKE. There is no limit on the number of MSR nodes that can be configured, however the best practice is to limit the amount to five. As with manager nodes, the decision on how many nodes to implement should be made with an understanding of the trade-off between performance and fault-tolerance (a larger amount of nodes added can incur severe performance penalties).

The quorum formed by MSR utilizes RethinkDB which, as with swarm, uses the Raft Consensus Algorithm.

Add MSR Nodes¶

To add MSR nodes, simply include them in the launchpad.yaml file with a host role of msr. When adding an MSR node, specify both the adminUsername and adminPassword in the spec.mke section of the launchpad.yaml file so that MSR knows which admin credentials to use.

spec:
  mke:
    adminUsername: admin
    adminPassword: passw0rd!

Next, re-run launchpad apply which will configure everything on the new node and join it into the cluster.

Remove MSR nodes¶

Remove the host from the launchpad.yaml file.
Enable pruning by changing the prune setting to true in spec.cluster.prune.
```
spec:
  cluster:
    prune: true
```
Run the launchpad apply command.
Remove the node in the infrastructure.

See also

Launchpad CLI reference¶

Global options¶

A number of optional arguments can be used with any Launchpad command.

Option	Description
--disable-telemetry	Disable sending analytics and telemetry data
--accept-license	Accept the end user license agreement
--disable-upgrade-check	Skip check for Launchpad upgrade
--debug	Increase output verbosity
--help	Display command help

Commands¶

All Launchpad commands begin wth launchpad or lp.

launchpad <command>

Command	Description
init	Initialize Launchpad. Intializes the cluster config file (usually called `launchpad.yaml`). Supported options: n/a
apply	Initialize or upgrade Launchpad. After initializing the cluster config file, applies the settings and initializes or upgrades a cluster. Supported options: --config Path to a cluster config file, including the filename (default: `launchpad.yaml`, to read from standard input use: `-`). --force Continue installation when prerequisite validation fails (default: `false`)
client-config	Download client configuration. The MKE client bundle contains a private and public key pair that authorizes Launchpad to interact with the MKE CLI. Supported options: --config Path to a cluster config file, including the filename (default: `launchpad.yaml`, to read from standard input use: `-`). Note that the configuration MUST include the MKE credentials (example follows): apiVersion: launchpad.mirantis.com/mke/v1.4 kind: mke spec: mke: adminUsername: admin adminPassword: password
reset	Reset or uninstall a cluster. Resets or uninstalls an MKE cluster. Supported options: --config Path to a cluster config file, including the filename (default: `launchpad.yaml`, to read from standard input use: `-`). --force Required when running non-interactively (default: `false`)
exec	Execute a command or run a remote terminal on a host. Use Launchpad to run commands or an interactive terminal on the hosts in the configuration. Supported options: --config Path to a cluster config file, including the filename (default: `launchpad.yaml`, to read from standard input use: `-`). --target value Target host (example: address[:port]) --interactive Run interactive (default: `false`) --first Use the first target found in configuration (default: `false`) --role value Use the first target that has this role in configuration -[command] The command to run. When blank, will run the default shell.
describe	Presents basic information that correlates to the command target. When the launchpad describe hosts command is run, the information delivered includes the IP address, the internal IP, the host name, the set role, the operating system, and the MCR version of each host. When the launchpad describe MKE or launchpad describe MSR is run, the command returns the product version number for the product targeted, as well as the URL of the administation user interface. Supported options: --config Path to a cluster config file, including the filename (default: `launchpad.yaml`, to read from standard input use: `-`). -[report name] currently supported reports: `config`, `mke`, `msr`
register	Registers a user. Supported options: --name User’s name. --email User’s email address. --company Name of user’s company. --accept-license Accept the end user license agreement.
completion	Generate shell auto-completions. Completes a specified shell. Supported options: --shell Generates completions for the shell specified following the option. Installing the completion scripts: Bash: $ launchpad completion -s bash > \ /etc/bash_completion.d/launchpad $ source /etc/bash_completion.d/launchpad Zsh: $ launchpad completion -s zsh > \ /usr/local/share/zsh/site-functions/_launchpad $ source /usr/local/share/zsh/site-functions/_launchpad Fish: $ launchpad completion -s fish > \ ~/.config/fish/completions/launchpad.fish $ source ~/.config/fish/completions/launchpad.fish

Launchpad Configuration File¶

Mirantis Launchpad cluster configuration is presented in YAML format. launchpad.yaml is the file’s default name, though you can edit this name as necessary using any common text editor.

Sample Launchpad Configuration File¶

The following launchpad.yaml example uses every possible configuration option.

apiVersion: launchpad.mirantis.com/mke/v1.4
kind: mke+msr
metadata:
  name: mycluster
spec:
  hosts:
  - role: manager
    hooks:
      apply:
        before:
          - ls -al > test.txt
        after:
          - cat test.txt
    ssh:
      address: 10.0.0.1
      user: myuser
      port: 22
      keyPath: ~/.ssh/id_rsa
    privateInterface: eth0
    environment:
      http_proxy: http://example.com
      NO_PROXY: 10.0.0.*
    mcrConfig:
      debug: true
      log-opts:
        max-size: 10m
        max-file: "3"
  - role: worker
    winRM:
      address: 10.0.0.2
      user: myuser
      password: abcd1234
      port: 5986
      useHTTPS: true
      insecure: false
      useNTLM: false
      caCertPath: ~/.certs/cacert.pem
      certPath: ~/.certs/cert.pem
      keyPath: ~/.certs/key.pem
  - role: msr
    imageDir: ./msr-images
    ssh:
      address: 10.0.0.3
      user: myuser
      port: 22
      keyPath: ~/.ssh/id_rsa
  - role: worker
    localhost:
      enabled: true
  mke:
    version: "3.6.16"
    imageRepo: "docker.io/mirantis"
    adminUsername: admin
    adminPassword: "$MKE_ADMIN_PASSWORD"
    installFlags:
    - "--default-node-orchestrator=kubernetes"
    licenseFilePath: ./docker-enterprise.lic
    configFile: ./mke-config.toml
    configData: |-
      [scheduling_configuration]
        default_node_orchestrator = "kubernetes"
  msr:
    version: "2.9.17"
    imageRepo: "docker.io/mirantis"
    installFlags:
    - --dtr-external-url dtr.example.com
    - --ucp-insecure-tls
    replicaIDs: sequential
  mcr:
    version: "23.0.13"
    channel: stable
    repoURL: https://repos.mirantis.com
    installURLLinux: https://get.mirantis.com/
    installURLWindows: https://get.mirantis.com/install.ps1
  cluster:
    prune: true

Note

Launchpad follows Kubernetes-style versioning and grouping in its configuration.

Environment variable substitution¶

In reading the configuration file, Launchpad will replace any strings that begin with a dollar sign with values from the local host’s environment variables. For example:

apiVersion: launchpad.mirantis.com/mke/v1.4
kind: mke
spec:
  mke:
    installFlags:
    - --admin-password="$MKE_ADMIN_PASSWORD"

Simple bash-like expressions are supported.

Expression	Meaning
${var}	Value of var (same as $var)
${var-$DEFAULT}	If var not set, evaluate expression as $DEFAULT
${var:-$DEFAULT}	If var not set or is empty, evaluate expression as $DEFAULT
${var=$DEFAULT}	If var not set, evaluate expression as $DEFAULT
${var:=$DEFAULT}	If var not set or is empty, evaluate expression as $DEFAULT
${var+$OTHER}	If var set, evaluate expression as $OTHER, otherwise as empty string
${var:+$OTHER}	If var set, evaluate expression as $OTHER, otherwise as empty string
$$var	Escape expressions. Result will be $var.

Key detail¶

Comprehensive information follows for each of the top-level Launchpad configuration file (launchpad.yaml) keys: apiVersion, kind, metadata, spec, cluster

apiVersion¶

The latest API version is launchpad.mirantis.com/mke/v1.4, though earlier configuration file versions are also likely to work without changes (without any features added by more recent versions).

kind¶

mke and mke+msr are currently supported.

metadata¶

name: Name of the cluster to be created. Currently affects only Launchpad internal storage paths (for example, for client bundles and log files).

spec¶

The specification for the cluster (hosts_, mke_, msr_, engine_).

hosts¶

The machines that clusters run on are hosts.

Host name	Role of the host
`privateInterface`	Private network address for the configured network interface (default: `eth0`)
`role`	Role of the machine in the cluster. Possible values are: `manager` `worker` `msr`
`environment`	Key-value pairs in YAML mapping syntax. Values are updated to host environment (optional)
`mcrConfig`	Mirantis Container Runtime configuration in YAML mapping syntax, will be converted to daemon.json (optional)
`hooks`	Hooks configuration for running commands before or after stages (optional)
`imageDir`	Path to a directory containing .tar/.tar.gz files produced by `docker save`. The images from that directory will be uploaded and `docker load` is used to load them.
`sudodocker`	Flag indicating whether Docker should be run with `sudo`. When set to `true` on Linux hosts, Docker commands will be run with `sudo`, and the user will not be added to the machine `docker` group.

Host connection options¶

Option type	Options
`ssh` (Secure Shell)	`address`: SSH connection address `user`: User to log in as (default: `root`) `port`: Host’s ssh port (default: 22) `keyPath`: A local file path to an ssh private key file (default: ~/.ssh/id_rsa)
`winRM` (Windows Remote Management)	`address`: WinRM connection address `user`: Windows account username (default: `Administrator`) `password`: User account password `port`: Host’s winRM listening port (default: 5986) `useHTTPS`: Set `true` to use HTTPS protocol. When false, plain HTTP is used. (default: `false`) `insecure`: Set to `true` to ignore SSL certificate validation errors (default: `false`) `useNTLM`: Set `true` to use NTLM (default: `false`) `caCertPath`: Path to CA Certificate file (optional) `certPath`: Path to Certificate file (optional) `keyPath`: Path to Key file (optional)
`localhost`	`enabled`: Set to `true` to enable.

Hooks configuration options¶

Option type	Options
`apply`	`before`: List of commands to run on the host before the “Preparing host” phase (optional) `after`: List of commands to run on the host before the “Disconnect” phase when the apply was succesful (optional)
`reset`	`before`: List of commands to run on the host before the “Uninstall” phase (optional) `after`: List of commands to run on the host before the “Disconnect” phase when the reset was successful (optional)

mke¶

Specify options for the MKE cluster.

Options	Description
`version`	Version of MKE to install or upgrade to (default: 3.3.7)
`imageRepo`	The image repository to use for MKE installation (default: `docker.io/ mirantis`)
`adminUsername`	MKE administrator username (default: `admin`)
`adminPassword`	MKE administrator password (default: auto-generate)
`installFlags`	Custom installation flags for MKE installation.
`upgradeFlags`	Optional. Custom upgrade flags for MKE upgrade. Obtain a list of supported installation options for a specific MKE version by running the installer container with docker run -t -i --rm mirantis/ucp:3.6.16 upgrade --help.
`licenseFilePath`	Optional. A path to the MKE license file.
`configFile`	Optional. The initial full cluster configuration file.
`configData`	Optional. The initial full cluster configuration file in embedded “heredocs” syntax. Heredocs allows you to define a mulitiline string while maintaining the original formatting and indenting
`cloud`	Optional. Cloud provider configuration. `provider`: Provider name (currently AWS, Azure and OpenStack (MKE 3.3.3+) are supported) `configFile`: Path to cloud provider configuration file on local machine `configData`: Inlined cloud provider configuration
`swarmInstallFlags`	Optional. Custom flags for Swarm initialization
`swarmUpdateCommands`	Optional. Custom commands to run after the Swarm initialization
`caCertPath` `certPath` `keyPath` each followed by `<path to file>` or `caCertData` `certData` `keyData` each followed by `<PEM encoded string>`	Required components for configuring the MKE UI to use custom SSL certificates on its Ingress. You must specify all components: CA Certificate SSL Certificate Private Key Launchpad accepts either inline PEM-encoded data or a file path, depending on the provided argument. Note If MKE already uses custom certificates, Launchpad can rotate the certificates during upgrade.

Important

Unless a password is provided, the MKE installer automatically generates an administrator password. This password will display in clear text in the output and persist in the logs. Subsequent runs will fail if this automatically generated password is not configured in the launchpad.yaml file.

msr¶

Specify options for the MSR cluster.

Options	Description
`version`	Version of MSR to install or upgrade to (default: 2.8.5)
`imageRepo`	The image repository to use for MSR installation (default: `docker.io/ mirantis`)
`installFlags`	Optional. Custom installation flags for MSR installation. Obtain a list of supported installation options for a specific MSR version by running the installer container with docker run -t -i --rm mirantis/dtr:3.1.5 install --help. Note Launchpad inherits the MKE flags that MSR needs to perform an installation, and to join or remove nodes. Thus, there is no need to include the following install flags in the installFlags section of msr: `--ucp-username` (inherited from MKE’s `--admin-username` flag or `spec.mke.adminUsername`) `--ucp-password` (inherited from MKE’s `--admin-password` flag or `spec.mke.adminPassword`) `--ucp-url` (inherited from MKE’s `--san flag` or intelligently selected based on other configuration variables)
`upgradeFlags`	Optional. Custom upgrade flags for MSR upgrade. Obtain a list of supported installation options for a specific MSR version by running the installer container with docker run -t -i --rm mirantis/dtr:3.1.5 upgrade --help.
`replicaIDs`	Set to `sequential` to generate sequential replica id’s for cluster members, e.g., 000000000001, 000000000002, etc. (default: random)

mcr¶

Specify options for MCR installation.

Note

Customers take a risk in opting to use and manage their own install scripts for MCR instead of the install script that Mirantis hosts at get.mirantis.com. Mirantis manages this script as necessary to support MCR installations on demand, and can change it as needed to resolve issues and to support new features. As such, customers who opt to use their own script will need to monitor the Mirantis script to ensure compatibility.

Options	Description
`version`	Version of MCR to install or upgrade to. (default 20.10.0)
`channel`	Installation channel to use. One of `test` or `prod` (optional).
`repoURL`	Repository URL to use for MCR installation. (optional)
`installURLLinux`	Location from which to download the initial installer script for Linux hosts (local paths can also be used). (default: https://get.mirantis.com/)
`installURLWindows`	Location from which to download the initial installer script for Windows hosts (local paths can be used). (default: https://get.mirantis.com/install.ps1) Note In most scenarios, it is not necessary to specify `repoUrl` and `installURLLinux/Windows`, which usually are only used when installing from a non-standard location (that is, a disconnected datacenter).
`prune`	Removes certain system paths that are created by MCR during uninstallation (for example, `/var/lib/docker`).

cluster¶

Specify options that do not pertain to any of the individual components.

Options	Description
`prune`	Set to `true` to remove nodes that are known by the cluster but not listed in the `launchpad.yaml` file.

See also

Get support¶

Mirantis Kubernetes Engine (MKE) subscriptions provide access to prioritized support for designated contacts from your company, agency, team, or organization. MKE service levels are based on your subscription level and the cloud or cluster that you designate in your technical support case. Our support offerings are described on the Enterprise-Grade Cloud Native and Kubernetes Support page. You may inquire about Mirantis support subscriptions by using the contact us form.

The CloudCare Portal is the primary way that Mirantis interacts with customers who are experiencing technical issues. Access to the CloudCare Portal requires prior authorization by your company, agency, team, or organization, and a brief email verification step. After Mirantis sets up its backend systems at the start of the support subscription, a designated administrator at your company, agency, team, or organization, can designate additional contacts. If you have not already received and verified an invitation to our CloudCare Portal, contact your local designated administrator, who can add you to the list of designated contacts. Most companies, agencies, teams, and organizations have multiple designated administrators for the CloudCare Portal, and these are often the persons most closely involved with the software. If you do not know who your local designated administrator is, or you are having problems accessing the CloudCare Portal, you may also send an email to Mirantis support at support@mirantis.com.

Once you have verified your contact details and changed your password, you and all of your colleagues will have access to all of the cases and resources purchased. Mirantis recommends that you retain your Welcome to Mirantis email, because it contains information on how to access the CloudCare Portal, guidance on submitting new cases, managing your resources, and other related issues.

We encourage all customers with technical problems to use the knowledge base, which you can access on the Knowledge tab of the CloudCare Portal. We also encourage you to review the MKE product documentation and release notes prior to filing a technical case, as the problem may already be fixed in a later release or a workaround solution provided to a problem experienced by other customers.

One of the features of the CloudCare Portal is the ability to associate cases with a specific MKE cluster. These are referred to in the Portal as “Clouds”. Mirantis pre-populates your customer account with one or more Clouds based on your subscription(s). You may also create and manage your Clouds to better match how you use your subscription.

Mirantis also recommends and encourages customers to file new cases based on a specific Cloud in your account. This is because most Clouds also have associated support entitlements, licenses, contacts, and cluster configurations. These greatly enhance the ability of Mirantis to support you in a timely manner.

You can locate the existing Clouds associated with your account by using the Clouds tab at the top of the portal home page. Navigate to the appropriate Cloud and click on the Cloud name. Once you have verified that the Cloud represents the correct MKE cluster and support entitlement, you can create a new case via the New Case button near the top of the Cloud page.

One of the key items required for technical support of most MKE cases is the support bundle. This is a compressed archive in ZIP format of configuration data and log files from the cluster. There are several ways to gather a support bundle, each described in the paragraphs below. After you obtain a support bundle, you can upload the bundle to your new technical support case by following the instructions in the Mirantis knowledge base, using the Detail view of your case.

Obtain a full-cluster support bundle using the MKE web UI¶

Log in to the MKE web UI as an administrator.
In the left-side nagivation panel, navigate to <user name> and click Support Bundle.

It may take several minutes for the download to complete.

Note

The default name for the generated support bundle file is docker-support-<cluster-id>-YYYYmmdd-hh_mm_ss.zip. Mirantis suggests that you not alter the file name before submittal to the customer portal. However, if necessary, you can add a custom string between docker-support and <cluster-id>, as in: docker-support-MyProductionCluster-<cluster-id>-YYYYmmdd-hh_mm_ss.zip.
Submit the support bundle to Mirantis Customer Support by clicking Share support bundle on the success prompt that displays when the support bundle finishes downloading.
Fill in the Jira feedback dialog, and click Submit.

Obtain a full-cluster support bundle using the MKE API¶

Create an environment variable with the user security token:

export AUTHTOKEN=$(curl -sk -d \
'{"username":"<username>","password":"<password>"}' \
https://<mke-ip>/auth/login | jq -r .auth_token)

Obtain a cluster-wide support bundle:

curl -k -X POST -H "Authorization: Bearer $AUTHTOKEN" \
-H "accept: application/zip" https://<mke-ip>/support \
-o docker-support-$(date +%Y%m%d-%H_%M_%S).zip

Add the --submit option to the support command to submit the support bundle to Mirantis Customer Support. The support bundle will be sent, along with the following information:
- Cluster ID
- MKE version
- MCR version
- OS/architecture
- Cluster size
For more information on the support command, refer to support.

Obtain a single-node support bundle using the CLI¶

Use SSH to log into a node and run:

MKE_VERSION=$((docker container inspect ucp-proxy \
--format '{{index .Config.Labels "com.docker.ucp.version"}}' \
2>/dev/null || echo -n 3.6.16)|tr -d [[:space:]])

docker container run --rm \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --log-driver none \
  mirantis/ucp:${MKE_VERSION} \
  support > \
  docker-support-${HOSTNAME}-$(date +%Y%m%d-%H_%M_%S).tgz

Important

If SELinux is enabled, include the following additional flag: --security-opt label=disable.

Note

The CLI-derived support bundle only contains logs for the node on which you are running the command. If your MKE cluster is highly available, collect support bundles from all manager nodes.

Add the --submit option to the support command to submit the support bundle to Mirantis Customer Support. The support bundle will be sent, along with the following information:
- Cluster ID
- MKE version
- MCR version
- OS/architecture
- Cluster size
For more information on the support command, refer to support.

Use PowerShell to obtain a support bundle¶

Run the following command on Windows worker nodes to collect the support information and automatically place it in a zip file:

$MKE_SUPPORT_DIR = Join-Path -Path (Get-Location) -ChildPath 'dsinfo'
$MKE_SUPPORT_ARCHIVE = Join-Path -Path (Get-Location) -ChildPath $('docker-support-' + (hostname) + '-' + (Get-Date -UFormat "%Y%m%d-%H_%M_%S") + '.zip')
$MKE_PROXY_CONTAINER = & docker container ls --filter "name=ucp-proxy" --format "{{.Image}}"
$MKE_REPO = if ($MKE_PROXY_CONTAINER) { ($MKE_PROXY_CONTAINER -split '/')[0] } else { 'mirantis' }
$MKE_VERSION = if ($MKE_PROXY_CONTAINER) { ($MKE_PROXY_CONTAINER -split ':')[1] } else { '3.6.0' }
docker container run --name windowssupport `
-e UTILITY_CONTAINER="$MKE_REPO/ucp-containerd-shim-process-win:$MKE_VERSION" `
-v \\.\pipe\docker_engine:\\.\pipe\docker_engine `
-v \\.\pipe\containerd-containerd:\\.\pipe\containerd-containerd `
-v 'C:\Windows\system32\winevt\logs:C:\eventlogs:ro' `
-v 'C:\Windows\Temp:C:\wintemp:ro' $MKE_REPO/ucp-dsinfo-win:$MKE_VERSION
docker cp windowssupport:'C:\dsinfo' .
docker rm -f windowssupport
Compress-Archive -Path $MKE_SUPPORT_DIR -DestinationPath $MKE_SUPPORT_ARCHIVE

API Reference¶

The Mirantis Kubernetes Engine (MKE) API is a REST API, available using HTTPS, that enables programmatic access to Swarm and Kubernetes resources managed by MKE. MKE exposes the full Mirantis Container Runtime API, so you can extend your existing code with MKE features. The API is secured with role-based access control (RBAC), and thus only authorized users can make changes and deploy applications to your cluster.

The MKE API is accessible through the same IP addresses and domain names that you use to access the MKE web UI. And as the API is the same one used by the MKE web UI, you can use it to programmatically do everything you can do from the MKE web UI.

The system manages Swarm resources through collections and Kubernetes resources through namespaces. For detailed information on these resource sets, refer to the RBAC core elements table in the Role-based access control documentation.

endpoint	Description
`/roles`	Allows you to enumerate and create custom permissions for accessing collections.
`/accounts`	Enables the management of users, teams, and organizations.
`/configs`	Provides access to the swarm configuration.

CLI Reference¶

The mirantis/ucp:3.x.y image includes commands that install and manage MKE on a Mirantis Container Runtime.

You can configure the commands using either flags or environment variables.

Environment variables can use either of the following types of syntax:

Pass the value from your shell using the docker container run -e VARIABLE_NAME syntax.
Specify the value explicitly from the command line using the docker container run -e VARIABLE_NAME=value syntax.

To use the MKE CLI:

MKE CLI use requires that you name the mirantis/ucp:3.x.y image ucp and bind-mount the Docker daemon socket:

docker container run -it --rm \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  <command> <command-options>

Additional information is available for each command by using the --help flag.

Note

To obtain the appropriate image, it may be necessary to use docker/ucp:3.x.y rather than mirantis/ucp:3.x.y, as older versions are associated with the docker organization. Review the images in the mirantis and docker organizations on Docker Hub to determine the correct organization.

backup¶

The backup command creates a backup of an MKE manager node. Specifically, the command creates a TAR file with the contents of the volumes used by the given MKE manager node and then prints it. You can then use the restore command to restore the data from an existing backup.

To create backups of a multi-node cluster, you only need to back up a single manager node. The restore operation will reconstitute a new MKE installation from the backup of any previous manager node.

Note

The backup contains private keys and other sensitive information. Use the --passphrase flag to encrypt the backup with PGP-compatible encryption or --no-passphrase to opt out of encrypting the backup. Mirantis does not recommend the latter option.

To use the backup command:

docker container run \
  --rm \
  --interactive \
  --name ucp \
  --log-driver none \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  backup <command-options> > backup.tar

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--file <filename>`	Specifies the name of the file wherein the backup contents are written. This option requires that you bind-mount the file path to the container that is performing the backup. The file path must be relative to the container file tree. For example: docker run <other options> --mount type=bind,src=/home/user/backup:/backup mirantis/ucp --file /backup/backup.tar This option is ignored in interactive mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.
`--include-logs`	Stores an encrypted `backup.log` file in the mounted directory. Must be issued at the same time as the `--file` option. The default value is `true`.
`--interactive, -i`	Runs in interactive mode and prompts for configuration values.
`--no-passphrase`	Bypasses the option to encrypt the TAR file with a passphrase. Mirantis does not recommend this option.
`--passphrase <value>`	Encrypts the TAR file with a passphrase.

SELinux¶

Installing MKE on a manager node with SELinux enabled at the daemon and the operating system levels requires that you include --security-opt label=disable with your backup command. This flag disables SELinux policies on the MKE container. The MKE container mounts and configures the Docker socket as part of the MKE container. Therefore, the MKE backup process fails with the following error if you neglect to include this flag:

FATA[0000] unable to get valid Docker client: unable to ping Docker
daemon: Got permission denied while trying to connect to the Docker
daemon socket at unix:///var/run/docker.sock:
Get http://%2Fvar%2Frun%2Fdocker.sock/_ping:
dial unix /var/run/docker.sock: connect: permission denied -
If SELinux is enabled on the Docker daemon, make sure you run
MKE with "docker run --security-opt label=disable -v /var/run/docker.sock:/var/run/docker.sock ..."

To backup MKE with SELinux enabled at the daemon level:

docker container run \
  --rm \
  --interactive \
  --name ucp \
  --security-opt label=disable \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  backup <command-options> > backup.tar

dump-certs¶

The dump-certs command prints the public certificates used by the MKE web server. Specifically, the command produces public certificates for the MKE web server running on the specified node. By default, it prints the contents of the ca.pem and cert.pem files.

Integrating MKE and MSR requires that you use this command with the --cluster --ca flags to configure MSR.

To use the dump-certs command:

docker container run --rm \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  dump-certs <command-options>

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.
`--ca`	Prints only the contents of the `ca.pem` file.
`--cluster`	Prints the internal MKE swarm root CA and certificate instead of the public server certificate.

example-config¶

The example-config command displays an example configuration file for MKE.

To use the example-config command:

docker container run --rm -i \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  example-config

id¶

The id command prints the ID of the MKE components that run on your MKE cluster. This ID matches the ID in the output of the docker info command, when issued while using a client bundle.

To use the id command:

docker container run --rm \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  id

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.

images¶

The images command reviews the MKE images that are available on the specified node and pulls the ones that are missing.

To use the images command:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  images <command-options>

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.
`--list`	Lists all the images used by MKE but does not pull them.
`--pull <value>`	Pulls the MKE images. Valid values: `always`, `missing`, and `never`.
`--registry-password <value>`	Specifies the password to use when pulling images.
`--registry-username <value>`	Specifies the user name to use when pulling images.
`--swarm-only`	Returns only the images used in Swarm-only mode.

install¶

The install command installs MKE on the specified node. Specifically, the command initializes a new swarm, promotes the specified node into a manager node, and installs MKE.

The following customizations are possible when installing MKE:

Customize the MKE web server certificates:
1. Create a volume named ucp-controller-server-certs.
2. Copy the ca.pem, cert.pem, and key.pem files to the root directory.
3. Run the install` command with the --external-server-cert flag.
Customize the license used by MKE using one of the following options:
- Bind mount the file at /config/docker_subscription.lic in the tool. For example:
```
-v /path/to/my/config/docker_subscription.lic:/config/docker_subscription.lic
```
- Specify the --license $(cat license.lic) option.

If you plan to join more nodes to the swarm, open the following ports in your firewall:

443 or the value of --controller-port
2376 or the value of --swarm-port
2377 or the Swarm gRPC port
6443 or the value of --kube-apiserver-port
179, 10250, 12376, 12379, 12380, 12381, 12382, 12383, 12384, 12385, 12386, 12387, 12388, 12390
4789 (UDP) and 7946 (TCP/UDP) for overlay networking

For more information, refer to Open ports to incoming traffic.

Note

If you are installing MKE on a public cloud platform, see the cloud-specific MKE installation documentation for the following platforms:

To use the install command:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  install <command-options>

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.
`--interactive, -i`	Runs in interactive mode, prompting for configuration values.
`--admin-password <value>`	Sets the MKE administrator password, `$UCP_ADMIN_PASSWORD`.
`--admin-username <value>`	Sets the MKE administrator user name, `$UCP_ADMIN_USER`.
`--azure-ip-count <value>`	Configures the number of IP addresses to be provisioned for each Azure Virtual Machine. Default: `128`.
`binpack`	Sets the Docker Swarm scheduler to binpack mode, for backward compatibility.
`--cloud-provider <value>`	Sets the cluster cloud provider. Valid values: `aws`, `azure`, `gce`.
`--cni-installer-url <value>`	Sets a URL that points to a Kubernetes YAML file that is used as an installer for the cluster CNI plugin. If specified, the default CNI plugin is not installed. If the URL uses the `HTTPS` scheme, no certificate verification is performed.
`--controller-port <value>`	Sets the port for the web UI and the API Default: `443`.
`--data-path-addr <value>`	Sets the address or interface to use for data path traffic, `$UCP_DATA_PATH_ADDR`. Format: IP address or network interface name
`--disable-tracking`	Disables anonymous tracking and analytics.
`--disable-usage`	Disables anonymous usage reporting.
`--dns-opt <value>`	Sets the DNS options for the MKE containers, `$DNS_OPT`.
`--dns-search <value>`	Sets custom DNS search domains for the MKE containers, `$DNS_SEARCH`.
`--dns <value>`	Sets custom DNS servers for the MKE containers, `$DNS`.
`--enable-profiling`	Enables performance profiling.
`--existing-config`	Sets to use the latest existing MKE configuration during the installation. The installation will fail if a configuration is not found.
`--external-server-cert`	Customizes the certificates used by the MKE web server.
`--external-service-lb <value>`	Sets the IP address of the load balancer where you can expect to reach published services.
`--force-insecure-tcp`	Forces the installation to continue despite unauthenticated Mirantis Container Runtime ports.
`--force-minimums`	Forces the installation to occur even if the system does not meet the minimum requirements.
`--host-address <value>`	Sets the network address that advertises to other nodes, `$UCP_HOST_ADDRESS`. Format: IP address or network interface name
`--iscsiadm-pathvalue <value>`	Sets the path to the host `iscsiadm` binary. This option is applicable only when `--storage-iscsi` is specified.
`--kube-apiserver-port <value>`	Sets the port for the Kubernetes API server. Default: `6443`.
`--kv-snapshot-count <value>`	Sets the number of changes between key-value store snapshots, `$KV_SNAPSHOT_COUNT`. Default: `20000`.
`--kv-timeout <value>`	Sets the timeout in milliseconds for the key-value store, `$KV_TIMEOUT`. Default: `5000`.
`--license <value>`	Adds a license, `$UCP_LICENSE`. Format: `“$(cat license.lic)”`
`--nodeport-range <value>`	Sets the allowed port range for Kubernetes services of NodePort type. Default: `32768-35535`.
`--pod-cidr <values>`	Sets Kubernetes cluster IP pool for the Pods to be allocated from. Default: `192.168.0.0/16`.
`--preserve-certs`	Sets so that certificates are not generated if they already exist.
`--pull <value>`	Pulls MKE images. Valid values: `always`, `missing`, and `never` Default: `missing`.
`--random`	Sets the Docker Swarm scheduler to random mode, for backward compatibility.
`--registry-password <value>`	Sets the password to use when pulling images, `$REGISTRY_PASSWORD`.
`--registry-username <value>`	Sets the user name to use when pulling images, `$REGISTRY_USERNAME`.
`--san <value>`	Adds subject alternative names to certificates, `$UCP_HOSTNAMES`. For example: `--san www2.acme.com`
`--service-cluster-ip-range <value>`	Sets the Kubernetes cluster IP Range for services. Default: `10.96.0.0/16`.
`--skip-cloud-provider-check`	Disables checks which rely on detecting which cloud provider, if any, the cluster is currently running on.
`--storage-expt-enabled`	Enables experimental features in Kubernetes storage.
`--storage-iscsi`	Enables ISCSI-based PersistentVolumes in Kubernetes.
`--swarm-experimental`	Enables Docker Swarm experimental features, for backward compatibility.
`--swarm-grpc-port <value>`	Sets the port for communication between nodes. Default: `2377`.
`--swarm-port <value>`	Sets the port for the Docker Swarm manager, for backward compatibility. Default: `2376`.
`--unlock-key <value>`	Sets the unlock key for this swarm-mode cluster, if one exists, `$UNLOCK_KEY`.
`--unmanaged-cni`	Indicates that Calico is the CNI provider, managed by MKE. Calico is the default CNI provider.
`--kubelet-data-root`	Configures the kubelet data root directory on Linux when performing new MKE installations.
`--containerd-root`	Configures the containerd root directory on Linux when performing new MKE installations. Any non-root directory containerd customizations must be made along with the root directory customizations prior to installation and with the `--containerd-root` flag omitted.
`--ingress-controller`	Configures the `HTTP` ingress controller for the management of traffic that originates outside the cluster.
`--calico-ebpf-enabled`	Sets whether Calico eBPF mode is enabled. When specifying `--calico-ebpf-enabled`, do not use `--kube-default-drop-masq-bits` or `--kube-proxy-mode`.
`--kube-default-drop-masq-bits`	Sets whether MKE uses Kubernetes default values for iptables drop and masquerade bits.
`--kube-proxy-mode`	Sets the operational mode for kube-proxy. Valid values: `iptables`, `ipvs`, `disabled` Default: `iptables`.
`--kube-protect-kernel-defaults`	Protects kernel parameters from being overridden by kubelet. Default: `false`. Important When enabled, kubelet can fail to start if the following kernel parameters are not properly set on the nodes before you install MKE or before adding a new node to an existing cluster: vm.panic_on_oom=0 vm.overcommit_memory=1 kernel.panic=10 kernel.panic_on_oops=1 kernel.keys.root_maxkeys=1000000 kernel.keys.root_maxbytes=25000000 For more information, refer to Configure kernel parameters.
`--swarm-only`	Configures MKE in Swarm-only mode, which supports only Docker Swarm orchestration.
`--windows-containerd-root <value>`	Sets the root directory for containerd on Windows.
`--secure-overlay`	Enables IPSec network encryption using `SecureOverlay` in Kubernetes.
`--calico-ip-auto-method <value>`	Allows the user to set the method for autodetecting the IPv4 address for the host. When specified, IP autodetection method is set for `calico-node`.
`--calico-vxlan`	Sets the calico CNI dataplane to VXLAN. Default: VXLAN.
`vxlan-vni <value>`	Sets the `vxlan-vni` ID. Note that dataplane must be set to VXLAN. Valid values: `10000` - `20000`. Default: `10000`.
`--cni-mtu <value>`	Sets the MTU for CNI interfaces. Calculate MTU size based on which overlay is in use. For user-specific configuration, subtract 20 bytes for IPIP or 50 bytes for VXLAN. Default: `1480` for IPIP, `1450` for VXLAN.
`--windows-kubelet-data-root <value>`	Sets the data root directory for kubelet on Windows.
`--default-node-orchestrator <value>`	Sets the default node orchestrator for the cluster. Valid values: `swarm`, `kubernetes`. Default: `swarm`.
`--iscsidb-path <value>`	Sets the absolute path to host iscsi DB. Verify that `--storage-iscsi` is specified. Note that Symlinks are not allowed.
`--kube-proxy-disabled`	Disables `kube-proxy`. This option is activated by `--calico-ebpf-enabled`, and it cannot be used in combination with `--kube-proxy-mode`.
`--cluster-label <value>`	Sets the cluster label that is employed for usage reporting.

SELinux¶

Installing MKE on a manager node with SELinux enabled at the daemon and the operating system levels requires that you include --security-opt label=disable with your install command. This flag disables SELinux policies on the installation container. The MKE installation container mounts and configures the Docker socket as part of the MKE installation container. Therefore, omitting this flag will result in the failure of your MKE installation with the following error:

FATA[0000] unable to get valid Docker client: unable to ping Docker
daemon: Got permission denied while trying to connect to the Docker
daemon socket at unix:///var/run/docker.sock:
Get http://%2Fvar%2Frun%2Fdocker.sock/_ping:
dial unix /var/run/docker.sock: connect: permission denied -
If SELinux is enabled on the Docker daemon, make sure you run
MKE with "docker run --security-opt label=disable -v /var/run/docker.sock:/var/run/docker.sock ..."

To install MKE with SELinux enabled at the daemon level:

docker container run -rm -it \
  --name ucp \
  --security-opt label=disable \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  install <command-options>

See also

port-check-server¶

The port-check-server command verifies whether the specified port is available for use.

To use the port-check-server command:

docker run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  port-check-server <command-options>

Options¶

Option

Description

--listen-address, -l <value>

Sets the port on which to verify connectivity.

Default: :2376.

restore¶

The restore command restores an MKE cluster from a backup. Specifically, the command installs a new MKE cluster that is populated with the state of a previous MKE manager node using a TAR file originally generated using the backup command. All of the MKE settings, users, teams, and permissions are restored from the backup file.

The restore operation does not alter or recover the following cluster resources:

Containers
Networks
Volumes
Services

You can use the restore command on any manager node in an existing cluster. If the current node does not belong in a cluster, one is initialized using the value of the --host-address flag. When restoring on an existing Swarm-mode cluster, there must be no previous MKE components running on any node of the cluster. This cleanup operation is performed using the uninstall-ucp command.

If the restoration is performed on a different cluster than the one from which the backup file was created, the cluster root CA of the old MKE installation is not restored. This restoration invalidates any previously issued admin client bundles and, thus, all administrators are required to download new client bundles after the operation is complete. Any existing non-admin user client bundles remain fully operational.

By default, the backup TAR file is read from stdin. You can also bind-mount the backup file under /config/backup.tar and run the restore command with the --interactive flag.

Note

You must run uninstall-ucp before attempting the restore operation on an existing MKE cluster.
If your Swarm-mode cluster has lost quorum and the original set of managers are not recoverable, you can attempt to recover a single-manager cluster using the docker swarm init --force-new-cluster command.
You can restore MKE from a backup that was taken on a different manager node or a different cluster altogether.

To use the restore command:

docker run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --name ucp \
  mirantis/ucp:3.x.y \
  restore <command-options>

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.
`--interactive, i`	Runs in interactive mode and prompts for configuration values.
`--data-path-addr <value>`	Sets the address or interface to use for data path traffic.
`--force-minimums`	Forces the install or upgrade, which will go through even if the system does not meet the minimum requirements.
`--host-address <value>`	Sets the network address to advertise to other nodes. Format: IP address or network interface name
`--passphrase <value>`	Decrypts the backup TAR file with the provided passphrase.
`--san <value>`	Adds subject alternative names to certificates, for example, `--san www1.acme.com`
`--swarm-grpc-port <value>`	Sets the port for communication between nodes. Default: `2377`.
`--unlock-key <value>`	Sets the unlock key for a Swarm-mode cluster.
`--swarm-only`	Indicates that the backup cluster is configured in Swarm-only mode.
`--timeout` value	Sets the timeout duration. Valid time units: `ns`, `us`, `ms`, `s`, `m`, and `h`. Default: `"30m"`.

support¶

Use the support command to create a support bundle for the specified MKE nodes. This command creates a support bundle file for the specified nodes, including the MKE cluster ID, and prints it to stdout.

To use the support command:

docker container run --rm \
  --name mke \
  --log-driver none \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  support <command-options> > docker-support.tgz

Options¶

Option	Description
`--debug, -D`	Enable debug mode.
`--jsonlog`	Produce JSON-formatted output for easier parsing.
`--submit`	Submit the support bundle to Mirantis Customer Support along with the following information: Cluster ID MKE version MCR version OS/architecture Cluster size
`--loglines`	Limit the size of log files to the specified amount. Default: 10000
`--until`	Retrieve logs until the specified date and time. Format: YYYY-MM-DD HH:MM:SS
`--since`	Retrieve logs since the specified date and time. Format: YYYY-MM-DD HH:MM:SS
`--goroutine`	Retrieve goroutine stack straces.

uninstall-ucp¶

The uninstall-ucp command uninstalls MKE from the specified swarm, preserving the swarm so that your applications can continue running.

After MKE is uninstalled, you can use the docker swarm leave and docker node rm commands to remove nodes from the swarm. You cannot join nodes to the swarm until MKE is installed again.

To use the uninstall-ucp command:

docker container run --rm -it \
       --name ucp \
       -v /var/run/docker.sock:/var/run/docker.sock \
       -v /var/log:/var/log \
       mirantis/ucp:3.x.y \
       uninstall-ucp <command-options>

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.
`--interactive, -i`	Runs in interactive mode and prompts for configuration values.
`--id <value>`	Sets the ID of the MKE instance to uninstall.
`--no-purge-secret`	Configures the command to leave the MKE-related Swarm secrets in place.
`--pull <value>`	Pulls MKE images. Valid values: `always`, `missing`, and `never`.
`--purge-config`	Removes the MKE configuration file when uninstalling MKE.
`--registry-password <value>`	Sets the password to use when pulling images.
`--registry-username <value>`	Sets the user name to use when pulling images.
`--unmanaged-cni`	Specifies that MKE was installed in unmanaged CNI mode. When this parameter is supplied to the uninstaller, no attempt is made to clean up `/etc/cni`, thus causing any user-supplied CNI configuration files to persist in their original state.

upgrade¶

The upgrade command upgrades an MKE cluster.

Prior to performing an upgrade, Mirantis recommends that you perform a backup of your MKE cluster using the backup command.

After upgrading MKE, log in to the MKE web UI and confirm that each node is healthy and that all nodes have been upgraded successfully.

To use the upgrade command:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  upgrade <command-options>

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.
`--interactive, -i`	Runs in interactive mode and prompts for configuration values.
`--admin-password <value>`	Sets the MKE administrator password.
`--admin-username <value>`	Sets the MKE administrator user name.
`--force-minimums`	Forces the install or upgrade to occur even if the system does not meet the minimum requirements.
`--host-address <value>`	Overrides the previously configured host address with the specified IP address or network interface.
`--id <value>`	Sets the ID of the MKE instance to upgrade.
`--manual-worker-upgrade`	Sets whether to manually upgrade worker nodes. Default: `false`.
`--pull <value>`	Pulls MKE images. Valid values: `always`, `missing`, and `never`.
`--registry-password <value>`	Sets the password to use when pulling images.
`--registry-username <value>`	Sets the user name to use when pulling images.
`--force-port-check`	Forces the upgrade to continue even in the event of a port check failure. Default: `false`.
`--force-recent-backup`	Forces the upgrade to occur even if the system does not have a recent backup. Default: `false`.

checks (subcommand)¶

The checks subcommand runs the pre-upgrade review on your cluster.

To use the checks subcommand:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp \
  upgrade checks <command-options>

Options¶

Option	Description
`--debug, -D`	Enables debug mode.
`--jsonlog`	Produces JSON-formatted output for easier parsing.
`--interactive, -i`	Runs in interactive mode and prompts for configuration values.
`--admin-password <value>`	Sets the MKE administrator password.
`--admin-username <value>`	Sets the MKE administrator user name.
`--id <value>`	Sets the ID of the MKE instance to upgrade.
`--pull <value>`	Pulls MKE images. Valid values: `always`, `missing`, and `never`.
`--registry-password <value>`	Sets the password to use when pulling images.
`--registry-username <value>`	Sets the user name to use when pulling images.

CIS Benchmarks¶

The Center for Internet Security (CIS) provides the CIS Kubernetes Benchmarks for each Kubernetes release. These benchmarks comprise a comprehensive set of recommendations that is targeted to enhancing Kubernetes security configuration. Designed to align with industry regulations, CIS Benchmarks ensure standards that meet diverse compliance requirements, and their universal applicability across Kubernetes distributions ensures the fortification of such environments and while fostering a robust security posture.

Note

The CIS Benchmark results detailed herein are verified against MKE 3.6.8.
Mirantis has based its handling of Kubernetes benchmarks on CIS Kubernetes Benchmark v1.7.0.

1 Control Plane Components¶

Section 1 is comprised of security recommendations for the direct configuration of Kubernetes control plane processes. It is broken out into four subsections:

1.1 Control Node Plane Configuration Files

Recommendation designation	Recommendation	Level	Result
1.1.1	Ensure that the API server pod specification file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.2	Ensure that the API server pod specification file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.3	Ensure that the controller manager pod specification file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.4	Ensure that the controller manager pod specification file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.5	Ensure that the scheduler pod specification file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.6	Ensure that the scheduler pod specification file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.7	Ensure that the etcd pod specification file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.8	Ensure that the etcd pod specification file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.9	Ensure that the Container Network Interface file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.10	Ensure that the Container Network Interface file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.11	Ensure that the etcd data directory permissions are set to `700` or more restrictive.	Level 1 - Master Node	Pass
1.1.12	Ensure that the etcd data directory ownership is set to `etcd:etcd`.	Level 1 - Master Node	Fail MKE runs etcd in a container, and thus it does not create an etcd user on the host. Access to the etcd data directory is instead controlled through a docker volume.
1.1.13	Ensure that the `admin.conf` file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.14	Ensure that the `admin.conf` file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.15	Ensure that the `scheduler.conf` file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.16	Ensure that the `scheduler.conf` file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.17	Ensure that the `controller-manager.conf` file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.18	Ensure that the `controller-manager.conf` file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.19	Ensure that the Kubernetes PKI directory and file ownership is set to `root:root`.	Level 1 - Master Node	Pass
1.1.20	Ensure that the Kubernetes PKI certificate file permissions are set to `600` or more restrictive.	Level 1 - Master Node	Pass
1.1.21	Ensure that the Kubernetes PKI key file permissions are set to `600`.	Level 1 - Master Node	Pass

1.2 API Server

Recommendation designation	Recommendation	Level	Result
1.2.1	Ensure that the `--anonymous-auth` argument is set to `false`.	Level 1 - Master Node	Pass
1.2.2	Ensure that the `--token-auth-file` parameter is not set.	Level 1 - Master Node	Pass
1.2.3	Ensure that the -`DenyServiceExternalIPs` argument is set.	Level 1 - Master Node	Pass
1.2.4	Ensure that the `--kubelet-client-certificate` and `--kubelet-client-key` arguments are set as appropriate.	Level 1 - Master Node	Pass
1.2.5	Ensure that the `--kubelet-certificate-authority` argument is set as appropriate.	Level 1 - Master Node	Pass
1.2.6	Ensure that the `--authorization-mode` argument is not set to `AlwaysAllow`.	Level 1 - Master Node	Pass
1.2.7	Ensure that the `--authorization-mode` argument includes `Node`.	Level 1 - Master Node	Pass
1.2.8	Ensure that the `--authorization-mode` argument includes `RBAC`.	Level 1 - Master Node	Pass
1.2.9	Ensure that the admission control plugin `EventRateLimit` is set.	Level 1 - Master Node	Fail Optionally, MKE can configure the `EventRateLimit` admission controller plugin.
1.2.10	Ensure that the admission control plugin `AlwaysAdmit` is not set.	Level 1 - Master Node	Pass
1.2.11	Ensure that the admission control plugin `AlwaysPullImages` is set.	Level 1 - Master Node	Fail Optionally, MKE can configure the `AlwaysPullImages` admission controller plugin.
1.2.12	Ensure that the admission control plugin `SecurityContextDeny` is set if `PodSecurityPolicy` is not used.	Level 1 - Master Node	Pass
1.2.13	Ensure that the admission control plugin `ServiceAccount` is set.	Level 1 - Master Node	Pass
1.2.14	Ensure that the admission control plugin `NamespaceLifecycle` is set.	Level 1 - Master Node	Pass
1.2.15	Ensure that the admission control plugin `NodeRestriction` is set.	Level 1 - Master Node	Pass
1.2.16	Ensure that the `--secure-port` option is not set to `0`. Note: This recommendation is obsolete and will be deleted per the consensus process.	Level 1 - Master Node	Pass
1.2.17	Ensure that the `--profiling` option is set to `false`.	Level 1 - Master Node	Pass
1.2.18	Ensure that the `--audit-log-path` option is set.	Level 1 - Master Node	Pass
1.2.19	Ensure that the `--audit-log-maxage` argument is set to `30` or as appropriate.	Level 1 - Master Node	Pass
1.2.20	Ensure that the `--audit-log-maxbackup` argument is set to `10` or as appropriate.	Level 1 - Master Node	Pass
1.2.21	Ensure that the `--audit-log-maxsize` argument is set to `100` or as appropriate.	Level 1 - Master Node	Pass
1.2.22	Ensure that the `--request-timeout` argument is set as appropriate.	Level 1 - Master Node	Fail Optionally, MKE can configure the Kubernetes API server `–-request-timeout` argument value.
1.2.23	Ensure that the `--service-account-lookup` argument is set to `true`.	Level 1 - Master Node	Pass
1.2.24	Ensure that the ``–service-account-key-file `` argument is set as appropriate.	Level 1 - Master Node	Pass
1.2.25	Ensure that the `--etcd-certfile` and `--etcd-keyfile` arguments are set as appropriate.	Level 1 - Master Node	Pass
1.2.26	Ensure that the `--tls-cert-file` and ``–tls-private-key-file `` arguments are set as appropriate.	Level 1 - Master Node	Pass
1.2.27	Ensure that the `--client-ca-file` argument is set as appropriate.	Level 1 - Master Node	Pass
1.2.28	Ensure that the `--etcd-cafile` argument is set as appropriate.	Level 1 - Master Node	Pass
1.2.29	Ensure that the `--encryption-provider-config` argument is set as appropriate.	Level 1 - Master Node	Pass
1.2.30	Ensure that encryption providers are appropriately configured.	Level 1 - Master Node	Pass
1.2.31	Ensure that the API Server only makes use of Strong Cryptographic Ciphers.	Level 1 - Master Node	Fail Optionally, MKE can be configured to support a list of compliant TLS ciphers.

1.3 Controller Manager

Recommendation designation	Recommendation	Level	Result
1.3.1	Ensure that the `--terminated-pod-gc-threshold` argument is set as appropriate.	Level 1 - Master Node	Fail Optionally, MKE can be configured to use a compliant `terminated-pod-gc-threshold` value.
1.3.2	Ensure that the `--profiling` argument is set to `false`.	Level 1 - Master Node	Pass
1.3.3	Ensure that the `--use-service-account-credentials` argument is set to `true`.	Level 1 - Master Node	Pass
1.3.4	Ensure that the `--service-account-private-key-file` argument is set as appropriate.	Level 1 - Master Node	Pass
1.3.5	Ensure that the `--root-ca-file` argument is set as appropriate.	Level 1 - Master Node	Pass
1.3.6	Ensure that the `RotateKubeletServerCertificate` argument is set to `true`.	Level 1 - Master Node	Pass
1.3.7	Ensure that the `--bind-address` argument is set to `127.0.0.1`.	Level 1 - Master Node	Pass

1.4 Scheduler

Recommendation designation	Recommendation	Level	Result
1.4.1	Ensure that the --profiling ``argument is set to ``false.	Level 1 - Master Node	Pass
1.4.2	Ensure that the `--bind-address` argument is set to `127.0.0.1`.	Level 1 - Master Node	Pass

2 etcd¶

Section 2 details recommendations for etcd configuration, under the assumption that you are running etcd in a Kubernetes Pod.

2 etcd

Recommendation designation	Recommendation	Level	Result
2.1	Ensure that the `--cert-file` and `--key-file` arguments are set as appropriate.	Level 1 - Master Node	Pass
2.2	Ensure that the `--client-cert-auth` argument is set to `true`.	Level 1 - Master Node	Pass
2.3	Ensure that the `--auto-tls` argument is not set to `true`.	Level 1 - Master Node	Pass
2.4	Ensure that the `--peer-cert-file` and `--peer-key-file` arguments are set as appropriate.	Level 1 - Master Node	Pass
2.5	Ensure that the `--peer-client-cert-auth` argument is set to `true`.	Level 1 - Master Node	Pass
2.6	Ensure that the `--peer-auto-tls` argument is not set to `true`.	Level 1 - Master Node	Pass
2.7	Ensure that a unique Certificate Authority is used for etcd.	Level 2 - Master Node	Pass

3 Control Plane Configuration¶

Section 3 details recommendations for cluster-wide areas, such as authentication and logging. It is broken out into two subsections:

3.1 Authentication and Authorization

Recommendation designation	Recommendation	Level	Result
3.1.1	Client certificate authentication should not be used for users	Level 1 - Master Node	Pass
3.1.2	Service account token authentication should not be used for users.	Level 1 - Master Node	Pass
3.1.3	Bootstrap token authentication should not be used for users.	Level 1 - Master Node	Pass

3.2 Logging

Recommendation designation	Recommendation	Level	Result
3.2.1	Ensure that a minimal audit policy is created.	Level 1 - Master Node	Pass
3.2.2	Ensure that the audit policy covers key security concerns.	Level 2 - Master Node	Pass

4 Worker Nodes¶

Section 4 details security recommendations for the components that run on Kubernetes worker nodes.

Note

Note that the components for Kubernetes worker nodes may also run on Kubernetes master nodes. Thus, the recommendations in Section 4 should be applied to master nodes as well as worker nodes where the master nodes make use of these components.

Section 4 is broken out into two subsections:

4.1 Worker Node Configuration Files

Recommendation designation	Recommendation	Level	Result
4.1.1	Ensure that the kubelet service file permissions are set to 600 or more restrictive.	Level 1 - Worker Node	Pass
4.1.2	Ensure that the kubelet service file ownership is set to `root:root`.	Level 1 - Worker Node	Pass
4.1.3	If proxy `kubeconfig` file exists, ensure permissions are set to `600` or more restrictive.	Level 1 - Worker Node	Pass
4.1.4	If proxy `kubeconfig` file exists, ensure ownership is set to `root:root`.	Level 1 - Worker Node	Pass
4.1.5	Ensure that the `--kubeconfig kubelet.conf` file permissions are set to `600` or more restrictive.	Level 1 - Worker Node	Pass
4.1.6	Ensure that the `--kubeconfig kubelet.conf` file ownership is set to `root:root`.	Level 1 - Worker Node	Pass
4.1.7	Ensure that the certificate authorities file permissions are set to `600` or more restrictive.	Level 1 - Worker Node	Fail MKE sets the CA cert file permission to `644`. This fulfills the control requirement of restricting write access to administrators, thus preventing non-root containers from accessing the file. Further restrictions to `600` are unnecessary and can potentially complicate the configuration.
4.1.8	Ensure that the client certificate authorities file ownership is set to `root:root`.	Level 1 - Worker Node	Pass
4.1.9	If the kubelet `config.yaml` configuration file is being used validate permissions set to `600` or more restrictive.	Level 1 - Worker Node	Pass
4.1.10	If the kubelet `config.yaml` configuration file is being used validate file ownership is set to `root:root`.	Level 1 - Worker Node	Pass

4.2 Kubelet

Recommendation designation	Recommendation	Level	Result
4.2.1	Ensure that the `--anonymous-auth` argument is set to `false`.	Level 1 - Worker Node	Pass
4.2.2	Ensure that the `--authorization-mode` argument is not set to `AlwaysAllow`.	Level 1 - Worker Node	Pass
4.2.3	Ensure that the `--client-ca-file` argument is set as appropriate.	Level 1 - Worker Node	Pass
4.2.4	Verify that the `--read-only-port` argument is set to `0`.	Level 1 - Worker Node	Pass
4.2.5	Ensure that the `--streaming-connection-idle-timeout` argument is not set to `0`.	Level 1 - Worker Node	Pass
4.2.6	Ensure that the `--make-iptables-util-chains` argument is set to `true`.	Level 1 - Worker Node	Pass
4.2.7	Ensure that the `--hostname-override` argument is not set.	Level 1 - Worker Node	Pass
4.2.8	Ensure that the `eventRecordQPS` argument is set to a level which ensures appropriate event capture.	Level 2 - Worker Node	Pass
4.2.9	Ensure that the `--tls-cert-file` and `--tls-private-key-file` arguments are set as appropriate.	Level 1 - Worker Node	Pass
4.2.10	Ensure that the `--rotate-certificates` argument is not set to `false`.	Level 1 - Worker Node	Fail Not applicable, as MKE has a certificate authority that issues TLS certificates for kubelet.
4.2.11	Verify that the `RotateKubeletServerCertificate` argument is set to `true`.	Level 1 - Worker Node	Fail Not applicable, as MKE has a certificate authority that issues TLS certificates for kubelet.
4.2.12	Ensure that the Kubelet only makes use of Strong Cryptographic Ciphers.	Level 1 - Worker Node	Fail Optionally, MKE can be configured to support a list of compliant TLS ciphers.
4.2.13	Ensure that a limit is set on pod PIDs.	Level 1 - Worker Node	Pass

5 Policies¶

Section 5 details recommendations for various Kubernetes policies which are important to the security of the environment. Section 5 is broken out into six subsections, with 5.6 not in use:

5.1 RBAC and Service Accounts

Recommendation designation	Recommendation	Level	Result
5.1.1	Ensure that the `cluster-admin` role is only used where required.	Level 1 - Master Node	Pass
5.1.2	Minimize access to secrets.	Level 1 - Master Node	Pass
5.1.3	Minimize wildcard use in Roles and ClusterRoles.	Level 1 - Worker Node	Pass
5.1.4	Minimize access to create Pods.	Level 1 - Master Node	Pass
5.1.5	Ensure that default service accounts are not actively used.	Level 1 - Master Node	Pass MKE installations are compliant starting with MKE 3.6.7. For customers upgrading from previous MKE versions, Mirantis offers a script that can be used to determine which service accounts are in violation and that offers an option for patching such accounts.
5.1.6	Ensure that Service Account Tokens are only mounted where necessary.	Level 1 - Master Node	Fail MKE system service accounts set `automount` to `false` at the service account level and override the `automount` flag on the system Pods that require it. To have core MKE functionality, the following Pods must mount their respective service account tokens: `calico-kube-controllers` `calico-node` `coredns` `ucp-metrics` `ucp-node-feature-discovery`
5.1.7	Avoid use of `system:masters` group.	Level 1 - Master Node	Pass
5.1.8	Limit use of the Bind, Impersonate and Escalate permissions in the Kubernetes cluster.	Level 1 - Master Node	Pass
5.1.9	Minimize access to create persistent volumes.	Level 1 - Master Node	Pass
5.1.10	Minimize access to the `proxy` sub-resource of nodes.	Level 1 - Master Node	Pass
5.1.11	Minimize access to the `approval` sub-resource of `certificatesigningrequests` objects.	Level 1 - Master Node	Pass
5.1.12	Minimize access to webhook configuration objects.	Level 1 - Master Node	Pass
5.1.13	Minimize access to the service account token creation.	Level 1 - Master Node	Pass

5.2 Pod Security Standards

Recommendation designation	Recommendation	Level	Result
5.2.1	Ensure that the cluster has at least one active policy control mechanism in place.	Level 1 - Master Node	Pass
5.2.2	Minimize the admission of privileged containers.	Level 1 - Master Node	Pass
5.2.3	Minimize the admission of containers wishing to share the host process ID namespace.	Level 1 - Master Node	Pass
5.2.4	Minimize the admission of containers wishing to share the host IPC namespace.	Level 1 - Master Node	Pass
5.2.5	Minimize the admission of containers wishing to share the host network namespace.	Level 1 - Master Node	Pass
5.2.6	Minimize the admission of containers with `allowPrivilegeEscalation`.	Level 1 - Master Node	Pass
5.2.7	Minimize the admission of root containers.	Level 2 - Master Node	Pass
5.2.8	Minimize the admission of containers with the NET_RAW capability.	Level 1 - Master Node	Pass MKE control plane containers no longer use NET_RAW, however policies must be added to restrict NET_RAW capability for user workloads.
5.2.9	Minimize the admission of containers with added capabilities.	Level 1 - Master Node	Pass
5.2.10	Minimize the admission of containers with capabilities assigned.	Level 2 - Master Node	Pass
5.2.11	Minimize the admission of Windows HostProcess Containers.	Level 1 - Master Node	Pass
5.2.12	Minimize the admission of HostPath volumes.	Level 1 - Master Node	Pass
5.2.13	Minimize the admission of containers which use HostPorts.	Level 1 - Master Node	Pass

5.3 Pod Network Policies and CNI

Recommendation designation	Recommendation	Level	Result
5.3.1	Ensure that the CNI in use supports Network Policies.	Level 1 - Master Node	Pass
5.3.2	Ensure that all Namespaces have Network Policies defined.	Level 2 - Master Node	Pass

5.4 Secrets Management

Recommendation designation	Recommendation	Level	Result
5.4.1	Prefer using secrets as files over secrets as environment variables.	Level 2 - Master Node	Pass
5.4.2	Consider external secret storage.	Level 2 - Master Node	Pass

5.5 Secrets Management

Recommendation designation	Recommendation	Level	Result
5.5.1	Configure Image Provenance using `ImagePolicyWebhook` admission controller.	Level 2 - Master Node	Pass

5.7 General Policies

Recommendation designation	Recommendation	Level	Result
5.7.1	Create administrative boundaries between resources using namespaces.	Level 1 - Master Node	Pass
5.7.2	Ensure that the `seccomp` profile is set to `docker/default` in your Pod definitions.	Level 2 - Master Node	Pass
5.7.3	Apply Security Context to Your Pods and Containers.	Level 2 - Master Node	Pass
5.7.4	The default namespace should not be used.	Level 2 - Master Node	Pass

Release Notes¶

Considerations

Upgrading MKE 3.6.0 - 3.6.4 to a later MKE version can result in ucp-pause containers not carrying forward to the later version.
A limitation in MKE 3.6.2 and MKE 3.6.3 can cause issues in clusters that deploy more than 120 nodes.

If you plan to run a cluster with more than 120 nodes, Mirantis strongly recommends that you upgrade to MKE 3.6.4. If, however, it is imperative that you run one of the affected MKE versions with 121+ nodes, contact Mirantis support to secure a workaround.
As MKE 3.6.0 runs etcd 3.4.16, upgrading to it from MKE 3.5.6 or later (which run etcd 3.5.5) will fail. As such, it is necessary to upgrade instead to MKE 3.6.1 or later.

The etcd component, by design, will not accept a downgrade of itself.
MKE 3.6.0 requires MCR 20.10.13 or later, which you must install or upgrade to prior to installing or upgrading to MKE 3.6.0.
Upgrading from one MKE minor version to another minor version can result in the downgrading of MKE middleware components. For more information, refer to the middleware versioning tables in the release notes of both the source and target MKE versions.
CentOS 8 entered EOL status as of 31-December-2021. For this reason, Mirantis no longer supports CentOS 8 for all versions of MKE. We encourage customers who are using CentOS 8 to migrate onto any one of the supported operating systems, as further bug fixes will not be forthcoming.
Custom log drivers are no longer supported, beginning with MKE 3.6.0. This is due to the transition from Dockershim over to cri-dockerd.
In MKE 3.6.1 - 3.6.7, performance issues may occur with both cri-dockerd and dockerd due to the manner in which cri-dockerd handles container and ImageFSInfo statistics.

MKE 3.6.16 current

Patch release for MKE 3.6 that focuses exclusively on bug resolution.

MKE 3.6.15

Patch release for MKE 3.6 that focuses exclusively on CVE mitigation.

MKE 3.6.14

Patch release for MKE 3.6 introducing the following key features:

Addition of Kubernetes log retention configuration parameters
Customizability of audit log policies
Inclusion of Docker events in MKE support bundle

MKE 3.6.13

Patch release for MKE 3.6 that focuses exclusively on CVE mitigation.

MKE 3.6.12

Patch release for MKE 3.6 introducing the following key features:

Kubernetes for GMSA now supported
Addition of ucp-cadvisor container level metrics component

MKE 3.6.11

Patch release for MKE 3.6 introducing the following key features:

Augmented validation for etcd storage quota
Improved handling of larger sized etcd instances
All errors now returned from pre upgrade checks
Minimum Docker storage requirement now part of pre upgrade checks

MKE 3.6.10 (discontinued)

MKE 3.6.10 was discontinued shortly after release due to issues encountered when upgrading to it from previous versions of the product.

MKE 3.6.9

Patch release for MKE 3.6 that focuses exclusively on CVE resolution.

MKE 3.6.8

Patch release for MKE 3.6 introducing the following key features:

Performance improvement to MKE image tagging API

MKE 3.6.7

Patch release for MKE 3.6 introducing the following key features:

Added ability to filter organizations by name in MKE web UI
Improved Kubernetes role creation error handling in MKE web UI
Increased SAML proxy feedback detail
Upgrade verifies that cluster nodes have minimum required MCR
kube-proxy now binds only to localhost
Enablement of read-only rootfs for specific containers
Added MKE web UI capability to add OS constraints to swarm services
Added ability to set support bundle collection windows
Added ability to set line limit of log files in support bundles
Addition of search function to Grants > Swarm in MKE web UI

MKE 3.6.6

Patch release for MKE 3.6 that focuses exclusively on the resolution of security vulnerabilities.

MKE 3.6.5

Patch release for MKE 3.6 introducing the following key features:

Enablement of read-only root filesystem for select MKE containers
Support bundles with custom options now carry custom preface
Enablement of stack traces collection with support bundles
Improved support for custom containerd root
Enablement of node type selection with support bundles
Type designation added to the MKE web UI swarm grants table
Addition of referral chasing LDAP parameter

MKE 3.6.4

Patch release for MKE 3.6 introducing the following key features:

Enablement of read-only root filesystem for select MKE containers
Addition of option to limit kernel capabilities in Interlock 3.3.10
Calico components metrics collection
Addition of SAML proxy configuration to auth settings in MKE web UI
Addition of option to disable LDAP referral URL chasing

MKE 3.6.3

Patch release for MKE 3.6 introducing the following key features:

Enablement of read-only root filesystem for select MKE containers
Health checks added to ucp-sf-notifier container
The ucp-kube-ingress-controller container now runs as non-root
The ucp-sf-notifier container now runs as non-root
The ucp-hardware-info container now runs as non-root
k8s components are non-root
Delivery of container disk usage metric

MKE 3.6.2

Patch release for MKE 3.6 introducing the following key features:

Interlock update to 3.3.8
--kube-protect-kernel-defaults install option
kube_api_server_auditing configuration option
Configuration options for disabling profiling
support CLI command options for node support dumps
Configuration options for system hardening
MKE web UI Banner design update
etcd storage quota UI notification
Self ports no longer checked during upgrade (Linux only)

MKE 3.6.1

Patch release for MKE 3.6 introducing the following key features:

NVIDIA settings disablement
Support bundle API endpoint
Account Privileges page in MKE web UI
Improved Image Pruning section in the MKE web UI

MKE 3.6.0

Initial MKE 3.6.0 release introducing the following key features and enhancements:

GCP support
OPA Gatekeeper
Windows Server 2022 support
no-new-privileges
cri-dockerd-mke
Kubernetes 1.24.5
Calico 3.24.1
Interlock 3.3.7

Deprecation notes

A list of features deprecated in MKE 3.6.x.