In correlation with the end of life (EOL) date for MKE 3.3.x, Mirantis stopped maintaining this
documentation version as of 2022-05-27. The latest MKE product documentation
is available here.
This documentation provides information on how to deploy and operate a
Mirantis Kubernetes Engine (MKE). The documentation is intended to help
operators to understand the core concepts of the product. The documentation
provides sufficient information to deploy and operate the solution.
The information provided in this documentation set is being constantly
improved and amended based on the feedback and kind requests from the
consumers of MKE.
In correlation with the end of life (EOL) date for MKE 3.3.x, Mirantis stopped maintaining this
documentation version as of 2022-05-27. The latest MKE product documentation
is available here.
Mirantis Kubernetes Engine (MKE, formerly Universal Control Plane or UCP)
is the industry-leading container orchestration platform for developing
and running modern applications at scale, on private clouds, public clouds,
and on bare metal.
MKE delivers immediate value to your business by allowing you to adopt modern
application development and delivery models that are cloud-first and
cloud-ready. With MKE you get a centralized place with a graphical UI to manage
and monitor your Kubernetes and/or Swarm cluster instance.
Your business benefits from using MKE as a container orchestration platform,
especially in the following use cases:
More than one container orchestrator
Whether your application requirements are complex and require medium
to large clusters or simple ones that can be deployed quickly on development
environments, MKE gives you a container orchestration choice.
Deploy Kubernetes, Swarm, or both types of clusters and manage them
on a single MKE instance or centrally manage your instance using
Mirantis Container Cloud.
Robust and scalable applications deployment
Monolithic applications are old school, microservices are the modern way
to deploy an application at scale. Delivering applications through
an automated CI/CD pipeline can dramatically improve time-to-market
and service agility. Adopting microservices becomes a lot easier
when using Kubernetes and/or Swarm clusters to deploy and test
microservice-based applications.
Multi-tenant software offerings
Containerizing existing monolithic SaaS applications enables quicker
development cycles, automated continuous integration and deployment.
But these applications need to allow multiple users to share a single
instance of a software application. MKE can operate multi-tenant
environments, isolate teams and organizations, separate cluster
resources, and so on.
In correlation with the end of life (EOL) date for MKE 3.3.x, Mirantis stopped maintaining this
documentation version as of 2022-05-27. The latest MKE product documentation
is available here.
The MKE Reference Architecture provides a technical overview of Mirantis
Kubernetes Engine (MKE). It is your source for the product hardware and
software specifications, standards, component information, and configuration
detail.
Mirantis Kubernetes Engine (MKE) allows you to adopt modern application
development and delivery models that are cloud-first and cloud-ready. With MKE
you get a centralized place with a graphical UI to manage and monitor your
Kubernetes and/or Swarm cluster instance.
The core MKE components are:
ucp-cluster-agent
Reconciles the cluster-wide state, including Kubernetes
addons such as Kubecompose and KubeDNS, managing replication configurations
for the etcd and RethinkDB clusters, and syncing the node inventories of
SwarmKit and Swarm Classic. This component is a single-replica service that
runs on any manager node in the cluster.
ucp-manager-agent
Reconciles the node-local state on manager nodes,
including the configuration of the local Docker daemon, local date volumes,
certificates, and local container components. Each manager node in the
cluster runs a task from this service.
ucp-worker-agent
Performs the same reconciliation operations as
ucp-manager-agent but on worker nodes. This component runs a task on
each worker node.
The following MKE component names differ based on the node’s operating
system:
Take careful note of the minimum and recommended hardware requirements for MKE
manager and worker nodes prior to deployment.
Note
High availability (HA) installations require transferring files between
hosts.
On manager nodes, MKE only supports the workloads it requires to run.
Windows container images are typically larger than Linux
container images. As such, provision more local storage for Windows
nodes and for any MSR repositories that store Windows container
images.
Manager nodes manage a swarm and persist the swarm state. Using several
containers per node, the ucp-manager-agent automatically deploys all
MKE components on manager nodes, including the MKE web UI and the data stores
that MKE uses.
The following table details the MKE services that run on manager
nodes:
A cluster-scoped Kubernetes controller used to coordinate Calico
networking. Runs on one manager node only.
k8s_calico-node
The Calico node agent, which coordinates networking fabric according
to the cluster-wide Calico configuration. Part of the calico-node
DaemonSet. Runs on all nodes. Configure the container network interface
(CNI) plugin using the --cni-installer-url flag. If this flag is not
set, MKE uses Calico as the default CNI plugin.
k8s_install-cni_calico-node
A container in which the Calico CNI plugin binaries are installed and
configured on each host. Part of the calico-node DaemonSet. Runs on
all nodes.
A dnsmasq instance used in the Kubernetes DNS Service. Part of the
kube-dns deployment. Runs on one manager node only.
k8s_ucp-kube-compose
A custom Kubernetes resource component that translates Compose files
into Kubernetes constructs. Part of the compose deployment. Runs on
one manager node only.
k8s_ucp-kube-dns
The main Kubernetes DNS Service, used by pods to resolve service names.
Part of the kube-dns deployment, a set of three containers deployed
through Kubernetes as a single pod. Provides service discovery for
Kubernetes services and pods. Runs on one manager node only.
k8s_ucp-kubedns-sidecar
A daemon of the Kubernetes DNS Service responsible for health checking
and metrics. Part of the kube-dns deployment. Runs on one manager
node only.
ucp-auth-api
The centralized service for identity and authentication used by MKE and
MSR.
ucp-auth-store
A container that stores authentication configurations and data for
users, organizations, and teams.
ucp-auth-worker
A container that performs scheduled LDAP synchronizations and cleans
authentication and authorization data.
ucp-client-root-ca
A certificate authority to sign client bundles.
ucp-cluster-root-ca
A certificate authority used for TLS communication between MKE
components.
ucp-controller
The MKE web server.
ucp-dsinfo
A Docker system script for collecting troubleshooting information.
Named ucp-dsinfo-win on Windows nodes.
ucp-hardware-info
A container for collecting disk/hardware information about the host.
ucp-interlock
A container that monitors Swarm workloads configured to use layer 7
routing. Only runs when you enable layer 7 routing.
ucp-interlock-proxy
A service that provides load balancing and proxying for Swarm workloads.
Only runs when you enable layer 7 routing.
ucp-kube-apiserver
A master component that serves the Kubernetes API. It persists its state
in etcd directly, and all other components communicate directly with
the API server. The Kubernetes API server is configured to encrypt
Secrets using AES-CBC with a 256-bit key. The encryption key is never
rotated, and the encryption key is stored on manager nodes, in a file
on disk.
ucp-kube-controller-manager
A master component that manages the desired state of controllers and
other Kubernetes objects. It monitors the API server and performs
background tasks when needed.
ucp-kubelet
The Kubernetes node agent running on every node, which is responsible
for running Kubernetes pods, reporting the health of the node, and
monitoring resource usage.
ucp-kube-proxy
The networking proxy running on every node, which enables pods to
contact Kubernetes services and other pods by way of cluster IP
addresses.
ucp-kube-scheduler
A master component that handles pod scheduling. It communicates with
the API server only to obtain workloads that need to be scheduled.
ucp-kv
A container used to store the MKE configurations. Do not use it in your
applications, as it is for internal use only. Also used by Kubernetes
components.
ucp-manager-agent
The agent that monitors the manager node and ensures that the right MKE
services are running.
ucp-metrics
A container used to collect and process metrics for a node, such as the
disk space available.
A TLS proxy that allows secure access from the local Mirantis Container
Runtime to MKE components.
ucp-reconcile
A container that converges the node to its desired state whenever the
ucp-manager-agent service detects that the node is not running the
correct MKE components. This container should remain in an exited state
when the node is healthy.
ucp-swarm-manager
A container used to provide backwards compatibility with Docker Swarm.
Worker nodes are instances of MCR that participate in a swarm for the purpose
of executing containers. Such nodes receive and execute tasks dispatched from
manager nodes. Worker nodes must have at least one manager node, as they do not
participate in the Raft distributed state, perform scheduling, or serve
the swarm mode HTTP API.
The following table details the MKE services that run on worker nodes.
A cluster-scoped Kubernetes controller used to coordinate Calico
networking. Runs on all nodes.
k8s_install-cni_calico-node
A container that installs the Calico CNI plugin
binaries and configuration on each host. Part of the calico-node
DaemonSet. Runs on all nodes.
k8s_POD_calico-node
The Pause containers for the Calico-node pod. By
default, this container is hidden, but you can see it by running the
following command:
docker ps -a
ucp-interlock-extension
A helper service that reconfigures the ucp-interlock-proxy service,
based on the Swarm workloads that are running.
ucp-interlock-proxy
A service that provides load balancing and proxying for swarm
workloads. Only runs when you enable layer 7 routing.
ucp-dsinfo
A Docker system script for collecting information that assists with
troubleshooting. On Windows nodes the component name is
ucp-dsinfo-win.
ucp-hardware-info
A container for collecting disk and hardware information about the host.
ucp-kubelet
The kubernetes node agent running on every node, which is responsible
for running Kubernetes pods, reporting the health of the node, and
monitoring resource usage.
ucp-kube-proxy
The networking proxy running on every node, which enables pods to
contact Kubernetes services and other pods through cluster IP
addresses.
A container that converges the node to its desired state whenever the
ucp-worker-agent service detects that the node is not running the
correct MKE components. This container should remain in an exited
state when the node is healthy.
ucp-proxy
A TLS proxy that allows secure access from the local Mirantis Container
Runtime to MKE components.
ucp-worker-agent
A service that monitors the worker node and ensures that the correct MKE
services are running. The ucp-worker-agent service ensures that only
authorized users and other MKE services can run Docker commands on the
node. The ucp-worker-agent deploys a set of containers onto worker
nodes, which is a subset of the containers that ucp-manager-agent
deploys onto manager nodes. This component is named
ucp-worker-agent-win on Windows nodes.
Admission controllers are plugins that govern and enforce
cluster usage. There are two types of admission controllers:
default and custom. The tables below list the available admission controllers.
For more information, see
Kubernetes documentation: Using Admission Controllers.
Note
You cannot enable or disable custom admission controllers.
Adds a default storage class to PersistentVolumeClaim objects that
do not request a specific storage class.
DefaultTolerationSeconds
Sets the pod default forgiveness toleration to tolerate the
notready:NoExecute and unreachable:NoExecute taints
based on the default-not-ready-toleration-seconds and
default-unreachable-toleration-seconds Kubernetes API server input
parameters if they do not already have toleration for the
node.kubernetes.io/not-ready:NoExecute or
node.kubernetes.io/unreachable:NoExecute taints. The default value
for both input parameters is five minutes.
LimitRanger
Ensures that incoming requests do not violate the constraints in a
namespace LimitRange object.
MutatingAdmissionWebhook
Calls any mutating webhooks that match the request.
NamespaceLifecycle
Ensures that users cannot create new objects in namespaces undergoing
termination and that MKE rejects requests in nonexistent namespaces.
It also prevents users from deleting the reserved default,
kube-system, and kube-public namespaces.
NodeRestriction
Limits the Node and Pod objects that a kubelet can modify.
PersistentVolumeLabel (deprecated)
Attaches region or zone labels automatically to PersistentVolumes as
defined by the cloud provider.
PodNodeSelector
Limits which node selectors can be used within a namespace by reading a
namespace annotation and a global configuration.
PodSecurityPolicy
Determines whether a new or modified pod should be admitted based on the
requested security context and the available Pod Security Policies.
ResourceQuota
Observes incoming requests and ensures they do not violate any of the
constraints in a namespace ResourceQuota object.
ServiceAccount
Implements automation for ServiceAccount resources.
ValidatingAdmissionWebhook
Calls any validating webhooks that match the request.
Annotates Docker Compose-on-Kubernetes Stack resources with
the identity of the user performing the request so that the Docker
Compose-on-Kubernetes resource controller can manage Stacks
with correct user authorization.
Detects the deleted ServiceAccount resources to correctly remove
them from the scheduling authorization back end of an MKE node.
Simplifies creation of the RoleBindings and
ClusterRoleBindings resources by automatically converting
user, organization, and team Subject names into their
corresponding unique identifiers.
Prevents users from deleting the built-in cluster-admin,
ClusterRole, or ClusterRoleBinding resources.
Prevents under-privileged users from creating or updating
PersistentVolume resources with host paths.
Works in conjunction with the built-in PodSecurityPolicies
admission controller to prevent under-privileged users from
creating Pods with privileged options. To grant non-administrators
and non-cluster-admins access to privileged attributes, refer to
Use admission controllers for access in the MKE Operations Guide.
CheckImageSigning
Enforces MKE Docker Content Trust policy which, if enabled, requires
that all pods use container images that have been digitally signed by
trusted and authorized users, which are members of one or more teams in
MKE.
UCPNodeSelector
Adds a com.docker.ucp.orchestrator.kubernetes:* toleration to pods
in the kube-system namespace and removes the
com.docker.ucp.orchestrator.kubernetes tolerations from pods in
other namespaces. This ensures that user workloads do not run on
swarm-only nodes, which MKE taints with
com.docker.ucp.orchestrator.kubernetes:NoExecute. It also adds a
node affinity to prevent pods from running on manager nodes depending
on MKE settings.
Every Kubernetes Pod includes an empty pause container, which bootstraps the
Pod to establish all of the cgroups, reservations, and namespaces before its
individual containers are created. The pause container image is always present,
so the pod resource allocation happens instantaneously as containers are
created.
To display pause containers:
When using the client bundle, pause containers are hidden by default.
To display pause containers when using the client bundle:
dockerps-a|grep-Ipause
To display pause containers when not using the client bundle:
Certificate and keys for the authentication and authorization
service.
ucp-auth-store-certs
Certificate and keys for the authentication and authorization
store.
ucp-auth-store-data
Data of the authentication and authorization store, replicated
across managers.
ucp-auth-worker-certs
Certificate and keys for authentication worker.
ucp-auth-worker-data
Data of the authentication worker.
ucp-client-root-ca
Root key material for the MKE root CA that issues client
certificates.
ucp-cluster-root-ca
Root key material for the MKE root CA that issues certificates
for swarm members.
ucp-controller-client-certs
Certificate and keys that the MKE web server uses to communicate
with other MKE components.
ucp-controller-server-certs
Certificate and keys for the MKE web server running in the node.
ucp-kv
MKE configuration data, replicated across managers.
ucp-kv-certs
Certificates and keys for the key-value store.
ucp-metrics-data
Monitoring data that MKE gathers.
ucp-node-certs
Certificate and keys for node communication.
ucp-backup
Backup artifacts that are created while processing a backup. The
artifacts persist on the volume for the duration of the backup and are
cleaned up when the backup completes, though the volume itself remains.
mke-containers
Symlinks to MKE component log files, created by ucp-agent.
Symlinks to MKE component log files, created by ucp-agent.
You can customize the volume driver for the volumes by creating
the volumes prior to installing MKE. During installation, MKE determines
which volumes do not yet exist on the node and creates those volumes using the
default volume driver.
By default, MKE stores the data for these volumes at
/var/lib/docker/volumes/<volume-name>/_data.
You can interact with MKE either through the web UI or the CLI.
With the MKE web UI you can manage your swarm, grant and revoke user
permissions, deploy, configure, manage, and monitor your applications.
In addition, MKE exposes the standard Docker API, so you can continue using
such existing tools as the Docker CLI client. As MKE secures your
cluster with RBAC, you must configure your Docker CLI client and
other client tools to authenticate your requests using client
certificates that you can download from your MKE profile page.
In correlation with the end of life (EOL) date for MKE 3.3.x, Mirantis stopped maintaining this
documentation version as of 2022-05-27. The latest MKE product documentation
is available here.
The MKE Installation Guide provides everything you need to install
and configure Mirantis Kubernetes Engine (MKE). The guide offers
detailed information, procedures, and examples that are specifically
designed to help DevOps engineers and administrators install and
configure the MKE container orchestration platform.
Before installing MKE, plan a single host name strategy to use consistently
throughout the cluster, keeping in mind that MKE and MCR both use host names.
There are two general strategies for creating host names: short host names and
fully qualified domain names (FQDN). Consider the following examples:
MCR uses three separate IP ranges for the docker0, docker_gwbridge, and
ucp-bridge interfaces. By default, MCR assigns the first available subnet
in default-address-pools (172.17.0.0/16) to docker0, the second
(172.18.0.0/16) to docker_gwbridge, and the third (172.19.0.0/16)
to ucp-bridge.
Note
The ucp-bridge bridge network specifically supports MKE component
containers.
You can reassign the docker0, docker_gwbridge, and ucp-bridge
subnets in default-address-pools. To do so, replace the relevant values in
default-address-pools in the /etc/docker/daemon.json file, making sure
that the setting includes at least three IP pools. Be aware that you must
restart the docker.service to activate your daemon.json file edits.
By default, default-address-pools contains the following values:
The list of CIDR ranges used to allocate subnets for local bridge
networks.
base
The CIDR range allocated for bridge networks in each IP address pool.
size
The CIDR netmask that determines the subnet size to allocate from the
base pool. If the size matches the netmask of the base,
then the pool contains one subnet. For example,
{"base":"172.17.0.0/16","size":16} creates the subnet:
172.17.0.0/16 (172.17.0.1 - 172.17.255.255).
For example, {"base":"192.168.0.0/16","size":20} allocates
/20 subnets from 192.168.0.0/16, including the following subnets for
bridge networks:
MCR creates and configures the host system with the docker0 virtual network
interface, an ethernet bridge through which all traffic between MCR
and the container moves. MCR uses docker0 to handle all container
routing. You can specify an alternative network interface when you start the
container.
MCR allocates IP addresses from the docker0 configurable IP range to the
containers that connect to docker0. The default IP range, or subnet, for
docker0 is 172.17.0.0/16.
You can change the docker0 subnet in /etc/docker/daemon.json using the
settings in the following table. Be aware that you must restart the
docker.service to activate your daemon.json file edits.
Parameter
Description
default-address-pools
Modify the first pool in default-address-pools.
Caution
By default, MCR assigns the second pool to docker_gwbridge. If you
modify the first pool such that the size does not match the base
netmask, it can affect docker_gwbridge.
{"default-address-pools":[{"base":"172.17.0.0/16","size":16}, <-- Modify this value{"base":"172.18.0.0/16","size":16},{"base":"172.19.0.0/16","size":16},{"base":"172.20.0.0/16","size":16},{"base":"172.21.0.0/16","size":16},{"base":"172.22.0.0/16","size":16},{"base":"172.23.0.0/16","size":16},{"base":"172.24.0.0/16","size":16},{"base":"172.25.0.0/16","size":16},{"base":"172.26.0.0/16","size":16},{"base":"172.27.0.0/16","size":16},{"base":"172.28.0.0/16","size":16},{"base":"172.29.0.0/16","size":16},{"base":"172.30.0.0/16","size":16},{"base":"192.168.0.0/16","size":20}]}
fixed-cidr
Configures a CIDR range.
Customize the subnet for docker0 using standard CIDR notation.
The default subnet is 172.17.0.0/16, the network gateway is
172.17.0.1, and MCR allocates IPs 172.17.0.2 - 172.17.255.254
for your containers.
{"fixed-cidr":"172.17.0.0/16",}
bip
Configures a gateway IP address and CIDR netmask of the docker0
network.
Customize the subnet for docker0 using the
<gatewayIP>/<CIDRnetmask> notation.
The default subnet is 172.17.0.0/16, the network gateway is
172.17.0.1, and MCR allocates IPs 172.17.0.2 - 172.17.255.254
for your containers.
The docker_gwbridge is a virtual network interface that connects
overlay networks (including ingress) to individual MCR container networks.
Initializing a Docker swarm or joining a Docker host to a swarm automatically
creates docker_gwbridge in the kernel of the Docker host. The default
docker_gwbridge subnet (172.18.0.0/16) is the second available subnet
in default-address-pools.
To change the docker_gwbridge subnet, open daemon.json and modify the
second pool in default-address-pools:
{"default-address-pools":[{"base":"172.17.0.0/16","size":16},{"base":"172.18.0.0/16","size":16}, <-- Modify this value{"base":"172.19.0.0/16","size":16},{"base":"172.20.0.0/16","size":16},{"base":"172.21.0.0/16","size":16},{"base":"172.22.0.0/16","size":16},{"base":"172.23.0.0/16","size":16},{"base":"172.24.0.0/16","size":16},{"base":"172.25.0.0/16","size":16},{"base":"172.26.0.0/16","size":16},{"base":"172.27.0.0/16","size":16},{"base":"172.28.0.0/16","size":16},{"base":"172.29.0.0/16","size":16},{"base":"172.30.0.0/16","size":16},{"base":"192.168.0.0/16","size":20}]}
Caution
Modifying the first pool to customize the docker0 subnet can affect
the default docker_gwbridge subnet. Refer to
docker0 for more information.
You can only customize the docker_gwbridge settings before you join
the host to the swarm or after temporarily removing it.
The default address pool that Docker Swarm uses for its overlay network is
10.0.0.0/8. If this pool conflicts with your current network
implementation, you must use a custom IP address pool. Prior to installing MKE,
specify your custom address pool using the --default-addr-pool
option when initializing swarm.
Note
The Swarm default-addr-pool and MCR default-address-pools settings
define two separate IP address ranges used for different purposes.
Kubernetes uses two internal IP ranges, either of
which can overlap and conflict with the underlying infrastructure, thus
requiring custom IP ranges.
The pod network
Either Calico or Azure IPAM services gives each Kubernetes pod
an IP address in the default 192.168.0.0/16 range. To customize this
range, during MKE installation, use the --pod-cidr flag with the
ucp install command.
The services network
You can access Kubernetes services with a VIP in the default 10.96.0.0/16
Cluster IP range. To customize this range, during MKE installation, use
the --service-cluster-ip-range flag with the ucp install
command.
The storage path for such persisted data as images, volumes, and cluster state
is docker data root (data-root in /etc/docker/daemon.json).
MKE clusters require that all nodes have the same docker data-root for the
Kubernetes network to function correctly. In addition, if the data-root is
changed on all nodes you must recreate the Kubernetes network configuration in
MKE by running the following commands:
MKE currently does not support no-new-privileges:true in the
/etc/docker/daemon.json file, as this causes several MKE components to
enter a failed state.
A well-configured network is essential for the proper functioning of your MKE
deployment. Pay particular attention to such key factors as IP address
provisioning, port management, and traffic enablement.
When installing MKE on a host, you need to open specific ports to
incoming traffic. Each port listens for incoming traffic from a particular set
of hosts, known as the port scope.
MKE uses the following scopes:
Scope
Description
External
Traffic arrives from outside the cluster through end-user interaction.
Internal
Traffic arrives from other hosts in the same cluster.
Self
Traffic arrives to that port only from processes on the same host.
Open the following ports for incoming traffic on each host type:
Hosts
Port
Scope
Purpose
Managers, workers
TCP 179
Internal
BGP peers, used for Kubernetes networking
Managers
TCP 443 (configurable)
External, internal
MKE web UI and API
Managers
TCP 2376 (configurable)
Internal
Docker swarm manager, used for backwards compatibility
Managers
TCP 2377 (configurable)
Internal
Control communication between swarm nodes
Managers, workers
UDP 4789
Internal
Overlay networking
Managers
TCP 6443 (configurable)
External, internal
Kubernetes API server endpoint
Managers, workers
TCP 6444
Self
Kubernetes API reverse proxy
Managers, workers
TCP, UDP 7946
Internal
Gossip-based clustering
Managers, workers
TCP 9091
Self
Felix Prometheus calico-node metrics
Managers
TCP 9094
Self
Felix Prometheus kube-controller metrics
Managers, workers
TCP 9099
Self
Calico health check
Managers, workers
TCP 10250
Internal
Kubelet
Managers, workers
TCP 12376
Internal
TLS authentication proxy that provides access to MCR
Managers, workers
TCP 12378
Self
etcd reverse proxy
Managers
TCP 12379
Internal
etcd Control API
Managers
TCP 12380
Internal
etcd Peer API
Managers
TCP 12381
Internal
MKE cluster certificate authority
Managers
TCP 12382
Internal
MKE client certificate authority
Managers
TCP 12383
Internal
Authentication storage back end
Managers
TCP 12384
Internal
Authentication storage back end for replication across
managers
Calico is the default networking plugin for MKE. The default Calico
encapsulation setting for MKE is VXLAN, however the plugin also supports
IP-in-IP encapsulation. Refer to the Calico documentation on
Overlay networking
for more information.
Important
NetworkManager can impair the Calico agent routing function. To resolve
this issue, you must create a file called
/etc/NetworkManager/conf.d/calico.conf with the following content:
Avoid firewall conflicts in the following Linux distributions:
Linux distribution
Procedure
SLES 12 SP2
Installations have the FW_LO_NOTRACK flag turned on by default in
the openSUSE firewall. It speeds up packet processing on the
loopback interface but breaks certain firewall setups that redirect
outgoing packets via custom rules on the local machine.
To turn off the FW_LO_NOTRACK option:
In /etc/sysconfig/SuSEfirewall2, set FW_LO_NOTRACK="no".
Either restart the firewall or reboot the system.
SLES 12 SP3
No change is required, as installations have the FW_LO_NOTRACK flag
turned off by default.
SLES 15 SP3 or RHEL 8, when running MCR 19.03.x
Configure the FirewallBackend option:
Verify that firewalld is running.
In /etc/firewalld/firewalld.conf, change
FirewallBackend=nftables to FirewallBackend=iptables.
Before performing SUSE Linux Enterprise Server (SLES) installations, consider
the following prerequisite steps:
For SLES 15 installations, disable CLOUD_NETCONFIG_MANAGE prior to
installing MKE:
Set CLOUD_NETCONFIG_MANAGE="no" in the
/etc/sysconfig/network/ifcfg-eth0 network interface configuration
file.
Run the service network restart command.
By default, SLES disables connection tracking. To allow
Kubernetes controllers in Calico to reach the Kubernetes API server, enable
connection tracking on the loopback interface for SLES by running the
following commands for each node in the cluster:
Configure all containers in an MKE cluster to regularly synchronize with a
Network Time Protocol (NTP) server. This ensures consistency between every
container in the cluster and avoids unexpected behavior that can lead to poor
performance.
Though MKE does not include a load balancer, you can configure your own to
balance user requests across all manager nodes. Before that, decide whether you
will add nodes to the load balancer using their IP address or their fully
qualified domain name (FQDN), and then use that strategy consistently
throughout the cluster. Take note of all IP addresses or FQDNs before you start
the installation.
If you plan to deploy both MKE and MSR, your load balancer must be able to
differentiate between the two: either by IP address or port number. Because
both MKE and MSR use port 443 by default, your options are as follows:
Configure your load balancer to expose either MKE or MSR on a port other than
443.
Configure your load balancer to listen on port 443 with separate virtual IP
addresses for MKE and MSR.
Configure separate load balancers for MKE and MSR, both listening on port
443.
If you want to install MKE in a high-availability configuration with a load
balancer in front of your MKE controllers, include the
appropriate IP address and FQDN for the load balancer VIP. To do so, use one
or more --san flags either with the ucp install command or in
interactive mode when MKE requests additional SANs.
MKE supports the setting of values for all IPVS related parameters that are
exposed by kube-proxy.
Kube-proxy runs on each cluster node, its role being to load-balance traffic
whose destination is services (via cluster IPs and node ports) to the correct
backend pods. Of the modes in which kube-proxy can run, IPVS (IP Virtual
Server) offers the widest choice of load balancing algorithms and superior
scalability.
You can only enable IPVS for MKE at installation, and it persists throughout
the life of the cluster. Thus, you cannot switch to iptables at a
later stage or switch over existing MKE clusters to use IPVS proxier.
Use the --existing-config parameter when installing MKE. You can also
change these values post-install using the MKE-sucp/config-toml
endpoint.
Caution
If you are using MKE 3.3.x with IPVS proxier and plan to upgrade to MKE
3.4.x, you must upgrade to MKE 3.4.3 or later as earlier versions of MKE
3.4.x do not support IPVS proxier.
You can customize MKE to use certificates signed by an External
Certificate Authority (ECA). When using your own certificates,
include a certificate bundle with the following:
ca.pem file with the root CA public certificate.
cert.pem file with the server certificate and any intermediate CA
public certificates. This certificate should also have Subject Alternative
Names (SANs) for all addresses used to reach the MKE manager.
key.pem file with a server private key.
You can either use separate certificates for every manager node or one
certificate for all managers. If you use separate certificates, you must use a
common SAN throughout. For example, MKE permits the following on a three-node
cluster:
node1.company.example.org with the SAN mke.company.org
node2.company.example.org with the SAN mke.company.org
node3.company.example.org with the SAN mke.company.org
If you use a single certificate for all manager nodes, MKE automatically copies
the certificate files both to new manager nodes and to those promoted to a
manager role.
Skip this step if you want to use the default named volumes.
MKE uses named volumes to persist data. If you want to customize
the drivers that manage such volumes, create the volumes
before installing MKE. During the installation process, the installer
will automatically detect the existing volumes and start using them.
Otherwise, MKE will create the default named volumes.
MKE uses the kernel parameters detailed here. The information is presented in
tables that are organized by parameter prefix, offering both the default
parameter values and the values as they are set following MKE installation.
Note
The MKE parameter values are not set by MKE, but by either MCR or
an upstream component.
Sets whether arptables rules apply to bridged network traffic.
If the bridge module is not loaded, and thus no bridges are present,
this key is not present.
call-ip6tables
Default: No default
MKE:1
Sets whether ip6tables rules apply to bridged network traffic.
If the bridge module is not loaded, and thus no bridges are present,
this key is not present.
call-iptables
Default: No default
MKE:1
Sets whether iptables rules apply to bridged network traffic.
If the bridge module is not loaded, and thus no bridges are present,
this key is not present.
filter-pppoe-tagged
Default: No default
MKE:0
Sets whether netfilter rules apply to bridged PPPOE network
traffic. If the bridge module is not loaded, and thus no bridges are
present, this key is not present.
filter-vlan-tagged
Default: No default
MKE:0
Sets whether netfilter rules apply to bridged VLAN network traffic. If
the bridge module is not loaded, and thus no bridges are present, this
key is not present.
pass-vlan-input-dev
Default: No default
MKE:0
Sets whether netfilter strips the incoming VLAN interface name from
bridged traffic. If the bridge module is not loaded, and thus no bridges
are present, this key is not present.
The *.vs.* default values persist, changing only because the ipvs
kernel module was not previously loaded. For more information, refer
to the Linux kernel documentation.
Parameter
Values
Description
conf.all.accept_redirects
Default:1
MKE:0
Sets whether ICMP redirects are permitted. This key affects all
interfaces.
conf.all.forwarding
Default:0
MKE:1
Sets whether network traffic is forwarded. This key affects all
interfaces.
conf.all.route_localnet
Default:0
MKE:1
Sets 127/8 for local routing. This key
affects all interfaces.
conf.default.forwarding
Default:0
MKE:1
Sets 127/8 for local routing. This key
affects new interfaces.
conf.lo.forwarding
Default:0
MKE:1
Sets forwarding for localhost traffic.
ip_forward
Default:0
MKE:1
Sets whether traffic forwards between interfaces. For Kubernetes to run,
this parameter must be set to 1.
vs.am_droprate
Default:10
MKE:10
Sets the always mode drop rate used in mode 3 of the drop_rate
defense.
vs.amemthresh
Default:1024
MKE:1024
Sets the available memory threshold in pages, which is used in the
automatic modes of defense. When there is not enough available memory,
this enables the strategy and the variable is set to 2. Otherwise,
the strategy is disabled and the variable is set to 1.
vs.backup_only
Default:0
MKE:0
Sets whether the director function is disabled while the server is in
back-up mode, to avoid packet loops for DR/TUN methods.
vs.cache_bypass
Default:0
MKE:0
Sets whether packets forward directly to the original destination when
no cache server is available and the destination address is not local
(iph->daddrisRTN_UNICAST). This mostly applies to transparent web
cache clusters.
vs.conn_reuse_mode
Default:1
MKE:1
Sets how IPVS handles connections detected on port reuse. It is a
bitmap with the following values:
0 disables any special handling on port reuse. The new
connection is delivered to the same real server that was servicing the
previous connection, effectively disabling expire_nodest_conn.
bit1 enables rescheduling of new connections when it is safe.
That is, whenever expire_nodest_conn and for TCP sockets, when
the connection is in TIME_WAIT state (which is only possible if
you use NAT mode).
bit2 is bit 1 plus, for TCP connections, when connections
are in FIN_WAIT state, as this is the last state seen by load
balancer in Direct Routing mode. This bit helps when adding new
real servers to a very busy cluster.
vs.conntrack
Default:0
MKE:0
Sets whether connection-tracking entries are maintained for connections
handled by IPVS. Enable if connections handled by IPVS
are to be subject to stateful firewall rules. That is, iptables
rules that make use of connection tracking. Otherwise, disable this
setting to optimize performance. Connections handled by
the IPVS FTP application module have connection tracking entries
regardless of this setting, which is only available when IPVS is
compiled with CONFIG_IP_VS_NFCT enabled.
vs.drop_entry
Default:0
MKE:0
Sets whether entries are randomly dropped in the connection hash table,
to collect memory back for new connections. In the current
code, the drop_entry procedure can be activated every second, then
it randomly scans 1/32 of the whole and drops entries that are in the
SYN-RECV/SYNACK state, which should be effective against syn-flooding
attack.
The valid values of drop_entry are 0 to 3, where 0 indicates
that the strategy is always disabled, 1 and 2 indicate automatic
modes (when there is not enough available memory, the strategy
is enabled and the variable is automatically set to 2,
otherwise the strategy is disabled and the variable is set to
1), and 3 indicates that the strategy is always enabled.
vs.drop_packet
Default:0
MKE:0
Sets whether rate packets are dropped prior to being forwarded to real
servers. Rate 1 drops all incoming packets.
The value definition is the same as that for drop_entry. In
automatic mode, the following formula determines the rate:
rate = amemthresh / (amemthresh - available_memory) when available
memory is less than the available memory threshold. When mode 3 is
set, the always mode drop rate is controlled by the
/proc/sys/net/ipv4/vs/am_droprate.
vs.expire_nodest_conn
Default:0
MKE:0
Sets whether the load balancer silently drops packets when its
destination server is not available. This can be useful when the
user-space monitoring program deletes the destination server (due to
server overload or wrong detection) and later adds the server back, and
the connections to the server can continue.
If this feature is enabled, the load balancer terminates the connection
immediately whenever a packet arrives and its destination server is not
available, after which the client program will be notified that the
connection is closed. This is equivalent to the feature that is
sometimes required to flush connections when the destination is not
available.
vs.ignore_tunneled
Default:0
MKE:0
Sets whether IPVS configures the ipvs_property on all packets of
unrecognized protocols. This prevents users from routing such tunneled
protocols as IPIP, which is useful in preventing the rescheduling
packets that have been tunneled to the IPVS host (that is, to prevent
IPVS routing loops when IPVS is also acting as a real server).
vs.nat_icmp_send
Default:0
MKE:0
Sets whether ICMP error messages (ICMP_DEST_UNREACH) are sent for
VS/NAT when the load balancer receives packets from real servers but the
connection entries do not exist.
vs.pmtu_disc
Default:0
MKE:0
Sets whether all DF packets that exceed the PMTU are rejected with
FRAG_NEEDED, irrespective of the forwarding method. For the TUN
method, the flag can be disabled to fragment such packets.
vs.schedule_icmp
Default:0
MKE:0
Sets whether scheduling ICMP packets in IPVS is enabled.
vs.secure_tcp
Default:0
MKE:0
Sets the use of a more complicated TCP state transition table.
For VS/NAT, the secure_tcp defense delays entering the
TCPESTABLISHED state until the three-way handshake completes. The
value definition is the same as that of drop_entry and
drop_packet.
vs.sloppy_sctp
Default:0
MKE:0
Sets whether IPVS is permitted to create a connection state on any
packet, rather than an SCTP INIT only.
vs.sloppy_tcp
Default:0
MKE:0
Sets whether IPVS is permitted to create a connection state on any
packet, rather than a TCP SYN only.
vs.snat_reroute
Default:0
MKE:1
Sets whether the route of SNATed packets is recalculated from real
servers as if they originate from the director. If disabled, SNATed
packets are routed as if they have been forwarded by the director.
If policy routing is in effect, then it is possible that the route
of a packet originating from a director is routed differently to a
packet being forwarded by the director.
If policy routing is not in effect, then the recalculated route will
always be the same as the original route. It is an optimization
to disable snat_reroute and avoid the recalculation.
vs.sync_persist_mode
Default:0
MKE:0
Sets the synchronization of connections when using persistence. The
possible values are defined as follows:
0 means all types of connections are synchronized.
1 attempts to reduce the synchronization traffic depending on
the connection type. For persistent services, avoid synchronization
for normal connections, do it only for persistence templates.
In such case, for TCP and SCTP it may need enabling sloppy_tcp and
sloppy_sctp flags on back-up servers. For non-persistent services
such optimization is not applied, mode 0 is assumed.
vs.sync_ports
Default:1
MKE:1
Sets the number of threads that the master and back-up servers can use
for sync traffic. Every thread uses a single UDP port, thread 0 uses the
default port 8848, and the last thread uses port
8848+sync_ports-1.
vs.sync_qlen_max
Default: Calculated
MKE: Calculated
Sets a hard limit for queued sync messages that are not yet sent. It
defaults to 1/32 of the memory pages but actually represents
number of messages. It will protect us from allocating large
parts of memory when the sending rate is lower than the queuing
rate.
vs.sync_refresh_period
Default:0
MKE:0
Sets (in seconds) the difference in the reported connection timer that
triggers new sync messages. It can be used to avoid sync messages for
the specified period (or half of the connection timeout if it is lower)
if the connection state has not changed since last sync.
This is useful for normal connections with high traffic, to reduce
sync rate. Additionally, retry sync_retries times with period of
sync_refresh_period/8.
vs.sync_retries
Default:0
MKE:0
Sets sync retries with period of sync_refresh_period/8. Useful
to protect against loss of sync messages. The range of the
sync_retries is 0 to 3.
vs.sync_sock_size
Default:0
MKE:0
Sets the configuration of SNDBUF (master) or RCVBUF (slave) socket
limit. Default value is 0 (preserve system defaults).
vs.sync_threshold
Default:350
MKE:350
Sets the synchronization threshold, which is the minimum number of
incoming packets that a connection must receive before the
connection is synchronized. A connection will be synchronized every time
the number of its incoming packets modulus sync_period equals the
threshold. The range of the threshold is 0 to sync_period.
When sync_period and sync_refresh_period are 0, send sync only
for state changes or only once when packets matches sync_threshold.
vs.sync_version
Default:1
MKE:1
Sets the version of the synchronization protocol to use when sending
synchronization messages. The possible values are:
``0 ``selects the original synchronization protocol (version 0). This
should be used when sending synchronization messages to a legacy
system that only understands the original synchronization protocol.
1 selects the current synchronization protocol (version 1). This
should be used whenever possible.
Kernels with this sync_version entry are able to receive messages
of both version 1 and version 2 of the synchronization protocol.
The net.netfilter.nf_conntrack_<subtree> default values persist,
changing only when the nf_conntrack kernel module has not been
previously loaded. For more information, refer to the
Linux kernel documentation.
Parameter
Values
Description
acct
Default:0
MKE:0
Sets whether connection-tracking flow accounting is enabled. Adds 64-bit
byte and packet counter per flow.
buckets
Default: Calculated
MKE: Calculated
Sets the size of the hash table. If not specified during module loading,
the default size is calculated by dividing total memory by 16384 to
determine the number of buckets. The hash table will never have fewer
than 1024 and never more than 262144 buckets. This sysctl is only
writeable in the initial net namespace.
checksum
Default:0
MKE:0
Sets whether the checksum of incoming packets is verified. Packets with
bad checksums are in an invalid state. If this is enabled, such packets
are not considered for connection tracking.
dccp_loose
Default:0
MKE:1
Sets whether picking up already established connections for Datagram
Congestion Control Protocol (DCCP) is permitted.
dccp_timeout_closereq
Default: Distribution dependent
MKE:64
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_closing
Default: Distribution dependent
MKE:64
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_open
Default: Distribution dependent
MKE:43200
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_partopen
Default: Distribution dependent
MKE:480
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_request
Default: Distribution dependent
MKE:240
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_respond
Default: Distribution dependent
MKE:480
The parameter description is not yet available in the Linux kernel
documentation.
dccp_timeout_timewait
Default: Distribution dependent
MKE:240
The parameter description is not yet available in the Linux kernel
documentation.
events
Default:0
MKE:1
Sets whether the connection tracking code provides userspace with
connection-tracking events through ctnetlink.
expect_max
Default: Calculated
MKE:1024
Sets the maximum size of the expectation table. The default value is
nf_conntrack_buckets / 256. The minimum is 1.
frag6_high_thresh
Default: Calculated
MKE:4194304
Sets the maximum memory used to reassemble IPv6 fragments. When
nf_conntrack_frag6_high_thresh bytes of memory is allocated for this
purpose, the fragment handler tosses packets until
nf_conntrack_frag6_low_thresh is reached. The size of this parameter
is calculated based on system memory.
frag6_low_thresh
Default: Calculated
MKE:3145728
See nf_conntrack_frag6_high_thresh. The size of this parameter is
calculated based on system memory.
frag6_timeout
Default:60
MKE:60
Sets the time to keep an IPv6 fragment in memory.
generic_timeout
Default:600
MKE:600
Sets the default for a generic timeout. This refers to layer 4 unknown
and unsupported protocols.
gre_timeout
Default:30
MKE:30
Set the GRE timeout from the conntrack table.
gre_timeout_stream
Default:180
MKE:180
Sets the GRE timeout for streamed connections. This extended timeout
is used when a GRE stream is detected.
helper
Default:0
MKE:0
Sets whether the automatic conntrack helper assignment is enabled.
If disabled, you must set up iptables rules to assign
helpers to connections. See the CT target description in the
iptables-extensions(8) main page for more information.
icmp_timeout
Default:30
MKE:30
Sets the default for ICMP timeout.
icmpv6_timeout
Default:30
MKE:30
Sets the default for ICMP6 timeout.
log_invalid
Default:0
MKE:0
Sets whether invalid packets of a type specified by value are logged.
max
Default: Calculated
MKE:131072
Sets the maximum number of allowed connection tracking entries. This
value is set to nf_conntrack_buckets by default.
Connection-tracking entries are added to the table twice, once for the
original direction and once for the reply direction (that is, with
the reversed address). Thus, with default settings a maxed-out
table will have an average hash chain length of 2, not 1.
sctp_timeout_closed
Default: Distribution dependent
MKE:10
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_cookie_echoed
Default: Distribution dependent
MKE:3
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_cookie_wait
Default: Distribution dependent
MKE:3
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_established
Default: Distribution dependent
MKE:432000
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_heartbeat_acked
Default: Distribution dependent
MKE:210
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_heartbeat_sent
Default: Distribution dependent
MKE:30
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_shutdown_ack_sent
Default: Distribution dependent
MKE:3
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_shutdown_recd
Default: Distribution dependent
MKE:0
The parameter description is not yet available in the Linux kernel
documentation.
sctp_timeout_shutdown_sent
Default: Distribution dependent
MKE:0
The parameter description is not yet available in the Linux kernel
documentation.
tcp_be_liberal
Default:0
MKE:0
Sets whether only out of window RST segments are marked as INVALID.
tcp_loose
Default:0
MKE:1
Sets whether already established connections are picked up.
tcp_max_retrans
Default:3
MKE:3
Sets the maximum number of packets that can be retransmitted without
receiving an acceptable ACK from the destination. If this number
is reached, a shorter timer is started. Timeout for unanswered.
tcp_timeout_close
Default: Distribution dependent
MKE:10
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_close_wait
Default: Distribution dependent
MKE:3600
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_fin_wait
Default: Distribution dependent
MKE:120
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_last_ack
Default: Distribution dependent
MKE:30
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_max_retrans
Default: Distribution dependent
MKE:300
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_syn_recv
Default: Distribution dependent
MKE:60
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_syn_sent
Default: Distribution dependent
MKE:120
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_time_wait
Default: Distribution dependent
MKE:120
The parameter description is not yet available in the Linux kernel
documentation.
tcp_timeout_unacknowledged
Default: Distribution dependent
MKE:30
The parameter description is not yet available in the Linux kernel
documentation.
timestamp
Default:0
MKE:0
Sets whether connection-tracking flow timestamping is enabled.
udp_timeout
Default:30
MKE:30
Sets the UDP timeout.
udp_timeout_stream
Default:120
MKE:120
Sets the extended timeout that is used whenever a UDP stream is
detected.
The net.nf_conntrack_<subtree> default values persist, changing only
when the nf_conntrack kernel module has not been previously loaded.
For more information, refer to the Linux kernel documentation.
Parameter
Values
Description
max
Default: Calculated
MKE:131072
Sets the maximum number of connections to track. The size of this
parameter is calculated based on system memory.
The ucp install command runs in interactive mode,
prompting you for the necessary configuration values. For more information
about the ucp install command, including how to install MKE on a
system with SELinux enabled, refer to the MKE Operations Guide:
mirantis/ucp install.
Note
MKE installs Project Calico for Kubernetes container-to-container
communication. However, you may install an alternative CNI plugin, such as
Weave or Flannel. For more information, refer to the MKE Operations
Guide: Installing an unmanaged CNI plugin.
After you Install the MKE image, proceed with
downloading your MKE license as described below. This section also contains
steps to apply your new license using the MKE web UI.
Warning
Users are not authorized to run MKE on production workloads without a valid
license. Refer to Mirantis Agreements and Terms for more information.
To download your MKE license:
Open an email from Mirantis Support with the subject
Welcome to Mirantis’ CloudCare Portal and follow the instructions
for logging in.
If you did not receive the CloudCare Portal email, it is likely that
you have not yet been added as a Designated Contact. To remedy this,
contact your Designated Administrator.
In the top navigation bar, click Environments.
Click the Cloud Name associated with the license you want to
download.
Scroll down to License Information and click the
License File URL. A new tab opens in your browser.
Click View file to download your license file.
To update your license settings in the MKE web UI:
Log in to your MKE instance using an administrator account.
In the left navigation, click Settings.
On the General tab, click Apply new license. A file
browser dialog displays.
Navigate to where you saved the license key (.lic) file, select it,
and click Open. MKE automatically updates with the new settings.
Note
Though MKE is generally a subscription-only service, Mirantis offers a free
trial license by request. Use our contact form to request a free trial license.
This section describes how to customize your MKE installation on AWS. It is for
those deploying Kubernetes workloads while
leveraging the AWS Kubernetes cloud provider, which provides dynamic
volume and loadbalancer provisioning.
Note
You may skip this topic if you plan to install MKE on AWS with no
customizations or if you will only deploy Docker Swarm workloads. Refer to
Install the MKE image for the appropriate installation instruction.
Complete the following prerequisites prior to installing MKE on AWS.
Log in to the AWS Management Console.
Assign your instance a host name using the
ip-<privateip>.<region>.compute.internal template. For example,
ip-172-31-15-241.us-east-2.compute.internal.
Tag your instance, VPC, and subnets by specifying
kubernetes.io/cluster/<unique-cluster-id> in the Key field
and <cluster-type> in the Value field.
Possible <cluster-type> values are as follows:
owned, if the cluster owns and manages the resources that it creates
shared, if the cluster shares its resources between multiple clusters
For example, Key: kubernetes.io/cluster/1729543642a6 and
Value: owned.
To enable introspection and resource provisioning, specify an instance
profile with appropriate policies for manager nodes. The following is
an example of a very permissive instance profile:
To enable access to dynamically provisioned resources, specify an instance
profile with appropriate policies for worker nodes. The following is an
example of a very permissive instance profile:
After you perform the steps described in Prerequisites, run the
following command to install MKE on a master node. Substitute <ucp-ip> with
the private IP address of the master node.
Warning
If your cluster includes Kubernetes Windows worker nodes, you must omit the
--cloud-provideraws flag from the following command, as its inclusion
causes the Kubernetes Windows worker nodes never to enter a healthy state.
Mirantis Kubernetes Engine (MKE) closely integrates with Microsoft
Azure for its Kubernetes Networking and Persistent Storage feature set.
MKE deploys the Calico CNI provider. In Azure, the Calico CNI leverages
the Azure networking infrastructure for data path networking and the
Azure IPAM for IP address management.
To avoid significant issues during the installation process, you must meet the
following infrastructure prerequisites to successfully deploy MKE on Azure.
Deploy all MKE nodes (managers and workers) into the
same Azure resource group. You can deploy the Azure networking components
(virtual network, subnets, security groups) in a second Azure resource
group.
Size the Azure virtual network and subnet appropriately for
your environment, because addresses from this pool will be consumed by
Kubernetes Pods.
Attach all MKE worker and manager nodes to the same Azure subnet.
Set internal IP addresses for all nodes to Static rather than
the Dynamic default.
Match the Azure virtual machine object name to the Azure
virtual machine computer name and the node operating system hostname that is
the FQDN of the host (including domain names). All characters in the names
must be in lowercase.
Ensure the presence of an Azure Service Principal with Contributor
access to the Azure resource group hosting the MKE nodes. Kubernetes uses
this Service Principal to communicate with the Azure API. The Service
Principal ID and Secret Key are MKE prerequisites.
If you are using a separate resource group for the networking components,
the same Service Principal must have NetworkContributor access to this
resource group.
Ensure that an open NSG between all IPs on the Azure subnet passes into MKE
during installation. Kubernetes Pods integrate into the underlying Azure
networking stack, from an IPAM and routing perspective with the Azure CNI
IPAM module. As such, Azure network security groups (NSG) impact pod-to-pod
communication. End users may expose containerized services on a range of
underlying ports, resulting in a manual process to open an NSG port every
time a new containerized service deploys on the platform, affecting only
workloads that deploy on the Kubernetes orchestrator.
To limit exposure, restrict the use of the Azure subnet to container host
VMs and Kubernetes Pods. Additionally, you can leverage Kubernetes Network
Policies to provide micro segmentation for containerized applications and
services.
The MKE installation requires the following information:
subscriptionId
Azure Subscription ID in which to deploy the MKE objects
tenantId
Azure Active Directory Tenant ID in which to deploy the MKE objects
MKE configures the Azure IPAM module for Kubernetes so that it can allocate IP
addresses for Kubernetes Pods. Per Azure IPAM module requirements, the
configuration of each Azure VM that is part of the Kubernetes cluster must
include a pool of IP addresses.
You can use automatic or manual IPs provisioning for the Kubernetes cluster on
Azure.
Automatic provisioning
Allows for IP pool configuration and maintenance for standalone Azure
virtual machines (VMs). This service runs within the calico-node
daemonset and provisions 128 IP addresses for each node by default.
Note
If you are using a VXLAN data plane, MKE automatically uses Calico IPAM.
It is not necessary to do anything specific for Azure IPAM.
New MKE installations use Calico VXLAN as the default data plane (the
MKE configuration calico_vxlan is set to true). MKE does not use
Calico VXLAN if the MKE version is lower than 3.3.0 or if you upgrade
MKE from lower than 3.3.0 to 3.3.0 or higher.
Manual provisioning
Manual provisioning of additional IP address for each Azure VM can be done
through the Azure Portal, the Azure CLI az network nic
ip-config create, or an ARM template.
For MKE to integrate with Microsoft Azure, the azure.json configuration
file must be identical across all manager and worker nodes in your cluster. For
Linux nodes, place the file in /etc/kubernetes on each host. For Windows
nodes, place the file in C:\k on each host. Because root owns the
configuration file, set its permissions to 0644 to ensure that the
container user has read access.
The following is an example template for azure.json.
To avoid significant issue during the installation process, follow
these guidelines to either use the appropriate size network in Azure or
take the necessary actions to fit within the subnet.
Configure the subnet and the virtual network associated with the primary
interface of the Azure VMs with an adequate address prefix/range. The number of
required IP addresses depends on the workload and the number of nodes in the
cluster.
For example, for a cluster of 256 nodes, make sure that the address space
of the subnet and the virtual network can allocate at least 128 * 256
IP addresses, in order to run a maximum of 128 pods concurrently on a
node. This is in addition to initial IP allocations to VM
network interface card (NICs) during Azure resource creation.
Accounting for the allocation of IP addresses to NICs that occur during VM
bring-up, set the address space of the subnet and virtual network to
10.0.0.0/16. This ensures that the network can dynamically allocate
at least 32768 addresses, plus a buffer for initial allocations for
primary IP addresses.
Note
The Azure IPAM module queries the metadata of an Azure VM to obtain a list
of the IP addresses that are assigned to the VM NICs. The IPAM module
allocates these IP addresses to Kubernetes pods. You configure the IP
addresses as ipConfigurations in the NICs associated with a VM or
scale set member, so that Azure IPAM can provide the addresses to Kubernetes
on request.
Manually provision IP address pools as part of an Azure VM scale set¶
Configure IP Pools for each member of the VM scale set during
provisioning by associating multiple ipConfigurations with the scale
set’s networkInterfaceConfigurations.
The following example networkProfile configuration for an ARM template
configures pools of 32 IP addresses for each VM in the VM scale set.
During an MKE installation, you can alter the number of Azure IP
addresses that MKE automatically provisions for pods.
By default, MKE will provision 128 addresses, from the same Azure subnet as the
hosts, for each VM in the cluster. If, however, you have manually attached
additional IP addresses to the VMs (by way of an ARM Template, Azure CLI or
Azure Portal) or you are deploying in to small Azure subnet (less than /16),
you can use an --azure-ip-count flag at install time.
Note
Do not set the --azure-ip-count variable to a value of less than 6 if
you have not manually provisioned additional IP addresses for each VM. The
MKE installation needs at least 6 IP addresses to allocate to the core MKE
components that run as Kubernetes pods (in addition to the VM’s private IP
address).
Below are several example scenarios that require the defining of the
--azure-ip-count variable.
Scenario 1: Manually provisioned addresses
If you have manually provisioned additional IP addresses for each VM and want
to disable MKE from dynamically provisioning more IP addresses, you must
pass --azure-ip-count0 into the MKE installation command.
Scenario 2: Reducing the number of provisioned addresses
Pass --azure-ip-count<custom_value> into the MKE installation command
to reduce the number of IP addresses dynamically allocated from 128 to a
custom value due to:
Primary use of the Swarm Orchestrator
Deployment of MKE on a small Azure subnet (for example, /24)
Plans to run a small number of Kubernetes pods on each node
To adjust this value post-installation, refer to the instructions on how to
download the MKE configuration file, change the value, and update
the configuration via the API.
Note
If you reduce the value post-installation, existing VMs will not
reconcile and you will need to manually edit the IP count in Azure.
Run the following command to install MKE on a manager node.
The --pod-cidr option maps to the IP address range that you configured
for the Azure subnet.
Note
The pod-cidr range must match the Azure virtual network’s subnet
attached to the hosts. For example, if the Azure virtual network had the
range 172.0.0.0/16 with VMs provisioned on an Azure subnet of
172.0.1.0/24, then the Pod CIDR should also be 172.0.1.0/24.
This requirement applies only when MKE does not use the VXLAN data plane.
If MKE uses the VXLAN data plane, the pod-cidr range must be
different than the node IP subnet.
The --host-address maps to the private IP address of the master node.
The --azure-ip-count serves to adjust the amount of IP addresses
provisioned to each VM.
You can create your own Azure custom roles for use with MKE. You can assign
these roles to users, groups, and service principals at management group (in
preview only), subscription, and resource group scopes.
Deploy an MKE cluster into a single resource group¶
A resource group
is a container that holds resources for an Azure solution. These resources are
the virtual machines (VMs), networks, and storage accounts that are associated
with the swarm.
To create a custom all-in-one role with permissions to deploy an MKE cluster
into a single resource group:
Create the role permissions JSON file.
For example:
{"Name":"Docker Platform All-in-One","IsCustom":true,"Description":"Can install and manage Docker platform.","Actions":["Microsoft.Authorization/*/read","Microsoft.Authorization/roleAssignments/write","Microsoft.Compute/availabilitySets/read","Microsoft.Compute/availabilitySets/write","Microsoft.Compute/disks/read","Microsoft.Compute/disks/write","Microsoft.Compute/virtualMachines/extensions/read","Microsoft.Compute/virtualMachines/extensions/write","Microsoft.Compute/virtualMachines/read","Microsoft.Compute/virtualMachines/write","Microsoft.Network/loadBalancers/read","Microsoft.Network/loadBalancers/write","Microsoft.Network/loadBalancers/backendAddressPools/join/action","Microsoft.Network/networkInterfaces/read","Microsoft.Network/networkInterfaces/write","Microsoft.Network/networkInterfaces/join/action","Microsoft.Network/networkSecurityGroups/read","Microsoft.Network/networkSecurityGroups/write","Microsoft.Network/networkSecurityGroups/join/action","Microsoft.Network/networkSecurityGroups/securityRules/read","Microsoft.Network/networkSecurityGroups/securityRules/write","Microsoft.Network/publicIPAddresses/read","Microsoft.Network/publicIPAddresses/write","Microsoft.Network/publicIPAddresses/join/action","Microsoft.Network/virtualNetworks/read","Microsoft.Network/virtualNetworks/write","Microsoft.Network/virtualNetworks/subnets/read","Microsoft.Network/virtualNetworks/subnets/write","Microsoft.Network/virtualNetworks/subnets/join/action","Microsoft.Resources/subscriptions/resourcegroups/read","Microsoft.Resources/subscriptions/resourcegroups/write","Microsoft.Security/advancedThreatProtectionSettings/read","Microsoft.Security/advancedThreatProtectionSettings/write","Microsoft.Storage/*/read","Microsoft.Storage/storageAccounts/listKeys/action","Microsoft.Storage/storageAccounts/write"],"NotActions":[],"AssignableScopes":["/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"]}
Compute resources act as servers for running containers.
To create a custom role to deploy MKE compute resources only:
Create the role permissions JSON file.
For example:
{"Name":"Docker Platform","IsCustom":true,"Description":"Can install and run Docker platform.","Actions":["Microsoft.Authorization/*/read","Microsoft.Authorization/roleAssignments/write","Microsoft.Compute/availabilitySets/read","Microsoft.Compute/availabilitySets/write","Microsoft.Compute/disks/read","Microsoft.Compute/disks/write","Microsoft.Compute/virtualMachines/extensions/read","Microsoft.Compute/virtualMachines/extensions/write","Microsoft.Compute/virtualMachines/read","Microsoft.Compute/virtualMachines/write","Microsoft.Network/loadBalancers/read","Microsoft.Network/loadBalancers/write","Microsoft.Network/networkInterfaces/read","Microsoft.Network/networkInterfaces/write","Microsoft.Network/networkInterfaces/join/action","Microsoft.Network/publicIPAddresses/read","Microsoft.Network/virtualNetworks/read","Microsoft.Network/virtualNetworks/subnets/read","Microsoft.Network/virtualNetworks/subnets/join/action","Microsoft.Resources/subscriptions/resourcegroups/read","Microsoft.Resources/subscriptions/resourcegroups/write","Microsoft.Security/advancedThreatProtectionSettings/read","Microsoft.Security/advancedThreatProtectionSettings/write","Microsoft.Storage/storageAccounts/read","Microsoft.Storage/storageAccounts/listKeys/action","Microsoft.Storage/storageAccounts/write"],"NotActions":[],"AssignableScopes":["/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"]}
To install MKE on an offline host, you must first use a separate computer with
an Internet connection to download a single package with all the images and
then copy that package to the host where you will install MKE. Once the package
is on the host and loaded, you can install MKE offline as described in
Install the MKE image.
Note
During the offline installation, both manager and worker nodes must be
offline.
To install MKE offline:
Download the required MKE package:
Note
MKE 3.3.10 is discontinued and thus not available for download.
Caution
Users running kernel version 4.15 or earlier may encounter an issue with
MKE 3.3.9 wherein support dumps fail and nodes disconnect. Mirantis
strongly recommends that these users either upgrade to kernel version
4.16 (or later) or upgrade to MKE 3.3.11.
MKE is designed to scale as your applications grow in size and usage. You can
add and remove nodes from the cluster to make it scale to your needs.
You can also uninstall MKE from your cluster. In this case, the MKE services
are stopped and removed, but your Mirantis Container Runtimes will continue
running in swarm mode. You applications will continue running normally.
If you wish to remove a single node from the MKE cluster, you should
instead remove that node from the cluster.
After you uninstall MKE from the cluster, you’ll no longer be able to
enforce role-based access control (RBAC) to the cluster, or have a
centralized way to monitor and manage the cluster. After uninstalling
MKE from the cluster, you will no longer be able to join new nodes using
dockerswarmjoin, unless you reinstall MKE.
To uninstall MKE, log in to a manager node using ssh, and run the
following command:
This runs the uninstall command in interactive mode, so that you are
prompted for any necessary configuration values.
If the uninstall-ucp command fails, you can run the following
commands to manually uninstall MKE:
#Run the following command on one manager node to remove remaining MKE services
dockerservicerm$(dockerservicels-fname=ucp--q)#Run the following command on each manager node to remove remaining MKE containers
dockercontainerrm-f$(dockercontainerps-a-fname=ucp--fname=k8s_-q)#Run the following command on each manager node to remove remaining MKE volumes
dockervolumerm$(dockervolumels-fname=ucp-q)
The MKE configuration is kept in case you want to reinstall MKE with the
same configuration. If you want to also delete the configuration, run
the uninstall command with the --purge-config option.
Refer to the CLI Reference documentation to see the available options.
Once the uninstall command finishes, MKE is completely removed from all
the nodes in the cluster. You don’t need to run the command again from
other nodes.
After uninstalling MKE, the nodes in your cluster will still be in swarm
mode, but you can’t join new nodes until you reinstall MKE, because
swarm mode relies on MKE to provide the CA certificates that allow nodes
in the cluster to identify one another. Also, since swarm mode is no
longer controlling its certificates, if the certificates expire
after you uninstall MKE, the nodes in the swarm won’t be able to
communicate at all. To fix this, either reinstall MKE before the
certificates expire or disable swarm mode by running
dockerswarmleave--force on every node.
When you install MKE, the Calico network plugin changes the host’s IP
tables. When you uninstall MKE, the IP tables aren’t reverted to their
previous state. After you uninstall MKE, restart the node to restore its
IP tables.
In correlation with the end of life (EOL) date for MKE 3.3.x, Mirantis stopped maintaining this
documentation version as of 2022-05-27. The latest MKE product documentation
is available here.
The MKE Operations Guide provides the comprehensive information
you need to run the MKE container orchestration platform. The guide is
intended for anyone who needs to effectively develop and securely administer
applications at scale, on private clouds, public clouds, and on bare metal.
You can access an MKE cluster in a variety of ways including through the MKE
web UI, Docker CLI, and kubectl (the Kubernetes CLI). To use the
Docker CLI and kubectl with MKE, first download a client certificate
bundle. This topic describes the MKE web UI, how to download and configure the
client bundle, and how to configure kubectl with MKE.
MKE allows you to control your cluster visually using the web UI. Role-based
access control (RBAC) gives administrators and non-administrators access to
the following web UI features:
Administrators:
Manage cluster configurations.
View and edit all cluster images, networks, volumes, and containers.
Manage the permissions of users, teams, and organizations.
Grant node-specific task scheduling permissions to users.
Non-administrators:
View and edit all cluster images, networks, volumes, and containers.
Requires administrator to grant access.
To access the MKE web UI:
Open a browser and navigate to https://<ip-address> (substituting
<ip-address> with the IP address of the machine that ran
docker run).
The expected Docker CLI server version starts with ucp/, and the
expected kubectl context name starts with ucp_.
Optional. Change your context directly using the client certificate bundle
.zip files. In the directory where you downloaded the user bundle, add
the new context:
If you use the client certificate bundle with buildkit, make
sure that builds are not accidentally scheduled on manager nodes. For more
information, refer to Manage services node deployment.
MKE installations include Kubernetes. Users can
deploy, manage, and monitor Kubernetes using either the MKE web UI or kubectl.
To install and use kubectl:
Identify which version of Kubernetes you are running by using the MKE web
UI, the MKE API version endpoint, or the Docker CLI
docker version command with the client bundle.
Caution
Kubernetes requires that kubectl and Kubernetes be within one
minor version of each other.
Helm recommends that you specify a Role and RoleBinding to limit
the scope of Tiller to a particular namespace. Refer to the
official Helm documentation
for more information.
With MKE, you can add labels to your nodes. Labels are metadata
that describe the node, such as:
node role (development, QA, production)
node region (US, EU, APAC)
disk type (HDD, SSD)
Once you apply a label to a node, you can specify constraints when deploying a
service to ensure that the service only runs on nodes that meet particular
criteria.
Hint
Use resource sets (MKE collections or Kubernetes namespaces) to organize access to your cluster, rather than creating labels
for authorization and permissions to resources.
The following example procedure applies the ssd label to a node.
Log in to the MKE web UI with administrator credentials.
Click Shared Resources in the navigation menu to expand the
selections.
Click Nodes. The details pane will display the full list of
nodes.
Click the node on the list that you want to attach labels to. The details
pane will transition, presenting the Overview information
for the selected node.
Click the settings icon in the upper-right corner to open the
Edit Node page.
Navigate to the Labels section and click Add
Label.
Add a label, entering disk into the Key field and ssd
into the Value field.
Click Save to dismiss the Edit Node page and return
to the node Overview.
The following example procedure deploys a service with a constraint that
ensures that the service only runs on nodes with SSD storage
node.labels.disk==ssd.
Log in to the MKE web UI with administrator credentials.
Click Shared Resources in the navigation menu to expand the
selections.
Click Stacks. The details pane will display the full list of
stacks.
Click the Create Stack button to open the Create
Application page.
Under 1. Configure Application, enter “wordpress” into the
Name field .
Under ORCHESTRATOR NODE, select Swarm Services.
Under 2. Add Application File, paste the following stack file in
the docker-compose.yml editor:
Click Done once the stack deployment completes to
return to the stacks list which now features your newly created stack.
In the navigation menu, click Nodes. The details pane will
display the full list of nodes.
Click the node with the disk label.
In the details pane, click the Inspect Resource drop-down menu
and select Containers.
Dismiss the filter and navigate to the Nodes page.
Click any node that does not have the disk label.
In the details pane, click the Inspect Resource drop-down menu
and select Containers. Dismiss the filter since there are no
WordPress containers scheduled on the node.
Add or remove a service constraint using the MKE web UI¶
You can declare the deployment constraints in your docker-compose.yml
file or when you create a stack. Also, you can apply constraints when
you create a service.
To add or remove a service constraint:
Verify whether a service has deployment constraints:
Navigate to the Services page and select that service.
In the details pane, click Constraints to list the constraint
labels.
Edit the constraints on the service:
Click Configure and select Details to open the
Update Service page.
A SAN (Subject Alternative Name) is a structured means for associating various
values (such as domain names, IP addresses, email addresses, URIs, and so on)
with a security certificate.
MKE always runs with HTTPS enabled. As such, whenever you connect to MKE, you
must ensure that the MKE certificates recognize the host name in use. For
example, if MKE is behind a load balancer that forwards traffic to your MKE
instance, your requests will not be for the MKE host name or IP address but for
the host name of the load balancer. Thus, MKE will reject the requests, unless
you include the address of the load balancer as a SAN in the MKE certificates.
Note
To use your own TLS certificates, confirm first that these certificates
have the correct SAN values.
To use the self-signed certificate that MKE offers out-of-the-box, you can
use the --san argument to set up the SANs during MKE
deployment.
To add new SANs using the MKE web UI:
Log in to the MKE web UI using administrator credentials.
Navigate to the Nodes page.
Click on a manager node to display the details pane for that node.
Click Configure and select Details.
In the SANs section, click Add SAN and enter one or
more SANs for the cluster.
Click Save.
Repeat for every existing manager node in the cluster.
Note
Thereafter, the SANs are automatically applied to any new manager nodes
that join the cluster.
To add new SANs using the MKE CLI:
Get the current set of SANs for the given manager node:
dockernodeinspect--format'{{ index .Spec.Labels "com.docker.ucp.SANs"}}'<node-id>
Example of system response:
default-cs,127.0.0.1,172.17.0.1
Append the desired SAN to the list (for example,
default-cs,127.0.0.1,172.17.0.1,example.com) and run:
Prometheus is an open-source systems monitoring and alerting toolkit to which
you can configure MKE as a target.
Prometheus runs as a Kubernetes deployment that, by default, is a DaemonSet
that runs on every manager node. A key benefit of this is that you can set the
DaemonSet to not schedule on any nodes, which effectively disables Prometheus
if you do not use the MKE web interface.
Along with events and logs, metrics are data sources that provide a view into
your cluster, presenting numerical data values that have a time-series
component. There are several sources from which you can derive metrics, each
providing different meanings for a business and its applications.
As the metrics data is stored locally on disk for each Prometheus server, it
does not replicate on new managers or if you schedule Prometheus to run
on a new node. The metrics are kept no longer than 24 hours.
MKE provides a base set of metrics that gets you into production without having
to rely on external or third-party tools. Mirantis strongly encourages, though,
the use of additional monitoring to provide more comprehensive visibility into
your specific MKE environment.
High-level aggregate metrics that typically combine technical,
financial, and organizational data to create IT infrastructure
information for business leaders. Examples of business metrics
include:
Company or division-level application downtime
Aggregation resource utilization
Application resource demand growth
Application
Metrics on APM tools domains (such as AppDynamics and
DynaTrace) that supply information on the state or performance of the
application itself.
Service state
Container platform
Host infrastructure
Service
Metrics on the state of services that are running on the container
platform. Such metrics have very low cardinality, meaning the
values are typically from a small fixed set of possibilities (commonly
binary).
Application health
Convergence of Kubernetes deployments and Swarm services
Cluster load by number of services or containers or pods
Note
Web UI disk usage (including free space) reflects only the MKE
managed portion of the file system: /var/lib/docker. To monitor
the total space available on each filesystem of an MKE worker or
manager, deploy a third-party monitoring solution to oversee
the operating system.
The container health, according to its healthcheck.
The 0 value indicates that the container is not reporting as
healthy, which is likely because it either does not have a healthcheck
defined or because healthcheck results have not yet been returned
Indicates whether the container is healthy, according to
its healthcheck.
The 0 value indicates that the container is not reporting as
healthy, which is likely because it either does not have a healthcheck
defined or because healthcheck results have not yet been returned
Total disk space on the Docker root directory on this node in bytes.
Note that the ucp_engine_disk_free_bytes metric is not available
for Windows nodes
MKE deploys Prometheus by default on the manager nodes to provide a built-in
metrics back end. For cluster sizes over 100 nodes, or if you need to scrape
metrics from Prometheus instances, Mirantis recommends that you deploy
Prometheus on dedicated worker nodes in the cluster.
To deploy Prometheus on worker nodes:
Source an admin bundle.
Verify that ucp-metrics pods are running on all managers:
If you use SELinux, label your ucp-node-certs
directories properly on the worker nodes before you move the
ucp-metrics workload to them. To run ucp-metrics on a worker
node, update the ucp-node-certs label by running:
Patch the ucp-metrics DaemonSet’s nodeSelector with the same key and
value in use for the node label. This example shows the key
ucp-metrics and the value "".
Create a Prometheus deployment and ClusterIP service using YAML.
On AWS with the Kubernetes cloud provider configured:
Replace ClusterIP with LoadBalancer in the service YAML.
Access the service through the load balancer.
If you run Prometheus external to MKE, change the domain for the
inventory container in the Prometheus deployment from
ucp-controller.kube-system.svc.cluster.local to an external domain, to
access MKE from the Prometheus node.
Forward port 9090 on the local host to the ClusterIP. The tunnel
you create does not need to be kept alive as its only purpose is to expose
the Prometheus UI.
ssh-L9090:10.96.254.107:9090ANY_NODE
Visit http://127.0.0.1:9090 to explore the MKE metrics that Prometheus
is collecting.
MKE uses native Kubernetes RBAC, which is active by default for Kubernetes
clusters. The YAML files of many ecosystem applications and integrations use
Kubernetes RBAC to access service accounts. Also, organizations looking to run
MKE both on-premises and in hosted cloud services want to run Kubernetes
applications in both environments without having to manually change RBAC in
their YAML file.
Note
Kubernetes and Swarm roles have separate views. Using the MKE web UI, you
can view all the roles for a particular cluster:
Click Access Control in the navigation menu at the left.
Click Roles.
Select the Kubernetes tab or the Swarm tab to view the specific roles for each.
You create Kubernetes roles either through the CLI using Kubernetes kubectl
tool or through the MKE web UI.
To create a Kubernetes role using the MKE web UI:
Log in to the the MKE web UI.
In the navigation menu at the left, click Access Control to
display the available options.
Click Roles.
At the top of the details pane, click the Kubernetes tab.
Click Create to open the Create Kubernetes
Object page.
Click Namespace to select a namespace for the role from one of
the available options.
Provide the YAML file for the role. To do this, either enter it in the
Object YAML editor, or upload an existing .yml file using the
Click to upload a .yml file selection link at the right.
To create a grant for a Kubernetes role in the MKE web UI:
Log in to the the MKE web UI.
In the navigation menu at the left, click Access Control to
display the available options.
Click the Grants option.
At the top of the details paine, click the Kubernetes tab. All
existing grants to Kubernetes roles are present in the details pane.
Click Create Role Binding to open the Create Role
Binding page.
Select the subject type at the top of the 1. Subject section
(Users, Organizations, or Service
Account).
Create a role binding for the selected subject type:
Users: Select a type from the User drop-down list.
Organizations: Select a type from the
Organization drop-down list. Optionally, you can also select
a team using the Team(optional) drop-down list, if any have
been established.
Service Account: Select a NAMESPACE from the
Namespace drop-down list, then a type from the
Service Account drop-down list.
Click Next to activate the 2. Resource Set section.
Select a resource set for the subject.
By default, the default namespace is indicated. To use a
different namespace, select the Select Namespace button
associated with the desired namespace.
For ClusterRoleBinding, slide the Apply Role Binding to
all namespace (Cluster Role Binding) selector to the right.
Click Next to activate the 3. Role section.
Select the role type.
Role
Cluster Role
Note
Cluster Role type is the only role type available if you enabled Apply Role Binding to all namespace (Cluster
Role Binding) in the 2. Resource Set section.
Audit logs are a chronological record of security-relevant activities by
individual users, administrators, or software components that have had an
effect on an MKE system. They focus on external user/agent actions and
security, rather than attempting to understand state or events of the system
itself.
Audit logs capture all HTTP actions (GET, PUT, POST, PATCH, DELETE) to all MKE
API, Swarm API, and Kubernetes API endpoints (with the exception of the ignored
list) that are invoked and and sent to Mirantis Container Runtime via stdout.
The benefits that audit logs provide include:
Historical troubleshooting
You can use audit logs to determine a sequence of past events that can help
explain why an issue occurred.
Security analysis and auditing
A full record of all user interactions with the container infrastructure
can provide your security team with the visibility necessary to root out
questionable or unauthorized access attempts.
Chargeback
Use audit log about the resources to generate chargeback information.
Alerting
With a watch on an event stream or a notification the event creates, you can
build alerting features on top of event tools that generate alerts for ops
teams (PagerDuty, OpsGenie, Slack, or custom solutions).
The level setting supports the following variables:
""
"metadata"
"request"
Caution
The support_dump_include_audit_logs flag specifies whether user
identification information from the ucp-controller container logs is
included in the support bundle. To prevent this information from being sent
with the support bundle, verify that support_dump_include_audit_logs
is set to false. When disabled, the support bundle collection tool
filters out any lines from the ucp-controller container logs that
contain the substring auditID.
With regard to audit logging, for reasons having to do with system security a
number of MKE API endpoints are either ignored or have their information
redacted.
You can set MKE to automatically record and transmit data to Mirantis through
an encrypted channel for monitoring and analysis purposes. The data collected
provides the Mirantis Customer Success Organization with information that helps
us to better understand the operational use of MKE by our customers. It also
provides key feedback in the form of product usage statistics, which enable our
product teams to enhance Mirantis products and services.
Specifically, with MKE you can send hourly usage reports, as well as
information on API and UI usage.
Caution
To send the telemetry, verify that dockerd and the MKE application container
can resolve api.segment.io and create a TCP (HTTPS) connection on port
443.
To enable telemetry in MKE:
Log in to the MKE web UI as an administrator.
At the top of the navigation menu at the left, click the user name
drop-down to display the available options.
Click Admin Settings to display the available
options.
Click Usage to open the Usage Reporting screen.
Toggle the Enable API and UI tracking slider to the right.
(Optional) Enter a unique label to identify the cluster in the usage
reporting.
Security Assertion Markup Language (SAML) is an open standard for exchanging
authentication and authorization data between parties. It is commonly supported
by enterprise authentication systems. SAML-based single sign-on (SSO) gives you
access to MKE through a SAML 2.0-compliant identity provider.
MKE supports the Okta and ADFS
identity providers.
The SAML integration process is as follows.
Configure the Identity Provider (IdP).
Enable SAML and configure MKE as the Service Provider under Admin
Settings > Authentication and Authorization.
Create (Edit) Teams to link with the Group memberships. This updates
team membership information when a user signs in with SAML.
Note
If you enable LDAP integration, you cannot enable SAML for authentication.
Note, though, that this does not affect local MKE user account
authentication.
Identity providers require certain values to successfully integrate
with MKE. As these values vary depending on the identity provider,
consult your identity provider documentation for instructions on
how to best provide the needed information.
URL for MKE, qualified with /enzi/v0/saml/acs. For example,
https://111.111.111.111/enzi/v0/saml/acs.
Service provider audience URI
URL for MKE, qualified with /enzi/v0/saml/metadata. For example,
https://111.111.111.111/enzi/v0/saml/metadata.
NameID format
Select Unspecified.
Application user name
Email. For example, a custom ${f:substringBefore(user.email,"@")}
specifies the user name portion of the email address.
Attribute Statements
Name: fullname
Value: user.displayName
Group Attribute Statement
Name: member-of
Filter: (user defined) for associate group membership.
The group name is returned with the assertion.
Name: is-admin
Filter: (user defined) for identifying whether the user is an admin.
Okta configuration
When two or more group names are expected to return with the assertion,
use the regex filter. For example, use the value apple|orange to
return groups apple and orange.
The service provider metadata URI value is the URL for
MKE, qualified with /enzi/v0/saml/metadata. For example,
https://111.111.111.111/enzi/v0/saml/metadata.
SAML configuration requires that you know the metadata URL for your chosen
identity provider, as well as the URL for the MKE host that contains the IP
address or domain of your MKE installation.
To configure SAML integration on MKE:
Log in to the MKE web UI.
In the navigation menu at the left, click the user name
drop-down to display the available options.
Click Admin Settings to display the available
options.
Click Authentication & Authorization.
In the Identity Provider section in the details pane, move the
slider next to SAML to enable the SAML settings.
In the SAML idP Server subsection, enter the URL for the
identity provider metadata in the IdP Metadata URL field.
Note
If the metadata URL is publicly certified, you can continue with the
default settings:
Skip TLS Verification unchecked
Root Certificates Bundle blank
Mirantis recommends TLS verification in production environments. If the
metadata URL cannot be certified by the default certificate authority
store, you must provide the certificates from the identity provider in
the Root Certificates Bundle field.
In the SAML Service Provider subsection, in the MKE
Host field, enter the URL that includes the IP address or
domain of your MKE installation.
The port number is optional. The current IP address or domain displays by
default.
(Optional) Customize the text of the sign-in button by entering the text for
the button in the Customize Sign In Button Text field. By
default, the button text is Sign in with SAML.
Copy the SERVICE PROVIDER METADATA URL, the
ASSERTION CONSUMER SERVICE (ACS) URL, and the SINGLE
LOGOUT (SLO) URL to paste into the identity provider workflow.
Click Save.
Note
To configure a service provider, enter the Identity Provider’s metadata
URL to obtain its metadata. To access the URL, you may need to provide the
CA certificate that can verify the remote server.
To link group membership with users, use the Edit or
Create team dialog to associate SAML group assertion with the
MKE team to synchronize user team membership when the user log in.
From the MKE web UI you can download a client bundle with which you can access
MKE using the CLI and the API.
A client bundle is a group of certificates that enable command-line access and
API access to the software. It lets you authorize a remote Docker engine to
access specific user accounts that are managed in MKE, absorbing all associated
RBAC controls in the process. Once you obtain the client bundle, you can
execute Docker Swarm commands from your remote machine to take effect on the
remote cluster.
Previously-authorized client bundle users can still access MKE, regardless
of the newly configured SAML access controls.
Mirantis recomments that you take the following steps to ensure that access
from the client bundle is in sync with the identity provider, and to thus
prevent any previously-authorized users from accessing MKE through their
existing client bundle:
Remove the user account from MKE that grants the client bundle
access.
If group membership in the identity provider changes, replicate
the change in MKE.
Continue using LDAP to sync group membership.
To download the client bundle:
Log in to the MKE web UI.
In the navigation menu at the left, click the user name
drop-down to display the available options.
Click your account name to display the available options.
Click My Profile.
Click the New Client Bundle drop-down in the details pane and
select Generate Client Bundle.
(Optional) Enter a name for the bundle into the Label field.
System for Cross-domain Identity Management (SCIM) is a standard for automating
the exchange of user identity information between identity domains or IT
systems. It offers an LDAP alternative for provisioning and managing users
and groups in MKE, as well as for syncing users and groups with an upstream
identity provider. Using SCIM schema and API, you can utilize Single
sign-on services (SSO) across various tools.
Mirantis certifies the use of Okta 3.2.0, however MKE offers the discovery
endpoints necessary to provide any system or application with the product
SCIM configuration.
In the SCIM configuration subsection, either enter the API
token in the API Token field or click Generate
to have MKE generate a UUID.
The base URL for all SCIM API calls is
https://<HostIP>/enzi/v0/scim/v2/. All SCIM methods are accessible
API endpoints of this base URL.
Bearer Auth is the API authentication method. When configured, you access SCIM
API endpoints through the Bearer<token> HTTP Authorization request header.
Note
SCIM API endpoints are not accessible by any other user (or their
token), including the MKE administrator and MKE admin Bearer token.
The only SCIM method MKE supports is an HTTP authentication request header
that contains a Bearer token.
Returns a list of SCIM users (by default, 200 users per page).
Use the startIndex and count query parameters to paginate long lists of
users. For example, to retrieve the first 20 Users, set startIndex to 1
and count to 20, provide the following JSON request:
GET Host IP/enzi/v0/scim/v2/Users?startIndex=1&count=20
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8
The response to the previous query returns paging metadata that is similar to
the following example:
Reactivate inactive users by specifying "active":true. To deactivate
active users, specify "active":false. The value of the {id} should be
the user’s ID.
All attribute values are overwritten, including attributes for which empty
values or no values have been provided. If a previously set attribute value is
left blank during a PUT operation, the value is updated with a blank value
in accordance with the attribute data type and storage provider. The value of
the {id} should be the user’s ID.
Updates an existing group resource, allowing the addition or removal of
individual (or groups of) users from the group with a single operation. Add
is the default operation.
To remove members from a group, set the operation attribute of a member object
to delete.
Updates an existing group resource, overwriting all values for a group
even if an attribute is empty or is not provided.
PUT replaces all members of a group with members that are provided by way
of the members attribute. If a previously set attribute is left blank
during a PUT operation, the new value is set to blank in accordance with
the data type of the attribute and the storage provider.
Returns a JSON structure that describes the SCIM specification features
that are available on a service provider using a schemas attribute of
urn:ietf:params:scim:schemas:core:2.0:ServiceProviderConfig.
MKE integrates with LDAP directory services, so that you can
manage users and groups from your organization’s directory and it will
automatically propagate that information to MKE and MSR.
If you enable LDAP, MKE uses a remote directory server to create users
automatically, and all logins are forwarded to the directory server.
When you switch from built-in authentication to LDAP authentication, all
manually created users whose usernames don’t match any LDAP search
results are still available.
When you enable LDAP authentication, you can choose whether MKE creates
user accounts only when users log in for the first time. Select the
Just-In-Time User Provisioning option to ensure that the only LDAP
accounts that exist in MKE are those that have had a user log in to MKE.
Note
If you enable SAML integration, you cannot enable LDAP for authentication.
This does not affect local MKE user account authentication.
You control how MKE integrates with LDAP by creating searches for users.
You can specify multiple search configurations, and you can specify
multiple LDAP servers to integrate with. Searches start with the
BaseDN, which is the distinguished name of the node in the LDAP
directory tree where the search starts looking for users.
Access LDAP settings by navigating to the Authentication &
Authorization page in the MKE web interface. There are two sections
for controlling LDAP searches and servers.
LDAP user search configurations: This is the section of the
Authentication & Authorization page where you specify search
parameters, like BaseDN, scope, filter, the username
attribute, and the fullname attribute. These searches are stored
in a list, and the ordering may be important, depending on your
search configuration.
LDAP server: This is the section where you specify the URL of an
LDAP server, TLS configuration, and credentials for doing the search
requests. Also, you provide a domain for all servers but the first
one. The first server is considered the default domain server. Any
others are associated with the domain that you specify in the page.
Here’s what happens when MKE synchronizes with LDAP:
MKE creates a set of search results by iterating over each of the
user search configs, in the order that you specify.
MKE choses an LDAP server from the list of domain servers by
considering the BaseDN from the user search config and selecting
the domain server that has the longest domain suffix match.
If no domain server has a domain suffix that matches the BaseDN
from the search config, MKE uses the default domain server.
MKE combines the search results into a list of users and creates MKE
accounts for them. If the Just-In-Time User Provisioning option
is set, user accounts are created only when users first log in.
The domain server to use is determined by the BaseDN in each search
config. MKE doesn’t perform search requests against each of the domain
servers, only the one which has the longest matching domain suffix, or
the default if there’s no match.
Here’s an example. Let’s say we have three LDAP domain servers:
Domain
Server URL
default
ldaps://ldap.example.com
dc=subsidiary1,dc=com
ldaps://ldap.subsidiary1.com
dc=subsidiary2,dc=subsidiary1,dc=com
ldaps://ldap.subsidiary2.com
Here are three user search configs with the following BaseDNs:
baseDN=ou=people,dc=subsidiary1,dc=com
For this search config, dc=subsidiary1,dc=com is the only server
with a domain which is a suffix, so MKE uses the server
ldaps://ldap.subsidiary1.com for the search request.
For this search config, two of the domain servers have a domain which
is a suffix of this base DN, but
dc=subsidiary2,dc=subsidiary1,dc=com is the longer of the two, so
MKE uses the server ldaps://ldap.subsidiary2.com for the search
request.
baseDN=ou=eng,dc=example,dc=com
For this search config, there is no server with a domain specified
which is a suffix of this base DN, so MKE uses the default server,
ldaps://ldap.example.com, for the search request.
If there are username collisions for the search results between
domains, MKE uses only the first search result, so the ordering of the
user search configs may be important. For example, if both the first and
third user search configs result in a record with the username
jane.doe, the first has higher precedence and the second is ignored.
For this reason, it’s important to choose a username attribute
that’s unique for your users across all domains.
Because names may collide, it’s a good idea to use something unique to
the subsidiary, like the email address for each person. Users can log in
with the email address, for example, jane.doe@subsidiary1.com.
To configure MKE to create and authenticate users by using an LDAP
directory, go to the MKE web interface, navigate to the Admin
Settings page, and click Authentication & Authorization to select
the method used to create and authenticate users.
In the LDAP Enabled section, click Yes. Now configure your LDAP
directory integration.
Use this setting to change the default permissions of new users.
Click the drop-down menu to select the permission level that MKE assigns
by default to the private collections of new users. For example, if you
change the value to ViewOnly, all users who log in for the first
time after the setting is changed have ViewOnly access to their
private collections, but permissions remain unchanged for all existing
users.
The distinguished name of the LDAP account used for searching entries in
the LDAP server. As a best practice, this should be an LDAP read-only
user.
Reader password
The password of the account used for searching entries in the LDAP
server.
Use Start TLS
Whether to authenticate/encrypt the connection after connecting to the
LDAP server over TCP. If you set the LDAP Server URL field with
ldaps://, this field is ignored.
Skip TLS verification
Whether to verify the LDAP server certificate when using TLS. The
connection is still encrypted but vulnerable to man-in-the-middle
attacks.
No simple pagination
If your LDAP server doesn’t support pagination.
Just-In-Time User Provisioning
Whether to create user accounts only when users log in for the first
time. The default value of true is recommended. If you upgraded from
UCP 2.0.x, the default is false.
Note
LDAP connections using certificates created with TLS v1.2 do not
currently advertise support for sha512WithRSAEncryption in the TLS
handshake which leads to issues establishing connections with some
clients. Support for advertising sha512WithRSAEncryption will be
added in MKE 3.1.0.
Click Confirm to add your LDAP domain.
To integrate with more LDAP servers, click Add LDAP Domain.
The distinguished name of the node in the directory tree where the
search should start looking for users.
Username attribute
The LDAP attribute to use as username on MKE. Only user entries with a
valid username will be created. A valid username is no longer than 100
characters and does not contain any unprintable characters, whitespace
characters, or any of the following characters: /\[]:;|=,+*?<>'".
Full name attribute
The LDAP attribute to use as the user’s full name for display purposes.
If left empty, MKE will not create new users with a full name value.
Filter
The LDAP search filter used to find users. If you leave this field
empty, all directory entries in the search scope with valid username
attributes are created as users.
Search subtree instead of just one level
Whether to perform the LDAP search on a single level of the LDAP tree,
or search through the full LDAP tree starting at the Base DN.
Match Group Members
Whether to further filter users by selecting those who are also members
of a specific group on the directory server. This feature is helpful if
the LDAP server does not support memberOf search filters.
Iterate through group members
If SelectGroupMembers is selected, this option searches for users
by first iterating over the target group’s membership, making a separate
LDAP query for each member, as opposed to first querying for all users
which match the above search query and intersecting those with the set
of group members. This option can be more efficient in situations where
the number of members of the target group is significantly smaller than
the number of users which would match the above search filter, or if
your directory server does not support simple pagination of search
results.
Group DN
If SelectGroupMembers is selected, this specifies the
distinguished name of the group from which to select users.
Group Member Attribute
If SelectGroupMembers is selected, the value of this group
attribute corresponds to the distinguished names of the members of the
group.
To configure more user search queries, click Add LDAP User Search
Configuration again. This is useful in cases where users may be found
in multiple distinct subtrees of your organization’s directory. Any user
entry which matches at least one of the search configurations will be
synced as a user.
An LDAP username for testing authentication to this application. This
value corresponds with the Username Attribute specified in the
LDAP user search configurations section.
Password
The user’s password used to authenticate (BIND) to the directory server.
Before you save the configuration changes, you should test that the
integration is correctly configured. You can do this by providing the
credentials of an LDAP user, and clicking the Test button.
The interval, in hours, to synchronize users between MKE and the LDAP
server. When the synchronization job runs, new users found in the LDAP
server are created in MKE with the default permission level. MKE users
that don’t exist in the LDAP server become inactive.
Enable sync of admin users
This option specifies that system admins should be synced directly with
members of a group in your organization’s LDAP directory. The admins
will be synced to match the membership of the group. The configured
recovery admin user will also remain a system admin.
Once you’ve configured the LDAP integration, MKE synchronizes users
based on the interval you’ve defined starting at the top of the hour.
When the synchronization runs, MKE stores logs that can help you
troubleshoot when something goes wrong.
You can also manually synchronize users by clicking Sync Now.
When a user is removed from LDAP, the effect on the user’s MKE account
depends on the Just-In-Time User Provisioning setting:
Just-In-Time User Provisioning is false: Users deleted from
LDAP become inactive in MKE after the next LDAP synchronization runs.
Just-In-Time User Provisioning is true: Users deleted from
LDAP can’t authenticate, but their MKE accounts remain active. This
means that they can use their client bundles to run commands. To
prevent this, deactivate their MKE user accounts.
Data synced from your organization’s LDAP directory¶
MKE saves a minimum amount of user data required to operate. This
includes the value of the username and full name attributes that you
have specified in the configuration as well as the distinguished name of
each synced user. MKE does not store any additional data from the
directory server.
As of MKE 3.1.5, LDAP-specific GET and PUT API endpoints have
been added to the Config resource. Note that swarm mode has to be
enabled before you can hit the following endpoints:
GET/api/ucp/config/auth/ldap - Returns information on your
current system LDAP configuration.
PUT/api/ucp/config/auth/ldap - Lets you update your LDAP
configuration.
You can configure MKE to allow users to deploy and run services in worker
nodes only, to ensure that all cluster management functionality remains
performant and to enhance cluster security.
Important
If for whatever reason a user deploys a malicious service that can affect
the node on which it is running, that service will not be able to strike any
other nodes in the cluster or have any impact on cluster management
functionality.
Restrict services deployment to Swarm worker nodes¶
To keep manager nodes performant, it is necessary at times to restrict
service deployment to Swarm worker nodes.
To restrict services deployment to Swarm worker nodes:
Log in to the MKE web UI with administrator credentials.
Click the user name at the top of the navigation menu.
Navigate to Admin Settings > Orchestration.
Under Container Scheduling, toggle all of the sliders to the
left to restrict the deployment only to worker nodes.
Note
Creating a grant with the Scheduler role against the / collection
takes precedence over any other grants with NodeSchedule on
subcollections.
Restrict services deployment to Kubernetes worker nodes¶
By default, MKE clusters use Kubernetes taints and tolerations
to prevent user workloads from deploying to MKE manager or MSR nodes.
Note
Workloads deployed by an administrator in the kube-system namespace do
not follow scheduling constraints. If an administrator deploys a
workload in the kube-system namespace, a toleration is applied to bypass
the taint, and the workload is scheduled on all node types.
Schedule services deployment on manager and MSR nodes¶
Log in to the MKE web UI with administrator credentials.
Click the user name at the top of the navigation menu.
Navigate to Admin Settings > Orchestration.
Select from the following options:
Under Container Scheduling, toggle to the right the slider
for Allow administrators to deploy containers on MKE managers
or nodes running MSR.
Under Container Scheduling, toggle to the right the slider
for Allow all authenticated users, including service accounts,
to schedule on all nodes, including MKE managers and MSR nodes..
Following any scheduling action, MKE applies a toleration to new workloads, to
allow the Pods to be scheduled on all node types. For existing workloads,
however, it is necessary to manually add the toleration to the Pod
specification.
Add a toleration to the Pod specification for existing workloads¶
Add the following toleration to the Pod specification, either through the
MKE web UI or using the kubectl edit <object> <workload> command:
A NoSchedule taint is present on MKE manager and MSR nodes, and if you
disable scheduling on managers and/or workers a toleration for that taint
will not be applied to the deployments. As such, you should not schedule on
these nodes, except when the Kubernetes workload is deployed in the
kube-system namespace.
With MKE you can force applications to use only Docker images that are signed
by MKE users you trust. Every time a user attempts to deploy an application to
the cluster, MKE verifies that the application is using a trusted Docker
image. If a trusted Docker image is not in use, MKE halts the deployment.
By signing and verifying the Docker images, you ensure that the images in use
in your cluster are trusted and have not been altered, either in
the image registry or on their way from the image registry to your MKE cluster.
Example workflow
A developer makes changes to a service and pushes their changes to a
version control system.
A CI system creates a build, runs tests, and pushes an image to the Mirantis
Secure Registry (MSR) with the new changes.
The quality engineering team pulls the image, runs more tests, and signs
and pushes the image if the image is verified.
IT operations deploys the service, but only if the image in use is signed
by the QA team. Otherwise, MKE will not deploy.
To configure MKE to only allow running services that use Docker trusted
images:
Log in to the MKE web UI.
In the left-side navigation menu, click the user name drop-down to display
the available options.
Click Admin Settings > Docker Content Trust to reveal the
Content Trust Settings page.
Enable Run only signed images.
Important
At this point, MKE allows the deployment of any signed image, regardless
of signee.
(Optional) Make it necessary for the image to be signed by a particular
team or group of teams:
Click Add Team+ to reveal the two-part tool.
From the drop-down at the left, select an organization.
From the drop-down at the right, select a team belonging to the
organization you selected.
Repeat the procedure to configure additional teams.
Note
If you specify multiple teams, the image must be signed by
a member of each team, or someone who is a member of all of the
teams.
Click Save.
MKE immediately begins enforcing the image trust policy. Existing services
continue to run and you can restart them as necessary. From this point,
however, MKE only allows the deployment of new services that use a
trusted image.
MKE enables the setting of various user sessions properties, such as
session timeout and the permitted number of concurrent sessions.
To configure MKE login session properties:
Log in to the MKE web UI.
In the left-side navigation menu, click the user name drop-down to display
the available options.
Click Admin Settings > Authentication & Authorization to reveal
the MKE login session controls.
The following table offers information on the MKE login session controls:
Field
Description
Lifetime Minutes
The set duration of a login session in minutes, starting from the moment
MKE generates the session. MKE invalidates the active session once this
period expires and the user must re-authenticate to establish a
new session.
Default: 60
Minimum: 10
Renewal Threshold Minutes
The time increment in minutes by which MKE extends an active session
prior to session expiration. MKE extends the session by the amount
specified in Lifetime Minutes. The threshold value cannot be
greater than that set in Lifetime Minutes.
To specify that sessions not be extended, set the threshold value
to 0. Be aware, though, that this may cause MKE web
UI users to be unexpectedly logged out.
Default: 20
Maximum: 5 minutes less than Lifetime Minutes
Per User Limit
The maximum number of sessions a user can have running
simultaneously. If the creation of a new session results in the
exceeding of this limit, MKE will delete the session least recently put
to use. Specifically, every time you use a session token, the server
marks it with the current time (lastUsed metadata). When you create
a new session exceeds the per-user limit, the session
with the oldest lastUsed time is deleted, which is not necessarily
the oldest session.
To disable the Per User Limit setting, set the value to
0.
The MKE configuration file documentation is up-to-date for the latest MKE
3.3.x release. As such, if you are running an earlier version of MKE, you
may encounter detail for configuration options and parameters that are not
applicable to the version of MKE you are currently running.
Refer to the MKE Release Notes for specific
version-by-version information on MKE configuration file additions and
changes.
The configuring of an MKE cluster takes place through the application of a TOML
file. You use this file, the MKE configuration file, to import and export MKE
configurations, to both create new MKE instances and to modify existing ones.
Refer to example-config in the MKE CLI reference documentation
to learn how to download an example MKE configuration file.
Put the MKE configuration file to work for the following use cases:
Set the configuration file to run at the install time of new MKE clusters
Use the API to import the file back into the same cluster
Use the API to import the file into multiple clusters
To make use of an MKE configuration file, you edit the file using either the
MKE web UI or the command line interface (CLI). Using the CLI, you can either
export the existing configuration file for editing, or use the
example-config command to view and edit an example TOML MKE
configuration file.
Working as an MKE admin, use the config-toml API from within the directory
of your client certificate bundle to export the current MKE settings to a TOML
file.
As detailed herein, the command set exports the current configuration for the
MKE hostname MKE_HOST to a file named mke-config.toml:
To customize a new MKE instance using a configuration file, you must create the
file prior to installation. Then, once the new configuration file is ready, you
can configure MKE to import it during the installation process using Docker
Swarm.
To import a configuration file at installation:
Create a Docker Swarm Config object named com.docker.mke.config and
the TOML value of your MKE configuration file contents.
When installing MKE on the cluster, specify the --existing-config flag
to force the installer to use the new Docker Swarm Config object for its
initial configuration.
Following the installation, delete the com.docker.mke.config object.
The length of time, in minutes, before the expiration of a session
where, if used, a session will be extended by the current configured
lifetime from then. A value of 0 disables session extension.
Default: 20
per_user_limit
no
The maximum number of sessions that a user can have simultaneously
active. If creating a new session will put a user over this
limit, the least recently used session is deleted.
A value of 0 disables session limiting.
Default: 10
store_token_per_session
no
If set, the user token is stored in sessionStorage instead of
localStorage. Setting this option logs the user out and
requires that they log back in, as they are actively changing the manner
in which their authentication is stored.
An array of tables that specifies the MSR instances that are managed by the
current MKE instance.
Parameter
Required
Description
host_address
yes
Sets the address for connecting to the MSR instance tied to the MKE
cluster.
service_id
yes
Sets the MSR instance’s OpenID Connect Client ID, as registered with the
Docker authentication provider.
ca_bundle
no
Specifies the root CA bundle for the MSR instance if you are using a
custom certificate authority (CA). The value is a string with the
contents of a ca.pem file.
Specifies scheduling options and the default orchestrator for new nodes.
Note
If you run a kubectl command, such as kubectl describe
nodes, to view scheduling rules on Kubernetes nodes, the results that
present do not reflect the MKE admin settings conifguration. MKE uses taints
to control container scheduling on nodes and is thus unrelated to the
kubectlUnschedulable boolean flag.
Parameter
Required
Description
enable_admin_ucp_scheduling
no
Determines whether administrators can schedule containers on
manager nodes.
Valid values: true, false.
Default: false
You can also set the parameter using the MKE web UI:
Log in to the MKE web UI as an administrator.
Click the user name drop-down in the left-side navigation panel.
Click Admin Settings > Orchestration to view the
Orchestration screen.
Scroll down to the Container Scheduling section and
toggle on the Allow administrators to deploy containers on
MKE managers or nodes running MSR slider.
default_node_orchestrator
no
Sets the type of orchestrator to use for new nodes that join
the cluster.
Set to require the signing of images by content trust.
Valid values: true, false.
Default: false
You can also set the parameter using the MKE web UI:
Log in to the MKE web UI as an administrator.
Click the user name drop-down in the left-side navigation panel.
Click Admin Settings > Docker Content Trust to open the
Content Trust Settings screen.
Toggle on the Run only signed images slider.
require_signature_from
no
A string array that specifies which users or teams must sign images.
allow_repos
no
A string array that specifies repos that are to bypass content trust
check, for example, ["docker.io/mirantis/dtr-rethink","docker.io/mirantis/dtr-registry"....].
Configures the logging options for MKE components.
Parameter
Required
Description
protocol
no
The protocol to use for remote logging.
Valid values: tcp, udp.
Default: tcp
host
no
Specifies a remote syslog server to receive sent MKE controller logs. If
omitted, controller logs are sent through the default Docker daemon
logging driver from the ucp-controller container.
Set to enable attempted automatic license renewal when the
license nears expiration. If disabled, you must manually upload renewed
license after expiration.
Included when you need to set custom API headers. You can repeat this
section multiple times to specify multiple separate headers. If you
include custom headers, you must specify both name and value.
[[custom_api_server_headers]]
Item
Description
name
Set to specify the name of the custom header with name =
“X-Custom-Header-Name”.
value
Set to specify the value of the custom header with value = “Custom
Header Value”.
Configures the cluster that the current MKE instance manages.
The dns, dns_opt, and dns_search settings configure the DNS
settings for MKE components. These values, when assigned, override the
settings in a container /etc/resolv.conf file.
Parameter
Required
Description
controller_port
yes
Sets the port that the ucp-controller monitors.
Default: 443
kube_apiserver_port
yes
Sets the port the Kubernetes API server monitors.
swarm_port
yes
Sets the port that the ucp-swarm-manager monitors.
Default: 2376
swarm_strategy
no
Sets placement strategy for container scheduling. Be aware that this
does not affect swarm-mode services.
Valid values: spread, binpack, random.
dns
yes
Array of IP addresses that serve as nameservers.
dns_opt
yes
Array of options in use by DNS resolvers.
dns_search
yes
Array of domain names to search whenever a bare unqualified host name is
used inside of a container.
profiling_enabled
no
Determines whether specialized debugging endpoints are enabled for
profiling MKE performance.
Valid values: true, false.
Default: false
authz_cache_timeout
no
Sets the timeout in seconds for the RBAC information cache of MKE
non-Kubernetes resource listing APIs. Setting changes take immediate
effect and do not require a restart of the MKE controller.
Default: 0 (cache is not enabled)
Once you enable the cache, the result of non-Kubernetes resource listing
APIs only reflects the latest RBAC changes for the user when the
cached RBAC info times out.
kv_timeout
no
Sets the key-value store timeout setting, in milliseconds.
Default: 5000
kv_snapshot_count
Required
Sets the key-value store snapshot count.
Default: 20000
external_service_lb
no
Specifies an optional external load balancer for default links to
services with exposed ports in the MKE web interface.
cni_installer_url
no
Specifies the URL of a Kubernetes YAML file to use to install a
CNI plugin. Only applicable during initial installation. If left empty,
the default CNI plugin is put to use.
metrics_retention_time
no
Sets the metrics retention time.
metrics_scrape_interval
no
Sets the interval for how frequently managers gather metrics from nodes
in the cluster.
metrics_disk_usage_interval
no
Sets the interval for the gathering of storage metrics, an
operation that can become expensive when large volumes are present.
nvidia_device_pluginAvailable since MKE 3.4.6
no
Enables the nvidia-gpu-device-plugin, which is disabled by default.
rethinkdb_cache_size
no
Sets the size of the cache for MKE RethinkDB servers.
Default: 1GB
Leaving the field empty or specifying auto instructs
RethinkDB to automatically determine the cache size.
exclude_server_identity_headers
no
Determines whether the X-Server-Ip and X-Server-Name
headers are disabled.
Valid values: true, false.
Default: false
cloud_provider
no
Sets the cloud provider for the Kubernetes cluster.
pod_cidr
yes
Sets the subnet pool from which the IP for the Pod should be allocated
from the CNI IPAM plugin.
Default: 192.168.0.0/16
calico_mtu
no
Sets the maximum transmission unit (MTU) size for the
Calico plugin.
ipip_mtu
no
Sets the IPIP MTU size for the Calico IPIP tunnel interface.
azure_ip_count
yes
Sets the IP count for Azure allocator to allocate IPs per Azure virtual
machine.
service_cluster_ip_range
yes
Sets the subnet pool from which the IP for Services should be allocated.
Default: 10.96.0.0/16
nodeport_range
yes
Sets the port range for Kubernetes services within which the type
NodePort can be exposed.
Default: 32768-35535
custom_kube_api_server_flags
no
Sets the configuration options for the Kubernetes API server.
Be aware that this parameter function is only for development and
testing. Arbitrary Kubernetes configuration parameters are not tested
and supported under the MKE Software Support Agreement.
custom_kube_controller_manager_flags
no
Sets the configuration options for the Kubernetes controller manager.
Be aware that this parameter function is only for development and
testing. Arbitrary Kubernetes configuration parameters are not tested
and supported under the MKE Software Support Agreement.
custom_kubelet_flags
no
Sets the configuration options for kubelet.
Be aware that this parameter function is only for development and
testing. Arbitrary Kubernetes configuration parameters are not tested
and supported under the MKE Software Support Agreement.
custom_kube_scheduler_flags
no
Sets the configuration options for the Kubernetes scheduler.
Be aware that this arameter function is only for development and
testing. Arbitrary Kubernetes configuration parameters are not tested
and supported under the MKE Software Support Agreement.
local_volume_collection_mapping
no
Set to store data about collections for volumes in the MKE local KV
store instead of on the volume labels. The parameter is used to enforce
access control on volumes.
manager_kube_reserved_resources
no
Reserves resources for MKE and Kubernetes components that are
running on manager nodes.
worker_kube_reserved_resources
no
Reserves resources for MKE and Kubernetes components that are
running on worker nodes.
kubelet_max_pods
yes
Sets the number of Pods that can run on a node.
Maximum: 250
Default: 110
kubelet_pods_per_core
no
Sets the maximum number of Pods per core.
0 indicates that there is no limit on the number of Pods per core.
The number cannot exceed the kubelet_max_pods setting.
Recommended: 10
Default: 0
secure_overlay
no
Enables IPSec network encryption in Kubernetes.
Valid values: true, false.
Default: false
image_scan_aggregation_enabled
no
Enables image scan result aggregation. The feature displays image
vulnerabilities in shared resource/containers and shared
resources/images pages.
Valid values: true, false.
Default: false
swarm_polling_disabled
no
Determines whether auto-refresh is turned off (which defaults to 15
seconds). If set to true, the Swarm API is only called once.
Valid values: true, false.
Default: false
oidc_client_id
no
Sets the OIDC client ID, using the eNZi service ID that is in the ODIC
authorization flow.
hide_swarm_ui
no
Determines whether the UI is hidden for all Swarm-only object types (has
no effect on Admin Settings).
Valid values: true, false.
Default: false
You can also set the parameter using the MKE web UI:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, click the user name
drop-down.
Click Admin Settings > Tuning to open the
Tuning screen.
Toggle on the Hide Swarm Navigation slider located under
the Configure MKE UI heading.
unmanaged_cni
yes
Sets Calico as the CNI provider, managed by MKE. Note that Calico is the
default CNI provider.
kube_proxy_mode
yes
Sets the operational mode for kube-proxy.
Valid values: iptables, ipvs, disabled.
Default: iptables
cipher_suites_for_kube_api_server
no
Sets the value for the kube-apiserver--tls-cipher-suites
parameter.
cipher_suites_for_kubelet
no
Sets the value for the kubelet--tls-cipher-suites parameter.
cipher_suites_for_etcd_server
no
Sets the value for the etcd server --cipher-suites
parameter.
For detail on how to use the MKE web UI to scale your cluster, refer to
Join Linux nodes or Join Windows worker nodes, depending on which
operating system you use. In particular, these topics offer information on
adding nodes to a cluster and configuring node availability.
You can also use the command line to perform all scaling operations.
Scale operation
Command
Obtain the join token
Run the following command on a manager node to obtain the join token
that is required for cluster scaling. Use either worker or manager for the
<node-type>:
dockerswarmjoin-token<node-type>
Configure a custom listen address
Specify the address and port where the new node listens for inbound
cluster management traffic:
Mirantis Kubernetes Engine (MKE) 3.2.5 adds support for a Key
Management Service (KMS) plugin to allow access to third-party secrets
management solutions, such as Vault. This plugin is used by MKE for
access from Kubernetes clusters.
KMS must be deployed before a machine becomes a MKE manager or it may be
considered unhealthy. MKE will not health check, clean up, or otherwise
manage the KMS plugin.
KMS plugin configuration should be done through MKE. MKE will maintain
ownership of the Kubernetes EncryptionConfig file, where the KMS plugin
is configured for Kubernetes. MKE does not currently check this file’s
contents after deployment.
MKE adds new configuration options to the cluster configuration table.
These options are not exposed through the web UI, but can be configured
via the API.
The following table shows the configuration options for the KMS plugin.
These options are not required.
Parameter
Type
Description
kms_enabled
bool
Determines if MKE should configure a KMS plugin.
kms_name
string
Name of the KMS plugin resource (for example, “vault”).
kms_endpoint
string
Path of the KMS plugin socket. This path must refer to a UNIX socket on
the host (for example, “/tmp/socketfile.sock”). MKE will bind mount this
file to make it accessible to the API server.
kms_cachesize
int
Number of data encryption keys (DEKs) to be cached in the clear.
Mirantis Kubernetes Engine (MKE) can use your local networking drivers to
orchestrate your cluster. You can create a config network, with a
driver like MAC VLAN, and you use it like any other named network in
MKE. If it’s set up as attachable, you can attach containers.
Warning
Encrypting communication between containers on different nodes works
only on overlay networks.
Always use MKE to create node-specific networks. You can use the MKE web
UI or the CLI (with an admin bundle). If you create the networks without
MKE, the networks won’t have the right access labels and won’t be
available in MKE.
In the Macvlan Configure section, select the configuration
option. Create all of the config-only networks before you create the
config-from network.
Config Only: Prefix the config-only network name with a
node hostname prefix, like node1/my-cfg-network,
node2/my-cfg-network, etc. This is necessary to ensure that
the access labels are applied consistently to all of the back-end
config-only networks. MKE routes the config-only network creation
to the appropriate node based on the node hostname prefix. All
config-only networks with the same name must belong in the same
collection, or MKE returns an error. Leaving the access label
empty puts the network in the admin’s default collection, which is
/ in a new MKE installation.
Config From: Create the network from a Docker config. Don’t
set up an access label for the config-from network. The labels of
the network and its collection placement are inherited from the
related config-only networks.
To ensure all communications between clients and MKE are encrypted, all MKE
services are exposed using HTTPS. By default, this is done using
self-signed TLS certificates that are not trusted by client tools such as
web browsers. Thus, when you try to access MKE, your browser warns that it
does not trust MKE or that MKE has an invalid certificate.
You can configure MKE to use your own TLS certificates. As a result, your
browser and other client tools will trust your MKE installation.
Mirantis recommends that you make this change outside of peak business hours.
Your applications will continue to run normally, but existing MKE client
certificates will become invalid, and thus users will have to download new
certificates to access MKE from the CLI.
To configure MKE to use your own TLS certificates and keys:
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to
<user name> > Admin Settings > Certificates.
Upload your certificates and keys based on the following table.
Note
All keys and certificates must be uploaded in PEM format.
Type
Description
Private key
The unencrypted private key for MKE. This key must correspond to the
public key used in the server certificate. This key does not use a
password.
Click Upload Key to upload a PEM file.
Server certificate
The MKE public key certificate, which establishes a chain of trust up
to the root CA certificate. It is followed by the certificates of any
intermediate certificate authorities.
Click Upload Certificate to upload a PEM file.
CA certificate
The public key certificate of the root certificate authority that
issued the MKE server certificate. If you do not have a CA
certificate, use the top-most intermediate certificate instead.
Click Upload CA Certificate to upload a PEM file.
Client CA
This field may contain one or more Root CA certificates that the MKE
controller uses to verify that client certificates are issued by a
trusted entity.
Click Upload CA Certificate to upload a PEM file.
Click Download MKE Server CA Certificate to download the
certificate as a PEM file.
Note
MKE is automatically configured to trust its internal CAs, which
issue client certificates as part of generated client bundles.
However, you may supply MKE with additional custom root CA
certificates using this field to enable MKE to trust the client
certificates issued by your corporate or trusted third-party
certificate authorities. Note that your custom root certificates
will be appended to MKE internal root CA certificates.
Click Save.
After replacing the TLS certificates, your users will not be able to
authenticate with their old client certificate bundles. Ask your users
to access the MKE web UI and download new client certificate
bundles.
Mirantis offers its own image registry, Mirantis Secure Registry (MSR), which
you can use to store and manage the images that you deploy to your cluster.
This topic describes how to use MKE to push the official WordPress image to MSR
and later deploy that image to your cluster.
To create an MSR image repository:
Log in to the MKE web UI.
From the left-side navigation panel, navigate to
<user name> > Admin Settings > Mirantis Secure Registry.
In the Installed MSRs section, capture the MSR URL for your
cluster.
In a new browser tab, navigate to the MSR URL captured in the previous step.
From the left-side navigation panel, click Repositories.
Click New repository.
In the namespace field under New Repository, select
the required namespace. The default namespace is your user name.
In the name field under New Repository, enter the
name wordpress.
To create the repository, click Save.
To push an image to MSR:
In this example, you will pull the official WordPress image from Docker Hub,
tag it, and push it to MSR. Once pushed to MSR, only authorized users will
be able to make changes to the image. Pushing to MSR requires CLI access to
a licensed MSR installation.
Pull the public WordPress image from Docker Hub:
dockerpullwordpress
Tag the image, using the IP address or DNS name of your MSR instance. For
example:
The Deployment object YAML specifies your MSR image in the Pod
template spec: image:<msr-url>:<port>/admin/wordpress:latest. Also,
the YAML file defines a NodePort service that exposes the WordPress
application so that it is accessible from outside the cluster.
Click Create. Creating the new Kubernetes objects will open the
Controllers page.
After a few seconds, verify that wordpress-deployment has a
green status icon and is thus successfully deployed.
When you add a node to your cluster, by default its workloads are managed
by Swarm. Changing the default orchestrator does not affect existing nodes
in the cluster. You can also change the orchestrator type for individual
nodes in the cluster.
The workloads on your cluster can be scheduled by Kubernetes, Swarm, or a
combination of the two. If you choose to run a mixed cluster, be aware that
different orchestrators are not aware of each other, and thus there is no
coordination between them.
Mirantis recommends that you decide which orchestrator you will use when
initially setting up your cluster. Once you start deploying workloads, avoid
changing the orchestrator setting. If you do change the node orchestrator,
your workloads will be evicted and you will need to deploy them again using the
new orchestrator.
Caution
When you promote a worker node to be a manager, its orchestrator type
automatically changes to Mixed. If you later demote that node to be
a worker, its orchestrator type remains as Mixed.
Note
The default behavior for Mirantis Secure Registry (MSR) nodes is to run in
the Mixed orchestration mode. If you change the MSR orchestrator type to
Swarm or Kubernetes only, reconciliation will revert the node back to the
Mixed mode.
When you change the node orchestrator, existing workloads are
evicted and they are not automatically migrated to the new orchestrator.
You must manually migrate them to the new orchestrator. For example, if you
deploy WordPress on Swarm, and you change the node orchestrator to
Kubernetes, MKE does not migrate the workload, and WordPress continues
running on Swarm. You must manually migrate your WordPress deployment to
Kubernetes.
The following table summarizes the results of changing a node
orchestrator.
Workload
Orchestrator-related change
Containers
Containers continue running on the node.
Docker service
The node is drained and tasks are rescheduled to another node.
Pods and other imperative resources
Imperative resources continue running on the node.
Deployments and other declarative resources
New declarative resources will not be scheduled on the node and
existing ones will be rescheduled at a time that can vary based on
resource details.
If a node is running containers and you change the node to Kubernetes,
the containers will continue running and Kubernetes will not be aware of
them. This is functionally the same as running the node in the Mixed mode.
Warning
The Mixed mode is not intended for production use and it may impact
the existing workloads on the node.
This is because the two orchestrator types have different views of
the node resources and they are not aware of the other orchestrator
resources. One orchestrator can schedule a workload without knowing
that the node resources are already committed to another workload
that was scheduled by the other orchestrator. When this happens, the
node can run out of memory or other resources.
Mirantis strongly recommends against using the Mixed mode in production
environments.
To change the node orchestrator using the MKE web UI:
Log in to the MKE web UI as an administrator.
From the left-side navigation panel, navigate to
Shared Resources > Nodes.
Click the node that you want to assign to a different orchestrator.
In the upper right, click the Edit Node icon.
In the Details pane, in the Role section under
ORCHESTRATOR TYPE, select either Swarm,
Kubernetes, or Mixed.
Warning
Mirantis strongly recommends against using the Mixed mode in
production environments.
Click Save to assign the node to the selected orchestrator.
To change the node orchestrator using the CLI:
Set the orchestrator on a node by assigning the orchestrator labels,
com.docker.ucp.orchestrator.swarm or
com.docker.ucp.orchestrator.kubernetes to true.
Change the node orchestrator. Select from the following options:
MKE administrators can filter the view of Kubernetes objects by the
namespace that the objects are assigned to, specifying a single namespace
or all available namespaces. This topic describes how to deploy services to two
newly created namespaces and then view those services, filtered by namespace.
To create two namespaces:
Log in to the MKE web UI as an administrator.
From the left-side navigation panel, click Kubernetes.
Click Create to open the Create Kubernetes Object
page.
Leave the Namespace drop-down blank.
In the Object YAML editor, paste the following YAML code:
Click Create to deploy the service in the green
namespace.
To view the newly created services:
In the left-side navigation panel, click Namespaces.
In the upper-right corner, click the Set context for all
namespaces toggle. The indicator in the left-side navigation panel under
Namespaces changes to All Namespaces.
Click Services to view your services.
Filter the view by namespace:
In the left-side navigation panel, click Namespaces.
Hover over the blue namespace and click Set Context.
The indicator in the left-side navigation panel under
Namespaces changes to blue.
Click Services to view the app-service-blue service.
Note that the app-service-green service does not display.
Perform the forgoing steps on the green namespace to view only the
services deployed in the green namespace.
MKE is designed to facilitate high availability (HA). You can join multiple
manager nodes to the cluster, so that if one manager node fails, another one
can automatically take its place without impacting the cluster.
Including multiple manager nodes in your cluster allows you to handle manager
node failures and load-balance user requests across all manager nodes.
The following table exhibits the relationship between the number of manager
nodes used and the number of faults that your cluster can tolerate:
Manager nodes
Failures tolerated
1
0
3
1
5
2
For deployment into product environments, follow these best practices:
For HA with minimal network overhead, Mirantis recommends using three manager
nodes and a maximum of five. Adding more manager nodes than this can lead to
performance degradation, as configuration changes must be replicated across
all manager nodes.
You should bring failed manager nodes back online as soon as possible, as
each failed manager node decreases the number of failures that your cluster
can tolerate.
You should distribute your manager nodes across different availability
zones. This way your cluster can continue working even if an entire
availability zone goes down.
MKE allows you to add or remove nodes from your cluster as your needs change
over time.
Because MKE leverages the clustering functionality provided by
Mirantis Container Runtime (MCR), you use the docker swarm join
command to add more nodes to your cluster. When you join a new node, MCR
services start running on the node automatically.
You can add both Linux manager and
worker nodes to your cluster.
You can promote worker nodes to managers to make MKE fault tolerant. You can
also demote a manager node into a worker node.
Log in to the MKE web UI.
In the left-side navigation panel, navigate to
Shared Resources > Nodes and select the required node.
In the upper right, select the Edit Node icon.
In the Role section, click Manager or
Worker.
Click Save and wait until the operation completes.
Navigate to Shared Resources > Nodes and verify the new node
role.
Note
If you are load balancing user requests to MKE across multiple manager
nodes, you must remove these nodes from the load-balancing pool when
demoting them to workers.
MKE allows you to add or remove nodes from your cluster as your needs change
over time.
Because MKE leverages the clustering functionality provided by
Mirantis Container Runtime (MCR), you use the docker swarm join
command to add more nodes to your cluster. When you join a new node, MCR
services start running on the node automatically.
The following features are not yet supported using Windows Server:
Category
Feature
Networking
Encrypted networks are not supported. If you have upgraded from a
previous version of MKE, you will need to recreate an unencrypted
version of the ucp-hrm network.
Secrets
When using secrets with Windows services, Windows stores temporary
secret files on your disk. You can use BitLocker on the volume
containing the Docker root directory to encrypt the secret data at
rest.
When creating a service that uses Windows containers, the options
to specify UID, GID, and mode are not supported for secrets.
Secrets are only accessible by administrators and users with system
access within the container.
Mounts
On Windows, Docker cannot listen on a Unix socket. Use TCP or a
named pipe instead.
If the cluster is deployed in a site that is offline, sideload MKE images
onto the Windows Server nodes. For more information, refer to
Install MKE offline.
On a manager node, list the images that are required on Windows nodes:
After joining multiple manager nodes for high availability (HA), you can
configure your own load balancer to balance user requests across all
manager nodes.
Use of a load balancer allows users to access MKE using a centralized domain
name. The load balancer can detect when a manager node fails and stop
forwarding requests to that node, so that users are unaffected by the failure.
By default, both MKE and Mirantis Secure Registry (MSR) use port 443. If you
plan to deploy MKE and MSR together, your load balancer must
distinguish traffic between the two by IP address or port number.
If you want MKE and MSR both to use port 443, then you must either use separate
load balancers for each or use two virtual IPs. Otherwise, you must configure
your load balancer to expose MKE or MSR on a port other than 443.
Improve network performance with Route Reflectors¶
MKE uses Calico as the default Kubernetes networking solution,
configured to create a BGP mesh between all nodes in the cluster.
Note
MKE deployments running on Microsoft Azure use Azure SDN for multi-host
networking, rather than Calico.
Networking performance decreases as you add more nodes to the cluster. If
your cluster has more than 100 nodes, you should reconfigure Calico to use
Route Reflectors rather than a node-to-node mesh.
For production-grade systems, deploy at least two Route Reflectors, with each
running on a dedicated node that is not running any other workloads.
If Route Reflectors are running on a same node as other workloads, swarm
ingress and NodePorts may not work in these workloads. As such, Mirantis
suggests that you taint the nodes to ensure that they do not run other
workloads.
Create a calico-rr.yaml file with the following content:
kind:DaemonSetapiVersion:apps/v1metadata:name:calico-rrnamespace:kube-systemlabels:app:calico-rrspec:updateStrategy:type:RollingUpdateselector:matchLabels:k8s-app:calico-rrtemplate:metadata:labels:k8s-app:calico-rrannotations:scheduler.alpha.kubernetes.io/critical-pod:''spec:tolerations:-key:com.docker.ucp.kubernetes.calico/route-reflectorvalue:"true"effect:NoSchedulehostNetwork:truecontainers:-name:calico-rrimage:calico/routereflector:v0.6.1env:-name:ETCD_ENDPOINTSvalueFrom:configMapKeyRef:name:calico-configkey:etcd_endpoints-name:ETCD_CA_CERT_FILEvalueFrom:configMapKeyRef:name:calico-configkey:etcd_ca# Location of the client key for etcd.-name:ETCD_KEY_FILEvalueFrom:configMapKeyRef:name:calico-configkey:etcd_key# Location of the client certificate for etcd.-name:ETCD_CERT_FILEvalueFrom:configMapKeyRef:name:calico-configkey:etcd_cert-name:IPvalueFrom:fieldRef:fieldPath:status.podIPvolumeMounts:-mountPath:/calico-secretsname:etcd-certssecurityContext:privileged:truenodeSelector:com.docker.ucp.kubernetes.calico/route-reflector:"true"volumes:# Mount in the etcd TLS secrets.-name:etcd-certssecret:secretName:calico-etcd-secrets
To reconfigure Calico to use Route Reflectors rather than a node-to-node mesh,
you must direct calicoctl to the etcd key-value store that is
managed by MKE. Use the CLI with an MKE client bundle to create a shell alias
to start calicoctl using the mirantis/ucp-dsinfo
image:
To ensure that there are no instances in which pods and Route Reflectors are
running on the same node, use your MKE client bundle to manually delete any
calico-node pods that are running on nodes dedicated to Route Reflectors.
Two-factor authentication (2FA) adds an extra layer of security when logging
in to the MKE web UI. Once enabled, 2FA requires the user to submit an
additional authentication code generated on a separate mobile device along
with their user name and password at login.
MKE 2FA requires the use of a time-based one-time password (TOTP)
application installed on a mobile device to generate a time-based
authentication code for each login to the MKE web UI. Examples of such
applications include 1Password,
Authy, and
LastPass Authenticator.
To configure 2FA:
Install a TOTP application to your mobile device.
In the MKE web UI, navigate to My Profile > Security.
Toggle the Two-factor authentication control to
enabled.
Open the TOTP application and scan the offered QR code. The device will
display a six-digit code.
Enter the six-digit code in the offered field and click
Register. The TOTP application will save your MKE account.
Important
A set of recovery codes displays in the MKE web UI when two-factor
authentication is enabled. Save these codes in a safe location, as they
can be used to access the MKE web UI if for any reason the
configured mobile device becomes unavailable. Refer to
Recover 2FA for details.
Once 2FA is enabled, you will need to provide an authentication code each time
you log in to the MKE web UI. Typically, the TOTP application installed on your
mobile device generates the code and refreshes it every 30 seconds.
Access the MKE web UI with 2FA enabled:
In the MKE web UI, click Sign in. The Sign in page
will display.
Enter a valid user name and password.
Access the MKE code in the TOTP application on your mobile device.
Enter the current code in the 2FA Code field in the MKE web UI.
Note
Multiple authentication failures may indicate a lack of synchronization between the mobile device clock and the mobile provider.
If the mobile device with authentication codes is unavailable, you can
re-access MKE using any of the recovery codes that display in the MKE web UI
when 2FA is first enabled.
To recover 2FA:
Enter one of the recovery codes when prompted for the two-factor
authentication code upon login to the MKE web UI.
Navigate to My Profile > Security.
Disable 2FA and then re-enable it.
Open the TOTP application and scan the offered QR code. The device will
display a six-digit code.
Enter the six-digit code in the offered field and click
Register. The TOTP application will save your MKE account.
If there are no recovery codes to draw from, ask your system administrator to
disable 2FA in order to regain access to the MKE web UI. Once done, repeat the
Configure 2FA procedure to reinstate 2FA
protection.
MKE administrators are not able to re-enable 2FA for users.
When migrating manager Nodes, Mirantis recommends that you replace one manager
Node at a time, to preserve fault tolerance and minimize performance impact.
MKE allows administrators to authorize users to view, edit, and use cluster
resources by granting role-based permissions for specific resource sets. This
section describes how to configure all the relevant components of role-based
access control (RBAC).
Mirantis Kubernetes Engine (MKE) lets you authorize users to view, edit, and
use cluster resources by granting role-based permissions against resource sets.
To authorize access to cluster resources across your organization, MKE
administrators might take the following high-level steps:
Add and configure subjects (users, teams, and service accounts).
Define custom roles (or use defaults) by adding permitted
operations per type of resource.
Group cluster resources into resource sets of Swarm collections
or Kubernetes namespaces.
Create grants by combining subject + role + resource set.
A subject represents a user, team, organization, or a service account. A
subject can be granted a role that defines permitted operations against
one or more resource sets.
User: A person authenticated by the authentication backend. Users
can belong to one or more teams and one or more organizations.
Team: A group of users that share permissions defined at the team
level. A team can be in one organization only.
Organization: A group of teams that share a specific set of
permissions, defined by the roles of the organization.
Service account: A Kubernetes object that enables a workload to
access cluster resources which are assigned to a namespace.
Roles define what operations can be done by whom. A role is a set of
permitted operations against a type of resource, like a container or
volume, which is assigned to a user or a team with a grant.
For example, the built-in role, Restricted Control, includes
permissions to view and schedule nodes but not to update nodes. A custom
DBA role might include permissions to r-w-x (read, write, and
execute) volumes and secrets.
Most organizations use multiple roles to fine-tune the appropriate
access. A given team or user may have different roles provided to them
depending on what resource they are accessing.
To control user access, cluster resources are grouped into Docker Swarm
collections or Kubernetes namespaces.
Swarm collections: A collection has a directory-like structure
that holds Swarm resources. You can create collections in MKE by
defining a directory path and moving resources into it. Also, you can
create the path in MKE and use labels in your YAML file to assign
application resources to the path. Resource types that users can
access in a Swarm collection include containers, networks, nodes,
services, secrets, and volumes.
Kubernetes namespaces: A
namespace is a logical area for a Kubernetes cluster. Kubernetes comes with
a default namespace for your cluster objects, plus two more
namespaces for system and public resources. You can create custom
namespaces, but unlike Swarm collections, namespaces cannot be nested.
Resource types that users can access in a Kubernetes namespace include pods,
deployments, network policies, nodes, services, secrets, and many more.
Together, collections and namespaces are named resource sets.
A grant is made up of a subject, a role, and a resource set.
Grants define which users can access what resources in what way. Grants
are effectively Access Control Lists (ACLs) which provide
comprehensive access policies for an entire organization when grouped
together.
Only an administrator can manage grants, subjects, roles, and access to
resources.
Note
An administrator is a user who creates subjects, groups resources by
moving them into collections or namespaces, defines roles by
selecting allowable operations, and applies grants to users and
teams.
For cluster security, only MKE admin users and service accounts that are
granted the cluster-admin ClusterRole for all Kubernetes namespaces
via a ClusterRoleBinding can deploy pods with privileged options. This
prevents a platform user from being able to bypass the Universal Control
Plane Security Model.
These privileged options include:
Pods with any of the following defined in the Pod Specification:
PodSpec.hostIPC - Prevents a user from deploying a pod in the
host’s IPC Namespace.
PodSpec.hostNetwork - Prevents a user from deploying a pod in the
host’s Network Namespace.
PodSpec.hostPID - Prevents a user from deploying a pod in the
host’s PID Namespace.
SecurityContext.allowPrivilegeEscalation - Prevents a child
process of a container from gaining more privileges than its parent.
SecurityContext.capabilities - Prevents additional Linux
Capabilities from being added to a pod.
SecurityContext.privileged - Prevents a user from deploying a
Privileged Container.
Volume.hostPath - Prevents a user from mounting a path from the
host into the container. This could be a file, a directory, or even
the Docker Socket.
Persistent Volumes using the following storage classes:
Local - Prevents a user from creating a persistent volume with
the Local Storage Class. The Local storage class allows a user to mount
directorys from the host into a pod. This could be a file, a directory, or
even the Docker Socket.
Note
If an admin has created a persistent volume with the local storage
class, a non-admin could consume this via a persistent volume claim.
If a user without a cluster admin role tries to deploy a pod with any of
these privileged options, an error similar to the following example is
displayed:
This topic describes how to create organizations, teams, and users.
Note
Individual users can belong to multiple teams but a team can belong to
only one organization.
New users have a default permission level that you can extend by adding
the user to a team and creating grants. Alternatively, you can make the
user an administrator to extend their permission level.
All users are authenticated on the back end. MKE provides built-in
authentication and also integrates with LDAP directory services. To use
MKE built-in authentication, you must create users manually.
Log in to the MKE web UI and perform the following steps:
To create an organization:
Navigate to Access Control > Orgs & Teams > Create.
Enter an organization name and click Create.
To create a team in the organization:
Navigate to the required organization and click the plus sign in the top
right corner.
Enter a team name and description and click Create.
To create a user:
Navigate to Access Control > Users > Create.
Enter a user name, password, and the user’s full name.
Optional. Select IS A MIRANTIS KUBERNETES ENGINE ADMIN to give
the user administrator privileges.
Click Create.
To add an existing user to a team:
Navigate to the required team and click the plus sign in the top right
corner.
Select the users you want to include and click Add Users.
Choose between the following two methods for matching group members from an
LDAP directory. Refer to the table below for more information.
Select LDAP MATCH METHOD to change the method for
matching group members in the LDAP directory from
Match Search Results (default) to
Match Group Members. Fill out Group DN and
Group Member Attribute as required.
Keep the default Match Search Results method and fill out
Search Base DN, Search filter, and
Search subtree instead of just one level as required.
Optional. Select Immediately Sync Team Members to run an LDAP
sync operation immediately after saving the configuration for the team.
Click Create.
There are two methods for matching group members from an LDAP directory:
Bind method
Description
Match Group Members (direct bind)
Specifies that team members are synced directly with
members of a group in the LDAP directory of your organization. The team
membership is synced to match the membership of the group.
Group DN
The distinguished name of the group from which you select users.
Group Member Attribute
The value of this group attribute corresponds to the distinguished
names of the members of the group.
Match Search Results (search bind)
Specifies that team members are synced using a search
query against the LDAP directory of your organization. The team
membership is synced to match the users in the search results.
Search Base DN
The distinguished name of the node in the directory tree where the
search starts looking for users.
Search filter
Filter to find users. If empty, existing users in the search scope are
added as members of the team.
Search subtree instead of just one level
Defines search through the full LDAP tree, not just one level, starting
at the base DN.
You can define custom roles or use the following built-in roles:
Role
Description
None
Users have no access to Swarm or Kubernetes resources. Maps to NoAccess role in UCP 2.1.x.
ViewOnly
Users can view resources but can’t create them.
RestrictedControl
Users can view and edit resources but can’t run a service or container
in a way that affects the node where it’s running. Users cannot mount a
node directory, exec into containers, or run containers in privileged
mode or
with additional kernel capabilities.
Scheduler
Users can view nodes (worker and manager) and schedule (not view)
workloads on these nodes. By default, all users are granted the
Scheduler role against the /Shared collection. (To view
workloads, users need permissions such as ContainerView).
FullControl
Users can view and edit all granted resources. They can create
containers without any restriction, but can’t see the containers of other users.
When creating custom roles to use with Swarm, the Roles page lists
all default and custom roles applicable in the organization.
You can give a role a global name, such as “Remove Images”, which might
enable the Remove and Force Remove operations for images. You
can apply a role with the same name to different resource sets.
Click Roles under Access Control.
Click Create Role.
Enter the role name on the Details page.
Click Operations. All available API operations are displayed.
Select the permitted operations per resource type.
This section describes the set of operations (calls) that can be
executed to the Swarm resources. Be aware that each permission
corresponds to a CLI command and enables the user to execute that
command.
Operation
Command
Description
Config
dockerconfig
Manage Docker configurations. See child commands for specific examples.
Container
dockercontainer
Manage Docker containers. See child commands for specific examples.
Container
dockercontainercreate
Create a new container. See extended
description and examples for more information.
Container
dockercreate[OPTIONS]IMAGE[COMMAND][ARG...]
Create new
containers. See extended description and examples for more information.
Container
dockerupdate[OPTIONS]CONTAINER[CONTAINER...]
Update configuration
of one or more containers. Using this command can also prevent
containers from consuming too many resources from their Docker host.
See extended description and examples for more information.
Container
dockerrm[OPTIONS]CONTAINER[CONTAINER...]
Remove one or more containers. See options and examples for more
information.
Image
dockerimageCOMMAND
Remove one or more containers. See options and examples for more
information.
Image
dockerimageremove
Remove one or more images. See child commands for examples.
Network
dockernetwork
Manage networks. You can use child commands to create, inspect, list,
remove, prune, connect, and disconnect networks.
Node
dockernodeCOMMAND
Manage Swarm nodes. See child commands for examples.
Secret
dockersecretCOMMAND
Manage Docker secrets. See child commands for sample usage and options.
Service
dockerserviceCOMMAND
Manage services. See child commands for sample usage and options.
Volume
dockervolumecreate[OPTIONS][VOLUME]
Create a new volume that containers can consume and store data in. See
examples for more information.
Volume
dockervolumerm[OPTIONS]VOLUME[VOLUME...]
Remove one or more volumes. Users cannot remove a volume that is in use
by a container. See related commands for more information.
MKE enables access control to cluster resources by grouping them into two types
of resource sets: Swarm collections (for Swarm workloads) and Kubernetes
namespaces (for Kubernetes workloads). Refer to Access control model for
a description of the difference between Swarm collections and Kubernetes
namespaces. Administrators use grants to combine resources sets, giving users
permission to access specific cluster resources.
Users assign resources to collections with labels. The following resource types
have editable labels and thus you can assign them to collections: services,
nodes, secrets, and configs. For these resources types, change
com.docker.ucp.access.label to move a resource to a different collection.
Collections have generic names by default, but you can assign them meaningful
names as required (such as dev, test, and prod).
Note
The following resource types do not have editable labels and thus you cannot
assign them to collections: containers, networks, and volumes.
Groups of resources identified by a shared label are called stacks. You can
place one stack of resources in multiple collections. MKE automatically places
resources in the default collection. Users can change this using a specific
com.docker.ucp.access.label in the stack/compose file.
The system uses com.docker.ucp.collection.* to enable efficient
resource lookup. You do not need to manage these labels, as MKE controls them
automatically. Nodes have the following labels set to true by
default:
Each user has a default collection, which can be changed in the MKE
preferences.
To deploy resources, they must belong to a collection. When a user deploys
a resource without using an access label to specify its collection, MKE
automatically places the resource in the default collection.
Default collections are useful for the following types of users:
Users who work only on a well-defined portion of the system
Users who deploy stacks but do not want to edit the contents of their
compose files
Custom collections are appropriate for users with more complex roles in the
system, such as administrators.
Note
For those using Docker Compose, the system applies default collection labels
across all resources in the stack unless you explicitly set
com.docker.ucp.access.label.
MKE administrators create grants to control how users and organizations access
resource sets. A grant defines user permissions to access resources. Each grant
associates one subject with one role and one resource set. For example, you can
grant the ProdTeamRestrictedControl over services in the
/Production collection.
The following is a common workflow for creating grants:
create-manually.
Define custom roles (or use defaults) by adding
permitted API operations per type of resource.
Create grants by combining subject, role, and resource set.
Note
This section assumes that you have created the relevant objects for the
grant, including the subject, role, and resource set (Kubernetes namespace
or Swarm collection).
To create a Kubernetes grant:
Log in to the MKE web UI.
Navigate to Access Control > Grants.
Select the Kubernetes tab and click
Create Role Binding.
Under Subject, select Users,
Organizations, or Service Account.
For Users, select the user from the pull-down menu.
For Organizations, select the organization and, optionally,
the team from the pull-down menu.
For Service Account, select the namespace and service account
from the pull-down menu.
Click Next to save your selections.
Under Resource Set, toggle the switch labeled Apply
Role Binding to all namespaces (Cluster Role Binding).
Click Next.
Under Role, select a cluster role.
Click Create.
To create a Swarm grant:
Log in to the MKE web UI.
Navigate to Access Control > Grants.
Select the Swarm tab and click Create Grant.
Under Subject, select Users or
Organizations.
For Users, select a user from the pull-down menu.
For Organizations, select the organization and, optionally,
the team from the pull-down menu.
Click Next to save your selections.
Under Resource Set, click View Children until the
required collection displays.
Click Select Collection next to the required collection.
Click Next.
Under Role, select a role type from the drop-down menu.
Click Create.
Note
MKE places new users in the docker-datacenter organization by default.
To apply permissions to all MKE users, create a grant with the
docker-datacenter organization as a subject.
By default, only administrators can pull images into a cluster managed by
MKE. This topic describes how to give non-administrator users permission
to pull images.
Images are always in the swarm collection, as they are a shared resource.
Grant users the ImageCreate permission for the Swarm collection to
allow them to pull images.
To grant a user permission to pull images:
Log in to the MKE web UI as an administrator.
Navigate to Access Control > Roles.
Select the Swarm tab and click Create.
On the Details tab, enter Pullimages for the role name.
On the Operations tab, select Image Create from the
IMAGE OPERATIONS drop-down.
Click Create.
Navigate to Access Control > Grants.
Select the Swarm tab and click Create Grant.
Under Subject, click Users and select the required
user from the drop-down.
Click Next.
Under Resource Set, select the Swarm collection and
click Next.
Under Role, select Pull images from the drop-down.
This topic describes how to reset passwords for users and administrators.
To change a user password in MKE:
Log in to the MKE web UI with administrator credentials.
Click Access Control > Users.
Select the user whose password you want to change.
Click the gear icon in the top right corner.
Select Security from the left navigation.
Enter the new password, confirm that it is correct, and click
Update Password.
Note
For users managed with an LDAP service, you must change user passwords on
the LDAP server.
To change an administrator password in MKE:
SSH to an MKE manager node and run:
dockerrun--net=host-vucp-auth-api-certs:/tls-it\"$(dockerinspect--format\'{{ .Spec.TaskTemplate.ContainerSpec.Image }}'\
ucp-auth-api)"\"$(dockerinspect--format\'{{ index .Spec.TaskTemplate.ContainerSpec.Args 0 }}'\
ucp-auth-api)"\
passwd-i
Optional. If you have DEBUG set as your global log level within MKE,
running $(dockerinspect--format'{{index.Spec.TaskTemplate.ContainerSpec.Args0}}`
returns --debug instead of --db-addr.
Pass Args1 to $dockerinspect instead to reset your administrator
password:
dockerrun--net=host-vucp-auth-api-certs:/tls-it\"$(dockerinspect--format\'{{ .Spec.TaskTemplate.ContainerSpec.Image }}'\
ucp-auth-api)"\"$(dockerinspect--format\'{{ index .Spec.TaskTemplate.ContainerSpec.Args 1 }}'\
ucp-auth-api)"\
passwd-i
Note
Alternatively, ask another administrator to change your password.
This topic describes how to grant two teams access to separate volumes in two
different resource collections such that neither team can see the volumes of
the other team. MKE allows you to do this even if the volumes are on the same
nodes.
To create two teams:
Log in to the MKE web UI.
Navigate to Orgs & Teams.
Create two teams in the engineering
organization named Dev and Prod.
To create grants for controlling access to the new volumes:
Create a grant for the Dev team to access
the dev-volumes collection with the
Restricted Control built-in role.
Create a grant for the Prod team to
access the prod-volumes collection with the
Restricted Control built-in role.
To create a volume as a team member:
Log in as one of the users on the Dev team.
Navigate to Swarm > Volumes and click Create.
On the Details tab, name the new volume dev-data.
On the Collection tab, navigate to the dev-volumes
collection and click Create.
Log in as one of the users on the Prod team.
Navigate to Swarm > Volumes and click Create.
On the Details tab, name the new volume prod-data.
On the Collection tab, navigate to the prod-volumes
collection and click Create.
As a result, the user on the Prod team cannot see the
Dev team volumes, and the user on the Dev team cannot
see the Prod team volumes. MKE administrators can see all of the
volumes created by either team.
You can use MKE to physically isolate resources by organizing nodes into
collections and granting Scheduler access for different users. Control
access to nodes by moving them to dedicated collections where you can grant
access to specific users, teams, and organizations.
The following tutorials explain how to isolate nodes using Swarm and
Kubernetes.
This tutorial explains how to give a team access to a node collection and a
resource collection. MKE access control ensures that team members cannot view
or use Swarm resources that are not in their collection.
Note
You need an MKE license and at least two worker nodes to complete this
tutorial.
The following is a high-level overview of the steps you will take to isolate
cluster nodes:
Create an Ops team and assign a user to it.
Create a Prod collection for the team node.
Assign a worker node to the Prod collection.
Grant the Ops teams access to its collection.
To create a team:
Log in to the MKE web UI.
Create a team named Ops in your organization.
Add a user to the team who is not an administrator.
To create the team collections:
In this example, the Ops team uses a collection for its assigned
nodes and another for its resources.
The Prod collection is for the worker nodes and the
Webserver sub-collection is for an application that you will deploy
on the corresponding worker nodes.
To move a worker node to a different collection:
Note
MKE places worker nodes in the Shared collection by default, and
it places those running MSR in the System collection.
Navigate to Shared Resources > Nodes to view all of the nodes in
the swarm.
Find a node located in the Shared collection. You cannot move
worker nodes that are assigned to the System collection.
Click the gear icon on the node details page.
In the Labels section on the Details tab, change
com.docker.ucp.access.label from /Shared to /Prod.
Click Save to move the node to the Prod collection.
To create two grants for team access to the two collections:
Create a grant for the Ops team to access
the Webserver collection with the built-in
Restricted Control role.
Create a grant for the Ops team to access
the Prod collection with the built-in Scheduler
role.
The cluster is now set up for node isolation. Users with access to nodes in the
Prod collection can deploy Swarm services and Kubernetes apps. They
cannot, however, schedule workloads on nodes that are not in the collection.
To deploy a Swarm service as a team member:
When a user deploys a Swarm service, MKE assigns its resources to the
default collection. As a user on the Ops team, set
Webserver to be your default collection.
Note
From the resource target collection, MKE walks up the ancestor
collections until it finds the highest ancestor that the user has
Scheduler access to. MKE schedules tasks on any nodes in the
tree below this ancestor. In this example, MKE assigns the user service to
the Webserver collection and schedules tasks on nodes in the
Prod collection.
Log in as a user on the Ops team.
Navigate to Shared Resources > Collections.
Navigate to the Webserver collection.
Under the vertical ellipsis menu, select Set to default.
Navigate to Swarm > Services and click Create to
create a Swarm service.
Name the service NGINX, enter nginx:latest in the Image*
field, and click Create.
Click the NGINX service when it turns green.
Scroll down to TASKS, click the NGINX container,
and confirm that it is in the Webserver collection.
Navigate to the Metrics tab on the container page, select the
node, and confirm that it is in the Prod collection.
Note
An alternative approach is to use a grant instead of changing the
default collection. An administrator can create a grant for a role that has
the Service Create permission for the Webserver
collection or a child collection. In this case, the user sets the value of
com.docker.ucp.access.label to the new collection or one of its
children that has a Service Create grant for the required user.
This topic describes how to use a Kubernetes namespace to deploy a Kubernetes
workload to worker nodes using the MKE web UI.
MKE uses the scheduler.alpha.kubernetes.io/node-selector annotation key to
assign node selectors to namespaces. Assigning the name of the node selector
to this annotation pins all applications deployed in the namespace to the nodes
that have the given node selector specified.
To isolate cluster nodes with Kubernetes:
Create a Kubernetes namespace.
Note
You can also associate nodes with a namespace by providing the namespace
definition information in a configuration file.
Log in to the MKE web UI as an administrator.
In the left-side navigation panel, navigate to Kubernetes and
click Create to open the Create Kubernetes Object
page.
Collections and grants are strong tools that can be used to control access and
visibility to resources in MKE.
This tutorial describes a fictitious company named OrcaBank that needs to
configure an architecture in MKE with role-based access control (RBAC) for
their application engineering group.
OrcaBank reorganized their application teams by product with each team
providing shared services as necessary. Developers at OrcaBank do their
own DevOps and deploy and manage the lifecycle of their applications.
OrcaBank has four teams with the following resource needs:
security should have view-only access to all applications in the
cluster.
db should have full access to all database applications and
resources.
mobile should have full access to their mobile applications and
limited access to shared db services.
payments should have full access to their payments applications
and limited access to shared db services.
To assign the proper access, OrcaBank is employing a combination of
default and custom roles:
ViewOnly (default role) allows users to see all resources (but
not edit or use).
Ops (custom role) allows users to perform all operations against
configs, containers, images, networks, nodes, secrets, services, and
volumes.
View&UseNetworks+Secrets (custom role) enables users to
view/connect to networks and view/use secrets used by db
containers, but prevents them from seeing or impacting the db
applications themselves.
OrcaBank is also creating collections of resources to mirror their team
structure.
Currently, all OrcaBank applications share the same physical resources,
so all nodes and applications are being configured in collections that
nest under the built-in collection, /Shared.
Other collections are also being created to enable shared db
applications.
/Shared/mobile hosts all Mobile applications and resources.
/Shared/payments hosts all Payments applications and resources.
/Shared/db is a top-level collection for all db resources.
/Shared/db/payments is a collection of db resources for
Payments applications.
/Shared/db/mobile is a collection of db resources for Mobile
applications.
The collection architecture has the following tree representation:
/
├── System
└── Shared
├── mobile
├── payments
└── db
├── mobile
└── payments
OrcaBank’s Grant composition ensures that their collection architecture gives
the db team access to alldb resources and restricts app teams to
shareddb resources.
OrcaBank has standardized on LDAP for centralized authentication to help
their identity team scale across all the platforms they manage.
To implement LDAP authentication in MKE, OrcaBank is using MKE’s native
LDAP/AD integration to map LDAP groups directly to MKE teams. Users can
be added to or removed from MKE teams via LDAP which can be managed
centrally by OrcaBank’s identity team.
The following grant composition shows how LDAP groups are mapped to MKE
teams.
OrcaBank is taking advantage of the flexibility in MKE’s grant model by
applying two grants to each application team. One grant allows each team
to fully manage the apps in their own collection, and the second grant
gives them the (limited) access they need to networks and secrets within
the db collection.
OrcaBank’s resulting access architecture shows applications connecting
across collection boundaries. By assigning multiple grants per team, the
Mobile and Payments applications teams can connect to dedicated Database
resources through a secure and controlled interface, leveraging Database
networks and secrets.
Note
In MKE, all resources are deployed across the same
group of MKE worker nodes. Node segmentation is provided in Docker
Enterprise.
The db team is responsible for deploying and managing the full
lifecycle of the databases used by the application teams. They can
execute the full set of operations against all database resources.
In the first tutorial, the fictional company, OrcaBank, designed an
architecture with role-based access control (RBAC) to meet their
organization’s security needs. They assigned multiple grants to
fine-tune access to resources across collection boundaries on a single
platform.
In this tutorial, OrcaBank implements new and more stringent security
requirements for production applications:
First, OrcaBank adds staging zone to their deployment model. They
will no longer move developed applications directly in to production.
Instead, they will deploy apps from their dev cluster to staging for
testing, and then to production.
Second, production applications are no longer permitted to share any
physical infrastructure with non-production infrastructure. OrcaBank
segments the scheduling and access of applications with Node Access
Control.
Note
Node Access Control is a feature of MKE and provides secure
multi-tenancy with node-based isolation. Nodes can be placed in different
collections so that resources can be scheduled and isolated on disparate
physical or virtual hardware resources.
OrcaBank still has three application teams, payments, mobile,
and db with varying levels of segmentation between them.
Their RBAC redesign is going to organize their MKE cluster into two
top-level collections, staging and production, which are completely
separate security zones on separate physical infrastructure.
OrcaBank’s four teams now have different needs in production and
staging:
security should have view-only access to all applications in
production (but not staging).
db should have full access to all database applications and
resources in production (but not staging).
mobile should have full access to their Mobile applications in
both production and staging and limited access to shared db
services.
payments should have full access to their Payments applications
in both production and staging and limited access to shared db
services.
OrcaBank has decided to replace their custom Ops role with the
built-in FullControl role.
ViewOnly (default role) allows users to see but not edit all
cluster resources.
FullControl (default role) allows users complete control of all
collections granted to them. They can also create containers without
restriction but cannot see the containers of other users.
View&UseNetworks+Secrets (custom role) enables users to
view/connect to networks and view/use secrets used by db
containers, but prevents them from seeing or impacting the db
applications themselves.
OrcaBank must now diversify their grants further to ensure the proper
division of access.
The payments and mobile application teams will have three grants
each–one for deploying to production, one for deploying to staging, and
the same grant to access shared db networks and secrets.
The resulting access architecture, designed with MKE,
provides physical segmentation between production and staging using node
access control.
Applications are scheduled only on MKE worker nodes in the dedicated
application collection. And applications use shared resources across
collection boundaries to access the databases in the /prod/db
collection.
The OrcaBank db team is responsible for deploying and managing the
full lifecycle of the databases that are in production. They have the
full set of operations against all database resources.
The mobile team is responsible for deploying their full application
stack in staging. In production they deploy their own applications but
use the databases that are provided by the db team.
Prior to upgrading MKE, review the MKE release notes
for information that may be relevant to the upgrade process.
In line with your MKE upgrade, you should plan to upgrade the Mirantis
Container Runtime (MCR) instance on each cluster node to version 19.03.08 or
later. Mirantis recommends that you schedule the upgrade for non-business hours
to ensure minimal user impact.
Do not make changes to your MKE configuration while upgrading, as doing so can
cause misconfigurations that are difficult to troubleshoot.
MKE uses semantic versioning. While downgrades are not supported, Mirantis
supports upgrades according to the following rules:
When you upgrade from one patch version to another, you can skip patch
versions as no data migration takes place between patch versions.
When you upgrade between minor releases, you cannot skip releases. You can,
however, upgrade from any patch version from the previous minor release to
any patch version of the subsequent minor release.
When you upgrade between major releases, you cannot skip releases.
Warning
Upgrading from one MKE minor version to another minor version can result in
a downgrading of MKE middleware components. For more information, refer to
the component listings in the release notes of both the source and target
MKE versions.
Azure installations have additional prerequisites. Refer to
Install MKE on Azure for more information.
To perform storage verifications:
Verify that no more than 70% of /var/ storage is used. If more than 70%
is used, allocate enough storage to meet this requirement.
Verify whether any node local file systems have disk storage issues,
including MSR back-end storage, for example, NFS.
Verify that you are using Overlay2 storage drivers, as they are more stable.
If you are not, you should transition to Overlay2 at this time.
Transitioning from device mapper to Overlay2 is a destructive rebuild.
To perform operating system verifications:
Patch all relevant packages to the most recent cluster node operating
system version, including the kernel.
Perform rolling restart of each node to confirm in-memory settings are the
same as startup scripts.
After performing rolling restarts, run check-config.sh on each cluster
node checking for kernel compatibility issues.
To perform procedural verifications:
Perform Swarm, MKE, and MSR backups.
Gather Compose, service, and stack files.
Generate an MKE support bundle for this specific point in time.
Preinstall MKE, MSR, and MCR images. If your cluster does not have an
Internet connection, Mirantis provides tarballs containing all the
required container images. If your cluster does have an Internet connection,
pull the required container images onto your nodes:
Load troubleshooting packages, for example, netshoot.
To upgrade MCR:
The MKE upgrade requires MCR 19.03.08 or later to be running on every cluster
node. If it is not, perform the following steps first on manager and then on
worker nodes:
Log in to the node using SSH.
Upgrade MCR to version 19.03.08 or later.
Using the MKE web UI, verify that the node is in a healthy state:
Log in to the MKE web UI.
Navigate to Shared Resources > Nodes.
Verify that the node is healthy and a part of the cluster.
Caution
Mirantis recommends upgrading in the following order: MCR, MKE, MSR. This
topic is limited to the upgrade instructions for MKE.
To perform cluster verifications:
Verify that your cluster is in a healthy state, as it will be easier to
troubleshoot should a problem occur.
Create a backup of your cluster, thus allowing you to recover should
something go wrong during the upgrade process.
Note
You cannot use the backup archive during the upgrade process, as it is
version specific. For example, if you create a backup archive for
an MKE 3.4.2 cluster, you cannot use the archive file after you upgrade
to MKE 3.4.4.
To upgrade MKE on machines that are not connected to the Internet, refer to
Install MKE offline to learn how to download the MKE package for offline
installation.
In all three methods, manager nodes are automatically upgraded in place. You
cannot control the order of manager node upgrades. For each worker node that
requires an upgrade, you can upgrade that node in place or you can replace the
node with a new worker node. The type of upgrade you perform depends on what is
needed for each node.
Consult the following table to determine which method is right for you:
Automatically upgrades manager nodes and allows you to control the
upgrade order of worker nodes. This type of upgrade is more advanced
than the automated in-place cluster upgrade.
This type of upgrade allows you to stand up a new cluster in parallel to
the current one and switch over when the upgrade is complete. It
requires that you join new worker nodes, schedule workloads to run on
them, pause, drain, and remove old worker nodes in batches (rather than
one at a time), and shut down servers to remove worker nodes. This is
the most advanced upgrade method.
This is the standard method of upgrading MKE. It updates all MKE components on
all nodes within the MKE cluster one-by-one until the upgrade is complete, and
is thus not ideal for those needing to upgrade their worker nodes in a
particular order.
Verify that all MCR instances have been upgraded to the
corresponding new version.
SSH into one MKE manager node and run the following command (do not run this
command on a workstation with a client bundle):
This method allows granular control of the MKE upgrade process by first
upgrading a manager node and then allowing you to upgrade worker nodes manually
in the order that you select. This allows you to migrate workloads and control
traffic while upgrading. You can temporarily run MKE worker nodes with
different versions of MKE and MCR.
This method allows you to handle failover by adding additional worker node
capacity during an upgrade. You can add worker nodes to a partially-upgraded
cluster, migrate workloads, and finish upgrading the remaining worker nodes.
Verify that all MCR instances have been upgraded to the
corresponding new version.
SSH into one MKE manager node and run the following command (do not run this
command on a workstation with a client bundle):
The --manual-worker-upgrade flag allows MKE to upgrade only the manager
nodes. It adds an upgrade-hold label to all worker nodes, which prevents
MKE from upgrading each worker node until you remove the label.
Optional. Join additional worker nodes to your cluster:
Replace existing worker nodes using blue-green deployment¶
This method creates a parallel environment for a new deployment, which reduces
downtime, upgrades worker nodes without disrupting workloads, and allows you
to migrate traffic to the new environment with worker node rollback capability.
Note
You do not have to replace all worker nodes in the cluster at one time, but
can instead replace them in groups.
Verify that all MCR instances have been upgraded to the
corresponding new version.
SSH into one MKE manager node and run the following command (do not run this
command on a workstation with a client bundle):
The --manual-worker-upgrade flag allows MKE to upgrade only the manager
nodes. It adds an upgrade-hold label to all worker nodes, which prevents
MKE from upgrading each worker node until the label is removed.
This topic describes common problems and errors that occur during the upgrade
process and how to identify and resolve them.
To check for multiple conflicting upgrades:
The upgrade command automatically checks for multiple ucp-worker-agents,
the existence of which can indicate that the cluster is still undergoing a
prior manual upgrade. You must resolve the conflicting node labels before
proceeding with the upgrade.
To resolve upgrade failures:
You can resolve upgrade failures on worker nodes by changing the node labels
back to the previous version, but this is not supported on manager nodes.
To check Kubernetes errors:
For more information on anything that might have gone wrong during the upgrade
process, check Kubernetes errors in node state messages after the upgrade is
complete.
This topic describes how to use both the MKE web UI and the CLI to deploy a
multi-service application for voting on whether you prefer cats or dogs.
To deploy a multi-service application using the MKE web UI:
Log in to the MKE web UI.
Navigate to Shared Resources > Stacks and click
Create Stack.
In the Name field, enter voting-app.
Under ORCHESTRATOR MODE, select Swarm Services and
click Next.
In the Add Application File editor, paste the following
application definition written in the docker-compose.yml format:
version:"3"services:# A Redis key-value store to serve as message queueredis:image:redis:alpineports:-"6379"networks:-frontend# A PostgreSQL database for persistent storagedb:image:postgres:9.4volumes:-db-data:/var/lib/postgresql/datanetworks:-backend# Web UI for votingvote:image:dockersamples/examplevotingapp_vote:beforeports:-5000:80networks:-frontenddepends_on:-redis# Web UI to count voting resultsresult:image:dockersamples/examplevotingapp_result:beforeports:-5001:80networks:-backenddepends_on:-db# Worker service to read from message queueworker:image:dockersamples/examplevotingapp_workernetworks:-frontend-backendnetworks:frontend:backend:volumes:db-data:
Click Create to deploy the stack.
In the list on the Shared Resources > Stacks page, verify that
the application is deployed by looking for voting-app. If the
application is in the list, it is deployed.
To view the individual application services, click voting-app
and navigate to the Services tab.
Cast votes by accessing the service on port 5000.
Caution
MKE does not support referencing external files when using the MKE web UI
to deploy applications, and thus does not support the following keywords:
build
dockerfile
env_file
You must use a version control system to store the stack definition used
to deploy the stack, as MKE does not store the stack definition.
To deploy a multi-service application using the MKE CLI:
Create a file named docker-stack.yml with the following
content:
version:"3"services:# A Redis key-value store to serve as message queueredis:image:redis:alpineports:-"6379"networks:-frontend# A PostgreSQL database for persistent storagedb:image:postgres:9.4volumes:-db-data:/var/lib/postgresql/datanetworks:-backend# Web UI for votingvote:image:dockersamples/examplevotingapp_vote:beforeports:-5000:80networks:-frontenddepends_on:-redis# Web UI to count voting resultsresult:image:dockersamples/examplevotingapp_result:beforeports:-5001:80networks:-backenddepends_on:-db# Worker service to read from message queueworker:image:dockersamples/examplevotingapp_workernetworks:-frontend-backendnetworks:frontend:backend:volumes:db-data:
This topic describes how to use both the CLI and a Compose file to deploy
application resources to a particular Swarm collection. Attach the Swarm
collection path to the service access label to assign the service to the
required collection. MKE automatically assigns new services to the default
collection unless you use either of the methods presented here to assign a
different Swarm collection.
Navigate to the Shared Resources > Stacks and click
Create Stack.
Name the application wordpress.
Under ORCHESTRATOR MODE, select Swarm Services and
click Next.
In the Add Application File editor, paste the Compose file.
Click Create to deploy the application
Click Done when the deployment completes.
Note
MKE reports an error if the /Shared/wordpress collection does not
exist or if you do not have a grant for accessing it.
To confirm that the service deployed to the correct Swarm collection:
Navigate to Shared Resources > Stacks and select your
application.
Navigate to the to Services tab and select the required
service.
On the details pages, verify that the service is assigned to the correct
Swarm collection.
Note
MKE creates a default overlay network for your stack that attaches to
each container you deploy. This works well for administrators and those
assigned full control roles. If you have lesser permissions, define a custom
network with the same com.docker.ucp.access.label label as your services
and attach this network to each service. This correctly groups your network
with the other resources in your stack.
This topic describes how to create and use secrets with MKE by showing you
how to deploy a WordPress application that uses a secret for storing a
plaintext password. Other sensitive information you might use a secret to store
includes TLS certificates and private keys. MKE allows you to securely store
secrets and configure who can access and manage them using role-based access
control (RBAC).
The application you will create in this topic includes the following two
services:
wordpress
Apache, PHP, and WordPress
wordpress-db
MySQL database
The following example stores a password in a secret, and the secret is stored
in a file inside the container that runs the services you will deploy. The
services have access to the file, but no one else can see the plaintext
password. To make things simple, you will not configure the database to persist
data, and thus when the service stops, the data is lost.
To create a secret:
Log in to the MKE web UI.
Navigate to Swarm > Secrets and click Create.
Note
After you create the secret, you will not be able to edit or see the
secret again.
Name the secret wordpress-password-v1.
In the Content field, assign a value to the secret.
Optional. Define a permission label so that other users can be given
permission to use this secret.
Note
To use services and secrets together, they must either have the same
permission label or no label at all.
To create a network for your services:
Navigate to Swarm > Networks and click Create.
Create a network called wordpress-network with the default settings.
To create the MySQL service:
Navigate to Swarm > Services and click
Create.
Under Service Details, name the service wordpress-db.
Under Task Template, enter mysql:5.7.
In the left-side menu, navigate to Network, click
Attach Network +, and select wordpress-network from
the drop-down.
In the left-side menu, navigate to Environment, click
Use Secret +, and select wordpress-password-v1 from
the drop-down.
Click Confirm to associate the secret with the
service.
Scroll down to Environment variables and click
Add Environment Variable +.
Enter the following string to create an environment variable that contains
the path to the password file in the container:
If you specified a permission label on the secret, you must set the
same permission label on this service.
Click Create to deploy the MySQL service.
This creates a MySQL service that is attached to the wordpress-network
network and that uses the wordpress-password-v1 secret. By default, this
creates a file with the same name in /run/secrets/<secret-name> inside the
container running the service.
We also set the MYSQL_ROOT_PASSWORD_FILE environment variable to
configure MySQL to use the content of the
/run/secrets/wordpress-password-v1 file as the root password.
To create the WordPress service:
Navigate to Swarm > Services and click
Create.
Under Service Details, name the service wordpress.
Under Task Template, enter wordpress:latest.
In the left-side menu, navigate to Network, click
Attach Network +, and select wordpress-network from
the drop-down.
In the left-side menu, navigate to Environment, click
Use Secret +, and select wordpress-password-v1 from
the drop-down.
Click Confirm to associate the secret with the
service.
Scroll down to Environment variables and click
Add Environment Variable +.
Enter the following string to create an environment variable that contains
the path to the password file in the container:
Add another environment variable and enter the following string:
“WORDPRESS_DB_HOST=wordpress-db:3306”.
If you specified a permission label on the secret, you must set the
same permission label on this service.
Click Create to deploy the WordPress service.
This creates a WordPress service that is attached to the same network as the
MySQL service so that they can communicate, and maps the port 80 of the
service to port 8000 of the cluster routing mesh.
Once you deploy this service, you will be able to access it on port 8000 using
the IP address of any node in your MKE cluster.
To update a secret:
If the secret is compromised, you need to change it, update the services
that use it, and delete the old secret.
Create a new secret named wordpress-password-v2.
From Swarm > Secrets, select the
wordpress-password-v1 secret to view all the services that you
need to update. In this example, it is straightforward, but that will not
always be the case.
Update wordpress-db to use the new secret.
Update the MYSQL_ROOT_PASSWORD_FILE environment variable with either
of the following methods:
Update the environment variable directly with the following:
Mount the secret file in /run/secrets/wordpress-password-v1 by setting
the Target Name field with wordpress-password-v1. This
mounts the file with the wordpress-password-v2 content in
/run/secrets/wordpress-password-v1.
Delete the wordpress-password-v1 secret and click Update.
Repeat the foregoing steps for the WordPress service.
MKE includes a system for application-layer (layer 7) routing that offers both
application routing and load balancing (ingress routing) for Swarm
orchestration. The Interlock architecture leverages Swarm components to provide
scalable layer 7 routing and Layer 4 VIP mode functionality.
Swarm mode provides MCR with a routing mesh, which enables users to access
services using the IP address of any node in the swarm. layer 7 routing enables
you to access services through any node in the swarm by using a domain name,
with Interlock routing the traffic to the node with the relevant container.
Interlock uses the Docker remote API to automatically configure extensions such
as NGINX and HAProxy for application traffic. Interlock is designed for:
Full integration with MCR, including Swarm services, secrets, and configs
Enhanced configuration, including context roots, TLS, zero downtime
deployment, and rollback
Support through extensions for external load balancers, such as NGINX,
HAProxy, and F5
Least privilege for extensions, such that they have no Docker API access
Note
Interlock and Layer 7 routing are used for Swarm deployments. Refer to
Use Istio Ingress for Kubernetes for information on routing traffic to your Kubernetes
applications.
The central piece of the layer 7 routing solution. The core service is
responsible for interacting with the Docker remote API and building an
upstream configuration for the extensions. Interlock uses the Docker API to
monitor events, and manages the extension and proxy services, and it serves
this on a gRPC API that the extensions are configured to access.
Interlock manages extension and proxy service updates for both configuration
changes and application service deployments. There is no operator intervention
required.
The Interlock service starts a single replica on a manager node. The Interlock
extension service runs a single replica on any available node, and the
Interlock proxy service starts two replicas on any available node. Interlock
prioritizes replica placement in the following order:
Replicas on the same worker node
Replicas on different worker nodes
Replicas on any available nodes, including managers
Interlock extension
A secondary service that queries the Interlock gRPC API for the
upstream configuration. The extension service configures the proxy service
according to the upstream configuration. For proxy services that use files
such as NGINX or HAProxy, the extension service generates the file and sends
it to Interlock using the gRPC API. Interlock then updates the corresponding
Docker configuration object for the proxy service.
Interlock proxy
A proxy and load-balancing service that handles requests for the
upstream application services. Interlock configures these using the data
created by the corresponding extension service. By default, this service is a
containerized NGINX deployment.
All layer 7 routing components are failure-tolerant and leverage Docker Swarm
for high availability.
Automatic configuration
Interlock uses the Docker API for automatic configuration, without needing you
to manually update or restart anything to make services available. MKE
monitors your services and automatically reconfigures proxy services.
Scalability
Interlock uses a modular design with a separate proxy service, allowing an
operator to individually customize and scale the proxy Layer to handle user
requests and meet services demands, with transparency and no downtime for
users.
TLS
You can leverage Docker secrets to securely manage TLS certificates and keys
for your services. Interlock supports both TLS termination and TCP
passthrough.
Context-based routing
Interlock supports advanced application request routing by context or path.
Host mode networking
Layer 7 routing leverages the Docker Swarm routing mesh by default, but
Interlock also supports running proxy and application services in host mode
networking, allowing you to bypass the routing mesh completely, thus promoting
maximum application performance.
Security
The layer 7 routing components that are exposed to the outside world run on
worker nodes, thus your cluster will not be affected if they are compromised.
SSL
Interlock leverages Docker secrets to securely store and use SSL certificates
for services, supporting both SSL termination and TCP passthrough.
Blue-green and canary service deployment
Interlock supports blue-green service deployment allowing an operator to
deploy a new application while the current version is serving. Once the new
application verifies the traffic, the operator can scale the older version to
zero. If there is a problem, the operation is easy to reverse.
Service cluster support
Interlock supports multiple extension and proxy service combinations, thus
allowing for operators to partition load balancing resources to be used, for
example, in region- or organization-based load balancing.
Least privilege
Interlock supports being deployed where the load balancing proxies do not need
to be colocated with a Swarm manager. This is a more secure approach to
deployment as it ensures that the extension and proxy services do not have
access to the Docker API.
This topic describes various ways to optimize your Interlock
deployments. First, it will be helpful to review the stages of an Interlock
deployment. The following process occurs each time you update an application:
The user updates a service with a new version of an application.
The default stop-first policy stops the first replica before
scheduling the second. The Interlock proxies remove ip1.0 from the
back-end pool as the app.1 task is removed.
Interlock reschedules the first application task with the new image after
the first task stops.
Interlock reschedules proxy.1 with the new NGINX configuration
containing the new app.1 task update.
After proxy.1 is complete, proxy.2 redeploys with the updated NGINX
configuration for the app.1 task.
In this scenario, the service is unavailable for less than 30 seconds.
Using --update-order, Swarm allows you to control the order in which tasks
are stopped when you replace them with new tasks:
Optimization type
Description
stop-first (default)
Configures the old task to stop before the new task starts. Use this if
the old and new tasks cannot serve clients at the same time.
start-first
Configures the old task to stop after the new task starts. Use this if
you have a single application replica and you cannot have service
interruption. This optimizes for high availability.
To optimize the order in which you update your application,
[need-instructions-from-sme].
To set an application update delay:
Using update-delay, Swarm allows you to control how long it takes an
application to update by adding a delay between updating tasks. The delay
occurs between the time when the first task enters a healthy state and when the
next task begins its update. The default is 0 seconds, meaning there is no
delay.
Use update-delay if either of the following applies:
You can tolerate a longer update cycle with the benefit of fewer dropped
connections.
Interlock update convergence takes a long time in your environment, often
due to having a large number of overlay networks.
Do not use update-delay if either of the following applies:
You need service updates to occur rapidly.
The old and new tasks cannot serve clients at the same time.
To set the update delay, [need-instructions-from-sme].
To configure application health checks:
Using health-cmd, Swarm allows you to check application health to ensure
that updates do not cause service interruption. Without using health-cmd,
Swarm considers an application healthy as soon as the container process is
running, even if the application is not yet capable of serving clients, thus
leading to dropped connections. You can configure health-cmd using either
a Dockerfile or a Compose file.
To configure health-cmd, [need-instructions-from-sme].
To configure an application stop grace period:
Using stop-grace-period, Swarm allows you to set the maximum wait time
before it force-kills a task. A task can run no longer than the value of this
setting after initiating its shutdown cycle. The default is 10 seconds. Use
longer wait times for applications that require long periods to process
requests, allowing connections to terminate normally.
To configure stop-grace-period, [need-instructions-from-sme].
To use service clusters for Interlock segmentation:
Interlock can be segmented into multiple logical instances called service
clusters, with independently-managed proxies. Application traffic can be
fully-segmented, as it only uses the proxies for a particular service cluster.
Each service cluster only connects to the networks that use that specific
service cluster, reducing the number of overlay networks that proxies connect
to. The use of separate proxies enables service clusters to reduce the amount
of load balancer configuration churn during service updates.
To configure service clusters, [need-instructions-from-sme].
To minimize the number of overlay networks:
Every overlay network connected to Interlock adds one to two seconds of
additional update delay, and too many connected networks cause the load
balancer configuration to be out of date for too long, resulting in dropped
traffic.
The following are two different ways you can minimize the number of overlay
networks that Interlock connects to:
Group applications together to share a network if the architecture
permits doing so.
Use Interlock service clusters, as they segment which networks are
connected to Interlock, reducing the number of networks each proxy is
connected to. And use admin-defined networks, limiting the number of networks
per service cluster.
To use Interlock VIP Mode:
Using VIP mode, Interlock allows you to reduce the impact of application
updates on the Interlock proxies. It uses the Swarm L4 load balancing VIPs
instead of individual task IPs to load balance traffic to a more stable
internal endpoint. This prevents the proxy load balancer configurations from
changing for most kinds of app service updates, thus reducing Interlock churn.
These are the features that VIP mode supports:
Host and context routing
Context root rewrites
Interlock TLS termination
TLS passthrough
Service clusters
These are the features that VIP mode does not support:
Sticky sessions
Websockets
Canary deployments
To use Interlock VIP mode, [need-instructions-from-sme].
This section describes how to customize layer 7 routing by updating the
ucp-interlock service with a new Docker configuration, including
configuration options and the procedure for creating a proxy service.
Optional. If you provide an invalid configuration, the ucp-interlock
service is configured to roll back to a previous stable configuration, by
default. Configure the service to pause instead of rolling back:
The following options are available to configure the extensions. Interlock must
contain at least one extension to service traffic.
Option
Type
Description
Image
string
Name of the Docker image to use for the extension.
Args
[]string
Arguments to pass to the extension service.
Labels
map[string]string
Labels to add to the extension service.
Networks
[]string
Allows the administrator to cherry pick a list of networks
that Interlock can connect to. If this option is not specified, the
proxy service can connect to all networks.
ContainerLabels
map[string]string
Labels for the extension service tasks.
Constraints
[]string
One or more constraints to use when scheduling the extension service.
PlacementPreferences
[]string
One of more placement preferences.
ServiceName
string
Name of the extension service.
ProxyImage
string
Name of the Docker image to use for the proxy service.
ProxyArgs
[]string
Arguments to pass to the proxy service.
ProxyLabels
map[string]string
Labels to add to the proxy service.
ProxyContainerLabels
map[string]string
Labels to add to the proxy service tasks.
ProxyServiceName
string
Name of the proxy service.
ProxyConfigPath
string
Path in the service for the generated proxy configuration.
ProxyReplicas
unit
Number or proxy service replicas.
ProxyStopSignal
string
Stop signal for the proxy service. For example, SIGQUIT.
ProxyStopGracePeriod
string
Stop grace period for the proxy service in seconds. For example, 5s.
ProxyConstraints
[]string
One or more constraints to use when scheduling the proxy service. Set
the variable to false, as it is currently set to true by
default.
ProxyPlacementPreferences
[]string
One or more placement preferences to use when scheduling the proxy
service.
ProxyUpdateDelay
string
Delay between rolling proxy container updates.
ServiceCluster
string
Name of the cluster that this extension serves.
PublishMode
string (ingress or host)
Publish mode that the proxy service uses.
PublishedPort
int
Port on which the proxy service serves non-SSL traffic.
PublishedSSLPort
int
Port on which the proxy service serves SSL traffic.
Template
int
Docker configuration object that is used as the extension template.
Config
config
Proxy configuration used by the extensions as described in this section.
HitlessServiceUpdate
bool
When set to true, services can be updated without restarting the
proxy container.
ConfigImage
config
Name for the config service used by hitless service updates. For
example, mirantis/ucp-interlock-config:3.2.1.
ConfigServiceName
config
Name of the config service. This name is equivalent to
ProxyServiceName. For example, ucp-interlock-config.
Options are available to the extensions, and the extensions use the options
needed for proxy service configuration. This provides overrides to the
extension configuration.
Because Interlock passes the extension configuration directly to the
extension, each extension has different configuration options available.
The default proxy service used by MKE to provide layer 7 routing is
NGINX. If users try to access a route that has not been configured, they
will see the default NGINX 404 page.
You can customize this by labeling a service with
com.docker.lb.default_backend=true. If users try to access a route that is
not configured, they will be redirected to the custom service.
If you want to customize the default NGINX proxy service used by MKE to provide
layer 7 routing, follow the steps below to create an example proxy service
where users will be redirected if they try to access a route that is not
configured.
Layer 7 routing components communicate with one another by default
using overlay networks, but Interlock also supports host mode networking
in a variety of ways, including proxy only, Interlock only, application only,
and hybrid.
When using host mode networking, you cannot use DNS service discovery,
since that functionality requires overlay networking. For services to
communicate, each service needs to know the IP address of the node where
the other service is running.
Note
Use an alternative to DNS service discovery such as Registrator if you
require this functionality.
The following is a high-level overview of how to use host mode instead of
overlay networking:
Update the ucp-interlock configuration.
Deploy your Swarm services.
Configure proxy services.
If you have not already done so, configure the layer 7 routing solution for
production with the ucp-interlock-proxy service replicas running
on their own dedicated nodes.
This section describes how to deploy an example Swarm service on an eight-node
cluster using host mode networking to route traffic without using overlay
networks. The cluster has three manager nodes and five worker nodes, with two
workers configured as dedicated ingress cluster load balancer nodes that will
receive all application traffic.
This example does not cover the actual infrastructure deployment, and assumes
you have a typical Swarm cluster using dockerinit and
dockerswarmjoin from the nodes.
By default, NGINX is used as a proxy. The following configuration options are
available for the NGINX extension.
Note
The ServerNamesHashBucketSize option, which allowed the user to manually
set the bucket size for the server names hash table, was removed in MKE
3.3.10 because MKE now adaptively calculates the setting and overrides any
manual input.
Option
Type
Description
Defaults
User
string
User name for the proxy
nginx
PidPath
string
Path to the PID file for the proxy service
/var/run/proxy.pid
MaxConnections
int
Maximum number of connections for the proxy service
1024
ConnectTimeout
int
Timeout in seconds for clients to connect
600
SendTimeout
int
Timeout in seconds for the service to read a response from the proxied
upstream
600
ReadTimeout
int
Timeout in seconds for the service to read a response from the proxied
upstream
600
SSLOpts
int
Options to be passed when configuring SSL
N/A
SSLDefaultDHParam
int
Size of DH parameters
1024
SSLDefaultDHParamPath
string
Path to DH parameters file
N/A
SSLVerify
string
SSL client verification
required
WorkerProcesses
string
Number of worker processes for the proxy service
1
RLimitNoFile
int
Maximum number of open files for the proxy service
65535
SSLCiphers
string
SSL ciphers to use for the proxy service
HIGH:!aNULL:!MD5
SSLProtocols
string
Enable the specified TLS protocols
TLSv1.2
HideInfoHeaders
bool
Hide proxy-related response headers
N/A
KeepaliveTimeout
string
Connection keep-alive timeout
75s
ClientMaxBodySize
string
Maximum allowed client request body size
1m
ClientBodyBufferSize
string
Buffer size for reading client request body
8k
ClientHeaderBufferSize
string
Maximum number and size of buffers used for reading large
client request header
1k
LargeClientHeaderBuffers
string
Maximum number and size of buffers used for reading large
client request header
48k
ClientBodyTimeout
string
Timeout for reading client request body
60s
UnderscoresInHeaders
bool
Enables or disables the use of underscores in client request header
fields
false
UpstreamZoneSize
int
Size of the shared memory zone (in KB)
64
GlobalOptions
[]string
List of options that are included in the global configuration
N/A
HTTPOptions
[]string
List of options that are included in the HTTP configuration
N/A
TCPOptions
[]string
List of options that are included in the stream (TCP) configuration
Change the action that Swarm takes when an update fails using
update-failure-action (the default is pause), for example, to
rollback to the previous configuration:
Change the amount of time between proxy updates using update-delay
(the default is to use rolling updates), for example, setting the delay to
thirty seconds:
This topic describes how to update Interlock services by first updating
the Interlock configuration to specify the new extension or proxy image
versions and then updating the Interlock services to use the new configuration
and image.
This topic describes how to route traffic to Swarm services by deploying
a layer 7 routing solution into a Swarm-orchestrated cluster. It has the
following prerequisites:
Enabling layer 7 routing causes the following to occur:
MKE creates the ucp-interlock overlay network.
MKE deploys the ucp-interlock service and attaches it both to the
Docker socket and the overlay network that was created. This allows
the Interlock service to use the Docker API, which is why this service needs
to run on a manger node.
The ucp-interlock service starts the ucp-interlock-extension
service and attaches it to the ucp-interlock network, allowing both
services to communicate.
The ucp-interlock-extension generates a configuration for the proxy
service to use. By default the proxy service is NGINX, so this
service generates a standard NGINX configuration. MKE creates the
com.docker.ucp.interlock.conf-1 configuration file and uses it to
configure all the internal components of this service.
The ucp-interlock service takes the proxy configuration and uses
it to start the ucp-interlock-proxy service.
Note
Layer 7 routing is disabled by default.
To enable layer 7 routing using the MKE web UI:
Log in to the MKE web UI as an administrator.
Navigate to <user-name> > Admin Settings.
Click Ingress.
Toggle the Swarm HTTP ingress slider to the right.
Optional. By default, the routing mesh service listens on port 8080 for HTTP
and 8443 for HTTPS. Change these ports if you already have services using
them.
The three primary Interlock services include the core service, the extensions,
and the proxy. The following is the default MKE configuration, which is created
automatically when you enable Interlock as described in this topic.
The value of LargeClientHeaderBuffers indicates the number of buffers to
use to read a large client request header, as well as the size of those
buffers.
To enable layer 7 routing from the command line:
Interlock uses a TOML file for the core service configuration. The following
example uses Swarm deployment and recovery features by creating a Docker
config object.
The Interlock core service must have access to a Swarm manager
(--constraintnode.role==manager), however the extension and proxy
services are recommended to run on workers.
Verify that the three services are created, one for the Interlock service,
one for the extension service, and one for the proxy service:
This topic describes how to configure Interlock for a production environment
and builds upon the instruction in the previous topic,
Deploy a layer 7 routing solution. It does not describe infrastructure deployment,
and it assumes you are using a typical Swarm cluster, using
docker init and docker swarm join from the nodes.
The layer 7 solution that ships with MKE is highly available, fault tolerant,
and designed to work independently of how many nodes you manage with MKE.
The following procedures require that you dedicate two worker nodes for running
the ucp-interlock-proxy service. This tuning ensures the following:
The proxy services have dedicated resources to handle user requests.
You can configure these nodes with higher performance network
interfaces.
No application traffic can be routed to a manager node, thus making
your deployment more secure.
If one of the two dedicated nodes fails, layer 7 routing continues
working.
To dedicate two nodes to running the proxy service:
Select two nodes that you will dedicate to running the proxy service.
Log in to one of the Swarm manager nodes.
Add labels to the two dedicated proxy service nodes, configuring them as
load balancer worker nodes, for example, lb-00 and lb-01:
This updates the proxy service to have two replicas, ensures that they
are constrained to the workers with the label nodetype==loadbalancer,
and configures the stop signal for the tasks to be a SIGQUIT with a
grace period of five seconds. This ensures that NGINX does not exit before
the client request is finished.
Inspect the service to verify that the replicas have started on the selected
nodes:
Optional. By default, the config service is global, scheduling one task on
every node in the cluster. To modify constraint scheduling, update the
ProxyConstraints variable in the Interlock configuration file. Refer
to Configure layer 7 routing service for more information.
Verify that the proxy service is running on the dedicated nodes:
dockerservicepsucp-interlock-proxy
Update the settings in the upstream load balancer, such as ELB or F5, with
the addresses of the dedicated ingress workers, thus directing all traffic
to these two worker nodes.
To install Interlock on your cluster without an Internet connection, you must
have the required Docker images loaded on your computer. This topic describes
how to export the required images from a local instance of MCR and then load
them to your Swarm-orchestrated cluster.
To export Docker images from a local instance:
Using a local instance of MCR, save the required images:
interlock-extension-nginx.tar - the Interlock extension
for NGINX.
interlock-proxy-nginx.tar - the official NGINX image based
on Alpine.
Note
Replace
mirantis/ucp-interlock-extension:3.3.16
and mirantis/ucp-interlock-proxy:3.3.16
with the corresponding extension and proxy image if you are not using
NGINX.
Copy the three files you just saved to each node in the cluster and load
each image:
After Interlock is deployed, you can launch and publish services and
applications. This topic describes how to configure services to publish
themselves to the load balancer by using service labels.
Caution
The following procedures assume a DNS entry exists for each of the
applications (or local hosts entry for local testing).
To publish a demo service with four replicas to the host (demo.local):
Create a Docker Service using the following two labels:
com.docker.lb.hosts for Interlock to determine where the service is
available.
com.docker.lb.port for the proxy service to determine which port to
use to access the upstreams.
Create an overlay network so that service traffic is isolated and
secure:
Defines the hostname for the service. When the layer 7 routing solution
gets a request containing app.example.org in the host header, that
request is forwarded to the demo service.
com.docker.lb.network
Defines which network the ucp-interlock-proxy should attach to in
order to communicate with the demo service. To use layer 7 routing, you
must attach your services to at least one network. If your service is
attached to a single network, you do not need to add a label to specify
which network to use for routing. When using a common stack file for
multiple deployments leveraging MKE Interlock and layer 7 routing,
prefix com.docker.lb.network with the stack name to ensure traffic
is directed to the correct overlay network. In combination with
com.docker.lb.ssl_passthrough, the label in mandatory even if your
service is only attached to a single network.
com.docker.lb.port
Specifies which port the ucp-interlock-proxy service should use to
communicate with this demo service. Your service does not need to expose
a port in the Swarm routing mesh. All communications are done using the
network that you have specified.
The ucp-interlock service detects that your service is using these
labels and automatically reconfigures the ucp-interlock-proxy service.
Optional. Increase traffic to the new version by adding more replicas. For
example:
dockerservicescaledemo-v2=4
Example output:
demo-v2
Complete the upgrade by scaling the demo-v1 service to zero replicas:
dockerservicescaledemo-v1=0
Example output:
demo-v1
This routes all application traffic to the new version. If you need to roll
back your service, scale the v1 service back up and the v2 service back
down.
Interlock detects when the service is available and publishes it.
Note
Interlock only supports one path per host for each service cluster. When
a specific com.docker.lb.hosts label is applied, it cannot be
applied again in the same service cluster.
After the tasks are running and the proxy service is updated, the
application is available at http://demo.local:
Interlock uses back-end task IPs to route traffic from the
proxy to each container. Traffic to the front-end route is layer 7 load
balanced directly to service tasks. This allows for routing
functionality such as sticky sessions for each container. Task routing
mode applies layer 7 routing and then sends packets directly to a
container.
Interlock uses the Swarm service VIP as the back-end IP instead of
using container IPs. Traffic to the front-end route is layer 7 load
balanced to the Swarm service VIP, which Layer 4 load balances to
back-end tasks. VIP mode is useful for reducing the amount of churn in
Interlock proxy service configurations, which can be an advantage in
highly dynamic environments.
VIP mode optimizes for fewer proxy updates with the tradeoff of a
reduced feature set. Most application updates do not require configuring
back ends in VIP mode. In VIP routing mode, Interlock uses the service
VIP, which is a persistent endpoint that exists from service creation to
service deletion, as the proxy back end. VIP routing mode applies Layer
7 routing and then sends packets to the Swarm Layer 4 load balancer,
which routes traffic to service containers.
Canary deployments
In task mode, a canary service with one task next to an existing service
with four tasks represents one out of five total tasks, so the canary
will receive 20% of incoming requests.
Because VIP mode routes by service IP rather than by task IP, it affects
the behavior of canary deployments. In VIP mode, a canary service with
one task next to an existing service with four tasks will receive 50%
of incoming requests, as it represents one out of two total services.
You can set each service to use either the task or the VIP back-end routing
mode. Task mode is the default and is used if a label is not specified or if it
is set to task.
Interlock detects when the service is available and publishes it. After
tasks are running and the proxy service is updated, the application is
available at any URL that is not configured.
In this example, Interlock configures a single upstream for the host using
IP 10.0.2.9. Interlock skips further proxy updates as long as
there is at least one replica for the service, as the only upstream is
the VIP.
Interlock detects when the service is available and publishes it.
After tasks are running and the proxy service is updated, the application is
available through http://new.local with a redirect configured that sends
http://old.local to http://new.local:
Reconfiguring the single proxy service that Interlock manages by default can
take one to two seconds for each overlay network that the proxy manages. You
can scale up to a larger number of Interlock-routed networks and services
by implementing a service cluster. Service clusters use Interlock to manage
multiple proxy services, each responsible for routing to a separate set of
services and their corresponding networks, thereby minimizing proxy
reconfiguration time.
This topic and the next assume that the following prerequisites have been met:
You have an operational MKE cluster with at least two worker nodes
(mke-node-0 and mke-node-1), which you will use as dedicated proxy
servers for two independent Interlock service clusters.
You have enabled Interlock with an HTTP port of 80 and an HTTPS port
of 8443.
From a manager node, apply node labels to the MKE workers that you have
chosen to use as your proxy servers: