Introduction

This documentation provides information on how to deploy and operate a Mirantis Kubernetes Engine (MKE). The documentation is intended to help operators to understand the core concepts of the product. The documentation provides sufficient information to deploy and operate the solution.

The information provided in this documentation set is being constantly improved and amended based on the feedback and kind requests from the consumers of MKE.

Product Overview

Mirantis Kubernetes Engine (MKE, formerly Universal Control Plane or UCP) is the industry-leading container orchestration platform for developing and running modern applications at scale, on private clouds, public clouds, and on bare metal.

MKE delivers immediate value to your business by allowing you to adopt modern application development and delivery models that are cloud-first and cloud-ready. With MKE you get a centralized place with a graphical UI to manage and monitor your Kubernetes and/or Swarm cluster instance.

Your business benefits from using MKE as a container orchestration platform, especially in the following use cases:

More than one container orchestrator

Whether your application requirements are complex and require medium to large clusters or simple ones that can be deployed quickly on development environments, MKE gives you a container orchestration choice. Deploy Kubernetes, Swarm, or both types of clusters and manage them on a single MKE instance or centrally manage your instance using Mirantis Container Cloud.

Robust and scalable applications deployment

Monolithic applications are old school, microservices are the modern way to deploy an application at scale. Delivering applications through an automated CI/CD pipeline can dramatically improve time-to-market and service agility. Adopting microservices becomes a lot easier when using Kubernetes and/or Swarm clusters to deploy and test microservice-based applications.

Multi-tenant software offerings

Containerizing existing monolithic SaaS applications enables quicker development cycles, automated continuous integration and deployment. But these applications need to allow multiple users to share a single instance of a software application. MKE can operate multi-tenant environments, isolate teams and organizations, separate cluster resources, and so on.

See also

Kubernetes

Reference Architecture

The MKE Reference Architecture provides a technical overview of Mirantis Kubernetes Engine (MKE). It is your source for the product hardware and software specifications, standards, component information, and configuration detail.

Introduction to MKE

Mirantis Kubernetes Engine (MKE) allows you to adopt modern application development and delivery models that are cloud-first and cloud-ready. With MKE you get a centralized place with a graphical UI to manage and monitor your Kubernetes and/or Swarm cluster instance.

The core MKE components are:

  • ucp-cluster-agent

    Reconciles the cluster-wide state, including Kubernetes addons such as Kubecompose and KubeDNS, managing replication configurations for the etcd and RethinkDB clusters, and syncing the node inventories of SwarmKit and Swarm Classic. This component is a single-replica service that runs on any manager node in the cluster.

  • ucp-manager-agent

    Reconciles the node-local state on manager nodes, including the configuration of the local Docker daemon, local date volumes, certificates, and local container components. Each manager node in the cluster runs a task from this service.

  • ucp-worker-agent

    Performs the same reconciliation operations as ucp-manager-agent but on worker nodes. This component runs a task on each worker node.

The following MKE component names differ based on the node’s operating system:

Component name on Linux

Component name on Windows

ucp-worker-agent

ucp-worker-agent-win

ucp-containerd-shim-process

ucp-containerd-shim-process-win

ucp-dsinfo

ucp-dsinfo-win

No equivalent

ucp-kube-binaries-win

ucp-pause

ucp-pause-win

MKE hardware requirements

Take careful note of the minimum and recommended hardware requirements for MKE manager and worker nodes prior to deployment.

Note

  • High availability (HA) installations require transferring files between hosts.

  • On manager nodes, MKE only supports the workloads it requires to run.

  • Windows container images are typically larger than Linux container images. As such, provision more local storage for Windows nodes and for any MSR repositories that store Windows container images.

Minimum and recommended hardware requirements

Manager nodes

Worker nodes

Minimum hardware requirements

  • 16 GB of RAM

  • 2 vCPUs

  • 79 GB available storage:

    • 79 GB available storage for the /var partition, unpartitioned

    OR

    • 79 GB available storage, partitioned as follows:

      • 25 GB for a single /var/ partition

      • 25 GB for /var/lib/kubelet/ (for installations and future upgrades)

      • 25 GB for /var/lib/docker/

      • 4 GB for /var/lib/containerd/

  • 4 GB RAM

  • 15 GB storage for the /var/ partition

Recommended hardware requirements

  • 24 - 32 GB RAM

  • 4 vCPUs

  • At least 79 GB available storage, partitioned as follows:

    • 25 GB for a single /var/ partition

    • 25 GB for /var/lib/kubelet/ (for installations and future upgrades)

    • 25 GB for /var/lib/docker/

    • 4 GB for /var/lib/containerd/

Recommendations vary depending on the workloads.

MKE software requirements

Prior to MKE deployment, consider the following software requirements:

  • Run the same MCR version (20.10.0 or later) on all nodes.

  • Run Linux kernel 3.10 or higher on all nodes.

    For debugging purposes, the host OS kernel versions should match as closely as possible.

  • Use a static IP address for each node in the cluster.

Manager nodes

Manager nodes manage a swarm and persist the swarm state. Using several containers per node, the ucp-manager-agent automatically deploys all MKE components on manager nodes, including the MKE web UI and the data stores that MKE uses.

Note

Some Kubernetes components are run as Swarm services because the MKE control plane is itself a Docker Swarm cluster.

The following tables detail the MKE services that run on manager nodes:

Swarm services

MKE component

Description

ucp-auth-api

The centralized service for identity and authentication used by MKE and MSR.

ucp-auth-store

A container that stores authentication configurations and data for users, organizations, and teams.

ucp-auth-worker

A container that performs scheduled LDAP synchronizations and cleans authentication and authorization data.

ucp-client-root-ca

A certificate authority to sign client bundles.

ucp-cluster-agent

The agent that monitors the cluster-wide MKE components. Runs on only one manager node.

ucp-cluster-root-ca

A certificate authority used for TLS communication between MKE components.

ucp-controller

The MKE web server.

ucp-hardware-info

A container for collecting disk/hardware information about the host.

ucp-interlock

A container that monitors Swarm workloads configured to use layer 7 routing. Only runs when you enable layer 7 routing.

ucp-interlock-config

A service that manages Interlock configuration.

ucp-interlock-extension

A service that verifies the run status of the Interlock extension.

ucp-interlock-proxy

A service that provides load balancing and proxying for Swarm workloads. Runs only when layer 7 routing is enabled.

ucp-kube-apiserver

A master component that serves the Kubernetes API. It persists its state in etcd directly, and all other components communicate directly with the API server. The Kubernetes API server is configured to encrypt Secrets using AES-CBC with a 256-bit key. The encryption key is never rotated, and the encryption key is stored on manager nodes, in a file on disk.

ucp-kube-controller-manager

A master component that manages the desired state of controllers and other Kubernetes objects. It monitors the API server and performs background tasks when needed.

ucp-kubelet

The Kubernetes node agent running on every node, which is responsible for running Kubernetes pods, reporting the health of the node, and monitoring resource usage.

ucp-kube-proxy

The networking proxy running on every node, which enables pods to contact Kubernetes services and other pods by way of cluster IP addresses.

ucp-kube-scheduler

A master component that manages Pod scheduling, which communicates with the API server only to obtain workloads that need to be scheduled.

ucp-kv

A container used to store the MKE configurations. Do not use it in your applications, as it is for internal use only. Also used by Kubernetes components.

ucp-manager-agent

The agent that monitors the manager node and ensures that the right MKE services are running.

ucp-proxy

A TLS proxy that allows secure access from the local Mirantis Container Runtime to MKE components.

ucp-sf-notifier

A Swarm service that sends notifications to Salesforce when alerts are configured by OpsCare, and later when they are triggered.

ucp-swarm-manager

A container used to provide backward compatibility with Docker Swarm.

Kubernetes components

MKE component

Description

cri-dockerd-mke

An MKE service that accounts for the removal of dockershim from Kubernetes as of version 1.24, thus enabling MKE to continue using Docker as the container runtime.

k8s_calico-kube-controllers

A cluster-scoped Kubernetes controller used to coordinate Calico networking. Runs on one manager node only.

k8s_calico-node

The Calico node agent, which coordinates networking fabric according to the cluster-wide Calico configuration. Part of the calico-node DaemonSet. Runs on all nodes. Configure the container network interface (CNI) plugin using the --cni-installer-url flag. If this flag is not set, MKE uses Calico as the default CNI plugin.

k8s_enable-strictaffinity

An init container for Calico controller that sets the StrictAffinity in Calico networking according to the configured boolean value.

k8s_firewalld-policy_calico-node

An init container for calico-node that verifies whether systems with firewalld are compatible with Calico.

k8s_install-cni_calico-node

A container in which the Calico CNI plugin binaries are installed and configured on each host. Part of the calico-node DaemonSet. Runs on all nodes.

k8s_ucp-coredns_coredns

The CoreDNS plugin, which provides service discovery for Kubernetes services and Pods.

k8s_ucp-gatekeeper_gatekeeper-controller-manager

The Gatekeeper manager controller for Kubernetes that provides policy enforcement. Only runs when OPA Gatekeeper is enabled in MKE.

k8s_ucp-gatekeeper-audit_gatekeeper-audit

The audit controller for Kubernetes that provides audit functionality of OPA Gatekeeper. Only runs when OPA Gatekeeper is enabled in MKE.

k8s_ucp-kube-compose

A custom Kubernetes resource component that translates Compose files into Kubernetes constructs. Part of the Compose deployment. Runs on one manager node only.

k8s_ucp-kube-compose-api

The API server for Kube Compose, which is part of the compose deployment. Runs on one manager node only.

k8s_ucp-kube-ingress-controller

The Ingress controller for Kubernetes, which provides layer 7 routing for Kubernertes services. Only runs with Ingress for Kubernetes enabled.

k8s_ucp-metrics-inventory

A container that generates the inventory targets for Prometheus server. Part of the Kubernetes Prometheus Metrics plugin.

k8s_ucp-metrics-prometheus

A container used to collect and process metrics for a node. Part of the Kubernetes Prometheus Metrics plugin.

k8s_ucp-metrics-proxy

A container that runs a proxy for the metrics server. Part of the Kubernetes Prometheus Metrics plugin.

k8s_ucp-node-feature-discovery-master

A container that provides node feature discovery labels for Kubernetes nodes.

k8s_ucp-node-feature-discovery-worker

A container that provides node feature discovery labels for Kubernetes nodes.

k8s_ucp-nvidia-device-partitioner

A container that provides support for Multi Instance GPU (MIG) on NVIDIA GPUs.

k8s_ucp-secureoverlay-agent

A container that provides a per-node service that manages the encryption state of the data plane.

k8s_POD_ucp-secureoverlay-mgr

A container that provides the key management process that configures and periodically rotates the encryption keys.

Kubernetes pause containers

MKE component

Description

k8s_POD_calico-node

The pause container for the calico-node pod.

k8s_POD_calico-kube-controllers

The pause container for the calico-kube-controllers pod.

k8s_POD_compose

The pause container for the compose pod.

k8s_POD_compose-api

The pause container for ucp-kube-compose-api.

k8s_POD_coredns

The pause container for the ucp-coredns Pod.

k8s_POD_ingress-nginx-controller

The pause container for ucp-kube-ingress-controller.

k8s_POD_gatekeeper-audit

The pause container for ucp-gatekeeper-audit.

k8s_POD_gatekeeper-controller-manager

The pause container for ucp-gatekeeper.

k8s_POD_ucp-metrics

The pause container for the ucp-metrics.

k8s_POD_ucp-node-feature-discovery

The pause container for the node feature discovery labels on Kubernetes nodes.

k8s_POD_ucp-nvidia-device-partitioner

A pause container for ucp-nvidia-device-partitioner.

k8s_ucp-pause_ucp-nvidia-device-partitioner

A pause container for ucp-nvidia-device-partitioner.

Worker nodes

Worker nodes are instances of MCR that participate in a swarm for the purpose of executing containers. Such nodes receive and execute tasks dispatched from manager nodes. Worker nodes must have at least one manager node, as they do not participate in the Raft distributed state, perform scheduling, or serve the swarm mode HTTP API.

Note

Some Kubernetes components are run as Swarm services because the MKE control plane is itself a Docker Swarm cluster.

The following tables detail the MKE services that run on worker nodes.

Swarm services

MKE component

Description

ucp-hardware-info

A container for collecting host information regarding disks and hardware.

ucp-interlock-config

A service that manages Interlock configuration.

ucp-interlock-extension

A helper service that reconfigures the ucp-interlock-proxy service, based on the Swarm workloads that are running.

ucp-interlock-proxy

A service that provides load balancing and proxying for swarm workloads. Only runs when you enable layer 7 routing.

ucp-kube-proxy

The networking proxy running on every node, which enables Pods to contact Kubernetes services and other Pods through cluster IP addresses. Named ucp-kube-proxy-win in Windows systems.

ucp-kubelet

The Kubernetes node agent running on every node, which is responsible for running Kubernetes Pods, reporting the health of the node, and monitoring resource usage. Named ucp-kubelet-win in Windows systems.

ucp-pod-cleaner-win

A service that removes all the Kubernetes Pods that remain once Kubernetes components are removed from Windows nodes. Runs only on Windows nodes.

ucp-proxy

A TLS proxy that allows secure access from the local Mirantis Container Runtime to MKE components.

ucp-tigera-node-win

The Calico node agent that coordinates networking fabric for Windows nodes according to the cluster-wide Calico configuration. Runs on Windows nodes when Kubernetes is set as the orchestrator.

ucp-tigera-felix-win

A Calico component that runs on every machine that provides endpoints. Runs on Windows nodes when Kubernetes is set as the orchestrator.

ucp-worker-agent-x and ucp-worker-agent-y

A service that monitors the worker node and ensures that the correct MKE services are running. The ucp-worker-agent service ensures that only authorized users and other MKE services can run Docker commands on the node. The ucp-worker-agent-<x/y> deploys a set of containers onto worker nodes, which is a subset of the containers that ucp-manager-agent deploys onto manager nodes. This component is named ucp-worker-agent-win-<x/y> on Windows nodes.

Kubernetes components

MKE component

Description

cri-dockerd-mke

An MKE service that accounts for the removal of dockershim from Kubernetes as of version 1.24, thus enabling MKE to continue using Docker as the container runtime.

k8s_calico-node

The Calico node agent that coordinates networking fabric according to the cluster-wide Calico configuration. Part of the calico-node DaemonSet. Runs on all nodes.

k8s_firewalld-policy_calico-node

An init container for calico-node that verifies whether systems with firewalld are compatible with Calico.

k8s_install-cni_calico-node

A container that installs the Calico CNI plugin binaries and configuration on each host. Part of the calico-node DaemonSet. Runs on all nodes.

k8s_ucp-node-feature-discovery-master

A container that provides node feature discovery labels for Kubernetes nodes.

k8s_ucp-node-feature-discovery-worker

A container that provides node feature discovery labels for Kubernetes nodes.

k8s_ucp-nvidia-device-partitioner

A container that provides supports for Multi Instance GPU (MIG) on NVIDIA GPUs.

k8s_ucp-secureoverlay-agent

A container that provides a per-node service that manages the encryption state of the data plane.

Kubernetes pause containers

MKE component

Description

k8s_POD_calico-node

The pause container for the Calico-node Pod. This container is hidden by default, but you can see it by running the following command:

docker ps -a

k8s_POD_ucp-node-feature-discovery

The pause container for the node feature discovery labels on Kubernetes nodes.

k8s_POD_ucp-nvidia-device-partitioner

The pause container for ucp-nvidia-device-partitioner.

k8s_ucp-pause_ucp-nvidia-device-partitioner

The pause container for ucp-nvidia-device-partitioner.

Admission controllers

Admission controllers are plugins that govern and enforce cluster usage. There are two types of admission controllers: default and custom. The tables below list the available admission controllers. For more information, see Kubernetes documentation: Using Admission Controllers.

Note

You cannot enable or disable custom admission controllers.


Default admission controllers

Name

Description

DefaultStorageClass

Adds a default storage class to PersistentVolumeClaim objects that do not request a specific storage class.

DefaultTolerationSeconds

Sets the pod default forgiveness toleration to tolerate the notready:NoExecute and unreachable:NoExecute taints based on the default-not-ready-toleration-seconds and default-unreachable-toleration-seconds Kubernetes API server input parameters if they do not already have toleration for the node.kubernetes.io/not-ready:NoExecute or node.kubernetes.io/unreachable:NoExecute taints. The default value for both input parameters is five minutes.

LimitRanger

Ensures that incoming requests do not violate the constraints in a namespace LimitRange object.

MutatingAdmissionWebhook

Calls any mutating webhooks that match the request.

NamespaceLifecycle

Ensures that users cannot create new objects in namespaces undergoing termination and that MKE rejects requests in nonexistent namespaces. It also prevents users from deleting the reserved default, kube-system, and kube-public namespaces.

NodeRestriction

Limits the Node and Pod objects that a kubelet can modify.

PersistentVolumeLabel (deprecated)

Attaches region or zone labels automatically to PersistentVolumes as defined by the cloud provider.

PodNodeSelector

Limits which node selectors can be used within a namespace by reading a namespace annotation and a global configuration.

ResourceQuota

Observes incoming requests and ensures they do not violate any of the constraints in a namespace ResourceQuota object.

ServiceAccount

Implements automation for ServiceAccount resources.

ValidatingAdmissionWebhook

Calls any validating webhooks that match the request.


Custom admission controllers

Name

Description

UCPAuthorization

  • Annotates Docker Compose-on-Kubernetes Stack resources with the identity of the user performing the request so that the Docker Compose-on-Kubernetes resource controller can manage Stacks with correct user authorization.

  • Detects the deleted ServiceAccount resources to correctly remove them from the scheduling authorization backend of an MKE node.

  • Simplifies creation of the RoleBindings and ClusterRoleBindings resources by automatically converting user, organization, and team Subject names into their corresponding unique identifiers.

  • Prevents users from deleting the built-in cluster-admin, ClusterRole, or ClusterRoleBinding resources.

  • Prevents under-privileged users from creating or updating PersistentVolume resources with host paths.

  • Works in conjunction with the built-in PodSecurityPolicies admission controller to prevent under-privileged users from creating Pods with privileged options. To grant non-administrators and non-cluster-admins access to privileged attributes, refer to Use admission controllers for access in the MKE Operations Guide.

CheckImageSigning

Enforces MKE Docker Content Trust policy which, if enabled, requires that all pods use container images that have been digitally signed by trusted and authorized users, which are members of one or more teams in MKE.

UCPNodeSelector

Adds a com.docker.ucp.orchestrator.kubernetes:* toleration to pods in the kube-system namespace and removes the com.docker.ucp.orchestrator.kubernetes tolerations from pods in other namespaces. This ensures that user workloads do not run on swarm-only nodes, which MKE taints with com.docker.ucp.orchestrator.kubernetes:NoExecute. It also adds a node affinity to prevent pods from running on manager nodes depending on MKE settings.

Pause containers

Every Kubernetes Pod includes an empty pause container, which bootstraps the Pod to establish all of the cgroups, reservations, and namespaces before its individual containers are created. The pause container image is always present, so the pod resource allocation happens instantaneously as containers are created.


To display pause containers:

When using the client bundle, pause containers are hidden by default.

  • To display pause containers when using the client bundle:

    docker ps -a | grep -I pause
    
  • To display pause containers when not using the client bundle:

    1. Log in to a manager or worker node.

    2. Display pause containers:

      docker ps | grep -I pause
      

    Example output on a manager node:

    5aeeafb80e8f   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_calico-kube-controllers-86565cb444-rwlrd_kube-system_fdd491cc-94e4-4510-a080-396454f2798c_0
    ea4a1263398d   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-node-feature-discovery-59btp_node-feature-discovery_ef7a6f29-e3d4-4430-9c75-22940208f616_0
    951f6622f8de   d50ea4c05222               "/pause"   2 hours ago   Up 2 hours   k8s_ucp-pause_ucp-nvidia-device-partitioner-77qq5_kube-system_59d95409-721e-48f3-9524-97f1d30e63a4_0
    f99ab238282e   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-nvidia-device-partitioner-77qq5_kube-system_59d95409-721e-48f3-9524-97f1d30e63a4_0
    eec3d297e7a2   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-metrics-6sf2z_kube-system_de4f67d3-99cc-4d00-a4f1-ccad66c31ebc_0
    5a40fdc669b1   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_compose-api-cb58448cc-xfb5g_kube-system_d1c8c8d2-9b81-475f-9cd4-9f486d3ace97_0
    8e5897a13cd6   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_coredns-9d5479b97-gmwct_kube-system_9c89d798-ff47-4194-b5d3-1ba3368698bc_0
    d308274689a4   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_coredns-9d5479b97-sxnb2_kube-system_74a70909-771c-4dce-9518-d49129e3645c_0
    c45bf83d032a   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_compose-69d4dc8c69-f56ql_kube-system_64646ec3-f9e8-4cce-aeb1-37636e1858ce_0
    c32ea1407b28   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_calico-node-j9fmw_kube-system_0939bae3-0659-4608-8547-0b4095d99cc5_0
    

    Example output on a worker node:

    c5e836c38435   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-node-feature-discovery-wztkl_node-feature-discovery_efe87dc1-349e-47f2-a98f-67d4675f6d9b_0
    0f66550f654e   d50ea4c05222               "/pause"   2 hours ago   Up 2 hours   k8s_ucp-pause_ucp-nvidia-device-partitioner-bq5th_kube-system_873f6045-8f61-4a55-9d00-d55b27e8f2c9_0
    753efca985ef   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_calico-node-xx28v_kube-system_6c024ae0-8d27-4d89-a327-8ca635b37f79_0
    7f2eda992ea6   mirantis/ucp-pause:3.6.0   "/pause"   2 hours ago   Up 2 hours   k8s_POD_ucp-nvidia-device-partitioner-bq5th_kube-system_873f6045-8f61-4a55-9d00-d55b27e8f2c9_0
    

See also

Kubernetes Pods

Volumes

MKE uses named volumes to persist data on all nodes on which it runs.

Volumes used by MKE manager nodes

Volume name

Contents

ucp-auth-api-certs

Certificate and keys for the authentication and authorization service.

ucp-auth-store-certs

Certificate and keys for the authentication and authorization store.

ucp-auth-store-data

Data of the authentication and authorization store, replicated across managers.

ucp-auth-worker-certs

Certificate and keys for authentication worker.

ucp-auth-worker-data

Data of the authentication worker.

ucp-client-root-ca

Root key material for the MKE root CA that issues client certificates.

ucp-cluster-root-ca

Root key material for the MKE root CA that issues certificates for swarm members.

ucp-controller-client-certs

Certificate and keys that the MKE web server uses to communicate with other MKE components.

ucp-controller-server-certs

Certificate and keys for the MKE web server running in the node.

ucp-kv

MKE configuration data, replicated across managers.

ucp-kv-certs

Certificates and keys for the key-value store.

ucp-metrics-data

Monitoring data that MKE gathers.

ucp-metrics-inventory

Configuration file that the ucp-metrics service uses.

ucp-node-certs

Certificate and keys for node communication.

ucp-backup

Backup artifacts that are created while processing a backup. The artifacts persist on the volume for the duration of the backup and are cleaned up when the backup completes, though the volume itself remains.

mke-containers

Symlinks to MKE component log files, created by ucp-agent.

ucp-kube-apiserver-audit

Audit logs streamed by kube-apiserver container.

Volumes used by MKE worker nodes

Volume name

Contents

ucp-node-certs

Certificate and keys for node communication.

mke-containers

Symlinks to MKE component log files, created by ucp-agent.

You can customize the volume driver for the volumes by creating the volumes prior to installing MKE. During installation, MKE determines which volumes do not yet exist on the node and creates those volumes using the default volume driver.

By default, MKE stores the data for these volumes at /var/lib/docker/volumes/<volume-name>/_data.

Configuration

The table below presents the configuration files in use by MKE:

Configuration files in use by MKE

Configuration file name

Description

com.docker.interlock.extension

Configuration of the Interlock extension service that monitors and configures the proxy service

com.docker.interlock.proxy

Configuration of the service that handles and routes user requests

com.docker.license

MKE license

com.docker.ucp.interlock.conf

Configuration of the core Interlock service

Web UI and CLI

You can interact with MKE either through the web UI or the CLI.

With the MKE web UI you can manage your swarm, grant and revoke user permissions, deploy, configure, manage, and monitor your applications.

In addition, MKE exposes the standard Docker API, so you can continue using such existing tools as the Docker CLI client. As MKE secures your cluster with RBAC, you must configure your Docker CLI client and other client tools to authenticate your requests using client certificates that you can download from your MKE profile page.

Role-based access control

MKE allows administrators to authorize users to view, edit, and use cluster resources by granting role-based permissions for specific resource sets.

To authorize access to cluster resources across your organization, high-level actions that MKE administrators can take include the following:

  • Add and configure subjects (users, teams, organizations, and service accounts).

  • Define custom roles (or use defaults) by adding permitted operations per resource type.

  • Group cluster resources into resource sets of Swarm collections or Kubernetes namespaces.

  • Create grants by combining subject, role, and resource set.

Note

Only administrators can manage Role-based access control (RBAC).

The following table describes the core elements used in RBAC:

Element

Description

Subjects

Subjects are granted roles that define the permitted operations for one or more resource sets and include:

User

A person authenticated by the authentication backend. Users can belong to more than one team and more than one organization.

Team

A group of users that share permissions defined at the team level. A team can be in only one organization.

Organization

A group of teams that share a specific set of permissions, defined by the roles of the organization.

Service account

A Kubernetes object that enables a workload to access cluster resources assigned to a namespace.

Roles

Roles define what operations can be done by whom. A role is a set of permitted operations for a type of resource, such as a container or volume. It is assigned to a user or a team with a grant.

For example, the built-in Restricted Control role includes permissions to view and schedule but not to update nodes. Whereas a custom role may include permissions to read, write, and execute (r-w-x) volumes and secrets.

Most organizations use multiple roles to fine-tune the appropriate access for different subjects. Users and teams may have different roles for the different resources they access.

Resource sets

Users can group resources into two types of resource sets to control user access: Docker Swarm collections and Kubernetes namespaces.

Docker Swarm collections

Collections have a directory-like structure that holds Swarm resources. You can create collections in MKE by defining a directory path and moving resources into it. Alternatively, you can use labels in your YAML file to assign application resources to the path. Resource types that users can access in Swarm collections include containers, networks, nodes, services, secrets, and volumes.

Each Swarm resource can be in only one collection at a time, but collections can be nested inside one another to a maximum depth of two layers. Collection permission includes permission for child collections.

For child collections and users belonging to more than one team, the system concatenates permissions from multiple roles into an effective role for the user, which specifies the operations that are allowed for the target.

Kubernetes namespaces

Namespaces are virtual clusters that allow multiple teams to access a given cluster with different permissions. Kubernetes automatically sets up four namespaces, and users can add more as necessary, though unlike Swarm collections they cannot be nested. Resource types that users can access in Kubernetes namespaces include pods, deployments, network policies, nodes, services, secrets, and more.

Grants

Grants consist of a subject, role, and resource set, and define how specific users can access specific resources. All the grants of an organization taken together constitute an access control list (ACL), which is a comprehensive access policy for the organization.

For complete information on how to configure and use role-based access control in MKE, refer to Authorize role-based access.

MKE limitations

See also

Kubernetes

Installation Guide

The MKE Installation Guide provides everything you need to install and configure Mirantis Kubernetes Engine (MKE). The guide offers detailed information, procedures, and examples that are specifically designed to help DevOps engineers and administrators install and configure the MKE container orchestration platform.

Plan the deployment

Default install directories

The following table details the default MKE install directories:

Path

Description

/var/lib/docker

Docker data root directory

/var/lib/kubelet

kubelet data root directory (created with ftype = 1)

/var/lib/containerd

containerd data root directory (created with ftype = 1)

Host name strategy

Before installing MKE, plan a single host name strategy to use consistently throughout the cluster, keeping in mind that MKE and MCR both use host names.

There are two general strategies for creating host names: short host names and fully qualified domain names (FQDN). Consider the following examples:

  • Short host name: engine01

  • Fully qualified domain name: node01.company.example.com

MCR considerations

A number of MCR considerations must be taken into account when deploying any MKE cluster.

default-address-pools

MCR uses three separate IP ranges for the docker0, docker_gwbridge, and ucp-bridge interfaces. By default, MCR assigns the first available subnet in default-address-pools (172.17.0.0/16) to docker0, the second (172.18.0.0/16) to docker_gwbridge, and the third (172.19.0.0/16) to ucp-bridge.

Note

The ucp-bridge bridge network specifically supports MKE component containers.

You can reassign the docker0, docker_gwbridge, and ucp-bridge subnets in default-address-pools. To do so, replace the relevant values in default-address-pools in the /etc/docker/daemon.json file, making sure that the setting includes at least three IP pools. Be aware that you must restart the docker.service to activate your daemon.json file edits.

By default, default-address-pools contains the following values:

{
  "default-address-pools": [
   {"base":"172.17.0.0/16","size":16}, <-- docker0
   {"base":"172.18.0.0/16","size":16}, <-- docker_gwbridge
   {"base":"172.19.0.0/16","size":16}, <-- ucp-bridge
   {"base":"172.20.0.0/16","size":16},
   {"base":"172.21.0.0/16","size":16},
   {"base":"172.22.0.0/16","size":16},
   {"base":"172.23.0.0/16","size":16},
   {"base":"172.24.0.0/16","size":16},
   {"base":"172.25.0.0/16","size":16},
   {"base":"172.26.0.0/16","size":16},
   {"base":"172.27.0.0/16","size":16},
   {"base":"172.28.0.0/16","size":16},
   {"base":"172.29.0.0/16","size":16},
   {"base":"172.30.0.0/16","size":16},
   {"base":"192.168.0.0/16","size":20}
   ]
 }
The default-address-pools parameters

Parameter

Description

default-address-pools

The list of CIDR ranges used to allocate subnets for local bridge networks.

base

The CIDR range allocated for bridge networks in each IP address pool.

size

The CIDR netmask that determines the subnet size to allocate from the base pool. If the size matches the netmask of the base, then the pool contains one subnet. For example, {"base":"172.17.0.0/16","size":16} creates the subnet: 172.17.0.0/16 (172.17.0.1 - 172.17.255.255).

For example, {"base":"192.168.0.0/16","size":20} allocates /20 subnets from 192.168.0.0/16, including the following subnets for bridge networks:

192.168.0.0/20 (192.168.0.1 - 192.168.15.255)

192.168.16.0/20 (192.168.16.1 - 192.168.31.255)

192.168.32.0/20 (192.168.32.1 - 192.168.47.255)

192.168.48.0/20 (192.168.48.1 - 192.168.63.255)

192.168.64.0/20 (192.168.64.1 - 192.168.79.255)

192.168.240.0/20 (192.168.240.1 - 192.168.255.255)

docker0

MCR creates and configures the host system with the docker0 virtual network interface, an ethernet bridge through which all traffic between MCR and the container moves. MCR uses docker0 to handle all container routing. You can specify an alternative network interface when you start the container.

MCR allocates IP addresses from the docker0 configurable IP range to the containers that connect to docker0. The default IP range, or subnet, for docker0 is 172.17.0.0/16.

You can change the docker0 subnet in /etc/docker/daemon.json using the settings in the following table. Be aware that you must restart the docker.service to activate your daemon.json file edits.

Parameter

Description

default-address-pools

Modify the first pool in default-address-pools.

Caution

By default, MCR assigns the second pool to docker_gwbridge. If you modify the first pool such that the size does not match the base netmask, it can affect docker_gwbridge.

{
   "default-address-pools": [
         {"base":"172.17.0.0/16","size":16}, <-- Modify this value
         {"base":"172.18.0.0/16","size":16},
         {"base":"172.19.0.0/16","size":16},
         {"base":"172.20.0.0/16","size":16},
         {"base":"172.21.0.0/16","size":16},
         {"base":"172.22.0.0/16","size":16},
         {"base":"172.23.0.0/16","size":16},
         {"base":"172.24.0.0/16","size":16},
         {"base":"172.25.0.0/16","size":16},
         {"base":"172.26.0.0/16","size":16},
         {"base":"172.27.0.0/16","size":16},
         {"base":"172.28.0.0/16","size":16},
         {"base":"172.29.0.0/16","size":16},
         {"base":"172.30.0.0/16","size":16},
         {"base":"192.168.0.0/16","size":20}
   ]
}

fixed-cidr

Configures a CIDR range.

Customize the subnet for docker0 using standard CIDR notation. The default subnet is 172.17.0.0/16, the network gateway is 172.17.0.1, and MCR allocates IPs 172.17.0.2 - 172.17.255.254 for your containers.

{
  "fixed-cidr": "172.17.0.0/16",
}

bip

Configures a gateway IP address and CIDR netmask of the docker0 network.

Customize the subnet for docker0 using the <gateway IP>/<CIDR netmask> notation. The default subnet is 172.17.0.0/16, the network gateway is 172.17.0.1, and MCR allocates IPs 172.17.0.2 - 172.17.255.254 for your containers.

{
  "bip": "172.17.0.0/16",
}
docker_gwbridge

The docker_gwbridge is a virtual network interface that connects overlay networks (including ingress) to individual MCR container networks. Initializing a Docker swarm or joining a Docker host to a swarm automatically creates docker_gwbridge in the kernel of the Docker host. The default docker_gwbridge subnet (172.18.0.0/16) is the second available subnet in default-address-pools.

To change the docker_gwbridge subnet, open daemon.json and modify the second pool in default-address-pools:

{
    "default-address-pools": [
       {"base":"172.17.0.0/16","size":16},
       {"base":"172.18.0.0/16","size":16}, <-- Modify this value
       {"base":"172.19.0.0/16","size":16},
       {"base":"172.20.0.0/16","size":16},
       {"base":"172.21.0.0/16","size":16},
       {"base":"172.22.0.0/16","size":16},
       {"base":"172.23.0.0/16","size":16},
       {"base":"172.24.0.0/16","size":16},
       {"base":"172.25.0.0/16","size":16},
       {"base":"172.26.0.0/16","size":16},
       {"base":"172.27.0.0/16","size":16},
       {"base":"172.28.0.0/16","size":16},
       {"base":"172.29.0.0/16","size":16},
       {"base":"172.30.0.0/16","size":16},
       {"base":"192.168.0.0/16","size":20}
   ]
}

Caution

  • Modifying the first pool to customize the docker0 subnet can affect the default docker_gwbridge subnet. Refer to docker0 for more information.

  • You can only customize the docker_gwbridge settings before you join the host to the swarm or after temporarily removing it.

Docker swarm

The default address pool that Docker Swarm uses for its overlay network is 10.0.0.0/8. If this pool conflicts with your current network implementation, you must use a custom IP address pool. Prior to installing MKE, specify your custom address pool using the --default-addr-pool option when initializing swarm.

Note

The Swarm default-addr-pool and MCR default-address-pools settings define two separate IP address ranges used for different purposes.

A node.Status.Addr of 0.0.0.0 can cause unexpected problems. To prevent any such issues, add the --advertise-addr flag to the docker swarm join command.

To resolve the 0.0.0.0 situation, initiate the following workaround:

  1. Stop the docker daemon that has .Status.Addr 0.0.0.0.

  2. In the /var/lib/docker/swarm/docker-state.json file, apply the correct node IP to AdvertiseAddr and LocalAddr.

  3. Start the docker daemon.

Example result:

`{"LocalAddr":"","RemoteAddr":"10.200.200.10:2377","ListenAddr":"0.0.0.0:2377","AdvertiseAddr":"","DataPathAddr":"","DefaultAddressPool":null,"SubnetSize":0,"DataPathPort":0,"JoinInProgress":false,"FIPS":false}`

to

`{"LocalAddr":"10.200.200.13","RemoteAddr":"","ListenAddr":"0.0.0.0:2377","AdvertiseAddr":"10.200.200.13:2377","DataPathAddr":"","DefaultAddressPool":null,"SubnetSize":0,"DataPathPort":0,"JoinInProgress":false,"FIPS":false}
Kubernetes

Kubernetes uses two internal IP ranges, either of which can overlap and conflict with the underlying infrastructure, thus requiring custom IP ranges.

The pod network

Either Calico or Azure IPAM services gives each Kubernetes pod an IP address in the default 192.168.0.0/16 range. To customize this range, during MKE installation, use the --pod-cidr flag with the ucp install command.

The services network

You can access Kubernetes services with a VIP in the default 10.96.0.0/16 Cluster IP range. To customize this range, during MKE installation, use the --service-cluster-ip-range flag with the ucp install command.

See also

docker data-root

The storage path for such persisted data as images, volumes, and cluster state is docker data root (data-root in /etc/docker/daemon.json).

MKE clusters require that all nodes have the same docker data-root for the Kubernetes network to function correctly. In addition, if the data-root is changed on all nodes you must recreate the Kubernetes network configuration in MKE by running the following commands:

kubectl -n kube-system delete configmap/calico-config
kubectl -n kube-system delete ds/calico-node deploy/calico-kube-controllers

See also

Kubernetes

no-new-privileges

The no-new-privileges setting prevents the container application processes from gaining new privileges during the execution process.

For most Linux distributions, MKE supports setting no-new-privileges to true in the /etc/docker/daemon.json file. The parameter is not, however, supported on RHEL 7.9, CentOS 7.9, Oracle Linux 7.8, and Oracle Linux 7.9.

This option is not supported on Windows. It is a Linux kernel feature.

Device Mapper storage driver

MCR hosts that run the devicemapper storage driver use the loop-lvm configuration mode by default. This mode uses sparse files to build the thin pool used by image and container snapshots and is designed to work without any additional configuration.

Note

Mirantis recommends that you use direct-lvm mode in production environments in lieu of loop-lvm mode. direct-lvm mode is more efficient in its use of system resources than loop-lvm mode, and you can scale it as necessary.

For information on how to configure direct-lvm mode, refer to the Docker documentation, Use the Device Mapper storage driver.

Memory metrics reporting

To report accurate memory metrics, MCR requires that you enable specific kernel settings that are often disabled on Ubuntu and Debian systems. For detailed instructions on how to do this, refer to the Docker documentation, Your kernel does not support cgroup swap limit capabilities.

Perform pre-deployment configuration

Configure networking

A well-configured network is essential for the proper functioning of your MKE deployment. Pay particular attention to such key factors as IP address provisioning, port management, and traffic enablement.

IP considerations

Before installing MKE, adopt the following practices when assigning IP addresses:

  • Ensure that your network and nodes support using a static IPv4 address and assign one to every node.

  • Avoid IP range conflicts. The following table lists the recommended addresses you can use to avoid IP range conflicts:

    Component

    Subnet

    Range

    Recommended IP address

    MCR

    default-address-pools

    CIDR range for interface and bridge networks

    172.17.0.0/16 - 172.30.0.0/16, 192.168.0.0/16

    Swarm

    default-addr-pool

    CIDR range for Swarm overlay networks

    10.0.0.0/8

    Kubernetes

    pod-cidr

    CIDR range for Kubernetes pods

    192.168.0.0/16

    Kubernetes

    service-cluster-ip-range

    CIDR range for Kubernetes services

    10.96.0.0/16

    Minimum: 10.96.0.0/24

See also

Kubernetes

Open ports to incoming traffic

When installing MKE on a host, you need to open specific ports to incoming traffic. Each port listens for incoming traffic from a particular set of hosts, known as the port scope.

MKE uses the following scopes:

Scope

Description

External

Traffic arrives from outside the cluster through end-user interaction.

Internal

Traffic arrives from other hosts in the same cluster.

Self

Traffic arrives to Self ports only from processes on the same host. These ports, however, do not need to be open to outside traffic.


Open the following ports for incoming traffic on each host type:

Hosts

Port

Scope

Purpose

Managers, workers

TCP 179

Internal

BGP peers, used for Kubernetes networking

Managers

TCP 443 (configurable)

External, internal

MKE web UI and API

Managers

TCP 2376 (configurable)

Internal

Docker swarm manager, used for backwards compatibility

Managers

TCP 2377 (configurable)

Internal

Control communication between swarm nodes

Managers, workers

UDP 4789

Internal

Overlay networking

Managers

TCP 6443 (configurable)

External, internal

Kubernetes API server endpoint

Managers, workers

TCP 6444

Self

Kubernetes API reverse proxy

Managers, workers

TCP, UDP 7946

Internal

Gossip-based clustering

Managers

TCP 9055

Internal

ucp-rethinkdb-exporter metrics

Managers, workers

TCP 9091

Internal

Felix Prometheus calico-node metrics

Managers

TCP 9094

Self

Felix Prometheus kube-controller metrics

Managers, workers

TCP 9099

Self

Calico health check

Managers, workers

TCP 9100

Internal

ucp-node-exporter metrics

Managers, workers

TCP 10248

Self

Kubelet health check

Managers, workers

TCP 10250

Internal

Kubelet

Managers, workers

TCP 12376

Internal

TLS authentication proxy that provides access to MCR

Managers, workers

TCP 12378

Self

etcd reverse proxy

Managers

TCP 12379

Internal

etcd Control API

Managers

TCP 12380

Internal

etcd Peer API

Managers

TCP 12381

Internal

MKE cluster certificate authority

Managers

TCP 12382

Internal

MKE client certificate authority

Managers

TCP 12383

Internal

Authentication storage backend

Managers

TCP 12384

Internal

Authentication storage backend for replication across managers

Managers

TCP 12385

Internal

Authentication service API

Managers

TCP 12386

Internal

Authentication worker

Managers

TCP 12387

Internal

Prometheus server

Managers

TCP 12388

Internal

Kubernetes API server

Managers, workers

TCP 12389

Self

Hardware Discovery API

Managers

TCP 12391

Internal

ucp-kube-controller-manager metrics

Managers

TCP 12392

Internal

MKE etcd certificate authority

See also

Ports information for:

Cluster and service networking options

MKE supports the following cluster and service networking options:

  • Kube-proxy with iptables proxier, and either the managed CNI or an unmanaged alternative

  • Kube-proxy with ipvs proxier, and either the managed CNI or an unmanaged alternative

  • eBPF mode with either the managed CNI or an unmanaged alternative

You can configure cluster and service networking options at install time or in existing clusters. For detail on reconfiguring existing clusters, refer to Configure cluster and service networking in an existing cluster in the MKE Operations Guide.

Caution

Swarm workloads that require the use of encrypted overlay networks must use iptables proxier with either the managed CNI or an unmanaged alternative. Be aware that the other networking options detailed here automatically disable Docker Swarm encrypted overlay networks.

Mirantis partner integrations

Solution component

Develop and maintain

Test and integrate with MKE

First line support

Product support

Calico Open Source

Community

Mirantis

Mirantis

Tigera for Linux, Mirantis for Windows

Calico Enterprise

Tigera

Tigera, for every major MKE release

Mirantis

Tigera, with customers paying for additional features

Cilium Open Source

Community

Planned

Mirantis

Community or Isovalent

Cilium Enterprise

Isovalent

Isovalent

Mirantis

Isovalent

To enable kube-proxy with iptables proxier while using the managed CNI:

Using default option kube-proxy with iptables proxier is the equivalent of specifying --kube-proxy-mode=iptables at install time. To verify that the option is operational, confirm the presence of the following line in the ucp-kube-proxy container logs:

I1027 05:35:27.798469        1 server_others.go:212] Using iptables Proxier.

To enable kube-proxy with ipvs proxier while using the managed CNI:

  1. Prior to MKE installation, verify that the following kernel modules are available on all Linux manager and worker nodes:

    • ipvs

    • ip_vs_rr

    • ip_vs_wrr

    • ip_vs_sh

    • nf_conntrack_ipv4

  2. Specify --kube-proxy-mode=ipvs at install time.

  3. Optional. Once installation is complete, configure the following ipvs-related parameters in the MKE configuration file (otherwise, MKE will use the Kubernetes default parameter settings):

    • ipvs_exclude_cidrs = ""

    • ipvs_min_sync_period = ""

    • ipvs_scheduler = ""

    • ipvs_strict_arp = false

    • ipvs_sync_period = ""

    • ipvs_tcp_timeout = ""

    • ipvs_tcpfin_timeout = ""

    • ipvs_udp_timeout = ""

    For more information on using these parameters, refer to kube-proxy in the Kubernetes documentation.

    Note

    The ipvs-related parameters have no install time counterparts and therefore must only be configured once MKE installation is complete.

  4. Verify that kube-proxy with ipvs proxier is operational by confirming the presence of the following lines in the ucp-kube-proxy container logs:

    I1027 05:14:50.868486     1 server_others.go:274] Using ipvs Proxier.
    W1027 05:14:50.868822     1 proxier.go:445] IPVS scheduler not specified, use rr by default
    

To enable eBPF mode while using the managed CNI:

  1. Verify that the prerequisites for eBPF use have been met, including kernel compatibility, for all Linux manager and worker nodes. Refer to the Calico documentation Enable the eBPF dataplane for more information.

  2. Specify --calico-ebpf-enabled at install time.

  3. Verify that eBPF mode is operational by confirming the presence of the following lines in the ucp-kube-proxy container logs:

    KUBE_PROXY_MODE (disabled) CLEANUP_ON_START_DISABLED true
    "Sleeping forever...."
    

To enable kube-proxy with iptables proxier while using an unmanaged CNI.

  1. Specify --unmanaged-cni at install time.

  2. Verify that kube-proxy with iptables proxier is operational by confirming the presence of the following line in the ucp-kube-proxy container logs:

    I1027 05:35:27.798469     1 server_others.go:212] Using iptables Proxier.
    

To enable kube-proxy with ipvs proxier while using an unmanaged CNI:

  1. Specify the following parameters at install time:

    • --unmanaged-cni

    • --kube-proxy-mode=ipvs

  2. Verify that kube-proxy with ipvs proxier is operational by confirming the presence of the following lines in the ucp-kube-proxy container logs:

    I1027 05:14:50.868486     1 server_others.go:274] Using ipvs Proxier.
    W1027 05:14:50.868822     1 proxier.go:445] IPVS scheduler not specified, use rr by default
    

To enable eBPF mode while using an unmanaged CNI:

  1. Verify that the prerequisites for eBPF use have been met, including kernel compatibility, for all Linux manager and worker nodes. Refer to the Calico documentation Enable the eBPF dataplane for more information.

  2. Specify the following parameters at install time:

    • --unmanaged-cni

    • --kube-proxy-mode=disabled

    • --kube-default-drop-masq-bits

  3. Verify that eBPF mode is operational by confirming the presence of the following lines in ucp-kube-proxy container logs:

    KUBE_PROXY_MODE (disabled) CLEANUP_ON_START_DISABLED true
    "Sleeping forever...."
    
Calico networking

Calico is the default networking plugin for MKE. The default Calico encapsulation setting for MKE is VXLAN, however the plugin also supports IP-in-IP encapsulation. Refer to the Calico documentation on Overlay networking for more information.

Important

NetworkManager can impair the Calico agent routing function. To resolve this issue, you must create a file called /etc/NetworkManager/conf.d/calico.conf with the following content:

[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:wireguard.cali
Multus CNI installation and enablement

Available since MKE 3.7.0

You can enable Multus CNI in the MKE cluster when you install MKE, using the --multus-cni flag with the MKE install CLI command.

Multus CNI acts as a meta plugin, enabling the attachment of multiple network interfaces to multi-homed Pods. Refer to Multus CNI on GitHub for more information.

Enable ESP traffic

For overlay networks with encryption to function, you must allow IP protocol 50 Encapsulating Security Payload (ESP) traffic.

If you are running RHEL 8.x, Rocky Linux 8.x, or CentOS 8, install kernel module xt_u32:

sudo dnf install kernel-modules-extra
Avoid firewall conflicts

Avoid firewall conflicts in the following Linux distributions:

Linux distribution

Procedure

SUSE Linux Enterprise Server 12 SP2

Installations have the FW_LO_NOTRACK flag turned on by default in the openSUSE firewall. It speeds up packet processing on the loopback interface but breaks certain firewall setups that redirect outgoing packets via custom rules on the local machine.

To turn off the FW_LO_NOTRACK option:

  1. In /etc/sysconfig/SuSEfirewall2, set FW_LO_NOTRACK="no".

  2. Either restart the firewall or reboot the system.

SUSE Linux Enterprise Server 12 SP3

No change is required, as installations have the FW_LO_NOTRACK flag turned off by default.

DNS entry in hosts file

MKE adds the proxy.local DNS entry to the following files at install time:

Linux

/etc/hosts

Windows

c:\Windows\System32\Drivers\etc\hosts


To configure MCR to connect to the Internet using HTTP_PROXY you must set the value of proxy.local to NOPROXY.

Preconfigure an SLES installation

Before performing SUSE Linux Enterprise Server (SLES) installations, consider the following prerequisite steps:

  • For SLES 15 installations, disable CLOUD_NETCONFIG_MANAGE prior to installing MKE:

    1. Set CLOUD_NETCONFIG_MANAGE="no" in the /etc/sysconfig/network/ifcfg-eth0 network interface configuration file.

    2. Run the service network restart command.

  • By default, SLES disables connection tracking. To allow Kubernetes controllers in Calico to reach the Kubernetes API server, enable connection tracking on the loopback interface for SLES by running the following commands for each node in the cluster:

    sudo mkdir -p /etc/sysconfig/SuSEfirewall2.d/defaults
    echo FW_LO_NOTRACK=no | sudo tee \
    /etc/sysconfig/SuSEfirewall2.d/defaults/99-docker.cfg
    sudo SuSEfirewall2 start
    

See also

Verify the timeout settings

Confirm that MKE components have the time they require to effectively communicate.

Default timeout settings

Component

Timeout (ms)

Configurable

Raft consensus between manager nodes

3000

no

Gossip protocol for overlay networking

5000

no

etcd

500

yes

RethinkDB

10000

no

Stand-alone cluster

90000

no

Network lag of more than two seconds between MKE manager nodes can cause problems in your MKE cluster. For example, such a lag can indicate to MKE components that the other nodes are down, resulting in unnecessary leadership elections that will result in temporary outages and reduced performance. To resolve the issue, decrease the latency of the MKE node communication network.

See also

Configure time synchronization

Configure all containers in an MKE cluster to regularly synchronize with a Network Time Protocol (NTP) server, to ensure consistency between all containers in the cluster and to circumvent unexpected behavior that can lead to poor performance.

  1. Install NTP on every machine in your cluster:

    sudo apt-get update && sudo apt-get install ntp ntpdate
    
    sudo yum install ntp ntpdate
    sudo systemctl start ntpd
    sudo systemctl enable ntpd
    sudo systemctl status ntpd
    sudo ntpdate -u -s 0.centos.pool.ntp.org
    sudo systemctl restart ntpd
    
    sudo zypper ref && zypper install ntp
    

    In addition to installing NTP, the command sequence starts ntpd, a daemon that periodically syncs the machine clock to a central server.

  2. Sync the machine clocks:

    sudo ntpdate pool.ntp.org
    
  3. Verify that the time of each machine is in sync with the NTP servers:

    sudo ntpq -p
    

    Example output, which illustrates how much the machine clock is out of sync with the NTP servers:

         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
     45.35.50.61     139.78.97.128    2 u   24   64    1   60.391  4623378   0.004
     time-a.timefreq .ACTS.           1 u   23   64    1   51.849  4623377   0.004
     helium.constant 128.59.0.245     2 u   22   64    1   71.946  4623379   0.004
     tock.usshc.com  .GPS.            1 u   21   64    1   59.576  4623379   0.004
     golem.canonical 17.253.34.253    2 u   20   64    1  145.356  4623378   0.004
    

Configure a load balancer

Though MKE does not include a load balancer, you can configure your own to balance user requests across all manager nodes. Before that, decide whether you will add nodes to the load balancer using their IP address or their fully qualified domain name (FQDN), and then use that strategy consistently throughout the cluster. Take note of all IP addresses or FQDNs before you start the installation.

If you plan to deploy both MKE and MSR, your load balancer must be able to differentiate between the two: either by IP address or port number. Because both MKE and MSR use port 443 by default, your options are as follows:

  • Configure your load balancer to expose either MKE or MSR on a port other than 443.

  • Configure your load balancer to listen on port 443 with separate virtual IP addresses for MKE and MSR.

  • Configure separate load balancers for MKE and MSR, both listening on port 443.

If you want to install MKE in a high-availability configuration with a load balancer in front of your MKE controllers, include the appropriate IP address and FQDN for the load balancer VIP. To do so, use one or more --san flags either with the ucp install command or in interactive mode when MKE requests additional SANs.

Configure IPVS

MKE supports the setting of values for all IPVS related parameters that are exposed by kube-proxy.

Kube-proxy runs on each cluster node, its role being to load-balance traffic whose destination is services (via cluster IPs and node ports) to the correct backend pods. Of the modes in which kube-proxy can run, IPVS (IP Virtual Server) offers the widest choice of load balancing algorithms and superior scalability.

Refer to the Calico documentation, Comparing kube-proxy modes: iptables or IPVS? for detailed information on IPVS.

Caution

You can only enable IPVS for MKE at installation, and it persists throughout the life of the cluster. Thus, you cannot switch to iptables at a later stage or switch over existing MKE clusters to use IPVS proxier.

MKE supports setting values for all IPVS-related parameters. For full parameter details, refer to the Kubernetes documentation for kube-proxy.

Use the kube-proxy-mode parameter at install time to enable IPVS proxier. The two valid values are iptables (default) and ipvs.

You can specify the following ipvs parameters for kube-proxy:

  • ipvs_exclude_cidrs

  • ipvs_min_sync_period

  • ipvs_scheduler

  • ipvs_strict_arp = false

  • ipvs_sync_period

  • ipvs_tcp_timeout

  • ipvs_tcpfin_timeout

  • ipvs_udp_timeout

To set these values at the time of bootstrap/installation:

  1. Add the required values under [cluster_config] in a TOML file (for example, config.toml).

  2. Create a config named com.docker.ucp.config from this TOML file:

    docker config create com.docker.ucp.config config.toml
    
  3. Use the --existing-config parameter when installing MKE. You can also change these values post-install using the MKE-s ucp/config-toml endpoint.

Caution

If you are using MKE 3.3.x with IPVS proxier and plan to upgrade to MKE 3.4.x, you must upgrade to MKE 3.4.3 or later as earlier versions of MKE 3.4.x do not support IPVS proxier.

Use an External Certificate Authority

You can customize MKE to use certificates signed by an External Certificate Authority (ECA). When using your own certificates, include a certificate bundle with the following:

  • ca.pem file with the root CA public certificate.

  • cert.pem file with the server certificate and any intermediate CA public certificates. This certificate should also have Subject Alternative Names (SANs) for all addresses used to reach the MKE manager.

  • key.pem file with a server private key.

You can either use separate certificates for every manager node or one certificate for all managers. If you use separate certificates, you must use a common SAN throughout. For example, MKE permits the following on a three-node cluster:

  • node1.company.example.org with the SAN mke.company.org

  • node2.company.example.org with the SAN mke.company.org

  • node3.company.example.org with the SAN mke.company.org

If you use a single certificate for all manager nodes, MKE automatically copies the certificate files both to new manager nodes and to those promoted to a manager role.

Customize named volumes

Note

Skip this step if you want to use the default named volumes.

MKE uses named volumes to persist data. If you want to customize the drivers that manage such volumes, create the volumes before installing MKE. During the installation process, the installer will automatically detect the existing volumes and start using them. Otherwise, MKE will create the default named volumes.

Configure kernel parameters

MKE uses a number of kernel parameters in its deployment.

Note

The MKE parameter values are not set by MKE, but by either MCR or an upstream component.

kernel.<subtree>

Parameter

Values

Description

panic

  • Default: Distribution dependent

  • MKE: 1

Sets the number of seconds the kernel waits to reboot following a panic.

Note

The kernel.panic parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

panic_on_oops

  • Default: Distribution dependent

  • MKE: 1

Sets whether the kernel should panic on an oops rather than continuing to attempt operations.

Note

The kernel.panic_on_oops parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

keys.root_maxkeys

  • Default: 1000000

  • MKE: 1000000

Sets the maximum number of keys that the root user (UID 0 in the root user namespace) can own.

Note

The kernel.keys.root_maxkeys parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

keys.root_maxbytes

  • Default: 25000000

  • MKE: 25000000

Sets the maximum number of bytes of data that the root user (UID 0 in the root user namespace) can hold in the payloads of the keys owned by root.

Allocate 25 bytes per key multiplied by the number of kernel/keys/root_maxkeys.

Note

The keys.root_maxbytes parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

pty.nr

  • Default: Dependent on number of logins. Not user-configurable.

  • MKE: 1

Sets the number of open PTYs.

net.bridge.bridge-nf-<subtree>

Parameter

Values

Description

call-arptables

  • Default: No default

  • MKE: 1

Sets whether arptables rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

call-ip6tables

  • Default: No default

  • MKE: 1

Sets whether ip6tables rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

call-iptables

  • Default: No default

  • MKE: 1

Sets whether iptables rules apply to bridged network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

filter-pppoe-tagged

  • Default: No default

  • MKE: 0

Sets whether netfilter rules apply to bridged PPPOE network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

filter-vlan-tagged

  • Default: No default

  • MKE: 0

Sets whether netfilter rules apply to bridged VLAN network traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

pass-vlan-input-dev

  • Default: No default

  • MKE: 0

Sets whether netfilter strips the incoming VLAN interface name from bridged traffic. If the bridge module is not loaded, and thus no bridges are present, this key is not present.

net.fan.<subtree>

Parameter

Values

Description

vxlan

  • Default: No default

  • MKE: 4

Sets the version of the VXLAN module on older kernels, not present on kernel version 5.x. If the VXLAN module is not loaded this key is not present.

net.ipv4.<subtree>

Note

  • The *.vs.* default values persist, changing only because the ipvs kernel module was not previously loaded. For more information, refer to the Linux kernel documentation.

Parameter

Values

Description

conf.all.accept_redirects

  • Default: 1

  • MKE: 0

Sets whether ICMP redirects are permitted. This key affects all interfaces.

conf.all.forwarding

  • Default: 0

  • MKE: 1

Sets whether network traffic is forwarded. This key affects all interfaces.

conf.all.route_localnet

  • Default: 0

  • MKE: 1

Sets 127/8 for local routing. This key affects all interfaces.

conf.default.forwarding

  • Default: 0

  • MKE: 1

Sets 127/8 for local routing. This key affects new interfaces.

conf.lo.forwarding

  • Default: 0

  • MKE: 1

Sets forwarding for localhost traffic.

ip_forward

  • Default: 0

  • MKE: 1

Sets whether traffic forwards between interfaces. For Kubernetes to run, this parameter must be set to 1.

vs.am_droprate

  • Default: 10

  • MKE: 10

Sets the always mode drop rate used in mode 3 of the drop_rate defense.

vs.amemthresh

  • Default: 1024

  • MKE: 1024

Sets the available memory threshold in pages, which is used in the automatic modes of defense. When there is not enough available memory, this enables the strategy and the variable is set to 2. Otherwise, the strategy is disabled and the variable is set to 1.

vs.backup_only

  • Default: 0

  • MKE: 0

Sets whether the director function is disabled while the server is in back-up mode, to avoid packet loops for DR/TUN methods.

vs.cache_bypass

  • Default: 0

  • MKE: 0

Sets whether packets forward directly to the original destination when no cache server is available and the destination address is not local (iph->daddr is RTN_UNICAST). This mostly applies to transparent web cache clusters.

vs.conn_reuse_mode

  • Default: 1

  • MKE: 1

Sets how IPVS handles connections detected on port reuse. It is a bitmap with the following values:

  • 0 disables any special handling on port reuse. The new connection is delivered to the same real server that was servicing the previous connection, effectively disabling expire_nodest_conn.

  • bit 1 enables rescheduling of new connections when it is safe. That is, whenever expire_nodest_conn and for TCP sockets, when the connection is in TIME_WAIT state (which is only possible if you use NAT mode).

  • bit 2 is bit 1 plus, for TCP connections, when connections are in FIN_WAIT state, as this is the last state seen by load balancer in Direct Routing mode. This bit helps when adding new real servers to a very busy cluster.

vs.conntrack

  • Default: 0

  • MKE: 0

Sets whether connection-tracking entries are maintained for connections handled by IPVS. Enable if connections handled by IPVS are to be subject to stateful firewall rules. That is, iptables rules that make use of connection tracking. Otherwise, disable this setting to optimize performance. Connections handled by the IPVS FTP application module have connection tracking entries regardless of this setting, which is only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.

vs.drop_entry

  • Default: 0

  • MKE: 0

Sets whether entries are randomly dropped in the connection hash table, to collect memory back for new connections. In the current code, the drop_entry procedure can be activated every second, then it randomly scans 1/32 of the whole and drops entries that are in the SYN-RECV/SYNACK state, which should be effective against syn-flooding attack.

The valid values of drop_entry are 0 to 3, where 0 indicates that the strategy is always disabled, 1 and 2 indicate automatic modes (when there is not enough available memory, the strategy is enabled and the variable is automatically set to 2, otherwise the strategy is disabled and the variable is set to 1), and 3 indicates that the strategy is always enabled.

vs.drop_packet

  • Default: 0

  • MKE: 0

Sets whether rate packets are dropped prior to being forwarded to real servers. Rate 1 drops all incoming packets.

The value definition is the same as that for drop_entry. In automatic mode, the following formula determines the rate: rate = amemthresh / (amemthresh - available_memory) when available memory is less than the available memory threshold. When mode 3 is set, the always mode drop rate is controlled by the /proc/sys/net/ipv4/vs/am_droprate.

vs.expire_nodest_conn

  • Default: 0

  • MKE: 0

Sets whether the load balancer silently drops packets when its destination server is not available. This can be useful when the user-space monitoring program deletes the destination server (due to server overload or wrong detection) and later adds the server back, and the connections to the server can continue.

If this feature is enabled, the load balancer terminates the connection immediately whenever a packet arrives and its destination server is not available, after which the client program will be notified that the connection is closed. This is equivalent to the feature that is sometimes required to flush connections when the destination is not available.

vs.ignore_tunneled

  • Default: 0

  • MKE: 0

Sets whether IPVS configures the ipvs_property on all packets of unrecognized protocols. This prevents users from routing such tunneled protocols as IPIP, which is useful in preventing the rescheduling packets that have been tunneled to the IPVS host (that is, to prevent IPVS routing loops when IPVS is also acting as a real server).

vs.nat_icmp_send

  • Default: 0

  • MKE: 0

Sets whether ICMP error messages (ICMP_DEST_UNREACH) are sent for VS/NAT when the load balancer receives packets from real servers but the connection entries do not exist.

vs.pmtu_disc

  • Default: 0

  • MKE: 0

Sets whether all DF packets that exceed the PMTU are rejected with FRAG_NEEDED, irrespective of the forwarding method. For the TUN method, the flag can be disabled to fragment such packets.

vs.schedule_icmp

  • Default: 0

  • MKE: 0

Sets whether scheduling ICMP packets in IPVS is enabled.

vs.secure_tcp

  • Default: 0

  • MKE: 0

Sets the use of a more complicated TCP state transition table. For VS/NAT, the secure_tcp defense delays entering the TCP ESTABLISHED state until the three-way handshake completes. The value definition is the same as that of drop_entry and drop_packet.

vs.sloppy_sctp

  • Default: 0

  • MKE: 0

Sets whether IPVS is permitted to create a connection state on any packet, rather than an SCTP INIT only.

vs.sloppy_tcp

  • Default: 0

  • MKE: 0

Sets whether IPVS is permitted to create a connection state on any packet, rather than a TCP SYN only.

vs.snat_reroute

  • Default: 0

  • MKE: 1

Sets whether the route of SNATed packets is recalculated from real servers as if they originate from the director. If disabled, SNATed packets are routed as if they have been forwarded by the director.

If policy routing is in effect, then it is possible that the route of a packet originating from a director is routed differently to a packet being forwarded by the director.

If policy routing is not in effect, then the recalculated route will always be the same as the original route. It is an optimization to disable snat_reroute and avoid the recalculation.

vs.sync_persist_mode

  • Default: 0

  • MKE: 0

Sets the synchronization of connections when using persistence. The possible values are defined as follows:

  • 0 means all types of connections are synchronized.

  • 1 attempts to reduce the synchronization traffic depending on the connection type. For persistent services, avoid synchronization for normal connections, do it only for persistence templates. In such case, for TCP and SCTP it may need enabling sloppy_tcp and sloppy_sctp flags on back-up servers. For non-persistent services such optimization is not applied, mode 0 is assumed.

vs.sync_ports

  • Default: 1

  • MKE: 1

Sets the number of threads that the master and back-up servers can use for sync traffic. Every thread uses a single UDP port, thread 0 uses the default port 8848, and the last thread uses port 8848+sync_ports-1.

vs.sync_qlen_max

  • Default: Calculated

  • MKE: Calculated

Sets a hard limit for queued sync messages that are not yet sent. It defaults to 1/32 of the memory pages but actually represents number of messages. It will protect us from allocating large parts of memory when the sending rate is lower than the queuing rate.

vs.sync_refresh_period

  • Default: 0

  • MKE: 0

Sets (in seconds) the difference in the reported connection timer that triggers new sync messages. It can be used to avoid sync messages for the specified period (or half of the connection timeout if it is lower) if the connection state has not changed since last sync.

This is useful for normal connections with high traffic, to reduce sync rate. Additionally, retry sync_retries times with period of sync_refresh_period/8.

vs.sync_retries

  • Default: 0

  • MKE: 0

Sets sync retries with period of sync_refresh_period/8. Useful to protect against loss of sync messages. The range of the sync_retries is 0 to 3.

vs.sync_sock_size

  • Default: 0

  • MKE: 0

Sets the configuration of SNDBUF (master) or RCVBUF (slave) socket limit. Default value is 0 (preserve system defaults).

vs.sync_threshold

  • Default: 3 50

  • MKE: 3 50

Sets the synchronization threshold, which is the minimum number of incoming packets that a connection must receive before the connection is synchronized. A connection will be synchronized every time the number of its incoming packets modulus sync_period equals the threshold. The range of the threshold is 0 to sync_period. When sync_period and sync_refresh_period are 0, send sync only for state changes or only once when packets matches sync_threshold.

vs.sync_version

  • Default: 1

  • MKE: 1

Sets the version of the synchronization protocol to use when sending synchronization messages. The possible values are:

  • ``0 ``selects the original synchronization protocol (version 0). This should be used when sending synchronization messages to a legacy system that only understands the original synchronization protocol.

  • 1 selects the current synchronization protocol (version 1). This should be used whenever possible.

Kernels with this sync_version entry are able to receive messages of both version 1 and version 2 of the synchronization protocol.

net.netfilter.nf_conntrack_<subtree>

Note

  • The net.netfilter.nf_conntrack_<subtree> default values persist, changing only when the nf_conntrack kernel module has not been previously loaded. For more information, refer to the Linux kernel documentation.

Parameter

Values

Description

acct

  • Default: 0

  • MKE: 0

Sets whether connection-tracking flow accounting is enabled. Adds 64-bit byte and packet counter per flow.

buckets

  • Default: Calculated

  • MKE: Calculated

Sets the size of the hash table. If not specified during module loading, the default size is calculated by dividing total memory by 16384 to determine the number of buckets. The hash table will never have fewer than 1024 and never more than 262144 buckets. This sysctl is only writeable in the initial net namespace.

checksum

  • Default: 0

  • MKE: 0

Sets whether the checksum of incoming packets is verified. Packets with bad checksums are in an invalid state. If this is enabled, such packets are not considered for connection tracking.

dccp_loose

  • Default: 0

  • MKE: 1

Sets whether picking up already established connections for Datagram Congestion Control Protocol (DCCP) is permitted.

dccp_timeout_closereq

  • Default: Distribution dependent

  • MKE: 64

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_closing

  • Default: Distribution dependent

  • MKE: 64

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_open

  • Default: Distribution dependent

  • MKE: 43200

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_partopen

  • Default: Distribution dependent

  • MKE: 480

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_request

  • Default: Distribution dependent

  • MKE: 240

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_respond

  • Default: Distribution dependent

  • MKE: 480

The parameter description is not yet available in the Linux kernel documentation.

dccp_timeout_timewait

  • Default: Distribution dependent

  • MKE: 240

The parameter description is not yet available in the Linux kernel documentation.

events

  • Default: 0

  • MKE: 1

Sets whether the connection tracking code provides userspace with connection-tracking events through ctnetlink.

expect_max

  • Default: Calculated

  • MKE: 1024

Sets the maximum size of the expectation table. The default value is nf_conntrack_buckets / 256. The minimum is 1.

frag6_high_thresh

  • Default: Calculated

  • MKE: 4194304

Sets the maximum memory used to reassemble IPv6 fragments. When nf_conntrack_frag6_high_thresh bytes of memory is allocated for this purpose, the fragment handler tosses packets until nf_conntrack_frag6_low_thresh is reached. The size of this parameter is calculated based on system memory.

frag6_low_thresh

  • Default: Calculated

  • MKE: 3145728

See nf_conntrack_frag6_high_thresh. The size of this parameter is calculated based on system memory.

frag6_timeout

  • Default: 60

  • MKE: 60

Sets the time to keep an IPv6 fragment in memory.

generic_timeout

  • Default: 600

  • MKE: 600

Sets the default for a generic timeout. This refers to layer 4 unknown and unsupported protocols.

gre_timeout

  • Default: 30

  • MKE: 30

Set the GRE timeout from the conntrack table.

gre_timeout_stream

  • Default: 180

  • MKE: 180

Sets the GRE timeout for streamed connections. This extended timeout is used when a GRE stream is detected.

helper

  • Default: 0

  • MKE: 0

Sets whether the automatic conntrack helper assignment is enabled. If disabled, you must set up iptables rules to assign helpers to connections. See the CT target description in the iptables-extensions(8) main page for more information.

icmp_timeout

  • Default: 30

  • MKE: 30

Sets the default for ICMP timeout.

icmpv6_timeout

  • Default: 30

  • MKE: 30

Sets the default for ICMP6 timeout.

log_invalid

  • Default: 0

  • MKE: 0

Sets whether invalid packets of a type specified by value are logged.

max

  • Default: Calculated

  • MKE: 131072

Sets the maximum number of allowed connection tracking entries. This value is set to nf_conntrack_buckets by default.

Connection-tracking entries are added to the table twice, once for the original direction and once for the reply direction (that is, with the reversed address). Thus, with default settings a maxed-out table will have an average hash chain length of 2, not 1.

sctp_timeout_closed

  • Default: Distribution dependent

  • MKE: 10

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_cookie_echoed

  • Default: Distribution dependent

  • MKE: 3

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_cookie_wait

  • Default: Distribution dependent

  • MKE: 3

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_established

  • Default: Distribution dependent

  • MKE: 432000

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_heartbeat_acked

  • Default: Distribution dependent

  • MKE: 210

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_heartbeat_sent

  • Default: Distribution dependent

  • MKE: 30

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_shutdown_ack_sent

  • Default: Distribution dependent

  • MKE: 3

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_shutdown_recd

  • Default: Distribution dependent

  • MKE: 0

The parameter description is not yet available in the Linux kernel documentation.

sctp_timeout_shutdown_sent

  • Default: Distribution dependent

  • MKE: 0

The parameter description is not yet available in the Linux kernel documentation.

tcp_be_liberal

  • Default: 0

  • MKE: 0

Sets whether only out of window RST segments are marked as INVALID.

tcp_loose

  • Default: 0

  • MKE: 1

Sets whether already established connections are picked up.

tcp_max_retrans

  • Default: 3

  • MKE: 3

Sets the maximum number of packets that can be retransmitted without receiving an acceptable ACK from the destination. If this number is reached, a shorter timer is started. Timeout for unanswered.

tcp_timeout_close

  • Default: Distribution dependent

  • MKE: 10

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_close_wait

  • Default: Distribution dependent

  • MKE: 3600

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_fin_wait

  • Default: Distribution dependent

  • MKE: 120

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_last_ack

  • Default: Distribution dependent

  • MKE: 30

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_max_retrans

  • Default: Distribution dependent

  • MKE: 300

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_syn_recv

  • Default: Distribution dependent

  • MKE: 60

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_syn_sent

  • Default: Distribution dependent

  • MKE: 120

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_time_wait

  • Default: Distribution dependent

  • MKE: 120

The parameter description is not yet available in the Linux kernel documentation.

tcp_timeout_unacknowledged

  • Default: Distribution dependent

  • MKE: 30

The parameter description is not yet available in the Linux kernel documentation.

timestamp

  • Default: 0

  • MKE: 0

Sets whether connection-tracking flow timestamping is enabled.

udp_timeout

  • Default: 30

  • MKE: 30

Sets the UDP timeout.

udp_timeout_stream

  • Default: 120

  • MKE: 120

Sets the extended timeout that is used whenever a UDP stream is detected.

net.nf_conntrack_<subtree>

Note

  • The net.nf_conntrack_<subtree> default values persist, changing only when the nf_conntrack kernel module has not been previously loaded. For more information, refer to the Linux kernel documentation.

Parameter

Values

Description

max

  • Default: Calculated

  • MKE: 131072

Sets the maximum number of connections to track. The size of this parameter is calculated based on system memory.

vm.overcommit_<subtree>

Parameter

Values

Description

memory

  • Default: Distribution dependent

  • MKE: 1

Sets whether the kernel permits memory overcommitment from malloc() calls.

Note

The vm.overcommit_memory parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

vm.panic_<subtree>

Parameter

Values

Description

on_oom

  • Default: 0

  • MKE: 0

Sets whether the kernel should panic on an out-of-memory, rather than continuing to attempt operations.

When set to 0 the kernel invokes the oom_killer, which kills the rogue processes and thus preserves the system.

Note

The vm.panic.on_oom parameter is not modified when the kube_protect_kernel_defaults parameter is enabled.

Set up kernel default protections

To protect kernel parameters from being overridden by kublet, you can either invoke the --kube-protect-kernel-defaults command option at the time of MKE install, or following MKE install you can adjust the cluster_config | kube_protect_kernel_defaults parameter in the MKE configuration file.

Important

When enabled, kubelet can fail to start if the kernel parameters on the nodes are not properly set. You must set those kernel parameters on the nodes before you install MKE or before adding a new node to an existing cluster.

  1. Create a configuration file called /etc/sysctl.d/90-kubelet.conf and add the following snippet to it:

    vm.panic_on_oom=0
    vm.overcommit_memory=1
    kernel.panic=10
    kernel.panic_on_oops=1
    kernel.keys.root_maxkeys=1000000
    kernel.keys.root_maxbytes=25000000
    
  2. Run sysctl -p /etc/sysctl.d/90-kubelet.conf.

Install the MKE image

To install MKE:

  1. Log in to the target host using Secure Shell (SSH).

  2. Pull the latest version of MKE:

    docker image pull mirantis/ucp:3.7.16
    
  3. Install MKE:

    docker container run --rm -it --name ucp \
    -v /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:3.7.16 install \
    --host-address <node-ip-address> \
    --interactive
    

    The ucp install command runs in interactive mode, prompting you for the necessary configuration values. For more information about the ucp install command, including how to install MKE on a system with SELinux enabled, refer to the MKE Operations Guide: mirantis/ucp install.

Note

MKE installs Project Calico for Kubernetes container-to-container communication. However, you may install an alternative CNI plugin, such as Cilium, Weave, or Flannel. For more information, refer to the MKE Operations Guide: Installing an unmanaged CNI plugin.

Obtain the license

After you Install the MKE image, proceed with downloading your MKE license as described below. This section also contains steps to apply your new license using the MKE web UI.

Warning

Users are not authorized to run MKE without a valid license. For more information, refer to Mirantis Agreements and Terms.

To download your MKE license:

  1. Open an email from Mirantis Support with the subject Welcome to Mirantis’ CloudCare Portal and follow the instructions for logging in.

    If you did not receive the CloudCare Portal email, it is likely that you have not yet been added as a Designated Contact. To remedy this, contact your Designated Administrator.

  2. In the top navigation bar, click Environments.

  3. Click the Cloud Name associated with the license you want to download.

  4. Scroll down to License Information and click the License File URL. A new tab opens in your browser.

  5. Click View file to download your license file.

To update your license settings in the MKE web UI:

  1. Log in to your MKE instance using an administrator account.

  2. In the left navigation, click Settings.

  3. On the General tab, click Apply new license. A file browser dialog displays.

  4. Navigate to where you saved the license key (.lic) file, select it, and click Open. MKE automatically updates with the new settings.

Note

Though MKE is generally a subscription-only service, Mirantis offers a free trial license by request. Use our contact form to request a free trial license.

Install MKE on AWS

This section describes how to customize your MKE installation on AWS.

Note

You may skip this topic if you plan to install MKE on AWS with no customizations or if you will only deploy Docker Swarm workloads. Refer to Install the MKE image for the appropriate installation instruction.

Prerequisites

Complete the following prerequisites prior to installing MKE on AWS.

  1. Log in to the AWS Management Console.

  2. Assign a host name to your instance. To determine the host name, run the following curl command within the EC2 instance:

    curl http://169.254.169.254/latest/meta-data/hostname
    
  3. Tag your instance, VPC, security-groups, and subnets by specifying kubernetes.io/cluster/<unique-cluster-id> in the Key field and <cluster-type> in the Value field. Possible <cluster-type> values are as follows:

    • owned, if the cluster owns and manages the resources that it creates

    • shared, if the cluster shares its resources between multiple clusters

    For example, Key: kubernetes.io/cluster/1729543642a6 and Value: owned.

  4. To enable introspection and resource provisioning, specify an instance profile with appropriate policies for manager nodes. The following is an example of a very permissive instance profile:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [ "ec2:*" ],
          "Resource": [ "*" ]
        },
        {
          "Effect": "Allow",
          "Action": [ "elasticloadbalancing:*" ],
          "Resource": [ "*" ]
        },
        {
          "Effect": "Allow",
          "Action": [ "route53:*" ],
          "Resource": [ "*" ]
        },
        {
          "Effect": "Allow",
          "Action": "s3:*",
          "Resource": [ "arn:aws:s3:::kubernetes-*" ]
        }
      ]
    }
    
  5. To enable access to dynamically provisioned resources, specify an instance profile with appropriate policies for worker nodes. The following is an example of a very permissive instance profile:

    {
      "Version": "2012-10-17",
      "Statement": [{
          "Effect": "Allow",
          "Action": "s3:*",
          "Resource": ["arn:aws:s3:::kubernetes-*"]
        },
        {
          "Effect": "Allow",
          "Action": "ec2:Describe*",
          "Resource": "*"
        },
        {
          "Effect": "Allow",
          "Action": "ec2:AttachVolume",
          "Resource": "*"
        },
        {
          "Effect": "Allow",
          "Action": "ec2:DetachVolume",
          "Resource": "*"
        },
        {
          "Effect": "Allow",
          "Action": ["route53:*"],
          "Resource": ["*"]
        }
      ]
    }
    

Install MKE

After you perform the steps described in Prerequisites, run the following command to install MKE on a master node. Substitute <ucp-ip> with the private IP address of the master node.

docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.7.16 install \
--host-address <ucp-ip> \
--cloud-provider aws \
--interactive

Note

The --cloud-provider aws flag is available as of MKE 3.7.12.

Install MKE on Azure

Mirantis Kubernetes Engine (MKE) closely integrates with Microsoft Azure for its Kubernetes Networking and Persistent Storage feature set. MKE deploys the Calico CNI provider. In Azure, the Calico CNI leverages the Azure networking infrastructure for data path networking and the Azure IPAM for IP address management.

Prerequisites

To avoid significant issues during the installation process, you must meet the following infrastructure prerequisites to successfully deploy MKE on Azure.

  • Deploy all MKE nodes (managers and workers) into the same Azure resource group. You can deploy the Azure networking components (virtual network, subnets, security groups) in a second Azure resource group.

  • Size the Azure virtual network and subnet appropriately for your environment, because addresses from this pool will be consumed by Kubernetes Pods.

  • Attach all MKE worker and manager nodes to the same Azure subnet.

  • Set internal IP addresses for all nodes to Static rather than the Dynamic default.

  • Match the Azure virtual machine object name to the Azure virtual machine computer name and the node operating system hostname that is the FQDN of the host (including domain names). All characters in the names must be in lowercase.

  • Ensure the presence of an Azure Service Principal with Contributor access to the Azure resource group hosting the MKE nodes. Kubernetes uses this Service Principal to communicate with the Azure API. The Service Principal ID and Secret Key are MKE prerequisites.

    If you are using a separate resource group for the networking components, the same Service Principal must have Network Contributor access to this resource group.

  • Ensure that an open NSG between all IPs on the Azure subnet passes into MKE during installation. Kubernetes Pods integrate into the underlying Azure networking stack, from an IPAM and routing perspective with the Azure CNI IPAM module. As such, Azure network security groups (NSG) impact pod-to-pod communication. End users may expose containerized services on a range of underlying ports, resulting in a manual process to open an NSG port every time a new containerized service deploys on the platform, affecting only workloads that deploy on the Kubernetes orchestrator.

    To limit exposure, restrict the use of the Azure subnet to container host VMs and Kubernetes Pods. Additionally, you can leverage Kubernetes Network Policies to provide micro segmentation for containerized applications and services.

The MKE installation requires the following information:

subscriptionId

Azure Subscription ID in which to deploy the MKE objects

tenantId

Azure Active Directory Tenant ID in which to deploy the MKE objects

aadClientId

Azure Service Principal ID

aadClientSecret

Azure Service Principal Secret Key

Networking

MKE configures the Azure IPAM module for Kubernetes so that it can allocate IP addresses for Kubernetes Pods. Per Azure IPAM module requirements, the configuration of each Azure VM that is part of the Kubernetes cluster must include a pool of IP addresses.

You can use automatic or manual IPs provisioning for the Kubernetes cluster on Azure.

  • Automatic provisioning

    Allows for IP pool configuration and maintenance for standalone Azure virtual machines (VMs). This service runs within the calico-node daemonset and provisions 128 IP addresses for each node by default.

    Note

    If you are using a VXLAN data plane, MKE automatically uses Calico IPAM. It is not necessary to do anything specific for Azure IPAM.

    New MKE installations use Calico VXLAN as the default data plane (the MKE configuration calico_vxlan is set to true). MKE does not use Calico VXLAN if the MKE version is lower than 3.3.0 or if you upgrade MKE from lower than 3.3.0 to 3.3.0 or higher.

  • Manual provisioning

    Manual provisioning of additional IP address for each Azure VM can be done through the Azure Portal, the Azure CLI az network nic ip-config create, or an ARM template.

Azure configuration file

For MKE to integrate with Microsoft Azure, the azure.json configuration file must be identical across all manager and worker nodes in your cluster. For Linux nodes, place the file in /etc/kubernetes on each host. For Windows nodes, place the file in C:\k on each host. Because root owns the configuration file, set its permissions to 0644 to ensure that the container user has read access.

The following is an example template for azure.json.

{
    "cloud":"AzurePublicCloud",
    "tenantId": "<parameter_value>",
    "subscriptionId": "<parameter_value>",
    "aadClientId": "<parameter_value>",
    "aadClientSecret": "<parameter_value>",
    "resourceGroup": "<parameter_value>",
    "location": "<parameter_value>",
    "subnetName": "<parameter_value>",
    "securityGroupName": "<parameter_value>",
    "vnetName": "<parameter_value>",
    "useInstanceMetadata": true
}

Optional parameters are available for Azure deployments:

primaryAvailabilitySetName

Worker nodes availability set

vnetResourceGroup

Virtual network resource group if your Azure network objects live in a separate resource group

routeTableName

Applicable if you have defined multiple route tables within an Azure subnet

Guidelines for IPAM configuration

Warning

To avoid significant issue during the installation process, follow these guidelines to either use the appropriate size network in Azure or take the necessary actions to fit within the subnet.

Configure the subnet and the virtual network associated with the primary interface of the Azure VMs with an adequate address prefix/range. The number of required IP addresses depends on the workload and the number of nodes in the cluster.

For example, for a cluster of 256 nodes, make sure that the address space of the subnet and the virtual network can allocate at least 128 * 256 IP addresses, in order to run a maximum of 128 pods concurrently on a node. This is in addition to initial IP allocations to VM network interface card (NICs) during Azure resource creation.

Accounting for the allocation of IP addresses to NICs that occur during VM bring-up, set the address space of the subnet and virtual network to 10.0.0.0/16. This ensures that the network can dynamically allocate at least 32768 addresses, plus a buffer for initial allocations for primary IP addresses.

Note

The Azure IPAM module queries the metadata of an Azure VM to obtain a list of the IP addresses that are assigned to the VM NICs. The IPAM module allocates these IP addresses to Kubernetes pods. You configure the IP addresses as ipConfigurations in the NICs associated with a VM or scale set member, so that Azure IPAM can provide the addresses to Kubernetes on request.

Manually provision IP address pools as part of an Azure VM scale set

Configure IP Pools for each member of the VM scale set during provisioning by associating multiple ipConfigurations with the scale set’s networkInterfaceConfigurations.

The following example networkProfile configuration for an ARM template configures pools of 32 IP addresses for each VM in the VM scale set.

"networkProfile": {
  "networkInterfaceConfigurations": [
    {
      "name": "[variables('nicName')]",
      "properties": {
        "ipConfigurations": [
          {
            "name": "[variables('ipConfigName1')]",
            "properties": {
              "primary": "true",
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              },
              "loadBalancerBackendAddressPools": [
                {
                  "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/loadBalancers/', variables('loadBalancerName'), '/backendAddressPools/', variables('bePoolName'))]"
                }
              ],
              "loadBalancerInboundNatPools": [
                {
                  "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/loadBalancers/', variables('loadBalancerName'), '/inboundNatPools/', variables('natPoolName'))]"
                }
              ]
            }
          },
          {
            "name": "[variables('ipConfigName2')]",
            "properties": {
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              }
            }
          }
          .
          .
          .
          {
            "name": "[variables('ipConfigName32')]",
            "properties": {
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              }
            }
          }
        ],
        "primary": "true"
      }
    }
  ]
}

Adjust the IP count value

During an MKE installation, you can alter the number of Azure IP addresses that MKE automatically provisions for pods.

By default, MKE will provision 128 addresses, from the same Azure subnet as the hosts, for each VM in the cluster. If, however, you have manually attached additional IP addresses to the VMs (by way of an ARM Template, Azure CLI or Azure Portal) or you are deploying in to small Azure subnet (less than /16), you can use an --azure-ip-count flag at install time.

Note

Do not set the --azure-ip-count variable to a value of less than 6 if you have not manually provisioned additional IP addresses for each VM. The MKE installation needs at least 6 IP addresses to allocate to the core MKE components that run as Kubernetes pods (in addition to the VM’s private IP address).

Below are several example scenarios that require the defining of the --azure-ip-count variable.

Scenario 1: Manually provisioned addresses

If you have manually provisioned additional IP addresses for each VM and want to disable MKE from dynamically provisioning more IP addresses, you must pass --azure-ip-count 0 into the MKE installation command.

Scenario 2: Reducing the number of provisioned addresses

Pass --azure-ip-count <custom_value> into the MKE installation command to reduce the number of IP addresses dynamically allocated from 128 to a custom value due to:

  • Primary use of the Swarm Orchestrator

  • Deployment of MKE on a small Azure subnet (for example, /24)

  • Plans to run a small number of Kubernetes pods on each node

To adjust this value post-installation, refer to the instructions on how to download the MKE configuration file, change the value, and update the configuration via the API.

Note

If you reduce the value post-installation, existing VMs will not reconcile and you will need to manually edit the IP count in Azure.

Run the following command to install MKE on a manager node.

docker container run --rm -it \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.7.16 install \
  --host-address <ucp-ip> \
  --pod-cidr <ip-address-range> \
  --cloud-provider Azure \
  --interactive
  • The --pod-cidr option maps to the IP address range that you configured for the Azure subnet.

    Note

    The pod-cidr range must match the Azure virtual network’s subnet attached to the hosts. For example, if the Azure virtual network had the range 172.0.0.0/16 with VMs provisioned on an Azure subnet of 172.0.1.0/24, then the Pod CIDR should also be 172.0.1.0/24.

    This requirement applies only when MKE does not use the VXLAN data plane. If MKE uses the VXLAN data plane, the pod-cidr range must be different than the node IP subnet.

  • The --host-address maps to the private IP address of the master node.

  • The --azure-ip-count serves to adjust the amount of IP addresses provisioned to each VM.

Azure custom roles

You can create your own Azure custom roles for use with MKE. You can assign these roles to users, groups, and service principals at management group (in preview only), subscription, and resource group scopes.

Deploy an MKE cluster into a single resource group

A resource group is a container that holds resources for an Azure solution. These resources are the virtual machines (VMs), networks, and storage accounts that are associated with the swarm.

To create a custom all-in-one role with permissions to deploy an MKE cluster into a single resource group:

  1. Create the role permissions JSON file.

    For example:

    {
      "Name": "Docker Platform All-in-One",
      "IsCustom": true,
      "Description": "Can install and manage Docker platform.",
      "Actions": [
        "Microsoft.Authorization/*/read",
        "Microsoft.Authorization/roleAssignments/write",
        "Microsoft.Compute/availabilitySets/read",
        "Microsoft.Compute/availabilitySets/write",
        "Microsoft.Compute/disks/read",
        "Microsoft.Compute/disks/write",
        "Microsoft.Compute/virtualMachines/extensions/read",
        "Microsoft.Compute/virtualMachines/extensions/write",
        "Microsoft.Compute/virtualMachines/read",
        "Microsoft.Compute/virtualMachines/write",
        "Microsoft.Network/loadBalancers/read",
        "Microsoft.Network/loadBalancers/write",
        "Microsoft.Network/loadBalancers/backendAddressPools/join/action",
        "Microsoft.Network/networkInterfaces/read",
        "Microsoft.Network/networkInterfaces/write",
        "Microsoft.Network/networkInterfaces/join/action",
        "Microsoft.Network/networkSecurityGroups/read",
        "Microsoft.Network/networkSecurityGroups/write",
        "Microsoft.Network/networkSecurityGroups/join/action",
        "Microsoft.Network/networkSecurityGroups/securityRules/read",
        "Microsoft.Network/networkSecurityGroups/securityRules/write",
        "Microsoft.Network/publicIPAddresses/read",
        "Microsoft.Network/publicIPAddresses/write",
        "Microsoft.Network/publicIPAddresses/join/action",
        "Microsoft.Network/virtualNetworks/read",
        "Microsoft.Network/virtualNetworks/write",
        "Microsoft.Network/virtualNetworks/subnets/read",
        "Microsoft.Network/virtualNetworks/subnets/write",
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Resources/subscriptions/resourcegroups/read",
        "Microsoft.Resources/subscriptions/resourcegroups/write",
        "Microsoft.Security/advancedThreatProtectionSettings/read",
        "Microsoft.Security/advancedThreatProtectionSettings/write",
        "Microsoft.Storage/*/read",
        "Microsoft.Storage/storageAccounts/listKeys/action",
        "Microsoft.Storage/storageAccounts/write"
      ],
      "NotActions": [],
      "AssignableScopes": [
        "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
      ]
    }
    
  2. Create the Azure RBAC role.

    az role definition create --role-definition all-in-one-role.json
    
Deploy MKE compute resources

Compute resources act as servers for running containers.

To create a custom role to deploy MKE compute resources only:

  1. Create the role permissions JSON file.

    For example:

    {
      "Name": "Docker Platform",
      "IsCustom": true,
      "Description": "Can install and run Docker platform.",
      "Actions": [
        "Microsoft.Authorization/*/read",
        "Microsoft.Authorization/roleAssignments/write",
        "Microsoft.Compute/availabilitySets/read",
        "Microsoft.Compute/availabilitySets/write",
        "Microsoft.Compute/disks/read",
        "Microsoft.Compute/disks/write",
        "Microsoft.Compute/virtualMachines/extensions/read",
        "Microsoft.Compute/virtualMachines/extensions/write",
        "Microsoft.Compute/virtualMachines/read",
        "Microsoft.Compute/virtualMachines/write",
        "Microsoft.Network/loadBalancers/read",
        "Microsoft.Network/loadBalancers/write",
        "Microsoft.Network/networkInterfaces/read",
        "Microsoft.Network/networkInterfaces/write",
        "Microsoft.Network/networkInterfaces/join/action",
        "Microsoft.Network/publicIPAddresses/read",
        "Microsoft.Network/virtualNetworks/read",
        "Microsoft.Network/virtualNetworks/subnets/read",
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Resources/subscriptions/resourcegroups/read",
        "Microsoft.Resources/subscriptions/resourcegroups/write",
        "Microsoft.Security/advancedThreatProtectionSettings/read",
        "Microsoft.Security/advancedThreatProtectionSettings/write",
        "Microsoft.Storage/storageAccounts/read",
        "Microsoft.Storage/storageAccounts/listKeys/action",
        "Microsoft.Storage/storageAccounts/write"
      ],
      "NotActions": [],
      "AssignableScopes": [
        "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
      ]
    }
    
  2. Create the Docker Platform RBAC role.

    az role definition create --role-definition platform-role.json
    
Deploy MKE network resources

Network resources are services inside your cluster. These resources can include virtual networks, security groups, address pools, and gateways.

To create a custom role to deploy MKE network resources only:

  1. Create the role permissions JSON file.

    For example:

    {
      "Name": "Docker Networking",
      "IsCustom": true,
      "Description": "Can install and manage Docker platform networking.",
      "Actions": [
        "Microsoft.Authorization/*/read",
        "Microsoft.Network/loadBalancers/read",
        "Microsoft.Network/loadBalancers/write",
        "Microsoft.Network/loadBalancers/backendAddressPools/join/action",
        "Microsoft.Network/networkInterfaces/read",
        "Microsoft.Network/networkInterfaces/write",
        "Microsoft.Network/networkInterfaces/join/action",
        "Microsoft.Network/networkSecurityGroups/read",
        "Microsoft.Network/networkSecurityGroups/write",
        "Microsoft.Network/networkSecurityGroups/join/action",
        "Microsoft.Network/networkSecurityGroups/securityRules/read",
        "Microsoft.Network/networkSecurityGroups/securityRules/write",
        "Microsoft.Network/publicIPAddresses/read",
        "Microsoft.Network/publicIPAddresses/write",
        "Microsoft.Network/publicIPAddresses/join/action",
        "Microsoft.Network/virtualNetworks/read",
        "Microsoft.Network/virtualNetworks/write",
        "Microsoft.Network/virtualNetworks/subnets/read",
        "Microsoft.Network/virtualNetworks/subnets/write",
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Resources/subscriptions/resourcegroups/read",
        "Microsoft.Resources/subscriptions/resourcegroups/write"
      ],
      "NotActions": [],
      "AssignableScopes": [
        "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
      ]
    }
    
  2. Create the Docker Networking RBAC role.

    az role definition create --role-definition networking-role.json
    

Install MKE on Google Cloud Platform

MKE includes support for installing and running MKE on Google Cloud Platform (GCP). You will learn in this section how to prepare your system for MKE installation on GCP, how to perform the installation, and some limitations with the support for GCP on MKE.

To learn how to deploy MKE on GCP using Launchpad, see Bootstrapping MKE cluster on GCP.

Prerequisites

Verify the following prerequisites before you install MKE on GCP:

  • MTU (maximum transmission unit) is set to at least 1500 on the VPC where you want to create your instances. For more information, refer to Google Cloud official documentation: Change the MTU setting of a VPC network.

  • All MKE instances have the necessary authorization for managing cloud resources.

    GCP defines authorization through the use of service accounts, roles, and access scopes. For information on how to best configure the authorization required for your MKE instances, refer to Google Cloud official documentation: Service accounts.

    An example of a permissible role for a service account is roles/owner, and an example of an access scope that provides access to most Google services is https://www.googleapis.com/auth/cloud-platform. As a best practice, define a broad access scope such as this to an instance and then restrict access using roles.

    Refer to Google Identity official documentation: OAuth 2.0 Scopes for Google APIs for a list of available scopes, and to Google Cloud official documentation: Understanding roles for a list of available roles.

  • All of your MKE instances include the same prefix.

  • Each instance is tagged with the prefix of its associated instance names. For example, if the instance names are testcluster-m1 and testcluster-m2, tag the associated instance with testcluster.

Install MKE

To install MKE on GCP, run the following command:

docker container run --rm -it \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp:3.7.16 install \
--host-address <ucp-ip> \
--cloud-provider gce \
--interactive

Note

Do not use the --cloud-provider gce flag if you do not require cloud provider integration.

Google Cloud Platform support limitations

Be aware of the following limitations in the MKE support for GCP:

  • wscc-images-incompat-gcp-note

  • mismatched-mtu-values-note


MetalLB load-balancer for Kubernetes

Available since MKE 3.7.0

MetalLB is a load-balancer implementation for bare metal Kubernetes clusters, using standard routing protocols.

Prerequisites

  • An MKE cluster that is running Kubernetes 1.13.0 or later, which does not already have network load-balancing functionality.

  • A cluster network configuration that is compatible with MetalLB.

  • Available IPv4 addresses that MetalLB can allocate.

  • BGP operating mode requires one or more routers capable of communicating with BGP.

  • When using the L2 operating mode, traffic on port 7946 must be allowed between nodes, as required by memberlist. You can configure TCP, UDP, and other ports.

  • Verification that kube-proxy is running in iptables mode.

  • Verification of the absence of any cloud provider configuration

Install MetalLB

You use the MKE configuration file to install MetalLB:

  1. Obtain the current MKE configuration file for the cluster.

  2. Set the enabled parameter setting for the cluster_config.metallb_config.enabled to true.

  3. Add IP address pools.

  4. Verify the successful deployment of MetalLB in the cluster.

    1. Verify the creation of the metallb-system namespace:

      kubectl get ns metallb-system
      

      Example output:

      NAME             STATUS   AGE
      metallb-system   Active   93s
      
    2. Verify that all MetalLB components are running in the system:

      • Verify the Pods:

        kubectl get pods -n metallb-system
        

        Example output:

        NAME                          READY   STATUS    RESTARTS   AGE
        controller-669d7d89b5-58s2g   1/1     Running   0          119s
        speaker-cchsw                 1/1     Running   0          119s
        speaker-ph96f                 1/1     Running   0          119s
        
      • Verify the Daemonsets:

        kubectl get daemonsets -n metallb-system
        

        Example output:

        NAME      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
        speaker   2         2         2       2            2
        kubernetes.io/os=linux   28m
        
      • Verify the Deployments:

        kubectl get deployment -n metallb-system
        

        Example output:

        NAME         READY   UP-TO-DATE   AVAILABLE   AGE
        controller   1/1     1            1           29m
        
      • Verify the Services:

        kubectl get services -n metallb-system
        

        Example output:

        NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
        webhook-service   ClusterIP   10.96.18.104   <none>        443/TCP   29m
        
    3. Verify the creation of the Custom Resource Defintions:

      kubectl get crd -n metallb-system
      

      Example output:

      NAME                           CREATED AT
      addresspools.metallb.io        2023-03-16T17:11:02Z
      bfdprofiles.metallb.io         2023-03-16T17:11:03Z
      bgpadvertisements.metallb.io   2023-03-16T17:11:03Z
      bgppeers.metallb.io            2023-03-16T17:11:03Z
      communities.metallb.io         2023-03-16T17:11:03Z
      ipaddresspools.metallb.io      2023-03-16T17:11:03Z
      l2advertisements.metallb.io    2023-03-16T17:11:03Z
      
    4. Verify the creation of the specified IP pools:

      kubectl get IPAddressPools -n metallb-system
      

      Example output:

      NAME       AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
      example1   true          false             ["192.168.10.0/24","192.168.1.0/24"]
      example2   true          false             ["52.205.10.1/24"]
      

Uninstall Metallb

To uninstall MetalLB you need only update the MKE configuration file.

  1. Obtain the current MKE configuration file for the cluster.

  2. Set cluster_config.metallb_config.enabled to false.

  3. Upload the modified MKE configuration file and allow at least 5 minutes for MKE to propagate the configuration changes throughout the cluster.

  4. Verify the successful uninstall of MetalLB in the cluster.

    1. Verify that no MetalLB components are running in the system.

      • Verify the Pods:

        kubectl get pods -n metallb-system
        

        Example output:

        No resources found in metallb-system namespace.
        
      • Verify the Daemonsets:

        kubectl get daemonsets -n metallb-system
        

        Example output:

        No resources found in metallb-system namespace.
        
      • Verify the Deployments:

        kubectl get deployment -n metallb-system
        

        Example output:

        No resources found in metallb-system namespace.
        
      • Verify the Services:

        kubectl get services -n metallb-system
        

        Example output:

        No resources found in metallb-system namespace.
        
    2. Verify the deletion of all IP address pools.

      kubectl get IPAddressPools -n metallb-system
      

      Example output:

      No resources found in metallb-system namespace.
      

Install MKE offline

To install MKE on an offline host, you must first use a separate computer with an Internet connection to download a single package with all the images and then copy that package to the host where you will install MKE. Once the package is on the host and loaded, you can install MKE offline as described in Install the MKE image.

Note

During the offline installation, both manager and worker nodes must be offline.

To install MKE offline:

  1. Download the required MKE package.

  2. Copy the MKE package to the host machine:

    scp ucp.tar.gz <user>@<host>
    
  3. Use SSH to log in to the host where you transferred the package.

  4. Load the MKE images from the .tar.gz file:

    docker load -i ucp.tar.gz
    
  5. Install the MKE image.

Uninstall MKE

This topic describes how to uninstall MKE from your cluster. After uninstalling MKE, your instances of MCR will continue running in swarm mode and your applications will run normally. You will not, however, be able to do the following unless you reinstall MKE:

  • Enforce role-based access control (RBAC) to the cluster.

  • Monitor and manage the cluster from a central place.

  • Join new nodes using docker swarm join.

    Note

    You cannot join new nodes to your cluster after uninstalling MKE because your cluster will be in swarm mode, and swarm mode relies on MKE to provide the CA certificates that allow nodes to communicate with each other. After the certificates expire, the nodes will not be able to communicate at all. Either reinstall MKE before the certificates expire, or disable swarm mode by running docker swarm leave --force on every node.

To uninstall MKE:

Note

If SELinux is enabled, you must temporarily disable it prior to running the uninstall-ucp command.

  1. Log in to a manager node using SSH.

  2. Run the uninstall-ucp command in interactive mode, thus prompting you for the necessary configuration values:

    docker container run --rm -it \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v /var/log:/var/log \
      --name ucp \
      mirantis/ucp:3.7.16 uninstall-ucp --interactive
    

    Note

    The uninstall-ucp command completely removes MKE from every node in the cluster. You do not need to run the command from multiple nodes.

    If the uninstall-ucp command fails, manually uninstall MKE.

    1. On any manager node, remove the remaining MKE services:

      docker service rm $(docker service ls -f name=ucp- -q)
      
    2. On each manager node, remove the remaining MKE containers:

      docker container rm -f $(docker container ps -a -f name=ucp- -f name=k8s_ -q)
      
    3. On each manager node, remove the remaining MKE volumes:

      docker volume rm $(docker volume ls -f name=ucp -q)
      

    Note

    For more information about the uninstall-ucp failure, refer to the logs in /var/log on any manager node. Be aware that you will not be able to access the logs if the volume /var/log:/var/log is not mounted while running the ucp container.

  3. Optional. Delete the MKE configuration:

    docker container run --rm -it \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v /var/log:/var/log \
      --name ucp \
      mirantis/ucp:3.7.16 uninstall-ucp \
      --purge-config --interactive
    

    MKE keeps the configuration by default in case you want to reinstall MKE later with the same configuration. For all available uninstall-ucp options, refer to mirantis/ucp uninstall-ucp.

  4. Optional. Restore the host IP tables to their pre-MKE installation values by restarting the node.

    Note

    The Calico network plugin changed the host IP tables from their original values during MKE installation.

Deploy Swarm-only mode

Swarm-only mode is an MKE configuration that supports only Swarm orchestration. Lacking Kubernetes and its operational and health-check dependencies, the resulting highly-stable application is smaller than a typical mixed-orchestration MKE installation.

You can only enable or disable Swarm-only mode at the time of MKE installation. MKE preserves the Swarm-only setting through upgrades, backups, and system restoration. Installing MKE in Swarm-only mode pulls only the images required to run MKE in this configuration. Refer to Swarm-only images for more information.

Note

Installing MKE in Swarm-only mode removes all Kubernetes options from the MKE web UI.

To install MKE in Swarm-only mode:

  1. Complete the steps and recommendations in Plan the deployment and Perform pre-deployment configuration.

  2. Add the --swarm-only flag to the install command in Install the MKE image:

    docker container run --rm -it --name ucp \
    -v /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:3.7.16 install \
    --host-address <node-ip-address> \
    --interactive \
    --swarm-only
    

Note

In addition, MKE includes the --swarm-only flag with the bootstrapper images command, which you can use to pull or to check the required images on manager nodes.

Caution

To restore Swarm-only clusters, invoke the ucp restore command with the --swarm-only option.

Swarm-only images

Installing MKE in Swarm-only mode pulls the following set of images, which is smaller than that of a typical MKE installation:

  • ucp-agent (ucp-agent-win on Windows)

  • ucp-auth-store

  • ucp-auth

  • ucp-azure-ip-allocator

  • ucp-cfssl

  • ucp-compose

  • ucp-containerd-shim-process (ucp-containerd-shim-process-win on Windows)

  • ucp-controller

  • ucp-dsinfo (ucp-dsinfo-win on Windows)

  • ucp-etcd

  • ucp-interlock-config

  • ucp-interlock-extension

  • ucp-interlock-proxy

  • ucp-interlock

  • ucp-metrics

  • ucp-sf-notifier

  • ucp-swarm

Prometheus

In Swarm-only mode, MKE runs the Prometheus server and the authenticating proxy in a single container on each manager node. Thus, unlike in conventional MKE installations, you cannot configure Prometheus server placement. Prometheus does not collect Kubernetes metrics in Swarm-only mode, and it requires an additional reserved port on manager nodes: 12387.

Operations Guide

The MKE Operations Guide provides the comprehensive information you need to run the MKE container orchestration platform. The guide is intended for anyone who needs to effectively develop and securely administer applications at scale, on private clouds, public clouds, and on bare metal.

Access an MKE cluster

You can access an MKE cluster in a variety of ways including through the MKE web UI, Docker CLI, and kubectl (the Kubernetes CLI). To use the Docker CLI and kubectl with MKE, first download a client certificate bundle. This topic describes the MKE web UI, how to download and configure the client bundle, and how to configure kubectl with MKE.

Access the MKE web UI

MKE allows you to control your cluster visually using the web UI. Role-based access control (RBAC) gives administrators and non-administrators access to the following web UI features:

  • Administrators:

    • Manage cluster configurations.

    • View and edit all cluster images, networks, volumes, and containers.

    • Manage the permissions of users, teams, and organizations.

    • Grant node-specific task scheduling permissions to users.

  • Non-administrators:

    • View and edit all cluster images, networks, volumes, and containers. Requires administrator to grant access.

To access the MKE web UI:

  1. Open a browser and navigate to https://<ip-address> (substituting <ip-address> with the IP address of the machine that ran docker run).

  2. Enter the user name and password that you set up when installing the MKE image.

Note

To set up two-factor authentication for logging in to the MKE web UI, see Use two-factor authentication.

Download and configure the client bundle

Download and configure the MKE client certificate bundle to use MKE with Docker CLI and kubectl. The bundle includes:

  • A private and public key pair for authorizing your requests using MKE

  • Utility scripts for configuring Docker CLI and kubectl with your MKE deployment

Note

MKE issues different certificates for each user type:

User certificate bundles

Allow running docker commands only through MKE manager nodes.

Administrator certificate bundles

Allow running docker commands through all node types. etcd_cert.pem and etcd_key.pem with ca.pem allow direct connection to etcd.

Download the client bundle

This section explains how to download the client certificate bundle using either the MKE web UI or the MKE API.

To download the client certificate bundle using the MKE web UI:

  1. Navigate to My Profile.

  2. Click Client Bundles > New Client Bundle.

To download the client certificate bundle using the MKE API on Linux:

  1. Create an environment variable with the user security token:

    AUTHTOKEN=$(curl -sk -d \
    '{"username":"<username>","password":"<password>"}' \
    https://<mke-ip>/auth/login | jq -r .auth_token)
    
  2. Download the client certificate bundle:

    curl -k -H "Authorization: Bearer $AUTHTOKEN" \
    https://<mke-ip>/api/clientbundle -o bundle.zip
    

To download the client certificate bundle using the MKE API on Windows Server 2016:

  1. Open an elevated PowerShell prompt.

  2. Create an environment variable with the user security token:

    $AUTHTOKEN=((Invoke-WebRequest -Body '{"username":"<username>", \
    "password":"<password>"}' -Uri https://`<mke-ip`>/auth/login \
    -Method POST).Content)|ConvertFrom-Json|select auth_token \
    -ExpandProperty auth_token
    
  3. Download the client certificate bundle:

    [io.file]::WriteAllBytes("ucp-bundle.zip", \
    ((Invoke-WebRequest -Uri https://`<mke-ip`>/api/clientbundle \
    -Headers @{"Authorization"="Bearer $AUTHTOKEN"}).Content))
    
Configure the client bundle

This section explains how to configure the client certificate bundle to authenticate your requests with MKE using the Docker CLI and kubectl.

To configure the client certificate bundle:

  1. Extract the client bundle .zip file into a directory, and use the appropriate utility script for your system:

    • For Linux:

      cd client-bundle && eval "$(<env.sh)"
      
    • For Windows (from an elevated PowerShell prompt):

      cd client-bundle && env.cmd
      

    The utility scripts do the following:

    • Update DOCKER_HOST to make the client tools communicate with your MKE deployment.

    • Update DOCKER_CERT_PATH to use the certificates included in the client bundle.

    • Configure kubectl with the kubectl config command.

      Note

      The kubeconfig file is named kube.yaml and is located in the unzipped client bundle directory.

  2. Verify that your client tools communicate with MKE:

    docker version --format '{{.Server.Version}}'
    kubectl config current-context
    

    The expected Docker CLI server version starts with ucp/, and the expected kubectl context name starts with ucp_.

  3. Optional. Change your context directly using the client certificate bundle .zip files. In the directory where you downloaded the user bundle, add the new context:

    cd client-bundle && docker context \
    import myucp ucp-docker-bundle.zip
    

Note

If you use the client certificate bundle with buildkit, make sure that builds are not accidentally scheduled on manager nodes. For more information, refer to Manage services node deployment.

Configure kubectl with MKE

MKE installations include Kubernetes. Users can deploy, manage, and monitor Kubernetes using either the MKE web UI or kubectl.

To install and use kubectl:

  1. Identify which version of Kubernetes you are running by using the MKE web UI, the MKE API version endpoint, or the Docker CLI docker version command with the client bundle.

    Caution

    Kubernetes requires that kubectl and Kubernetes be within one minor version of each other.

  2. Refer to Kubernetes: Install Tools to download and install the appropriate kubectl binary.

  3. Download the client bundle.

  4. Refer to Configure the client bundle to configure kubectl with MKE using the certificates and keys contained in the client bundle.

  5. Optional. Install Helm, the Kubernetes package manager, and Tiller, the Helm server.

    Caution

    Helm requires MKE 3.1.x or higher.

    To use Helm and Tiller with MKE, grant the default service account within the kube-system namespace the necessary roles:

    kubectl create rolebinding default-view --clusterrole=view \
    --serviceaccount=kube-system:default --namespace=kube-system
    
    kubectl create clusterrolebinding add-on-cluster-admin \
    --clusterrole=cluster-admin --serviceaccount=kube-system:default
    

    Note

    Helm recommends that you specify a Role and RoleBinding to limit the scope of Tiller to a particular namespace. Refer to the official Helm documentation for more information.

See also

Kubernetes

Administer an MKE cluster

Add labels to cluster nodes

With MKE, you can add labels to your nodes. Labels are metadata that describe the node, such as:

  • node role (development, QA, production)

  • node region (US, EU, APAC)

  • disk type (HDD, SSD)

Once you apply a label to a node, you can specify constraints when deploying a service to ensure that the service only runs on nodes that meet particular criteria.

Hint

Use resource sets (MKE collections or Kubernetes namespaces) to organize access to your cluster, rather than creating labels for authorization and permissions to resources.

Apply labels to a node

The following example procedure applies the ssd label to a node.

  1. Log in to the MKE web UI with administrator credentials.

  2. Click Shared Resources in the navigation menu to expand the selections.

  3. Click Nodes. The details pane will display the full list of nodes.

  4. Click the node on the list that you want to attach labels to. The details pane will transition, presenting the Overview information for the selected node.

  5. Click the settings icon in the upper-right corner to open the Edit Node page.

  6. Navigate to the Labels section and click Add Label.

  7. Add a label, entering disk into the Key field and ssd into the Value field.

  8. Click Save to dismiss the Edit Node page and return to the node Overview.

Hint

You can use the CLI to apply a label to a node:

docker node update --label-add <key>=<value> <node-id>
Deploy a service with constraints

The following example procedure deploys a service with a constraint that ensures that the service only runs on nodes with SSD storage node.labels.disk == ssd.


To deploy an application stack with service constraints:

  1. Log in to the MKE web UI with administrator credentials.

  2. Verify that the target node orchestrator is set to Swarm.

  3. Click Shared Resources in the left-side navigation panel to expand the selections.

  4. Click Stacks. The details pane will display the full list of stacks.

  5. Click the Create Stack button to open the Create Application page.

  6. Under 1. Configure Application, enter “wordpress” into the Name field .

  7. Under ORCHESTRATOR NODE, select Swarm Services.

  8. Under 2. Add Application File, paste the following stack file in the docker-compose.yml editor:

    version: "3.1"
    
    services:
      db:
        image: mysql:5.7
        deploy:
          placement:
            constraints:
              - node.labels.disk == ssd
          restart_policy:
            condition: on-failure
        networks:
          - wordpress-net
        environment:
          MYSQL_ROOT_PASSWORD: wordpress
          MYSQL_DATABASE: wordpress
          MYSQL_USER: wordpress
          MYSQL_PASSWORD: wordpress
      wordpress:
        depends_on:
          - db
        image: wordpress:latest
        deploy:
          replicas: 1
          placement:
            constraints:
              - node.labels.disk == ssd
          restart_policy:
            condition: on-failure
            max_attempts: 3
        networks:
          - wordpress-net
        ports:
          - "8000:80"
        environment:
          WORDPRESS_DB_HOST: db:3306
          WORDPRESS_DB_PASSWORD: wordpress
    
    networks:
      wordpress-net:
    
  9. Click Create to deploy the stack.

  10. Click Done once the stack deployment completes to return to the stacks list which now features your newly created stack.


To verify service tasks deployed to labeled node:

  1. In the left-side navigation panel, navigate to Shared Resources > Nodes. The details pane will display the full list of nodes.

  2. Click the node with the disk label.

  3. In the details pane, click the Metrics tab to verify that WordPress containers are scheduled on the node.

  4. In the left-side navigation panel, navigate to Shared Resources > Nodes.

  5. Click any node that does not have the disk label.

  6. In the details pane, click the Metrics tab to verify that there are no WordPress containers scheduled on the node.

Add Swarm placement constraints

If a node is set to use Kubernetes as its orchestrator while simultaneously running Swarm services, you must deploy placement constraints to prevent those services from being scheduled on the node.

The necessary service constraints will be automatically adopted by any new MKE-created Swarm services, as well as by older Swarm services that you have updated. MKE does not automatically add placement constraints, however, to Swarm services that were created using older versions of MKE, as to do so would restart the service tasks.


To add placement constraints to older Swarm services:

  1. Download and configure the client bundle.

  2. Identify the Swarm services that do not have placement constraints:

    services=$(docker service ls -q)
    for service in $services; do
        if docker service inspect $service --format '{{.Spec.TaskTemplate.Placement.Constraints}}' | grep -q -v 'node.labels.com.docker.ucp.orchestrator.swarm==true'; then
            name=$(docker service inspect $service --format '{{.Spec.Name}}')
            if [ $name = "ucp-agent" ] || [ $name = "ucp-agent-win" ] ||  [ $name = "ucp-agent-s390x" ]; then
                continue
            fi
            echo "Service $name (ID: $service) is missing the node.labels.com.docker.ucp.orchestrator.swarm=true placement constraint"
        fi
    done
    
  3. Add placement constraints to the Swarm services you identified:

    Note

    All service tasks will restart, thus causing some amount of service downtime.

    services=$(docker service ls -q)
    for service in $services; do
        if docker service inspect $service --format '{{.Spec.TaskTemplate.Placement.Constraints}}' | grep -q -v 'node.labels.com.docker.ucp.orchestrator.swarm=true'; then
            name=$(docker service inspect $service --format '{{.Spec.Name}}')
            if [ $name = "ucp-agent" ] || [ $name = "ucp-agent-win" ]; then
                continue
            fi
            echo "Updating service $name (ID: $service)"
            docker service update --detach=true --constraint-add node.labels.com.docker.ucp.orchestrator.swarm==true $service
        fi
    done
    
Add or remove a service constraint using the MKE web UI

You can declare the deployment constraints in your docker-compose.yml file or when you create a stack. Also, you can apply constraints when you create a service.

To add or remove a service constraint:

  1. Verify whether a service has deployment constraints:

    1. Navigate to the Services page and select that service.

    2. In the details pane, click Constraints to list the constraint labels.

  2. Edit the constraints on the service:

    1. Click Configure and select Details to open the Update Service page.

    2. Click Scheduling to view the constraints.

    3. Add or remove deployment constraints.

Add SANs to cluster certificates

A SAN (Subject Alternative Name) is a structured means for associating various values (such as domain names, IP addresses, email addresses, URIs, and so on) with a security certificate.

MKE always runs with HTTPS enabled. As such, whenever you connect to MKE, you must ensure that the MKE certificates recognize the host name in use. For example, if MKE is behind a load balancer that forwards traffic to your MKE instance, your requests will not be for the MKE host name or IP address but for the host name of the load balancer. Thus, MKE will reject the requests, unless you include the address of the load balancer as a SAN in the MKE certificates.

Note

  • To use your own TLS certificates, confirm first that these certificates have the correct SAN values.

  • To use the self-signed certificate that MKE offers out-of-the-box, you can use the --san argument to set up the SANs during MKE deployment.

To add new SANs using the MKE web UI:

  1. Log in to the MKE web UI using administrator credentials.

  2. Navigate to the Nodes page.

  3. Click on a manager node to display the details pane for that node.

  4. Click Configure and select Details.

  5. In the SANs section, click Add SAN and enter one or more SANs for the cluster.

  6. Click Save.

  7. Repeat for every existing manager node in the cluster.

    Note

    Thereafter, the SANs are automatically applied to any new manager nodes that join the cluster.

To add new SANs using the MKE CLI:

  1. Get the current set of SANs for the given manager node:

    docker node inspect --format '{{ index .Spec.Labels "com.docker.ucp.SANs"
    }}' <node-id>
    

    Example of system response:

    default-cs,127.0.0.1,172.17.0.1
    
  2. Append the desired SAN to the list (for example, default-cs,127.0.0.1,172.17.0.1,example.com) and run:

    docker node update --label-add com.docker.ucp.SANs=<SANs-list> <node-id>
    

    Note

    <SANs-list> is the comma-separated list of SANs with your new SAN appended at the end.

  3. Repeat the command sequence for each manager node.

Collect MKE cluster metrics with Prometheus

Prometheus is an open-source systems monitoring and alerting toolkit to which you can configure MKE as a target.

Prometheus runs as a Kubernetes deployment that, by default, is a DaemonSet that runs on every manager node. A key benefit of this is that you can set the DaemonSet to not schedule on any nodes, which effectively disables Prometheus if you do not use the MKE web interface.

Along with events and logs, metrics are data sources that provide a view into your cluster, presenting numerical data values that have a time-series component. There are several sources from which you can derive metrics, each providing different meanings for a business and its applications.

As the metrics data is stored locally on disk for each Prometheus server, it does not replicate on new managers or if you schedule Prometheus to run on a new node. The metrics are kept no longer than 24 hours.

MKE metrics types

MKE provides a base set of metrics that gets you into production without having to rely on external or third-party tools. Mirantis strongly encourages, though, the use of additional monitoring to provide more comprehensive visibility into your specific MKE environment.

Metrics types

Metric type

Description

Business

High-level aggregate metrics that typically combine technical, financial, and organizational data to create IT infrastructure information for business leaders. Examples of business metrics include:

  • Company or division-level application downtime

  • Aggregation resource utilization

  • Application resource demand growth

Application

Metrics on APM tools domains (such as AppDynamics and DynaTrace) that supply information on the state or performance of the application itself.

  • Service state

  • Container platform

  • Host infrastructure

Service

Metrics on the state of services that are running on the container platform. Such metrics have very low cardinality, meaning the values are typically from a small fixed set of possibilities (commonly binary).

  • Application health

  • Convergence of Kubernetes deployments and Swarm services

  • Cluster load by number of services or containers or pods

Note

Web UI disk usage (including free space) reflects only the MKE managed portion of the file system: /var/lib/docker. To monitor the total space available on each filesystem of an MKE worker or manager, deploy a third-party monitoring solution to oversee the operating system.

See also

Kubernetes

Metrics labels

The metrics that MKE exposes in Prometheus have standardized labels, depending on the target resource.

Container labels

Label name

Value

collection

The collection ID of the collection the container is in, if any.

container

The ID of the container.

image

The name of the container image.

manager

Set to true if the container node is an MKE manager.

name

The container name.

podName

The pod name, if the container is part of a Kubernetes Pod.

podNamespace

The pod namespace, if the container is part of a Kubernetes Pod namespace.

podContainerName

The container name in the pod spec, if the container is part of a Kubernetes pod.

service

The service ID, if the container is part of a Swarm service.

stack

The stack name, if the container is part of a Docker Compose stack.

Container networking labels

Label name

Value

collection

The collection ID of the collection the container is in, if any.

container

The ID of the container.

image

The name of the container image.

manager

Set to true if the container node is an MKE manager.

name

The container name.

network

The ID of the network.

podName

The pod name, if the container is part of a Kubernetes pod.

podNamespace

The pod namespace, if the container is part of a Kubernetes pod namespace.

podContainerName

The container name in the pod spec, if the container is part of a Kubernetes pod.

service

The service ID, if the container is part of a Swarm service.

stack

The stack name, if the container is part of a Docker Compose stack.

Note

The container networking labels are the same as the Container labels, with the addition of network.

Node labels

Label name

Value

manager

Set to true if the node is an MKE manager.

See also

Kubernetes

Core MKE metrics

MKE exports metrics on every node and also exports additional metrics from every controller.

Node-sourced MKE metrics

The metrics that MKE exports from nodes are specific to those nodes (for example, the total memory on that node).

The tables below offer detail on the node-sourced metrics that MKE exposes in Prometheus with the ucp_ label.

ucp_engine_container_cpu_percent

Units

Percentage

Description

Percentage of CPU time in use by the container

Labels

Container

ucp_engine_container_cpu_total_time_nanoseconds

Units

Nanoseconds

Description

Total CPU time used by the container

Labels

Container

ucp_engine_container_disk_size_rootfs

Units

Bytes

Description

Total container disk size

Labels

Container

ucp_engine_container_health

Units

0.0 or 1.0

Description

The container health, according to its healthcheck.

The 0 value indicates that the container is not reporting as healthy, which is likely because it either does not have a healthcheck defined or because healthcheck results have not yet been returned

Labels

Container

ucp_engine_container_memory_max_usage_bytes

Units

Bytes

Description

Maximum memory in use by the container in bytes

Labels

Container

ucp_engine_container_memory_usage_bytes

Units

Bytes

Description

Current memory in use by the container in bytes

Labels

Container

ucp_engine_container_memory_usage_percent

Units

Percentage

Description

Percentage of total node memory currently in use by the container

Labels

Container

ucp_engine_container_network_rx_bytes_total

Units

Bytes

Description

Number of bytes received by the container over the network in the last sample

Labels

Container networking

ucp_engine_container_network_rx_dropped_packets_total

Units

Number of packets

Description

Number of packets bound for the container over the network that were dropped in the last sample

Labels

Container networking

ucp_engine_container_network_rx_errors_total

Units

Number of errors

Description

Number of received network errors for the container over the network in the last sample

Labels

Container networking

ucp_engine_container_network_rx_packets_total

Units

Number of packets

Description

Number of packets received by the container over the network in the last sample

Labels

Container networking

ucp_engine_container_network_tx_bytes_total

Units

Bytes

Description

Number of bytes sent by the container over the network in the last sample

Labels

Container networking

ucp_engine_container_network_tx_dropped_packets_total

Units

Number of packets

Description

Number of packets sent from the container over the network that were dropped in the last sample

Labels

Container networking

ucp_engine_container_network_tx_errors_total

Units

Number of errors

Description

Number of sent network errors for the container on the network in the last sample

Labels

Container networking

ucp_engine_container_network_tx_packets_total

Units

Number of packets

Description

Number of sent packets for the container over the network in the last sample

Labels

Container networking

ucp_engine_container_unhealth

Units

0.0 or 1.0

Description

Indicates whether the container is healthy, according to its healthcheck.

The 0 value indicates that the container is not reporting as healthy, which is likely because it either does not have a healthcheck defined or because healthcheck results have not yet been returned

Labels

Container

ucp_engine_containers

Units

Number of containers

Description

Total number of containers on the node

Labels

Node

ucp_engine_cpu_total_time_nanoseconds

Units

Nanoseconds

Description

System CPU time used by the container

Labels

Container

ucp_engine_disk_free_bytes

Units

Bytes

Description

Free disk space on the Docker root directory on the node, in bytes. This metric is not available to Windows nodes

Labels

Node

ucp_engine_disk_total_bytes

Units

Bytes

Description

Total disk space on the Docker root directory on this node in bytes. Note that the ucp_engine_disk_free_bytes metric is not available for Windows nodes

Labels

Node

ucp_engine_images

Units

Number of images

Description

Total number of images on the node

Labels

Node

ucp_engine_memory_total_bytes

Units

Bytes

Description

Total amount of memory on the node

Labels

Node

ucp_engine_networks

Units

Number of networks

Description

Total number of networks on the node

Labels

Node

ucp_engine_num_cpu_cores

Units

Number of cores

Description

Number of CPU cores on the node

Labels

Node

ucp_engine_volumes

Units

Number of volumes

Description

Total number of volumes on the node

Labels

Node

Controller-sourced MKE metrics

The metrics that MKE exports from controllers are cluster-scoped (for example, the total number of Swarm services).

The tables below offer detail on the controller-sourced metrics that MKE exposes in Prometheus with the ucp_ label.

ucp_controller_services

Units

Number of services

Description

Total number of Swarm services

Labels

Not applicable

ucp_engine_node_health

Units

0.0 or 1.0

Description

Health status of the node, as determined by MKE

Labels

nodeName: node name, nodeAddr: node IP address

ucp_engine_pod_container_ready

Units

0.0 or 1.0

Description

Readiness of the container in a Kubernetes pod, as determined by its readiness probe

Labels

Pod

ucp_engine_pod_ready

Units

0.0 or 1.0

Description

Readiness of the container in a Kubernetes pod, as determined by its readiness probe

Labels

Pod

See also

Kubernetes Pods

MKE component metrics

Available since MKE 3.7.0

In addition to the core metrics that MKE exposes, you can use Prometheus to scrape a variety of metrics associated with MKE middleware components.

Herein, Mirantis outlines the components that expose Prometheus metrics, as well as offering detail on various key metrics. You should note, however, that this information is not exhaustive, but is rather a guideline to metrics that you may find especially useful in determining the overall health of your MKE deployment.

For specific key metrics, refer to the Usage information, which offers valuable insights on interpreting the data and using it to troubleshoot your MKE deployment.

Kube State Metrics

MKE deploys Kube State Metrics to expose metrics on the state of Kubernetes objects, such as Deployments, nodes, and Pods. These metrics are exposed in MKE on the ucp-kube-state-metrics service and can be scraped at ucp-kube-state-metrics.kube-system.svc.cluster.local:8080.

Note

Consult the documentation for Kube State Metrics for an extensive list of all the metrics exposed by Kube State Metrics.

Workqueue metrics for Kubernetes components

You can use workqueue metrics to learn how long it takes for various components to fulfill different actions and to check the level of work queue activity.

The metrics offered below are based on kube-controller-manager, however the same metrics are available for other Kubernetes components.

Usage

Abnormal workqueue metrics can be symptomatic of issues in the specific component. For example, an increase in workqueue_depth for the Kubernetes Controller Manager can indicate that the component is being oversaturated. In such cases, review the logs of the affected component.

workqueue_queue_duration_seconds_bucket

Description

Time that kube-controller-manager requires to fulfill the actions necessary to maintain the desired cluster status.

Example query

The following query checks the 99th percentile that kube-controler-manager needs to process items in the workqueue:

histogram_quantile(0.99,sum(rate(workqueue_queue_duration_seconds_bucket{job="kube_controller_manager_nodes"}[5m]))
by (instance, name, le))
workqueue_adds_total

Description

Measures additions to the workqueue. A high value can indicate issues with the component.

Example query

The following query checks the rate at which items are added to the workqueue:

sum(rate(workqueue_adds_total{job="kube_controller_manager_nodes"}[5m]))
by (instance, name)
workqueue_depth

Description

Relates to the size of the workqueue. The larger the workqueue, the more material there is to process. A growing trend in the size of the workqueue can be indicative of issues in the cluster.

Example query

sum(rate(workqueue_depth{job="kube_controller_manager_nodes"}[5m]))
by(instance, name)
Kubelet metrics

The kubelet agent runs on every node in an MKE cluster. Once you have set up the MKE client bundle you can view the available kubelet metrics for each node in an MKE cluster using the commands detailed below:

  • Obtain the name of the first available node in your MKE cluster:

    NODE_NAME=$(kubectl get node | sed -n '2 p' | awk '{print $1}')
    
  • Ping the kubelet metrics endpoint on the chosen node:

    kubectl get --raw /api/v1/nodes/${NODE_NAME}/proxy/metrics
    

The following are a number of key kubelet metrics:

kube_node_status_condition

Description

Reflects the total number of kubelet instances, which should correlate with the number of nodes in the cluster.

Example query

sum(kube_node_status_condition{condition="Ready", status= "true"})

Usage

If the number of kubelet instances decreases unexpectedly, review the nodes for connectivity issues.

kubelet_running_pods

Description

Indicates the total number of running Pods, which you can use to verify whether the number of Pods is in the expected range for your cluster.

Usage

If the number of Pods is unexpected on a node, review your Node Affinity or Node Selector rules to verify the scheduling of Pods for the appropriate nodes.

kubelet_running_containers

Description

Indicates the number of containers per node. You can query for a specific container state (running, created, exited). A high number of exited containers on a node can indicate issues on that node.

Example query

kubelet_running_containers{container_state="created"}

Usage

If the number of containers is unexpected on a node, check your Node Affinity or Node Selector rules to verify the scheduling of Pods for the appropriate nodes.

kubelet_runtime_operations_total

Description

Provides the total count of runtime operations, organized by type.

Example query

kubelet_runtime_operations_total{operation_type="create_container"}

Usage

An increase in runtime operations duration and/or runtime operations errors can indicate problems with the container runtime on the node.

kubelet_runtime_operations_errors_total

Description

Displays the number of errors in runtime operations. Monitor this metric to learn of issues on a node.

Usage

An increase in runtime operations duration and/or runtime operations errors can indicate problems with the container runtime on the node.

kubelet_runtime_operations_duration_seconds_bucket

Description

Reflects the time required for each runtime operation.

Example query

The following query checks the 99th percentile for time taken for various runtime operations.

histogram_quantile(0.99,
sum(rate(kubelet_runtime_operations_duration_seconds_bucket{instance=~".*"}[5m]))
by (instance, operation_type, le))

Usage

An increase in runtime operations duration and/or runtime operations errors can indicate problems with the container runtime on the node.

Kube Proxy

Kube Proxy runs on each node in an MKE cluster. Once you have set up the MKE client bundle, you can view the available Kube Proxy metrics for each node in an MKE cluster using the commands detailed below:

Note

The Kube Proxy metrics are only available when Kube Proxy is enabled in the MKE configuration and is running in either ipvs or iptables mode.

  • Obtain the name of the first available node in your MKE cluster:

    NODE_NAME=$(kubectl get node | sed -n '2 p' | awk '{print $1}')
    
  • Ping the kubelet metrics endpoint on the chosen node:

    kubectl get --raw /api/v1/nodes/${NODE_NAME}:10249/proxy/metrics
    

    Note

    Specify port 10249, as this is the port on which Kube Proxy metrics are exposed.

The following are a number of key Kube Proxy metrics:

kube_proxy_nodes

Description

Reflects the total number of Kube Proxy nodes, which should correlate with the number of nodes in the cluster.

Example query

sum(up{job="kube-proxy-nodes"})

Usage

If the number of kube-proxy instances decreases unexpectedly, check the nodes for connectivity issues.

rest_client_request_duration_seconds_bucket

Description

Reflects the latency of client requests, in seconds. Such information can be useful in determining whether your cluster is experiencing performance degradation.

Example query

The following query illustrates the latency for all POST requests.

rest_client_request_duration_seconds_bucket{verb="POST"}

Usage

Review Kube Proxy logs on affected nodes to uncover any potential errors or timeouts.

kubeproxy_sync_proxy_rules_duration_seconds_bucket

Description

Displays the latency in seconds between Kube Proxy network rules, which are consistently synchronized between nodes. If the measurement is increasing consistently it can result in Kube Proxy being out of sync across the nodes.

rest_client_requests_total

Description

Monitors the HTTP response codes for all requests to Kube Proxy. An increase in 5xx response codes can indicate issues with Kube Proxy.

Example query

The following query presents the number of 5xx response codes from Kube Proxy.

rest_client_requests_total{job="kube-proxy-nodes",code=~"5.."}

Usage

Review Kube Proxy logs on affected nodes to obtain details of the error responses.

Kube Controller Manager

Kube Controller Manager is a collection of different Kubernetes controllers whose primary task is to monitor changes in the state of various Kubenetes objects. It runs on all manager nodes in an MKE cluster.

Key Kube Controller Manager metrics are detailed as follows:

rest_client_request_duration_seconds_bucket

Description

Reflects the latency of calls to the API server, in seconds. Such information can be useful in determining whether your cluster is experiencing slower cluster performance.

Example query

The following query displays the 99th percentile latencies on requests to the API server.

histogram_quantile(0.99,
sum(rate(rest_client_request_duration_seconds_bucket{job="kube_controller_manager_nodes"}[5m]))
by (url, le))

Usage

Review the Kube Controller Manager logs on affected nodes to determine whether the metrics are abnormal.

rest_client_requests_total

Description

Presents the total number of HTTP requests to Kube Controller Manager, segmented by HTTP response code. A sudden increase in requests or an increase in requests with error response codes can indicate issues with the cluster.

Example query

The following query displays the rate of successful HTTP requests (those offering 2xx response codes).

sum(rate(rest_client_requests_total{job="kube_controller_manager_nodes"
,code=~"2.."}[5m]))

Usage

Review the Kube Controller Manager logs on affected nodes to determine whether the metrics are abnormal.

process_cpu_seconds_total

Description

Measures the total CPU time spent by a Kube Controller Manager instance.

Example query

rate(process_cpu_seconds_total{job="kube_controller_manager_nodes"}[5m])
process_resident_memory_bytes

Description

Measures the amount of resident memory by Kube Controller Manager instance.

Example query

rate(process_resident_memory_bytes{job="kube_controller_manager_nodes"}[5m])
Kube Apiserver

The Kube API server is the core of the Kubernetes control plane. It provides a means for obtaining information on Kubernetes objects and is also used to modify the state of API objects. MKE runs an instance of the Kube API server on each manager node.

The following are a number of key Kube Apiserver metrics:

apiserver_request_duration_seconds_bucket

Description

Measures latency for each request to the Kube API server.

Example query

The following query shows how latency is distributed across different HTTP verbs.

histogram_quantile(0.99,
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers"}[5m]))
by (verb, le))
apiserver_request_total

Description

Measures the total traffic to the api server, the resource being accessed, and whether the request is successful.

Example query

The following query measures the rate of requests that return 2xx HTTP response codes. You can modify the query to measure the rate of error requests.

sum(rate(apiserver_request_total{job="kubernetes-apiservers",code=~"2.."}[5m]))
Calico

Calico is the default networking plugin for MKE. Specifically, MKE gathers metrics from both the Felix and Kube-Controllers Calico components.

Refer to the official Calico documentation on Prometheus statistics for detailed information on Felix and kube controllers metrics.

RethinkDB

MKE deploys RethinkDB Exporter on all manager nodes, to allow metrics scraping from RethinkDB. The RethinkDB Exporter exports most of the statistics from the RethinkDB stats table.

You can monitor the read and write throughput for each RethinkDB replica by reviewing the following metrics:

table_docs_per_second

Description

Current number of document reads and writes per second from the table.

cluster_docs_per_second

Description

Current number of document reads and writes per second from the cluster.

server_docs_per_second

Description

Current number of document reads and writes per second from the server.

These metrics are organized into read/write categories and by replica. For example, to view all the table read metrics on a specific node you can run the following query:

table_docs_per_second{operation="read", instance="instance_name"}
NodeLocalDNS

MKE deploys NodeLocalDNS on every node, with the Prometheus plugin enabled. You can scrape NodeLocalDNS metrics on port 9253, which provides regular CoreDNS metrics that include the standard RED (Rate, Errors, Duration) metrics:

  • queries

  • durations

  • error counts

The metrics path is fixed to /metrics.

Metric

Description

coredns_build_info

Information to build CoreDNS.

coredns_cache_entries

Number of entries in the cache.

coredns_cache_size

Cache size.

coredns_cache_hits_total

Counter of cache hits by cache type.

coredns_cache_misses_total

Counter of cache misses.

coredns_cache_requests_total

Total number of DNS resolution requests in different dimensions.

coredns_dns_request_duration_seconds_bucket

Histogram of DNS request duration (bucket).

coredns_dns_request_duration_seconds_count

Histogram of DNS request duration (count).

coredns_dns_request_duration_seconds_sum

Histogram of DNS request duration (sum).

coredns_dns_request_size_bytes_bucket

Histogram of the size of DNS request (bucket).

coredns_dns_request_size_bytes_count

Histogram of the size of DNS request (count).

coredns_dns_request_size_bytes_sum

Histogram of the size of DNS request (sum).

coredns_dns_requests_total

Number of DNS requests.

coredns_dns_response_size_bytes_bucket

Histogram of the size of DNS response (bucket).

coredns_dns_response_size_bytes_count

Histogram of the size of DNS response (count).

coredns_dns_response_size_bytes_sum

Histogram of the size of DNS response (sum).

coredns_dns_responses_total

DNS response codes and number of DNS response codes.

coredns_forward_conn_cache_hits_total

Number of cache hits for each protocol and data flow.

coredns_forward_conn_cache_misses_total

Number of cache misses for each protocol and data flow.

coredns_forward_healthcheck_broken_total

Unhealthy upstream count.

coredns_forward_healthcheck_failures_total

Count of failed health checks per upstream.

coredns_forward_max_concurrent_rejects_total

Number of requests rejected due to excessive concurrent requests.

coredns_forward_request_duration_seconds_bucket

Histogram of forward request duration (bucket).

coredns_forward_request_duration_seconds_count

Histogram of forward request duration (count).

coredns_forward_request_duration_seconds_sum

Histogram of forward request duration (sum).

coredns_forward_requests_total

Number of requests for each data flow.

coredns_forward_responses_total

Number of responses to each data flow.

coredns_health_request_duration_seconds_bucket

Histogram of health request duration (bucket).

coredns_health_request_duration_seconds_count

Histogram of health request duration (count).

coredns_health_request_duration_seconds_sum

Histogram of health request duration (sum).

coredns_health_request_failures_total

Number of health request failures.

coredns_hosts_reload_timestamp_seconds

Timestamp of the last reload of the host file.

coredns_kubernetes_dns_programming_duration_seconds_bucket

Histogram of DNS programming duration (bucket).

coredns_kubernetes_dns_programming_duration_seconds_count

Histogram of DNS programming duration (count).

coredns_kubernetes_dns_programming_duration_seconds_sum

Histogram of DNS programming duration (sum).

coredns_local_localhost_requests_total

Number of localhost requests.

coredns_nodecache_setup_errors_total

Number of nodecache setup errors.

coredns_dns_response_rcode_count_total

Number of responses for each Zone and Rcode.

coredns_dns_request_count_total

Number of DNS requests.

coredns_dns_request_do_count_total

Number of requests with the DNSSEC OK (DO) bit set.

coredns_dns_do_requests_total

Number of requests with the DO bit set.

coredns_dns_request_type_count_total

Number of requests for each Zone and Type.

coredns_panics_total

Total number of panics.

coredns_plugin_enabled

Whether a plugin is enabled.

coredns_reload_failed_total

Number of last reload failures.

MKE cAdvsior metrics

Once you have enabled cAdvisor and generated an auth token, you can issue the following command to access the cAdvisor metrics:

curl -sk -H
"Authorization: Bearer $AUTHTOKEN"
"$<mke_url>/metricsservice/query?query=$<mke_specific_metric>\[$<time_duration>\]"

The Prometheus container metrics exposed by cAdvisor are presented below:

cadvisor_version_info

Units

N/A

Description

A metric with a constant 1 value that is labeled by kernel version, OS version, Docker version, cAdvisor version and cAdvisor revision.

Labels

cadvisorRevision, cadvisorVersion, instance, job, kernelVersion, osVersion

container_blkio_device_usage_total

Units

bytes

Description

The Block I/O (blkio) device bytes usage.

Labels

container, device, id, image, instance, job, major, minor, name, namespace, operation, pod

container_cpu_system_seconds_total

Value

seconds

Description

Cumulative system CPU time consumed.

Labels

container, id, image, instance, job, name, namespace, pod

container_cpu_usage_seconds_total

Value

seconds

Description

Cumulative CPU time consumed.

Labels

container, id, image, instance, job, name, namespace, pod

container_cpu_user_seconds_total

Value

seconds

Description

Cumulative user CPU time consumed.

Labels

container, id, image, instance, job, name, namespace, pod

container_fs_reads_bytes_total

Units

bytes

Description

Cumulative count of bytes read.

Labels

container, device, id, image, instance, job, name, namespace, pod

container_fs_reads_total

Value

integer

Description

Cumulative count of reads completed.

Labels

container, device, id, image, instance, job, name, namespace, pod

container_fs_writes_bytes_total

Units

bytes

Description

Cumulative count of bytes written.

Labels

container, device, id, image, instance, job, name, namespace, pod

container_fs_writes_total

Value

integer

Description

Cumulative count of writes completed.

Labels

container, device, id, image, instance, job, name, namespace, pod

container_last_seen

Units

timestamp

Description

Last time a container was seen by the exporter.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_cache

Units

bytes

Description

Total page cache memory.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_failcnt

Value

integer

Description

Number of memory usage hits limits.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_failures_total

Value

integer

Description

Cumulative count of memory allocation failures.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_mapped_file

Units

bytes

Description

Size of memory mapped files.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_max_usage_bytes

Units

bytes

Description

Maximum memory usage recorded.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_rss

Units

bytes

Description

Size of RSS.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_swap

Units

bytes

Description

Container swap usage.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_usage_bytes

Units

bytes

Description

Current memory usage, including all memory regardless of when it was accessed.

Labels

container, id, image, instance, job, name, namespace, pod

container_memory_working_set_bytes

Units

bytes

Description

Current working set.

Labels

container, id, image, instance, job, name, namespace, pod

container_network_receive_bytes_total

Units

bytes

Description

Cumulative count of bytes received.

Labels

container, id, image, instance, job, name, namespace, pod

container_network_receive_errors_total

Value

integer

Description

Cumulative count of errors encountered while receiving.

Labels

container, id, image, instance, job, name, namespace, pod

container_network_receive_packets_dropped_total

Value

integer

Description

Cumulative count of packets dropped while receiving.

Labels

container, id, image, instance, job, name, namespace, pod

container_network_receive_packets_total

Value

integer

Description

Cumulative count of packets received.

Labels

container, id, image, instance, job, name, namespace, pod

container_network_transmit_bytes_total

Value

integer

Description

Cumulative count of bytes transmitted.

Labels

container, id, image, instance, job, name, namespace, pod

container_network_transmit_errors_total

Value

integer

Description

Cumulative count of errors encountered while transmitting.

Labels

container, id, image, instance, job, name, namespace, pod

container_network_transmit_packets_dropped_total

Value

integer

Description

Cumulative count of packets dropped while transmitting.

Labels

container, id, image, instance, job, name, namespace, pod

container_network_transmit_packets_total

Value

integer

Description

Cumulative count of packets transmitted.

Labels

container, id, image, instance, job, name, namespace, pod

container_scrape_error

Units

N/A

Description

1 if an error occurred while container metrics were being obtained, otherwise 0.

Labels

instance, job

container_spec_cpu_period

Units

N/A

Description

CPU period of the container.

Labels

container, id, image, instance, job, name, namespace, pod

container_spec_cpu_shares

Units

N/A

Description

CPU share of the container.

Labels

container, id, image, instance, job, name, namespace, pod

container_spec_memory_limit_bytes

Units

bytes

Description

Memory limit for the container.

Labels

container, id, image, instance, job, name, namespace, pod

container_spec_memory_reservation_limit_bytes

Units

bytes

Description

Memory reservation limit for the container.

Labels

container, id, image, instance, job, name, namespace, pod

container_spec_memory_swap_limit_bytes

Units

bytes

Description

Memory swap limit for the container.

Labels

container, id, image, instance, job, name, namespace, pod

container_start_time_seconds

Value

seconds

Description

Start time of the container since unix epoch.

Labels

container, id, image, instance, job, name, namespace, pod

machine_cpu_cores

Value

integer

Description

Number of logical CPU cores.

Labels

boot_id, instance, job, machine_id, system_uuid

machine_cpu_physical_cores

Value

integer

Description

Number of physical CPU cores.

Labels

boot_id, instance, job, machine_id, system_uuid

machine_cpu_sockets

Value

integer

Description

Number of CPU sockets.

Labels

boot_id, instance, job, machine_id, system_uuid

machine_memory_bytes

Units

bytes

Description

Amount of memory installed on the machine.

Labels

boot_id, instance, job, machine_id, system_uuid

machine_nvm_avg_power_budget_watts

Units

watts

Description

NVM power budget.

Labels

boot_id, instance, job, machine_id, system_uuid

machine_nvm_capacity

Units

bytes

Description

NVM capacity value, labeled by NVM mode (memory mode or app direct mode).

Labels

boot_id, instance, job, machine_id, system_uuid

machine_scrape_error

Value

integer

Description

1 if an error occurred while machine metrics were being obtained, otherwise 0.

Labels

instance, job

Deploy Prometheus on worker nodes

MKE deploys Prometheus by default on the manager nodes to provide a built-in metrics backend. For cluster sizes over 100 nodes, or if you need to scrape metrics from Prometheus instances, Mirantis recommends that you deploy Prometheus on dedicated worker nodes in the cluster.

To deploy Prometheus on worker nodes:

  1. Source an admin bundle.

  2. Verify that ucp-metrics pods are running on all managers:

    $ kubectl -n kube-system get pods -l k8s-app=ucp-metrics -o wide
    
    NAME               READY  STATUS   RESTARTS  AGE  IP            NODE
    ucp-metrics-hvkr7  3/3    Running  0         4h   192.168.80.66 3a724a-0
    
  3. Add a Kubernetes node label to one or more workers. For example, a label with key ucp-metrics and value "" to a node with name 3a724a-1.

    $ kubectl label node 3a724a-1 ucp-metrics=
    
    node "test-3a724a-1" labeled
    

    SELinux Prometheus Deployment

    If you use SELinux, label your ucp-node-certs directories properly on the worker nodes before you move the ucp-metrics workload to them. To run ucp-metrics on a worker node, update the ucp-node-certs label by running:

    sudo chcon -R system_u:object_r:container_file_t:s0 /var/lib/docker/volumes/ucp-node-certs/_data.

  4. Patch the ucp-metrics DaemonSet’s nodeSelector with the same key and value in use for the node label. This example shows the key ucp-metrics and the value "".

    $ kubectl -n kube-system patch daemonset ucp-metrics --type json -p
    '[{"op": "replace", "path": "/spec/template/spec/nodeSelector", "value":
    {"ucp-metrics": ""}}]' daemonset "ucp-metrics" patched
    
  5. Confirm that ucp-metrics pods are running only on the labeled workers.

    $ kubectl -n kube-system get pods -l k8s-app=ucp-metrics -o wide
    
    NAME               READY  STATUS       RESTARTS  AGE IP           NODE
    ucp-metrics-88lzx  3/3    Running      0         12s 192.168.83.1 3a724a-1
    ucp-metrics-hvkr7  3/3    Terminating  0         4h 192.168.80.66 3a724a-0
    

See also

Kubernetes

Configure external Prometheus to scrape metrics from MKE

To configure your external Prometheus server to scrape metrics from Prometheus in MKE:

  1. Source an admin bundle.

  2. Create a Kubernetes secret that contains your bundle TLS material.

    (cd $DOCKER_CERT_PATH && kubectl create secret generic prometheus --from-file=ca.pem --from-file=cert.pem --from-file=key.pem)
    
  3. Create a Prometheus deployment and ClusterIP service using YAML.

    Note

    On bare metal clusters, enable MetalLB so that you can create a service of the load balancer type, and then perform the following steps:

    1. Replace ClusterIP with LoadBalancer in the service YAML.

    2. Access the service through the load balancer.

    3. If you run Prometheus external to MKE, change the domain for the inventory container in the Prometheus deployment from ucp-controller.kube-system.svc.cluster.local to an external domain, to access MKE from the Prometheus node.

    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: prometheus
    data:
      prometheus.yaml: |
        global:
          scrape_interval: 10s
        scrape_configs:
        - job_name: 'ucp'
          tls_config:
            ca_file: /bundle/ca.pem
            cert_file: /bundle/cert.pem
            key_file: /bundle/key.pem
            server_name: proxy.local
          scheme: https
          file_sd_configs:
          - files:
            - /inventory/inventory.json
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: prometheus
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: prometheus
      template:
        metadata:
          labels:
            app: prometheus
        spec:
          nodeSelector:
            kubernetes.io/os: linux
          containers:
          - name: inventory
            image: alpine
            command: ["sh", "-c"]
            args:
            - apk add --no-cache curl &&
              while :; do
                curl -Ss --cacert /bundle/ca.pem --cert /bundle/cert.pem --key /bundle/key.pem --output /inventory/inventory.json https://ucp-controller.kube-system.svc.cluster.local/metricsdiscovery;
                sleep 15;
              done
            volumeMounts:
            - name: bundle
              mountPath: /bundle
            - name: inventory
              mountPath: /inventory
          - name: prometheus
            image: prom/prometheus
            command: ["/bin/prometheus"]
            args:
            - --config.file=/config/prometheus.yaml
            - --storage.tsdb.path=/prometheus
            - --web.console.libraries=/etc/prometheus/console_libraries
            - --web.console.templates=/etc/prometheus/consoles
            volumeMounts:
            - name: bundle
              mountPath: /bundle
            - name: config
              mountPath: /config
            - name: inventory
              mountPath: /inventory
          volumes:
          - name: bundle
            secret:
              secretName: prometheus
          - name: config
            configMap:
              name: prometheus
          - name: inventory
            emptyDir:
              medium: Memory
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: prometheus
    spec:
      ports:
      - port: 9090
        targetPort: 9090
      selector:
        app: prometheus
      sessionAffinity: ClientIP
    EOF
    
  4. Determine the service ClusterIP:

    $ kubectl get service prometheus
    
    NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
    prometheus   ClusterIP   10.96.254.107   <none>        9090/TCP   1h
    
  5. Forward port 9090 on the local host to the ClusterIP. The tunnel you create does not need to be kept alive as its only purpose is to expose the Prometheus UI.

    ssh -L 9090:10.96.254.107:9090 ANY_NODE
    
  6. Visit http://127.0.0.1:9090 to explore the MKE metrics that Prometheus is collecting.

See also

Kubernetes

Set up Grafana with MKE Prometheus

Important

The information offered herein on how to set up a Grafana instance connected to MKE Prometheus is derived from the official Deploy Grafana on Kubernetes documentation and modified to work with MKE. As it deploys Grafana with default credentials, Mirantis strongly recommends that you adjust the configuration detail to meet your specific needs prior to deploying Grafana with MKE in a production environment.

  1. Source an MKE admin bundle.

  2. Create the monitoring namespace on which you will deploy Grafana:

    kubectl create namespace monitoring
    
  3. Obtain the UCP cluster ID:

    CLUSTER_ID=$(docker info --format '{{json .Swarm.Cluster.ID}}')
    
  4. Apply the following YAML file to deploy Grafana in the monitoring namespace and to automatically configure MKE Prometheus as a data source:

    kubectl apply -f - <<EOF
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: grafana
      name: grafana
      namespace: monitoring
    spec:
      selector:
        matchLabels:
          app: grafana
      template:
        metadata:
          labels:
            app: grafana
        spec:
          securityContext:
            runAsUser: 0
          containers:
            - name: grafana
              image: grafana/grafana:9.1.0-ubuntu
              imagePullPolicy: IfNotPresent
              ports:
                - containerPort: 3000
                  name: http-grafana
                  protocol: TCP
              readinessProbe:
                failureThreshold: 3
                httpGet:
                  path: /robots.txt
                  port: 3000
                  scheme: HTTP
                initialDelaySeconds: 10
                periodSeconds: 30
                successThreshold: 1
                timeoutSeconds: 2
              livenessProbe:
                failureThreshold: 3
                initialDelaySeconds: 30
                periodSeconds: 10
                successThreshold: 1
                tcpSocket:
                  port: 3000
                timeoutSeconds: 1
              resources:
                requests:
                  cpu: 250m
                  memory: 750Mi
              volumeMounts:
                - mountPath: /etc/grafana/
                  name: grafana-config-volume
                - mountPath: /etc/ssl
                  name: ucp-node-certs
          volumes:
            - name: grafana-config-volume
              configMap:
                name: grafana-config
                items:
                  - key: grafana.ini
                    path: grafana.ini
                  - key: dashboard.json
                    path: dashboard.json
                  - key: datasource.yml
                    path: provisioning/datasources/datasource.yml
            - name: ucp-node-certs
              hostPath:
                path: /var/lib/docker/volumes/ucp-node-certs/_data
          nodeSelector:
            node-role.kubernetes.io/master: ""
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: grafana
      namespace: monitoring
    spec:
      ports:
        - port: 3000
          protocol: TCP
          targetPort: http-grafana
      selector:
        app: grafana
      sessionAffinity: None
      type: ClusterIP
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: grafana-config
      namespace: monitoring
      labels:
        grafana_datasource: '1'
    data:
      grafana.ini: |
      dashboard.json: |
      datasource.yml: |-
        apiVersion: 1
        datasources:
        - name: mke-prometheus
          type: prometheus
          access: proxy
          orgId: 1
          url: https://ucp-metrics.kube-system.svc.cluster.local:443
          jsonData:
            tlsAuth: true
            tlsAuthWithCACert: false
            serverName: $CLUSTER_ID
          secureJsonData:
            tlsClientCert: "\$__file{/etc/ssl/cert.pem}"
            tlsClientKey: "\$__file{/etc/ssl/key.pem}"
    ---
    EOF
    
  5. Use port forwarding to access the Grafana UI. Be aware that this may require that you install socat on your manager nodes.

    kubectl port-forward service/grafana 3000:3000 -n monitoring
    

You can now navigate to the Grafana UI, which has the MKE Prometheus data source installed at http://localhost:3000/. Log in initially using admin for both the user name and password, taking care to change your credentials after successful log in.

See also

Kubernetes

Configure native Kubernetes role-based access control

MKE uses native Kubernetes RBAC, which is active by default for Kubernetes clusters. The YAML files of many ecosystem applications and integrations use Kubernetes RBAC to access service accounts. Also, organizations looking to run MKE both on-premises and in hosted cloud services want to run Kubernetes applications in both environments without having to manually change RBAC in their YAML file.

Note

Kubernetes and Swarm roles have separate views. Using the MKE web UI, you can view all the roles for a particular cluster:

  1. Click Access Control in the navigation menu at the left.

  2. Click Roles.

  3. Select the Kubernetes tab or the Swarm tab to view the specific roles for each.

Create a Kubernetes role

You create Kubernetes roles either through the CLI using Kubernetes kubectl tool or through the MKE web UI.

To create a Kubernetes role using the MKE web UI:

  1. Log in to the the MKE web UI.

  2. In the navigation menu at the left, click Access Control to display the available options.

  3. Click Roles.

  4. At the top of the details pane, click the Kubernetes tab.

  5. Click Create to open the Create Kubernetes Object page.

  6. Click Namespace to select a namespace for the role from one of the available options.

  7. Provide the YAML file for the role. To do this, either enter it in the Object YAML editor, or upload an existing .yml file using the Click to upload a .yml file selection link at the right.

  8. Click Create to complete role creation.

See also

Create a Kubernetes role grant

Kubernetes provides two types of role grants:

  • ClusterRoleBinding (applies to all namespaces)

  • RoleBinding (applies to a specific namespace)

To create a grant for a Kubernetes role in the MKE web UI:

  1. Log in to the the MKE web UI.

  2. In the navigation menu at the left, click Access Control to display the available options.

  3. Click the Grants option.

  4. At the top of the details paine, click the Kubernetes tab. All existing grants to Kubernetes roles are present in the details pane.

  5. Click Create Role Binding to open the Create Role Binding page.

  6. Select the subject type at the top of the 1. Subject section (Users, Organizations, or Service Account).

  7. Create a role binding for the selected subject type:

    • Users: Select a type from the User drop-down list.

    • Organizations: Select a type from the Organization drop-down list. Optionally, you can also select a team using the Team(optional) drop-down list, if any have been established.

    • Service Account: Select a NAMESPACE from the Namespace drop-down list, then a type from the Service Account drop-down list.

  8. Click Next to activate the 2. Resource Set section.

  9. Select a resource set for the subject.

    By default, the default namespace is indicated. To use a different namespace, select the Select Namespace button associated with the desired namespace.

    For ClusterRoleBinding, slide the Apply Role Binding to all namespace (Cluster Role Binding) selector to the right.

  10. Click Next to activate the 3. Role section.

  11. Select the role type.

    • Role

    • Cluster Role

    Note

    Cluster Role type is the only role type available if you enabled Apply Role Binding to all namespace (Cluster Role Binding) in the 2. Resource Set section.

  12. Select the role from the from the drop-down list.

  13. Click Create to complete grant creation.

See also

Kubernetes

MKE audit logging

Audit logs are a chronological record of security-relevant activities by individual users, administrators, or software components that have had an effect on an MKE system. They focus on external user/agent actions and security, rather than attempting to understand state or events of the system itself.

Audit logs capture all HTTP actions (GET, PUT, POST, PATCH, DELETE) to all MKE API, Swarm API, and Kubernetes API endpoints (with the exception of the ignored list) that are invoked and and sent to Mirantis Container Runtime via stdout.

The benefits that audit logs provide include:

Historical troubleshooting

You can use audit logs to determine a sequence of past events that can help explain why an issue occurred.

Security analysis and auditing

A full record of all user interactions with the container infrastructure can provide your security team with the visibility necessary to root out questionable or unauthorized access attempts.

Chargeback

Use audit log about the resources to generate chargeback information.

Alerting

With a watch on an event stream or a notification the event creates, you can build alerting features on top of event tools that generate alerts for ops teams (PagerDuty, OpsGenie, Slack, or custom solutions).

Logging levels

MKE provides three levels of audit logging to administrators:

None

Audit logging is disabled.

Metadata

Includes:
  • Method and API endpoint for the request

  • MKE user who made the request

  • Response status (success or failure)

  • Timestamp of the call

  • Object ID of any created or updated resource (for create or update API calls). We do not include names of created or updated resources.

  • License key

  • Remote address

Request

Includes all fields from the Metadata level, as well as the request payload.

Once you enable MKE audit logging, the audit logs will collect within the container logs of the ucp-controller container on each MKE manager node.

Note

Be sure to configure a logging driver with log rotation set, as audit logging can generate a large amount of data.

Enable MKE audit logging

Note

The enablement of auditing in MKE does not automatically enable auditing in Kubernetes objects. To do this, you must set the kube_api_server_auditing parameter in the MKE configuration file to true.

Once you have set the kube_api_server_auditing parameter to true, the following default auditing values are configured on the Kubernetes API server:

  • --audit-log-maxage: 30

  • --audit-log-maxbackup: 10

  • --audit-log-maxsize: 10

For information on how to enable and configure the Kubernetes API server audit values, refer to cluster_config table detail in the MKE configuration file.

You can enable MKE audit logging using the MKE web user interface, the MKE API, and the MKE configuration file.

Enable MKE audit logging using the web UI
  1. Log in to the MKE web user interface.

  2. Click admin to open the navigation menu at the left.

  3. Click Admin Settings.

  4. Click Logs & Audit Logs to open the Logs & Audit Logs details pane.

  5. In the Configure Audit Log Level section, select the relevant logging level.

  6. Click Save.

Enable MKE audit logging using the API
  1. Download the MKE client bundle from the command line, as described in Download the client bundle.

  2. Retrieve the JSON file for current audit log configuration:

    export DOCKER_CERT_PATH=~/ucp-bundle-dir/
    curl --cert ${DOCKER_CERT_PATH}/cert.pem --key ${DOCKER_CERT_PATH}/key.pem --cacert ${DOCKER_CERT_PATH}/ca.pem -k -X GET https://ucp-domain/api/ucp/config/logging > auditlog.json
    
  3. In auditlog.json, edit the auditlevel field to metadata or request:

    {
        "logLevel": "INFO",
        "auditLevel": "metadata",
        "supportDumpIncludeAuditLogs": false
    }
    
  4. Send the JSON request for the audit logging configuration with the same API path, but using the PUT method:

    curl --cert ${DOCKER_CERT_PATH}/cert.pem --key
    ${DOCKER_CERT_PATH}/key.pem --cacert ${DOCKER_CERT_PATH}/ca.pem -k -H
    "Content-Type: application/json" -X PUT --data $(cat auditlog.json)
    https://ucp-domain/api/ucp/config/logging
    
Enable MKE audit logging using the configuration file

You can enable MKE audit logging using the MKE configuration file before or after MKE installation.

The section of the MKE configuration file that controls MKE auditing logging is [audit_log_configuration]:

[audit_log_configuration]
  level = "metadata"
  support_dump_include_audit_logs = false

The level setting supports the following variables:

  • ""

  • "metadata"

  • "request"

Caution

The support_dump_include_audit_logs flag specifies whether user identification information from the ucp-controller container logs is included in the support bundle. To prevent this information from being sent with the support bundle, verify that support_dump_include_audit_logs is set to false. When disabled, the support bundle collection tool filters out any lines from the ucp-controller container logs that contain the substring auditID.

Access audit logs using the docker CLI

The audit logs are exposed through the ucp-controller logs. You can access these logs locally through the Docker CLI.

Note

You can also access MKE audit logs using an external container logging solution, such as ELK.

To access audit logs using the Docker CLI:

  1. Source a MKE client bundle.

  2. Run docker logs to obtain audit logs.

    The following example tails the command to show the last log entry.

    $ docker logs ucp-controller --tail 1
    
    {"audit":{"auditID":"f8ce4684-cb55-4c88-652c-d2ebd2e9365e","kind":"docker-swarm","level":"metadata","metadata":{"creationTimestamp":null},"requestReceivedTimestamp":"2019-01-30T17:21:45.316157Z","requestURI":"/metricsservice/query?query=(%20(sum%20by%20(instance)%20(ucp_engine_container_memory_usage_bytes%7Bmanager%3D%22true%22%7D))%20%2F%20(sum%20by%20(instance)%20(ucp_engine_memory_total_bytes%7Bmanager%3D%22true%22%7D))%20)%20*%20100\u0026time=2019-01-30T17%3A21%3A45.286Z","sourceIPs":["172.31.45.250:48516"],"stage":"RequestReceived","stageTimestamp":null,"timestamp":null,"user":{"extra":{"licenseKey":["FHy6u1SSg_U_Fbo24yYUmtbH-ixRlwrpEQpdO_ntmkoz"],"username":["admin"]},"uid":"4ec3c2fc-312b-4e66-bb4f-b64b8f0ee42a","username":"4ec3c2fc-312b-4e66-bb4f-b64b8f0ee42a"},"verb":"GET"},"level":"info","msg":"audit","time":"2019-01-30T17:21:45Z"}
    

    Sample audit log for a Kubernetes cluster:

    {"audit"; {
          "metadata": {...},
          "level": "Metadata",
          "timestamp": "2018-08-07T22:10:35Z",
          "auditID": "7559d301-fa6b-4ad6-901c-b587fab75277",
          "stage": "RequestReceived",
          "requestURI": "/api/v1/namespaces/default/pods",
          "verb": "list",
          "user": {"username": "alice",...},
          "sourceIPs": ["127.0.0.1"],
          ...,
          "requestReceivedTimestamp": "2018-08-07T22:10:35.428850Z"}}
    

    Sample audit log for a Swarm cluster:

    {"audit"; {
          "metadata": {...},
          "level": "Metadata",
          "timestamp": "2018-08-07T22:10:35Z",
          "auditID": "7559d301-94e7-4ad6-901c-b587fab31512",
          "stage": "RequestReceived",
          "requestURI": "/v1.30/configs/create",
          "verb": "post",
          "user": {"username": "alice",...},
          "sourceIPs": ["127.0.0.1"],
          ...,
          "requestReceivedTimestamp": "2018-08-07T22:10:35.428850Z"}}
    
API endpoints logging constraints

With regard to audit logging, for reasons having to do with system security a number of MKE API endpoints are either ignored or have their information redacted.

API endpoints ignored

The following API endpoints are ignored since they are not considered security events and can create a large amount of log entries:

  • /_ping

  • /ca

  • /auth

  • /trustedregistryca

  • /kubeauth

  • /metrics

  • /info

  • /version\*

  • /debug

  • /openid_keys

  • /apidocs

  • /kubernetesdocs

  • /manage

API endpoints information redacted

For security purposes, information for the following API endpoints is redacted from the audit logs:

  • /secrets/create (POST)

  • /secrets/{id}/update (POST)

  • /swarm/join (POST)

  • /swarm/update (POST) -/auth/login (POST)

  • Kubernetes secrets create/update endpoints

See also

Kubernetes

See also

Kubernetes

Enable MKE telemetry

You can set MKE to automatically record and transmit data to Mirantis through an encrypted channel for monitoring and analysis purposes. The data collected provides the Mirantis Customer Success Organization with information that helps us to better understand the operational use of MKE by our customers. It also provides key feedback in the form of product usage statistics, which enable our product teams to enhance Mirantis products and services.

Specifically, with MKE you can send hourly usage reports, as well as information on API and UI usage.

Caution

To send the telemetry, verify that dockerd and the MKE application container can resolve api.segment.io and create a TCP (HTTPS) connection on port 443.

To enable telemetry in MKE:

  1. Log in to the MKE web UI as an administrator.

  2. At the top of the navigation menu at the left, click the user name drop-down to display the available options.

  3. Click Admin Settings to display the available options.

  4. Click Usage to open the Usage Reporting screen.

  5. Toggle the Enable API and UI tracking slider to the right.

  6. (Optional) Enter a unique label to identify the cluster in the usage reporting.

  7. Click Save.

Enable and integrate SAML authentication

Security Assertion Markup Language (SAML) is an open standard for the exchange of authentication and authorization data between parties. It is commonly supported by enterprise authentication systems. SAML-based single sign-on (SSO) gives you access to MKE through a SAML 2.0-compliant identity provider.

MKE supports the Okta and ADFS identity providers.

To integrate SAML authentication into MKE:

  1. Configure the Identity Provider (IdP).

  2. In the left-side navigation panel, navigate to user name > Admin Settings > Authentication & Authorization.

  3. Create (Edit) Teams to link with the Group memberships. This updates team membership information when a user signs in with SAML.

Note

If LDAP integration is enabled, refer to Use LDAP in conjunction with SAML for information on using SAML in parallel with LDAP.

Configure SAML integration on identity provider

Identity providers require certain values to successfully integrate with MKE. As these values vary depending on the identity provider, consult your identity provider documentation for instructions on how to best provide the needed information.

Okta integration values

Okta integration requires the following values:

Value

Description

URL for single signon (SSO)

URL for MKE, qualified with /enzi/v0/saml/acs. For example, https://111.111.111.111/enzi/v0/saml/acs.

Service provider audience URI

URL for MKE, qualified with /enzi/v0/saml/metadata. For example, https://111.111.111.111/enzi/v0/saml/metadata.

NameID format

Select Unspecified.

Application user name

Email. For example, a custom ${f:substringBefore(user.email, "@")} specifies the user name portion of the email address.

Attribute Statements

  • Name: fullname
    Value: user.displayName

Group Attribute Statement

  • Name: member-of
    Filter: (user defined) for associate group membership.
    The group name is returned with the assertion.
  • Name: is-admin
    Filter: (user defined) for identifying whether the user is an admin.

Okta configuration

When two or more group names are expected to return with the assertion, use the regex filter. For example, use the value apple|orange to return groups apple and orange.

ADFS integration values

To enable ADFS integration:

  1. Add a relying party trust.

  2. Obtain the service provider metadata URI.

    The service provider metadata URI value is the URL for MKE, qualified with /enzi/v0/saml/metadata. For example, https://111.111.111.111/enzi/v0/saml/metadata.

  3. Add claim rules.

    1. Convert values from AD to SAML

      • Display-name : Common Name

      • E-Mail-Addresses : E-Mail Address

      • SAM-Account-Name : Name ID

    2. Create a full name for MKE (custom rule):

      c:[Type == "http://schemas.xmlsoap.org/claims/CommonName"]      => issue(Type = "fullname", Issuer = c.Issuer, OriginalIssuer = c.OriginalIssuer, Value = c.Value,       ValueType = c.ValueType);
      
    3. Transform account name to Name ID:

      • Incoming type: Name ID

      • Incoming format: Unspecified

      • Outgoing claim type: Name ID

      • Outgoing format: Transient ID

    4. Pass admin value to allow admin access based on AD group. Send group membership as claim:

      • Users group: your admin group

      • Outgoing claim type: is*admin

      • Outgoing claim value: 1

    5. Configure group membership for more complex organizations, with multiple groups able to manage access.

      • Send LDAP attributes as claims

      • Attribute store: Active Directory

        • Add two rows with the following information:

          • LDAP attribute = email address; outgoing claim type: email address

          • LDAP attribute = Display*Name; outgoing claim type: common name

      • Mapping:

        • Token-Groups - Unqualified Names : member-of

Note

Once you enable SAML, Service Provider metadata is available at https://<SPHost>/enzi/v0/saml/metadata. The metadata link is also labeled as entityID.

Only POST binding is supported for the Assertion Consumer Service, which is located at https://<SP Host>/enzi/v0/saml/acs.

Configure SAML integration on MKE

SAML configuration requires that you know the metadata URL for your chosen identity provider, as well as the URL for the MKE host that contains the IP address or domain of your MKE installation.

To configure SAML integration on MKE:

  1. Log in to the MKE web UI.

  2. In the navigation menu at the left, click the user name drop-down to display the available options.

  3. Click Admin Settings to display the available options.

  4. Click Authentication & Authorization.

  5. In the Identity Provider section in the details pane, move the slider next to SAML to enable the SAML settings.

  6. In the SAML idP Server subsection, enter the URL for the identity provider metadata in the IdP Metadata URL field.

    Note

    If the metadata URL is publicly certified, you can continue with the default settings:

    • Skip TLS Verification unchecked

    • Root Certificates Bundle blank

    Mirantis recommends TLS verification in production environments. If the metadata URL cannot be certified by the default certificate authority store, you must provide the certificates from the identity provider in the Root Certificates Bundle field.

  7. In the SAML Service Provider subsection, in the MKE Host field, enter the URL that includes the IP address or domain of your MKE installation.

    The port number is optional. The current IP address or domain displays by default.

  8. (Optional) Customize the text of the sign-in button by entering the text for the button in the Customize Sign In Button Text field. By default, the button text is Sign in with SAML.

  9. Copy the SERVICE PROVIDER METADATA URL, the ASSERTION CONSUMER SERVICE (ACS) URL, and the SINGLE LOGOUT (SLO) URL to paste into the identity provider workflow.

  10. Click Save.

Note

  • To configure a service provider, enter the Identity Provider’s metadata URL to obtain its metadata. To access the URL, you may need to provide the CA certificate that can verify the remote server.

  • To link group membership with users, use the Edit or Create team dialog to associate SAML group assertion with the MKE team to synchronize user team membership when the user log in.

SAML security considerations

From the MKE web UI you can download a client bundle with which you can access MKE using the CLI and the API.

A client bundle is a group of certificates that enable command-line access and API access to the software. It lets you authorize a remote Docker engine to access specific user accounts that are managed in MKE, absorbing all associated RBAC controls in the process. Once you obtain the client bundle, you can execute Docker Swarm commands from your remote machine to take effect on the remote cluster.

Previously-authorized client bundle users can still access MKE, regardless of the newly configured SAML access controls.

Mirantis recomments that you take the following steps to ensure that access from the client bundle is in sync with the identity provider, and to thus prevent any previously-authorized users from accessing MKE through their existing client bundle:

  1. Remove the user account from MKE that grants the client bundle access.

  2. If group membership in the identity provider changes, replicate the change in MKE.

  3. Continue using LDAP to sync group membership.

To download the client bundle:

  1. Log in to the MKE web UI.

  2. In the navigation menu at the left, click the user name drop-down to display the available options.

  3. Click your account name to display the available options.

  4. Click My Profile.

  5. Click the New Client Bundle drop-down in the details pane and select Generate Client Bundle.

  6. (Optional) Enter a name for the bundle into the Label field.

  7. Click Confirm to initiate the bundle download.

Set up SAML proxy

Available since MKE 3.7.0

You can enhance the security and flexibility of MKE by implementing a SAML proxy. With such a proxy, you can lock down your MKE deployment and still benefit from the use of SAML authentication. The proxy, which sits between MKE and Identity Providers (IdPs), forwards metadata requests between these two entities, using designated ports during the configuration process.

To set up a SAML proxy in MKE:

  1. Use the MKE web UI to add a proxy service.

    1. Log in to the MKE web UI as an administrator.

    2. In the left-side navigation panel, navigate to Kubernetes > Pods and click the Create button to call the Create Kubernetes Object pane.

    3. In the Namespace dropdown, select default.

    4. In the Object YAML editor, paste the following Deployment object YAML:

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: saml-proxy-deployment
      spec:
        selector:
          matchLabels:
            app: saml-proxy
        replicas: 1
        template:
          metadata:
            labels:
              app: saml-proxy
          spec:
            containers:
            - name: saml-proxy
              image: <proxy image>:<version>
              ports:
              - containerPort: <port-used-within-container>
      
    5. Click Create to add the container.

    6. In the left-side navigation panel, navigate to Kubernetes > Services and click the Create button to call the Create Kubernetes Object pane.

    7. In the Namespace dropdown, select default.

    8. In the Object YAML editor, paste the following Deployment object YAML:

      apiVersion: v1
      kind: Service
      metadata:
        name: saml-proxy
        labels:
          app: saml-proxy
      spec:
        type: NodePort
        ports:
          - port: <port-used-within-container>
            nodePort: <port-to-externally-access-proxy>
        selector:
          app: saml-proxy
      
    9. Click Create to add the container.

    1. Log in to the MKE web UI as an administrator.

    2. In the left-side navigation panel, navigate to Swarm > Services and click the Create button to call the Create Service pane.

    3. Configure the new service with your desired target proxy image. Note that any http/https proxy will suffice.

    4. In the left-side navigation panel, navigate to Network.

    5. Indicate the Target Port and Published Port and click Confirm.

      The Target Port is the port the proxy uses within the container, and the Published Port is the port that is externally accessible.

      Note

      The proxy you deploy determines the target and published ports.

    6. Click Create to add the container.

    7. In the left-side navigation panel, navigate to Shared Resources > Containers.

    8. Click the kebab menu for the <proxy-container-name>, at the far right, and select View logs.

    9. Test proxy use by making a request to the IdP and then checking the log for verification. For example, run the following command:

      curl <your IdP metadata URL> -x https://<MKE deployment IP>:<published-port>
      

      Note

      Be aware that the log entry can take up to five minutes to register.

  2. Configure the SAML proxy.

    1. Log in to the MKE web UI as an administrator.

    2. In the left-side navigation panel, navigate to <user-name> > Admin Settings > Authentication & Authorization to display the Authentication & Authorization pane.

    3. Toggle the SAML control to enable SAML and expand the SAML settings.

    4. Enable the SAML Proxy setting to reveal the Proxy URL, Proxy Username, and Proxy Password fields.

    5. Insert the pertinent field information and click Save.

    Note

    If upgrading from a previous version of MKE, you will need to add the [auth.samlProxy] section to the MKE configuration file.

    Edit the [auth.samlProxy] section of the MKE configuration file as follows:

    [auth.samlProxy]
       proxyURL = "http://<MKE deployment IP>:<published-port>"
       enabled = true
       [auth.samlProxy.credentials]
          [auth.samlProxy.credentials.basic]
          user = "<user-name>"
          password = "<password>"
    

    Note

    • If you provide empty strings for username or password, these will be considered valid credentials and will be used for the proxy.

    • To configure the proxy for use without authentication, remove the username and password fields.

    • For security purposes, a GET operation will not return the user and password credential values.

    Refer to Use an MKE configuration file for information on how to update the MKE configuration file.

  3. Use a private browser window to Configure SAML.

    Note

    Be aware that the log entry can take up to five minutes to register.

Enable Helm with MKE

To use Helm with MKE, you must define the necessary roles in the kube-system default service account.

Note

For comprehensive information on the use of Helm, refer to the Helm user documentation.

To enable Helm with MKE, enter the following kubectl commands in sequence:

kubectl create rolebinding default-view --clusterrole=view
--serviceaccount=kube-system:default --namespace=kube-system

kubectl create clusterrolebinding add-on-cluster-admin
--clusterrole=cluster-admin --serviceaccount=kube-system:default

Integrate SCIM

System for Cross-domain Identity Management (SCIM) is a standard for automating the exchange of user identity information between identity domains or IT systems. It offers an LDAP alternative for provisioning and managing users and groups in MKE, as well as for syncing users and groups with an upstream identity provider. Using SCIM schema and API, you can utilize Single sign-on services (SSO) across various tools.

Mirantis certifies the use of Okta 3.2.0, however MKE offers the discovery endpoints necessary to provide any system or application with the product SCIM configuration.

Configure SCIM for MKE

The Mirantis SCIM implementation uses SCIM version 2.0.

MKE SCIM intregration typically involves the following steps:

  1. Enable SCIM.

  2. Configure SCIM for authentication and access.

  3. Specify user attributes.

Enable SCIM
  1. Log in to the MKE web UI.

  2. Click Admin Settings > Authentication & Authorization.

  3. In the Identity Provider Integration section in the details pane, move the slider next to SCIM to enable the SCIM settings.

Configure SCIM authentication and access

In the SCIM configuration subsection, either enter the API token in the API Token field or click Generate to have MKE generate a UUID.

The base URL for all SCIM API calls is https://<Host IP>/enzi/v0/scim/v2/. All SCIM methods are accessible API endpoints of this base URL.

Bearer Auth is the API authentication method. When configured, you access SCIM API endpoints through the Bearer <token> HTTP Authorization request header.

Note

  • SCIM API endpoints are not accessible by any other user (or their token), including the MKE administrator and MKE admin Bearer token.

  • The only SCIM method MKE supports is an HTTP authentication request header that contains a Bearer token.

Specify user attributes

The following table maps the user attribute fields in use by Mirantis to SCIM and SAML attributes.

MKE

SAML

SCIM

Account name

nameID in response

userName

Account full name

Attribute value in fullname assertion

User’s name.formatted

Team group link name

Attribute value in member-of assertion

Group’s displayName

Team name

N/A

When creating a team, use the group’s displayName + _SCIM

Supported SCIM API endpoints

MKE supports SCIM API endpoints across three operational areas: User, Group, and Service Provider Configuration.

User operations

The SCIM API endpoints that serve in user operations provide the means to:

  • Retrieve user information

  • Create a new user

  • Update user information

For user GET and POST operations:

  • Filtering is only supported using the userName attribute and eq operator. For example, filter=userName Eq "john".

  • Attribute name and attribute operator are case insensitive. For example, the following two expressions have the same logical value:

    • filter=userName Eq "john"

    • filter=Username eq "john"

  • Pagination is fully supported.

  • Sorting is not supported.

GET /Users

Returns a list of SCIM users (by default, 200 users per page).

Use the startIndex and count query parameters to paginate long lists of users. For example, to retrieve the first 20 Users, set startIndex to 1 and count to 20, provide the following JSON request:

GET Host IP/enzi/v0/scim/v2/Users?startIndex=1&count=20
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8

The response to the previous query returns paging metadata that is similar to the following example:

{
  "totalResults":100,
  "itemsPerPage":20,
  "startIndex":1,
  "schemas":["urn:ietf:params:scim:api:messages:2.0:ListResponse"],
  "Resources":[{
     ...
  }]
}
GET /Users/{id}

Retrieves a single user resource.

The value of the {id} should be the user’s ID. You can also use the userName attribute to filter the results.

GET {Host IP}/enzi/v0/scim/v2/Users?{user ID}
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8
POST /Users

Creates a user.

The operation must include the userName attribute and at least one email address.

POST {Host IP}/enzi/v0/scim/v2/Users
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8
PATCH /Users/{id}

Updates a user’s active status.

Reactivate inactive users by specifying "active": true. To deactivate active users, specify "active": false. The value of the {id} should be the user’s ID.

PATCH {Host IP}/enzi/v0/scim/v2/Users?{user ID}
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8
PUT /Users/{id}

Updates existing user information.

All attribute values are overwritten, including attributes for which empty values or no values have been provided. If a previously set attribute value is left blank during a PUT operation, the value is updated with a blank value in accordance with the attribute data type and storage provider. The value of the {id} should be the user’s ID.

Group operations

The SCIM API endpoints that serve in group operations provide the means to:

  • Create a new user group

  • Retrieve group information

  • Update user group membership (add/replace/remove users)

For group GET and POST operations:

  • Pagination is fully supported.

  • Sorting is not supported.

GET /Groups/{id}

Retrieves information for a single group.

GET /scim/v1/Groups?{Group ID}
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8
GET /Groups

Returns a paginated list of groups (by default, ten groups per page).

Use the startIndex and count query parameters to paginate long lists of groups.

GET /scim/v1/Groups?startIndex=4&count=500 HTTP/1.1
Host: example.com
Accept: application/scim+json
Authorization: Bearer h480djs93hd8
POST /Groups

Creates a new group.

Add users to the group during group creation by supplying user ID values in the members array.

PATCH /Groups/{id}

Updates an existing group resource, allowing the addition or removal of individual (or groups of) users from the group with a single operation. Add is the default operation.

To remove members from a group, set the operation attribute of a member object to delete.

PUT /Groups/{id}

Updates an existing group resource, overwriting all values for a group even if an attribute is empty or is not provided.

PUT replaces all members of a group with members that are provided by way of the members attribute. If a previously set attribute is left blank during a PUT operation, the new value is set to blank in accordance with the data type of the attribute and the storage provider.

Service Provider configuration operations

The SCIM API endpoints that serve in Service provider configuration operations provide the means to:

  • Retrieve service provider resource type metadata

  • Retrieve schema for service provider and SCIM resources

  • Retrieve schema for service provider configuration

SCIM defines three endpoints to facilitate discovery of the SCIM service provider features and schema that you can retrieve using HTTP GET:

GET /ResourceTypes

Discovers the resource types available on a SCIM service provider (for example, Users and Groups).

Each resource type defines the endpoints, the core schema URI that defines the resource, and any supported schema extensions.

GET /Schemas

Retrieves information about all supported resource schemas supported by a SCIM service provider.

GET /ServiceProviderConfig

Returns a JSON structure that describes the SCIM specification features that are available on a service provider using a schemas attribute of urn:ietf:params:scim:schemas:core:2.0:ServiceProviderConfig.

Integrate with an LDAP directory

MKE integrates with LDAP directory services, thus allowing you to manage users and groups from your organization directory and to automatically propagate the information to MKE and MSR.

Once you enable LDAP, MKE uses a remote directory server to create users automatically, and all logins are forwarded thereafter to the directory server.

When you switch from built-in authentication to LDAP authentication, all manually created users whose usernames fail to match any LDAP search results remain available.

When you enable LDAP authentication, you configure MKE to create user accounts only when users log in for the first time.

Note

If SAML integration is enabled, refer to Use LDAP in conjunction with SAML for information on using SAML in parallel with LDAP.

MKE integration with LDAP

To control the integration of MKE with LDAP, you create user searches. For these user searches, you use the MKE web UI to specify multiple search configurations and specify multiple LDAP servers with which to integrate. Searches start with the Base DN, the Distinguished Name of the node in the LDAP directory tree in which the search looks for users.

MKE to LDAP synchronization workflow

The following occurs when MKE synchronizes with LDAP:

  1. MKE creates a set of search results by iterating over each of the user search configurations, in an order that you specify.

  2. MKE choses an LDAP server from the list of domain servers by considering the Base DN from the user search configuration and selecting the domain server with the longest domain suffix match.

    Note

    If no domain server has a domain suffix that matches the Base DN from the search configuration, MKE uses the default domain server.

  3. MKE creates a list of users from the search and creates MKE accounts for each one.

    Note

    If you select the Just-In-Time User Provisioning option, user accounts are created only when users first log in.

Example workflow:

Consider an example with three LDAP domain servers and three user search configurations.

The example LDAP domain servers:

LDAP domain server name

URL

default

ldaps://ldap.example.com

dc=subsidiary1,dc=com

ldaps://ldap.subsidiary1.com

dc=subsidiary2,dc=subsidiary1,dc=com

ldaps://ldap.subsidiary2.com

The example user search configurations:

User search configurations

Description

baseDN=\ ou=people,dc=subsidiary1,dc=com

For this search configuration, dc=subsidiary1,dc=com is the only server with a domain that is a suffix, so MKE uses the server ldaps://ldap.subsidiary1.com for the search request.

baseDN=\ ou=product,dc=subsidiary2,dc=subsidiary1,dc=com

For this search configuration, two of the domain servers have a domain that is a suffix of this Base DN. As dc=subsidiary2,dc=subsidiary1,dc=com is the longer of the two, however, MKE uses the server ldaps://ldap.subsidiary2.com for the search request.

baseDN=\ ou=eng,dc=example,dc=com

For this search configuration, no server with a domain specified is a suffix of this Base DN, so MKE uses the default server, ldaps://ldap.example.com, for the search request.

Whenever user search results contain username collisions between the domains, MKE uses only the first search result, and thus the ordering of the user search configurations can be important. For example, if both the first and third user search configurations result in a record with the username jane.doe, the first has higher precedence and the second is ignored. As such, it is important to implement a username attribute that is unique for your users across all domains. As a best practice, choose something that is specific to the subsidiary, such as the email address for each user.

Configure the LDAP integration

Note

MKE saves a minimum amount of user data required to operate, including any user name and full name attributes that you specify in the configuration, as well as the Distinguished Name (DN) of each synced user. MKE does not store any other data from the directory server.

Use the MKE web UI to configure MKE to create and authenticate users using an LDAP directory.

Access the LDAP controls

To configure LDAP integration you must first gain access to the controls for the service protocol.

  1. Log in to the MKE web UI.

  2. In the left-side navigation menu, click the user name drop-down to display the available options.

  3. Navigate to Admin Settings > Authentication & Authorization.

  4. In the Identity Provider section in the details pane, move the slider next to LDAP to enable the LDAP settings.

Set up an LDAP server

To configure an LDAP server, perform the following steps:

  1. To set up a new LDAP server, configure the settings in the LDAP Server subsection:

    Control

    Description

    LDAP Server URL

    The URL for the LDAP server.

    Reader DN

    The DN of the LDAP account that is used to search entries in the LDAP server. As a best practice, this should be an LDAP read-only user.

    Reader Password

    The password of the account used to search entries in the LDAP server.

    Skip TLS verification

    Sets whether to verify the LDAP server certificate when TLS is in use. The connection is still encrypted, however it is vulnerable to man-in-the-middle attacks.

    Use Start TLS

    Defines whether to authenticate or encrypt the connection after connection is made to the LDAP server over TCP. To ignore the setting, set the LDAP Server URL field to ldaps://.

    No Simple Pagination (RFC 2696)

    Indicates that your LDAP server does not support pagination.

    Just-In-Time User Provisioning

    Sets whether to create user accounts only when users log in for the first time. Mirantis recommends using the default true value.

    Note

    Available as of MKE 3.6.4 The disableReferralChasing setting, which is currently only available by way of the MKE API, allows you to disable the default behavior that occurs when a referral URL is received as a result of an LDAP search request. Refer to LDAP Configuration through API for more information.

  2. Click Save to add your LDAP server.

Add additional LDAP domains

To integrate MKE with additional LDAP domains:

  1. In the LDAP Additional Domains subsection, click Add LDAP Domain +. A set of input tools for configuring the additional domain displays.

  2. Configure the settings for the new LDAP domain:

    Control

    Description

    LDAP Domain

    Text field in which to enter the root domain component of this server. A longest-suffix match of the Base DN for LDAP searches is used to select which LDAP server to use for search requests. If no matching domain is found, the default LDAP server configuration is put to use.

    LDAP Server URL

    Text field in which to enter the URL for the LDAP server.

    Reader DN

    Text field in which to enter the DN of the LDAP account that is used to search entries in the LDAP server. As a best practice, this should be an LDAP read-only user.

    Reader Password

    The password of the account used to search entries in the LDAP server.

    Skip TLS verification

    Sets whether to verify the LDAP server certificate when TLS is in use. The connection is still encrypted, however it is vulnerable to man-in-the-middle attacks.

    Use Start TLS

    Sets whether to authenticate or encrypt the connection after connection is made to the LDAP server over TCP. To ignore the setting, set the LDAP Server URL field to ldaps://.

    No Simple Pagination (RFC 2696)

    Select if your LDAP server does not support pagination.

    Note

    Available as of MKE 3.6.4 The disableReferralChasing setting, which is currently only available by way of the MKE API, allows you to disable the default behavior that occurs when a referral URL is received as a result of an LDAP search request. Refer to LDAP Configuration through API for more information.

  3. Click Confirm to add the new LDAP domain.

  4. Repeat the procedure to add any additional LDAP domains.

Add LDAP user search configurations

To add LDAP user search configurations to your LDAP integration:

  1. In the LDAP User Search Configurations subsection, click Add LDAP User Search Configuration +.A set of input tools for configuring the LDAP user search configurations displays.

    Field

    Description

    Base DN

    Text field in which to enter the DN of the node in the directory tree, where the search should begin seeking out users.

    Username Attribute

    Text field in which to enter the LDAP attribute that serves as username on MKE. Only user entries with a valid username will be created.

    A valid username must not be longer than 100 characters and must not contain any unprintable characters, whitespace characters, or any of the following characters: / \ [ ] : ; | = , + * ? < > ' ".

    Full Name Attribute

    Text field in which to enter the LDAP attribute that serves as the user’s full name, for display purposes. If the field is left empty, MKE does not create new users with a full name value.

    Filter

    Text field in which to enter an LDAP search filter to use to find users. If the field is left empty, all directory entries in the search scope with valid username attributes are created as users.

    Search subtree instead of just one level

    Whether to perform the LDAP search on a single level of the LDAP tree, or search through the full LDAP tree starting at the Base DN.

    Match Group Members

    Sets whether to filter users further, by selecting those who are also members of a specific group on the directory server. The feature is helpful when the LDAP server does not support memberOf search filters.

    Iterate through group members

    Sets whether, when the Match Group Members option is enabled to sync users, the sync is done by iterating over the target group’s membership and making a separate LDAP query for each member, rather than through the use of a broad user search filter. This option can increase efficiency in situations where the number of members of the target group is significantly smaller than the number of users that would match the above search filter, or if your directory server does not support simple pagination of search results.

    Group DN

    Text field in which to enter the DN of the LDAP group from which to select users, when the Match Group Members option is enabled.

    Group Member Attribute

    Text field in which to enter the name of the LDAP group entry attribute that corresponds to the DN of each of the group members.

  2. Click Confirm to add the new LDAP user search configurations.

  3. Repeat the procedure to add any additional user search configurations. More than one such configuration can be useful in cases where users may be found in multiple distinct subtrees of your organization directory. Any user entry that matches at least one of the search configurations will be synced as a user.

Test LDAP login

Prior to saving your configuration changes, you can use the dedicated LDAP Test login tool to test the integration using the login credentials of an LDAP user.

  1. Input the credentials for the test user into the provided Username and Passworfd fields:

    Field

    Description

    Username

    An LDAP user name for testing authentication to MKE. The value corresponds to the Username Attribute that is specified in the Add LDAP user search configurations section.

    Password

    The password used to authenticate (BIND) to the directory server.

  2. Click Test. A search is made against the directory using the provided search Base DN, scope, and filter. Once the user entry is found in the directory, a BIND request is made using the input user DN and the given password value.

Set LDAP synchronization

Following LDAP integration, MKE synchronizes users at the top of the hour, based on an intervial that is defined in hours.

To set LDAP synchronization, configure the following settings in the LDAP Sync Configuration section:

Field

Description

Sync interval

The interval, in hours, to synchronize users between MKE and the LDAP server. When the synchronization job runs, new users found in the LDAP server are created in MKE with the default permission level. MKE users that do not exist in the LDAP server become inactive.

Enable sync of admin users

This option specifies that system admins should be synced directly with members of a group in your organization’s LDAP directory. The admins will be synced to match the membership of the group. The configured recovery admin user will also remain a system admin.

Manually synchronize LDAP

In addition to configuring MKE LDAP synchronization, you can also perform a hot synchronization by clicking the Sync Now button in the LDAP Sync Jobs subsection. Here you can also view the logs for each sync jobs by clicking View Logs link associated with a particular job.

Revoke user access

Whenever a user is removed from LDAP, the effect on their MKE account is determined by the Just-In-Time User Provisioning setting:

  • false: Users deleted from LDAP become inactive in MKE following the next LDAP synchronization runs.

  • true: A user deleted from LDAP cannot authenticate. Their MKE accounts remain active, however, and thus they can use their client bundles to run commands. To prevent this, deactivate the user’s MKE user account.

Synchronize teams with LDAP

MKE enables the syncing of teams within Organizations with LDAP, using either a search query or by matching a group that is established in your LDAP directory.

  1. Log in to the MKE web UI as an administrator.

  2. Navigate to Access Control > Orgs & Teams to display the Organizations that exist within your MKE instance.

  3. Locate the name of the Organization that contains the MKE team that you want to sync to LDAP and click it to display all of the MKE teams for that Organization.

  4. Hover your cursor over the MKE team that you want to sync with LDAP to reveal its vertical ellipsis, at the far right.

  5. Click the vertical ellipsis and select Edit to call the Details screen for the team.

  6. Toggle ENABLE SYNC TEAM MEMBERS to Yes to reveal the LDAP sync controls.

  7. Toggle LDAP MATCH METHOD to set the LDAP match method you want to use to make the sync, Match Search Results (default) or Match Group Members.

    • For Match Search Results:

      1. Enter a Base DN into the Search Base DN field, as it is established in LDAP.

      2. Enter a search filter based on one or more attributes into the Search filter field.

      3. Optional. Check Search subtree instead of just one level to enable search down through any sub-groups that exist within the group you entered into the Search Base DN field.

    • For Match Group Members:

      1. Enter the group Distinguised Name (DN) into the Group DN field.

      2. Enter a member attribute into the Group Member field.

  8. Toggle IMMEDIATELY SYNC TEAM MEMBERS as appropriate.

  9. Toggle ALLOW NON-LDAP MEMBERS as appropriate.

  10. Click Save.

LDAP Configuration through API

LDAP-specific GET and PUT API endpoints are available in the configuration resource. Swarm mode must be enabled to use the following endpoints:

  • GET /api/ucp/config/auth/ldap - Returns information on your current system LDAP configuration.

  • PUT /api/ucp/config/auth/ldap - Updates your LDAP configuration.

Configure an OpenID Connect identity provider

OpenID Connect (OIDC) allows you to authenticate MKE users with a trusted external identity provider.

Note

Kubernetes users who want client bundles to use OIDC must Download and configure the client bundle and replace the authorization section therein with the parameters presented in the Kubernetes OIDC Authenticator documentation.

For identity providers that require a client redirect URI, use https://<MKE_HOST>/login. For identity providers that do not permit the use of an IP address for the host, use https://<mke-cluster-domain>/login.

The requested scopes for all identity providers are "openid email". Claims are read solely from the ID token that your identity provider returns. MKE does not use the UserInfo URL to obtain user information. The default username claim is sub. To use a different username claim, you must specify that value with the usernameClaim setting in the MKE configuration file.

The following example details the MKE configuration file settings for using an external identity provider.

  • For the *signInCriteria array, term is set to hosted domain ("hd") and value is set to the domain from which the user is permitted to sign in.

  • For the *adminRoleCriteria array, matchType is set to "contains", in case any administrators are assigned to multiple roles that include admin.

[auth.external_identity_provider]
  wellKnownConfigUrl = "https://example.com/.well-known/openid-configuration"
  clientId = "4dcdace6-4eb4-461d-892f-01aed344ac80"
  clientSecret = "ed89aeddcdb4461ace640"
  usernameClaim = "email"
  caBundle = "----BEGIN CERTIFICATE----\nMIIF...UfTd\n----END CERTIFICATE----\n"

  [[auth.external_identity_provider.signInCriteria]]
    term = "hd"
    value = "myorg.com"
    matchType = "must"

  [[auth.external_identity_provider.adminRoleCriteria]]
    term = "roles"
    value = "admin"
    matchType = "contains"

Note

Using an external identity provider to sign in to the MKE web UI creates a new user session, and thus users who sign in this way will not be signed out when their ID token expires. Instead, the session lifetime is set using the auth.sessions parameters in the MKE configuration file.

Refer to the MKE configuration file auth.external_identity_provider (optional) for the complete reference documentation.

Use LDAP in conjunction with SAML

In MKE, you can configure LDAP to work together with SAML, though you may need to overcome certain issues to do so.


To enable LDAP and SAML to be used in tandem:

  1. Enable and integrate SAML authentication.

  2. Log in to the MKE web UI.

  3. In the left-side navigation panel, navigate to user name > Admin Settings > Authentication & Authorization.

  4. Scroll down to the Identity Provider Integration section and verify that SAML is toggled to Enabled.

  5. Select the Also allow LDAP users checkbox.

  6. Integrate with an LDAP directory.


To sync teams with both LDAP and SAML users:

  1. Log in to the MKE web UI.

  2. Verify that LDAP and SAML teams are both enabled for syncing.

  3. In the left-side navigation panel, navigate to Access Control > Orgs & Teams

  4. Select the required organization and then select the required team.

  5. Click the gear icon in the upper right corner.

  6. On the Details tab, select ENABLE SYNC TEAM MEMBERS.

  7. Select ALLOW NON-LDAP MEMBERS.


To determine a user’s authentication protocol:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to Access Control > Users and select the target user.

    If an LDAP DN attribute is present next to Full Name and Admin, the user is managed by LDAP. If, however, the LDAP DN attribute is not present, the user is not managed by LDAP.

Overlapping user names

Unexpected behavior can result from having the same user name in both SAML and LDAP.

If just-in-time (JIT) provisioning is enabled in LDAP, MKE only allows log in attempts from the identity provider that first attempts to log in. MKE then blocks all log in attempts from the second identify provider.

If JIT provisioning is disabled in LDAP, the LDAP synchronization, which occurs at regular intervals, always overrides the ability of the SAML user account to log in.


To allow overlapping user names:

There may at times be a user who has the same name in both LDAP and SAML who you want to be able to sign in using either protocol.

  1. Define a custom SAML attribute with a name of dn and a value that is equivalent to the user account distinguished name (DN) with the LDAP provider. Refer to Define a custom SAML attribute in the Okta documentation for more information.

    Note

    MKE considers such users to be LDAP users. As such, should their LDAP DN change, the custom SAML attribute must be updated to match.

  2. Log in to the MKE web UI.

  3. From the left-side navigation panel, navigate to <user name> > Admin Settings > Authentication & Authorization and scroll down to the LDAP section.

  4. Under SAML integration, select Allow LDAP users to sign in using SAML.

Manage services node deployment

You can configure MKE to allow users to deploy and run services in worker nodes only, to ensure that all cluster management functionality remains performant and to enhance cluster security.

Important

If for whatever reason a user deploys a malicious service that can affect the node on which it is running, that service will not be able to strike any other nodes in the cluster or have any impact on cluster management functionality.

Restrict services deployment to Swarm worker nodes

To keep manager nodes performant, it is necessary at times to restrict service deployment to Swarm worker nodes.

To restrict services deployment to Swarm worker nodes:

  1. Log in to the MKE web UI with administrator credentials.

  2. Click the user name at the top of the navigation menu.

  3. Navigate to Admin Settings > Orchestration.

  4. Under Container Scheduling, toggle all of the sliders to the left to restrict the deployment only to worker nodes.

Note

Creating a grant with the Scheduler role against the / collection takes precedence over any other grants with Node Schedule on subcollections.

Restrict services deployment to Kubernetes worker nodes

By default, MKE clusters use Kubernetes taints and tolerations to prevent user workloads from deploying to MKE manager or MSR nodes.

Note

Workloads deployed by an administrator in the kube-system namespace do not follow scheduling constraints. If an administrator deploys a workload in the kube-system namespace, a toleration is applied to bypass the taint, and the workload is scheduled on all node types.

To view the taints, run the following command:

$ kubectl get nodes <mkemanager> -o json | jq -r '.spec.taints | .[]'

Example of system response:

{
  "effect": "NoSchedule",
  "key": "com.docker.ucp.manager"
}
Allow services deployment on Kubernetes MKE manager or MSR nodes

You can circumvent the protections put in place by Kubernetes taints and tolerations. For details, refer to Restrict services deployment to Kubernetes worker nodes.

Schedule services deployment on manager and MSR nodes
  1. Log in to the MKE web UI with administrator credentials.

  2. Click the user name at the top of the navigation menu.

  3. Navigate to Admin Settings > Orchestration.

  4. Select from the following options:

    • Under Container Scheduling, toggle to the right the slider for Allow administrators to deploy containers on MKE managers or nodes running MSR.

    • Under Container Scheduling, toggle to the right the slider for Allow all authenticated users, including service accounts, to schedule on all nodes, including MKE managers and MSR nodes..

Following any scheduling action, MKE applies a toleration to new workloads, to allow the Pods to be scheduled on all node types. For existing workloads, however, it is necessary to manually add the toleration to the Pod specification.

Add a toleration to the Pod specification for existing workloads
  1. Add the following toleration to the Pod specification, either through the MKE web UI or using the kubectl edit <object> <workload> command:

    tolerations:
    - key: "com.docker.ucp.manager"
    operator: "Exists"
    
  2. Run the following command to confirm the successful application of the toleration:

    kubectl get <object> <workload> -o json | jq -r '.spec.template.spec.tolerations | .[]'
    

Example of system response:

{
"key": "com.docker.ucp.manager",
"operator": "Exists"
}

Caution

A NoSchedule taint is present on MKE manager and MSR nodes, and if you disable scheduling on managers and/or workers a toleration for that taint will not be applied to the deployments. As such, you should not schedule on these nodes, except when the Kubernetes workload is deployed in the kube-system namespace.

Run only the images you trust

With MKE you can force applications to use only Docker images that are signed by MKE users you trust. Every time a user attempts to deploy an application to the cluster, MKE verifies that the application is using a trusted Docker image. If a trusted Docker image is not in use, MKE halts the deployment.

By signing and verifying the Docker images, you ensure that the images in use in your cluster are trusted and have not been altered, either in the image registry or on their way from the image registry to your MKE cluster.

Example workflow

  1. A developer makes changes to a service and pushes their changes to a version control system.

  2. A CI system creates a build, runs tests, and pushes an image to the Mirantis Secure Registry (MSR) with the new changes.

  3. The quality engineering team pulls the image, runs more tests, and signs and pushes the image if the image is verified.

  4. IT operations deploys the service, but only if the image in use is signed by the QA team. Otherwise, MKE will not deploy.

To configure MKE to only allow running services that use Docker trusted images:

  1. Log in to the MKE web UI.

  2. In the left-side navigation menu, click the user name drop-down to display the available options.

  3. Click Admin Settings > Docker Content Trust to reveal the Content Trust Settings page.

  4. Enable Run only signed images.

    Important

    At this point, MKE allows the deployment of any signed image, regardless of signee.

  5. (Optional) Make it necessary for the image to be signed by a particular team or group of teams:

    1. Click Add Team+ to reveal the two-part tool.

    2. From the drop-down at the left, select an organization.

    3. From the drop-down at the right, select a team belonging to the organization you selected.

    4. Repeat the procedure to configure additional teams.

      Note

      If you specify multiple teams, the image must be signed by a member of each team, or someone who is a member of all of the teams.

  6. Click Save.

    MKE immediately begins enforcing the image trust policy. Existing services continue to run and you can restart them as necessary. From this point, however, MKE only allows the deployment of new services that use a trusted image.

Set user session properties

MKE enables the setting of various user sessions properties, such as session timeout and the permitted number of concurrent sessions.

To configure MKE login session properties:

  1. Log in to the MKE web UI.

  2. In the left-side navigation menu, click the user name drop-down to display the available options.

  3. Click Admin Settings > Authentication & Authorization to reveal the MKE login session controls.

The following table offers information on the MKE login session controls:

Field

Description

Lifetime Minutes

The set duration of a login session in minutes, starting from the moment MKE generates the session. MKE invalidates the active session once this period expires and the user must re-authenticate to establish a new session.

  • Default: 60

  • Minimum: 10

Renewal Threshold Minutes

The time increment in minutes by which MKE extends an active session prior to session expiration. MKE extends the session by the amount specified in Lifetime Minutes. The threshold value cannot be greater than that set in Lifetime Minutes.

To specify that sessions not be extended, set the threshold value to 0. Be aware, though, that this may cause MKE web UI users to be unexpectedly logged out.

  • Default: 20

  • Maximum: 5 minutes less than Lifetime Minutes

Per User Limit

The maximum number of sessions a user can have running simultaneously. If the creation of a new session results in the exceeding of this limit, MKE will delete the session least recently put to use. Specifically, every time you use a session token, the server marks it with the current time (lastUsed metadata). When you create a new session exceeds the per-user limit, the session with the oldest lastUsed time is deleted, which is not necessarily the oldest session.

To disable the Per User Limit setting, set the value to 0.

  • Default: 10

  • Minimum: 1 / Maximum: No limit

Configure an MKE cluster

Important

The MKE configuration file documentation is up-to-date for the latest MKE release. As such, if you are running an earlier version of MKE, you may encounter detail for configuration options and parameters that are not applicable to the version of MKE you are currently running.

Refer to the MKE Release Notes for specific version-by-version information on MKE configuration file additions and changes.

The configuring of an MKE cluster takes place through the application of a TOML file. You use this file, the MKE configuration file, to import and export MKE configurations, to both create new MKE instances and to modify existing ones.

Refer to example-config in the MKE CLI reference documentation to learn how to download an example MKE configuration file.

Use an MKE configuration file

Put the MKE configuration file to work for the following use cases:

  • Set the configuration file to run at the install time of new MKE clusters

  • Use the API to import the file back into the same cluster

  • Use the API to import the file into multiple clusters

To make use of an MKE configuration file, you edit the file using either the MKE web UI or the command line interface (CLI). Using the CLI, you can either export the existing configuration file for editing, or use the example-config command to view and edit an example TOML MKE configuration file.

docker container run --rm
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.7.16 \ example-config
Modify an existing MKE configuration

Working as an MKE admin, use the config-toml API from within the directory of your client certificate bundle to export the current MKE settings to a TOML file.

As detailed herein, the command set exports the current configuration for the MKE hostname MKE_HOST to a file named mke-config.toml:

  1. Define the following environment variables:

    export MKE_USERNAME=<mke-username>
    export MKE_PASSWORD=<mke-password>
    export MKE_HOST=<mke-fqdm-or-ip-address>
    
  2. Obtain and define an AUTHTOKEN environment variable:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"'$MKE_USERNAME'","password":"'$MKE_PASSWORD'"}' https://$MKE_HOST/auth/login | jq --raw-output .auth_token)
    
  3. Download the current MKE configuration file.

    curl --silent --insecure -X GET "https://$MKE_HOST/api/ucp/config-toml" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" > mke-config.toml
    
  4. Edit the MKE configuration file, as needed. For comprehensive detail, refer to Configuration options.

  5. Upload the newly edited MKE configuration file:

    Note

    You may need to reacquire the AUTHTOKEN, if significant time has passed since you first acquired it.

    curl --silent --insecure -X PUT -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file 'mke-config.toml' https://$MKE_HOST/api/ucp/config-toml
    
Apply an existing configuration at install time

To customize a new MKE instance using a configuration file, you must create the file prior to installation. Then, once the new configuration file is ready, you can configure MKE to import it during the installation process using Docker Swarm.

To import a configuration file at installation:

  1. Create a Docker Swarm Config object named com.docker.mke.config and the TOML value of your MKE configuration file contents.

  2. When installing MKE on the cluster, specify the --existing-config flag to force the installer to use the new Docker Swarm Config object for its initial configuration.

  3. Following the installation, delete the com.docker.mke.config object.

Configuration options
auth table

Parameter

Required

Description

backend

no

The name of the authorization backend to use, managed or ldap.

Default: managed

default_new_user_role

no

The role assigned to new users for their private resource sets.

Valid values: admin, viewonly, scheduler, restrictedcontrol, or fullcontrol.

Default: restrictedcontrol

auth.sessions

Parameter

Required

Description

lifetime_minutes

no

The initial session lifetime, in minutes.

Default: 60

renewal_threshold_minutes

no

The length of time, in minutes, before the expiration of a session where, if used, a session will be extended by the current configured lifetime from then. A value of 0 disables session extension.

Default: 20

per_user_limit

no

The maximum number of sessions that a user can have simultaneously active. If creating a new session will put a user over this limit, the least recently used session is deleted.

A value of 0 disables session limiting.

Default: 10

store_token_per_session

no

If set, the user token is stored in sessionStorage instead of localStorage. Setting this option logs the user out and requires that they log back in, as they are actively changing the manner in which their authentication is stored.

auth.external_identity_provider (optional)

Configures MKE with an external OpenID Connect (OIDC) identity provider.

Parameter

Required

Description

wellKnownConfigUrl

yes

Sets the OpenID discovery endpoint, ending in .well-known/openid-configuration, for your identity provider.

clientID

yes

Sets the client ID, which you obtain from your identity provider.

clientSecret

no (recommended)

Sets the client secret, which you obtain from your identity provider.

usernameClaim

no

Sets the unique JWT ID token claim that contains the user names from your identity provider.

Default: sub

caBundle

no

Sets the PEM certificate bundle that MKE uses to authenticate the discovery, issuer, and JWKs endpoints.

httpProxy

no

Sets the HTTP proxy for your identity provider.

httpsProxy

no

Sets the HTTPS proxy for your identity provider.

issuer

no

Sets the ID token issuer. If left blank, the value is obtained automatically from the discovery endpoint.

userServiceId

no

Sets the MKE service ID with the JWK URI for the identity provider. If left blank, the service ID is generated automatically.

Warning

Do not remove or replace an existing value.

auth.external_identity_provider.signInCriteria array (optional)

An array of claims that ID tokens require for use with MKE.

Parameter

Required

Description

term

yes

Sets the name of the claim.

value

yes

Sets the value for the claim in the form of a string.

matchType

yes

Sets how MKE evaluates the JWT claim.

Valid values:

  • must - the JWT claim value must be the same as the configuration value.

  • contains - the JWT claim value must contain the configuration value.

auth.external_identity_provider.adminRoleCriteria array (optional)

An array of claims that admin user ID tokens require for use with MKE. Creating a new account using a token that satisfies the criteria determined by this array automatically produces an administrator account.

Parameter

Required

Description

term

yes

Sets the name of the claim.

value

yes

Sets the value for the claim in the form of a string.

matchType

yes

Sets how the JWT claim is evaluated.

Valid values:

  • must - the JWT claim value must be the same as the configuration value.

  • contains - the JWT claim value must contain the configuration value.

auth.account_lock (optional)

Parameter

Required

Description

enabled

no

Sets whether the MKE account lockout feature is enabled.

failureTrigger

no

Sets the number of failed log in attempts that can occur before an account is locked.

durationSeconds

no

Sets the desired lockout duration in seconds. A value of 0 indicates that the account will remain locked until it is unlocked by an administrator.

hardening_configuration (optional)

The hardening_enabled option must be set to true to enable all other hardening_configuration options.

Parameter

Required

Description

hardening_enabled

no

Parent option that when set to true enables security hardening configuration options: limit_kernel_capabilities, pid_limit, pid_limit_unspecified, and use_strong_tls_ciphers.

Default: false

limit_kernel_capabilities

no

The option can only be enabled when hardening_enabled is set to true.

Limits kernel capabilities to the minimum required by each container.

Components run using Docker default capabilities by default. When you enable limit_kernel_capabilities all capabilities are dropped, except those that are specifically in use by the component. Several components run as privileged, with capabilities that cannot be disabled.

Default: false

pid_limit

no

The option can only be enabled when hardening_enabled is set to true.

Sets the maximum number of PIDs MKE can allow for their respective orchestrators.

The pid_limit option must be set to the default 0 when it is not in use.

Default: 0

pid_limit_unspecified

no

The option can only be enabled when hardening_enabled is set to true.

When set to false, enables PID limiting, using the pid_limit option value for the associated orchestrator.

Default: true

use_strong_tls_ciphers

no

The option can only be enabled when hardening_enabled is set to true.

When set to true, in line with control 4.2.12 of the CIS Kubernetes Benchmark 1.7.0, the use_strong_tls_ciphers parameter limits the allowed ciphers for the cipher_suites_for_kube_api_server, cipher_suites_for_kubelet and cipher_suites_for_etcd_server parameters in the cluster_config table to the following:

  • TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256

  • TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

  • TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305

  • TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384

  • TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305

  • TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

  • TLS_RSA_WITH_AES_256_GCM_SHA384

  • TLS_RSA_WITH_AES_128_GCM_SHA256

Default: false

registries array (optional)

An array of tables that specifies the MSR instances that are managed by the current MKE instance.

Parameter

Required

Description

host_address

yes

Sets the address for connecting to the MSR instance tied to the MKE cluster.

service_id

yes

Sets the MSR instance’s OpenID Connect Client ID, as registered with the Docker authentication provider.

ca_bundle

no

Specifies the root CA bundle for the MSR instance if you are using a custom certificate authority (CA). The value is a string with the contents of a ca.pem file.

audit_log_configuration table (optional)

Configures audit logging options for MKE components.

Parameter

Required

Description

level

no

Specifies the audit logging level.

Valid values: empty (to disable audit logs), metadata, request.

Default: empty

support_dump_include_audit_logs

no

Sets support dumps to include audit logs in the logs of the ucp-controller container of each manager node.

Valid values: true, false.

Default: false

scheduling_configuration table (optional)

Specifies scheduling options and the default orchestrator for new nodes.

Note

If you run a kubectl command, such as kubectl describe nodes, to view scheduling rules on Kubernetes nodes, the results that present do not reflect the MKE admin settings conifguration. MKE uses taints to control container scheduling on nodes and is thus unrelated to the kubectl Unschedulable boolean flag.

Parameter

Required

Description

enable_admin_ucp_scheduling

no

Determines whether administrators can schedule containers on manager nodes.

Valid values: true, false.

Default: false

You can also set the parameter using the MKE web UI:

  1. Log in to the MKE web UI as an administrator.

  2. Click the user name drop-down in the left-side navigation panel.

  3. Click Admin Settings > Orchestration to view the Orchestration screen.

  4. Scroll down to the Container Scheduling section and toggle on the Allow administrators to deploy containers on MKE managers or nodes running MSR slider.

default_node_orchestrator

no

Sets the type of orchestrator to use for new nodes that join the cluster.

Valid values: swarm, kubernetes.

Default: swarm

tracking_configuration table (optional)

Specifies the analytics data that MKE collects.

Parameter

Required

Description

disable_usageinfo

no

Set to disable analytics of usage information.

Valid values: true, false.

Default: false

disable_tracking

no

Set to disable analytics of API call information.

Valid values: true, false.

Default: false

cluster_label

no

Set a label to be included with analytics.

ops_care

no

Set to enable OpsCare.

Valid values: true, false.

Default: false

trust_configuration table (optional)

Specifies whether MSR images require signing.

Parameter

Required

Description

require_content_trust

no

Set to require the signing of images by content trust.

Valid values: true, false.

Default: false

You can also set the parameter using the MKE web UI:

  1. Log in to the MKE web UI as an administrator.

  2. Click the user name drop-down in the left-side navigation panel.

  3. Click Admin Settings > Docker Content Trust to open the Content Trust Settings screen.

  4. Toggle on the Run only signed images slider.

require_signature_from

no

A string array that specifies which users or teams must sign images.

allow_repos

no

A string array that specifies repos that are to bypass content trust check, for example, ["docker.io/mirantis/dtr-rethink" , "docker.io/mirantis/dtr-registry" ....].

log_configuration table (optional)

Configures the logging options for MKE components.

Parameter

Required

Description

protocol

no

The protocol to use for remote logging.

Valid values: tcp, udp.

Default: tcp

host

no

Specifies a remote syslog server to receive sent MKE controller logs. If omitted, controller logs are sent through the default Docker daemon logging driver from the ucp-controller container.

level

no

The logging level for MKE components.

Valid values (syslog priority levels): debug, info, notice, warning, err, crit, alert, emerg.

license_configuration table (optional)

Enables automatic renewal of the MKE license.

Parameter

Required

Description

auto_refresh

no

Set to enable attempted automatic license renewal when the license nears expiration. If disabled, you must manually upload renewed license after expiration.

Valid values: true, false.

Default: true

custom headers (optional)

Included when you need to set custom API headers. You can repeat this section multiple times to specify multiple separate headers. If you include custom headers, you must specify both name and value.

[[custom_api_server_headers]]

Item

Description

name

Set to specify the name of the custom header with name = “X-Custom-Header-Name”.

value

Set to specify the value of the custom header with value = “Custom Header Value”.

user_workload_defaults (optional)

A map describing default values to set on Swarm services at creation time if those fields are not explicitly set in the service spec.

[user_workload_defaults]

[user_workload_defaults.swarm_defaults]

Parameter

Required

Description

[tasktemplate.restartpolicy.delay]

no

Delay between restart attempts. The value is input in the <number><value type> formation. Valid value types include:

  • ns = nanoseconds

  • us = microseconds

  • ms = milliseconds

  • s = seconds

  • m = minutes

  • h = hours

Default: value = "5s"

[tasktemplate.restartpolicy.maxattempts]

no

Maximum number of restarts before giving up.

Default: value = "3"

cluster_config table (required)

Configures the cluster that the current MKE instance manages.

The dns, dns_opt, and dns_search settings configure the DNS settings for MKE components. These values, when assigned, override the settings in a container /etc/resolv.conf file.

Parameter

Required

Description

controller_port

yes

Sets the port that the ucp-controller monitors.

Default: 443

kube_apiserver_port

yes

Sets the port the Kubernetes API server monitors.

kube_protect_kernel_defaults

no

Protects kernel parameters from being overridden by kubelet.

Default: false.

Important

When enabled, kubelet can fail to start if the following kernel parameters are not properly set on the nodes before you install MKE or before adding a new node to an existing cluster:

vm.panic_on_oom=0
vm.overcommit_memory=1
kernel.panic=10
kernel.panic_on_oops=1
kernel.keys.root_maxkeys=1000000
kernel.keys.root_maxbytes=25000000

For more information, refer to Configure kernel parameters.

kube_api_server_auditing

no

Enables auditing to the log file in the kube-apiserver container.

Important

  • Prior to using kube_api_server_auditing you must first enable auditing in MKE. Refer to Enable MKE audit logging for detailed information.

  • Before you enable the kube_api_server_auditing option, verify that it does not conflict with MKE options that are already set.

For more information, refer to the official Kubernetes documentation Troubleshooting Clusters - Audit backends.

Default: false.

swarm_port

yes

Sets the port that the ucp-swarm-manager monitors.

Default: 2376

swarm_strategy

no

Sets placement strategy for container scheduling. Be aware that this does not affect swarm-mode services.

Valid values: spread, binpack, random.

dns

yes

Array of IP addresses that serve as nameservers.

dns_opt

yes

Array of options in use by DNS resolvers.

dns_search

yes

Array of domain names to search whenever a bare unqualified host name is used inside of a container.

profiling_enabled

no

Determines whether specialized debugging endpoints are enabled for profiling MKE performance.

Valid values: true, false.

Default: false

authz_cache_timeout

no

Sets the timeout in seconds for the RBAC information cache of MKE non-Kubernetes resource listing APIs. Setting changes take immediate effect and do not require a restart of the MKE controller.

Default: 0 (cache is not enabled)

Once you enable the cache, the result of non-Kubernetes resource listing APIs only reflects the latest RBAC changes for the user when the cached RBAC info times out.

kv_timeout

no

Sets the key-value store timeout setting, in milliseconds.

Default: 5000

kv_snapshot_count

Required

Sets the key-value store snapshot count.

Default: 20000

external_service_lb

no

Specifies an optional external load balancer for default links to services with exposed ports in the MKE web interface.

cni_installer_url

no

Specifies the URL of a Kubernetes YAML file to use to install a CNI plugin. Only applicable during initial installation. If left empty, the default CNI plugin is put to use.

metrics_retention_time

no

Sets the metrics retention time.

metrics_scrape_interval

no

Sets the interval for how frequently managers gather metrics from nodes in the cluster.

metrics_disk_usage_interval

no

Sets the interval for the gathering of storage metrics, an operation that can become expensive when large volumes are present.

nvidia_device_plugin

no

Enables the nvidia-gpu-device-plugin, which is disabled by default.

rethinkdb_cache_size

no

Sets the size of the cache for MKE RethinkDB servers.

Default: 1GB

Leaving the field empty or specifying auto instructs RethinkDB to automatically determine the cache size.

exclude_server_identity_headers

no

Determines whether the X-Server-Ip and X-Server-Name headers are disabled.

Valid values: true, false.

Default: false

cloud_provider

no

Sets the cloud provider for the Kubernetes cluster.

pod_cidr

yes

Sets the subnet pool from which the IP for the Pod should be allocated from the CNI IPAM plugin.

Default: 192.168.0.0/16

ipip_mtu

no

Sets the IPIP MTU size for the Calico IPIP tunnel interface.

azure_ip_count

yes

Sets the IP count for Azure allocator to allocate IPs per Azure virtual machine.

service_cluster_ip_range

yes

Sets the subnet pool from which the IP for Services should be allocated.

Default: 10.96.0.0/16

nodeport_range

yes

Sets the port range for Kubernetes services within which the type NodePort can be exposed.

Default: 32768-35535

custom_kube_api_server_flags

no

Sets the configuration options for the Kubernetes API server.

Be aware that this parameter function is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the MKE Software Support Agreement.

custom_kube_controller_manager_flags

no

Sets the configuration options for the Kubernetes controller manager.

Be aware that this parameter function is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the MKE Software Support Agreement.

custom_kubelet_flags

no

Sets the configuration options for kubelet.

Be aware that this parameter function is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the MKE Software Support Agreement.

custom_kubelet_flags_profiles Available since MKE 3.7.10

no

Sets a profile that can be applied to the kubelet agent on any node.

custom_kube_scheduler_flags

no

Sets the configuration options for the Kubernetes scheduler.

Be aware that this arameter function is only for development and testing. Arbitrary Kubernetes configuration parameters are not tested and supported under the MKE Software Support Agreement.

local_volume_collection_mapping

no

Set to store data about collections for volumes in the MKE local KV store instead of on the volume labels. The parameter is used to enforce access control on volumes.

manager_kube_reserved_resources

no

Reserves resources for MKE and Kubernetes components that are running on manager nodes.

worker_kube_reserved_resources

no

Reserves resources for MKE and Kubernetes components that are running on worker nodes.

kubelet_max_pods

yes

Sets the number of Pods that can run on a node.

Maximum: 250

Default: 110

kubelet_pods_per_core

no

Sets the maximum number of Pods per core.

0 indicates that there is no limit on the number of Pods per core. The number cannot exceed the kubelet_max_pods setting.

Recommended: 10

Default: 0

secure_overlay

no

Enables IPSec network encryption in Kubernetes.

Valid values: true, false.

Default: false

image_scan_aggregation_enabled

no

Enables image scan result aggregation. The feature displays image vulnerabilities in shared resource/containers and shared resources/images pages.

Valid values: true, false.

Default: false

swarm_polling_disabled

no

Determines whether resource polling is disabled for both Swarm and Kubernetes resources, which is recommended for production instances.

Valid values: true, false.

Default: false

oidc_client_id

no

Sets the OIDC client ID, using the eNZi service ID that is in the ODIC authorization flow.

hide_swarm_ui

no

Determines whether the UI is hidden for all Swarm-only object types (has no effect on Admin Settings).

Valid values: true, false.

Default: false

You can also set the parameter using the MKE web UI:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, click the user name drop-down.

  3. Click Admin Settings > Tuning to open the Tuning screen.

  4. Toggle on the Hide Swarm Navigation slider located under the Configure MKE UI heading.

unmanaged_cni

yes

Sets Calico as the CNI provider, managed by MKE. Note that Calico is the default CNI provider.

calico_ebpf_enabled

yes

Enables Calico eBPF mode.

kube_default_drop_masq_bits

yes

Sets the use of Kubernetes default values for iptables drop and masquerade bits.

kube_proxy_mode

yes

Sets the operational mode for kube-proxy.

Valid values: iptables, ipvs, disabled.

Default: iptables

cipher_suites_for_kube_api_server

no

Sets the value for the kube-apiserver --tls-cipher-suites parameter.

cipher_suites_for_kubelet

no

Sets the value for the kubelet --tls-cipher-suites parameter.

cipher_suites_for_etcd_server

no

Sets the value for the etcd server --cipher-suites parameter.

image_prune_schedule

no

Sets the cron expression used for the scheduling of image pruning. The parameter accepts either full crontab specifications or descriptors, but not both.

  • Full crontab specifications, which include <seconds> <minutes> <hours> <day of month> <month> <day of week>. For example, "0 0 0 * * *".

  • Descriptors, which are textual in nature, with a preceding @ symbol. For example: "@midnight" or "@every 1h30m".

Refer to the cron documentation for more information.

cpu_usage_banner_threshold

no

Sets the CPU usage threshold, above which the MKE web UI displays a warning banner.

Default: 20.

cpu_usage_banner_scrape_interval

no

Sets the MKE CPU usage measurement interval, which enables the function of the cpu_usage_banner_threshold option.

Default: "10m".

etcd_storage_quota

no

Sets the etcd storage size limit.

Example values: 4GB, 8GB.

Default value: 2GB.

nvidia_device_partitioner

no

Enables the NVIDIA device partitioner.

Default: true.

kube_api_server_profiling_enabled

no

Enables profiling for the Kubernetes API server.

Default: true.

kube_controller_manager_profiling_enabled

no

Enables profiling for the Kubernetes controller manager.

Default: true.

kube_scheduler_profiling_enabled

no

Enables profiling for the Kubernetes scheduler.

Default: true.

kube_scheduler_bind_to_all

no

Enables kube scheduler to bind to all available network interfaces, rather than just localhost.

Default: false.

use_flex_volume_driver

no

Extends support of FlexVolume drivers, which have been deprecated since the release of MKE 3.4.13.

Default: false.

pubkey_auth_cache_enabled

no

Warning

Implement pubkey_auth_cache_enabled only in cases in which there are certain performance issues in high-load clusters, and only under the guidance of Mirantis Support personnel.

Enables public key authentication cache.

Note

ucp-controller must be restarted for setting changes to take effect.

Default: false.

prometheus_memory_limit

no

The maximum amount of memory that can be used by the Prometheus container.

Default: 2Gi.

prometheus_memory_request

no

The minimum amount of memory reserved for the Prometheus container.

Default: 1Gi.

shared_sans

no

Subject alternative names for manager nodes.

kube_manager_terminated_pod_gc_threshold

no

Allows users to set the threshold for the terminated Pod garbage collector in Kube Controller Manager according to their cluster-specific requirement.

Default: 12500

kube_api_server_request_timeout

no

Timeout for Kube API server requests.

Default: 1m

cadvisor_enabled

no

Enables the ucp-cadvisor comoponent, which runs a standalone cadvisor instance on each node to provide additional container level metrics with all expected labels.

Default: false

calico_controller_probes_tuning

no

Enables the user to specify values for the Calico controller liveness and readiness probes.

Default: false.

calico_controller_liveness_probe_failure_threshold

no

Sets the Calico controller liveness probe failure threshold.

Default: 0. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_liveness_probe_initial_delay_seconds

no

Sets the Calico controller liveness probe initial delay period in seconds.

Default: -1. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_liveness_probe_period_seconds

no

Sets the Calico controller liveness probe period in seconds.

Default: 0. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_liveness_probe_success_threshold

no

Sets the Calico controller liveness probe success threshold.

Default: 0. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_liveness_probe_timeout_seconds

no

Sets the Calico controller liveness probe timeout period in seconds.

Default: 0. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_readiness_probe_failure_threshold

no

Sets the Calico controller readiness probe failure threshold.

Default: 0. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_readiness_probe_initial_delay_seconds

no

Sets the Calico controller readiness probe initial delay period in seconds.

Default: -1. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_readiness_probe_initial_delay_seconds

no

Sets the Calico controller readiness probe initial delay period in seconds.

Default: -1. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_readiness_probe_period_seconds

no

Sets the Calico controller readiness probe period in seconds.

Default: 0. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_readiness_probe_success_threshold

no

Sets the Calico controller readiness probe success threshold.

Default: 0. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

calico_controller_readiness_probe_timeout_seconds

no

Sets the Calico controller readiness probe timeout period in seconds.

Default: 0. The default value is not valid and must be changed to a valid value when calico_controller_probes_tuning is set to true.

kube_api_server_audit_log_maxage

no

Sets the maximum number of days for which to retain old audit log files in Kubernetes API server.

Default: 30.

kube_api_server_audit_log_maxbackup

no

Sets the maximum number of audit log files for which to retain in the Kubernetes API server.

Default: 10.

kube_api_server_audit_log_maxsize

no

Sets the maximum size the audit log file can attain, in megabytes, before it is rotated in Kubernetes API server.

Default: 10.

KubeAPIServerCustomAuditPolicyYaml

no

Specifies a Kubernetes audit logging policy. Refer to https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/ for more information.

KubeAPIServerEnableCustomAuditPolicy

no

Enables the use of a specified custom audit policy yaml file.

Default: false.

node_exporter_port Available since MKE 3.7.10

yes

Sets the listening port for Prometheus Node Exporter.

Default: 9100.

cluster_config.image_prune_whitelist (optional)

Configures the images that you do not want removed by MKE image pruning.

Note

Where possible, use the image ID to specify the image rather than the image name.

Parameter

Required

Description

key

yes

Sets the filter key.

Valid values: dangling, label, before, since, and reference.

For more information, refer to the Docker documentation on Filtering.

value

yes

Sets the filter value.

For more information, refer to the Docker documentation on Filtering.

cluster_config.ingress_controller (optional)

Set the configuration for the NGINX Ingress Controller to manage traffic that originates outside of your cluster (ingress traffic).

Note

Prior versions of MKE use Istio Ingress to manage traffic that originates from outside of the cluster, which employs many of the same parameters as NGINX Ingress Controller.

Parameter

Required

Description

enabled

No

Disables HTTP ingress for Kubernetes.

Valid values: true, false.

Default: false

ingress_num_replicas

No

Sets the number of NGINX Ingress Controller deployment replicas.

Default: 2

ingress_external_ips

No

Sets the list of external IPs for Ingress service.

Default: [] (empty)

ingress_enable_lb

No

Enables an external load balancer.

Valid values: true, false.

Default: false

ingress_preserve_client_ip

No

Enables preserving inbound traffic source IP.

Valid values: true, false.

Default: false

ingress_exposed_ports

No

Sets ports to expose.

For each port, provide arrays that contain the following port information (defaults as displayed):

  • name = http2

  • port = 80

  • target_port = 0

  • node_port = 33000


  • name = https

  • port = 443

  • target_port = 0

  • node_port = 33001


  • name = tcp

  • port = 31400

  • target_port = 0

  • node_port = 33002

ingress_node_affinity

No

Sets node affinity.

  • key = com.docker.ucp.manager

  • value = ""

  • target_port = 0

  • node_port = 0

ingress_node_toleration

No

Sets node toleration.

For each node, provide an array that contains the following information (defaults as displayed):

  • key = com.docker.ucp.manager

  • value = ""

  • operator = Exists

  • effect = NoSchedule

config_map

No

Sets advanced options for the NGINX proxy.

NGINX Ingress Controller uses ConfigMap to configure the NGINX proxy. For the complete list of available options, refer to the NGINX Ingress Controller documentation ConfigMap: configuration options.

Examples:

  • map-hash-bucket-size = "128"

  • ssl-protocols = "SSLv2"

ingress_extra_args.http_port

No

Sets the container port for servicing HTTP traffic.

Default: 80

ingress_extra_args.https_port

No

Sets the container port for servicing HTTPS traffic.

Default: 443

ingress_extra_args.enable_ssl_passthrough

No

Enables SSL passthrough.

Default: false

ingress_extra_args.default_ssl_certificate

No

Sets the Secret that contains an SSL certificate to be used as a default TLS certificate.

Valid value: <namespace>/<name>

cluster_config.metallb_config (optional)

Enable and disable MetalLB for load balancer services in bare metal clusters.

Parameter

Required

Description

enabled

No

Enables MetalLB load balancer for bare metal Kubernetes clusters.

Valid values: true, false.

Default: false.

metallb_ip_addr_pool

No

Adds a list of custom address pool resources. At least one entry is required to enable MetalLB.

Default: [] (empty).

cluster_config.policy_enforcement.gatekeeper (optional)

Enable and disable OPA Gatekeeper for policy enforcement.

Note

By design, when the OPA Gatekeeper is disabled using the configuration file, the Pods are deleted but the policies are not cleaned up. Thus, when the OPA Gatekeeper is re-enabled, the cluster can immediately adopt the existing policies.

The retention of the policies poses no risk, as they are just data on the API server and have no value outside of a OPA Gatekeeper deployment.

Parameter

Required

Description

enabled

No

Enables the Gatekeeper function.

Valid values: true, false.

Default: false.

excluded_namespaces

No

Excludes from the Gatekeeper admission webhook all of the resources that are contained in a list of namespaces. Specify as a comma-separated list.

For example: "kube-system", "gatekeeper-system"

cluster_config.core_dns_lameduck_config (optional)

Available since MKE 3.7.0

Enable and disable lameduck in CoreDNS.

Parameter

Required

Description

enabled

No

Enables the lameduck health function.

Valid values: true, false.

Default: false.

duration

No

Length of time during which lameduck will run, expessed with integers and time suffixes, such as s for seconds and m for minutes.

Note

  • The configured value for duration must be greater than 0s.

  • Default values are applied for any fields that are left blank.

Default: 7s.

Caution

Editing the CoreDNS config map outside of MKE to configure the lameduck function is not supported. Any such attempts will be superseded by the values that are configured in the MKE configuration file.

iSCSI (optional)

Configures iSCSI options for MKE.

Parameter

Required

Description

--storage-iscsi=true

no

Enables iSCSI-based Persistent Volumes in Kubernetes.

Valid values: true, false.

Default: false

--iscsiadm-path=<path>

no

Specifies the path of the iscsiadm binary on the host.

Default: /usr/sbin/iscsiadm

--iscsidb-path=<path>

no

Specifies the path of the iscsi database on the host.

Default: /etc/iscsi

pre_logon_message

Configures a pre-logon message.

Parameter

Required

Description

pre_logon_message

no

Sets a pre-logon message to alert users prior to log in.

backup_schedule_config (optional)

Configures backup scheduling and notifications for MKE.

Parameter

Required

Description

notification-delay

yes

Sets the number of days that elapse before a user is notified that they have not performed a recent backup. Set to -1 to disable notifications.

Default: 7

enabled

yes

Enables backup scheduling.

Valid values: true, false.

Default: false

path

yes

Sets the storage path for scheduled backups. Use chmod o+w /<path> to ensure that other users have write privileges.

no_passphrase

yes

Sets whether a passphrase is necessary to encrypt the TAR file. A value of true negates the use of a passphrase. A non-empty value in the passphrase parameter requires that no-passphrase be set to false.

Default: false

passphrase

yes

Encrypts the TAR file with a passphrase for all scheduled backups. Must remain empty if no_passphrase is set to true.

Do not share the configuration file if a passphrase is used, as the passphrase displays in plain text.

cron_spec

yes

Sets the cron expression in use for scheduling backups. The parameter accepts either full crontab specifications or descriptors, but not both.

  • Full crontab specifications include <seconds> <minutes> <hours> <day of month> <month> <day of week>. For example: "0 0 0 * * *".

  • Descriptors, which are textual in nature, have a preceding @ symbol. For example: "@midnight" or "@every 1h30m".

For more information, refer to the cron documentation.

include_logs

yes

Determines whether a log file is generated in addition to the backup. Refer to backup for more information.

backup_limits

yes

Sets the number of backups to store. Once this number is reached, older backups are deleted. Set to -1 to disable backup rotation.

etcd_cleanup_schedule_config (optional)

Configures scheduling for etcd cleanup for MKE.

Parameter

Required

Description

cleanup_enabled

yes

Enables etcd cleanup scheduling.

Valid values: true, false.

Default: false

min_ttl_to_keep_seconds

no

Minimum Time To Live (TTL) for retaining certain events in etcd.

Default: 0

cron_expression

yes

Sets the cron expression to use for scheduling backups.

cron_expression accepts either full crontab specifications or descriptors. It does not, though, concurrenlty accept both.

  • Full crontab specifications include <seconds> <minutes> <hours> <day-of-month> <month> <day-of-week>. For example, 0 0 0 * * MON

  • Descriptors, which are textual in nature, have a preceding @ symbol. For example: “@weekly”, “@monthly” or “@every 72h”.

The etcd cleanup operation starts with the deletion of the events, which is followed by the compacting of the etcd revisions. The cleanup scheduling inerval must be set for a minimum of 72 hours.

Refer to the official cron documentation for more information.

defrag_enabled

no

Enables defragmentation of the etcd cluster after successful cleanup.

Warning

The etcd cluster defragmentation process can cause temporary performance degradation. To minimize possible impact, schedule cron_expression to occur during off-peak hours or periods of low activity.

Valid values: true, false.

Default: false

defrag_pause_seconds

no

Sets the period of time, in seconds, to pause between issuing defrag commands to etcd members.

Default: 60

defrag_timeout_seconds

no

Sets the period of time, in seconds, that each etcd member is allotted to complete defragmentation. If the defragmentation of a member times out before the process is successfully completed, the entire cluster defragmentation is aborted.

Default: 300

windows_gmsa

Configures use of Windows GMSA credentia specifications.

Parameter

Required

Description

windows_gmsa

no

Allows creation of GMSA credential specifications for the Kubernetes cluster, as well as automatic population of full credential specifications for any Pod on which the GMSA credential specification is referenced in the security context of that Pod.

The schema for gmsa credential spec MKE uses is publicly documented at https://github.com/kubernetes-sigs/windows-gmsa/blob/master/charts/gmsa/templates/credentialspec.yaml.

For information on how to enable GMSA and how to obtain different components of the GMSA specification for one or more GMSA accounts in your domain, refer to the official Windows documentation.

Scale an MKE cluster

By adding or removing nodes from the MKE cluster, you can horizontally scale MKE to fit your needs as your applications grow in size and use.

Scale using the MKE web UI

For detail on how to use the MKE web UI to scale your cluster, refer to Join Linux nodes or Join Windows worker nodes, depending on which operating system you use. In particular, these topics offer information on adding nodes to a cluster and configuring node availability.

Scale using the CLI

You can also use the command line to perform all scaling operations.

Scale operation

Command

Obtain the join token

Run the following command on a manager node to obtain the join token that is required for cluster scaling. Use either worker or manager for the <node-type>:

docker swarm join-token <node-type>

Configure a custom listen address

Specify the address and port where the new node listens for inbound cluster management traffic:

docker swarm join \
   --token  SWMTKN-1-2o5ra9t7022neymg4u15f3jjfh0qh3yof817nunoioxa9i7lsp-dkmt01ebwp2m0wce1u31h6lmj \
   --listen-addr 234.234.234.234 \
   192.168.99.100:2377

Verify node addition

Once your node is added, run the following command on a manager node to verify its presence:

docker node ls

Set node availability state

Use the --availability option to set node availability, indicating active, pause, or drain:

docker node update --availability <availability-state> <node-hostname>

Remove the node

docker node rm <node-hostname>

Configure KMS plugin for MKE

Mirantis Kubernetes Engine (MKE) offers support for a Key Management Service (KMS) plugin that allows access to third-party secrets management solutions, such as Vault. MKE uses this plugin to facilitate access from Kubernetes clusters.

MKE will not health check, clean up, or otherwise manage the KMS plugin. Thus, you must deploy KMS before a machine becomes a MKE manager, or else it may be considered unhealthy.

Configuration

Use MKE to configure the KMS plugin configuration. MKE maintains ownership of the Kubernetes EncryptionConfig file, where the KMS plugin is configured for Kubernetes. MKE does not check the file contents following deployment.

MKE adds new configuration options to the cluster configuration table. Configuration of these options takes place through the API and not the MKE web UI.

The following table presents the configuration options for the KMS plugin, all of which are optional.

Parameter

Type

Description

kms_enabled

bool

Sets MKE to configure a KMS plugin.

kms_name

string

Name of the KMS plugin resource (for example, vault).

kms_endpoint

string

Path of the KMS plugin socket. The path must refer to a UNIX socket on the host (for example, /tmp/socketfile.sock). MKE bind mounts this file to make it accessible to the API server.

kms_cachesize

int

Number of data encryption keys (DEKs) to cache in the clear.

Use a local node network in a swarm

Mirantis Kubernetes Engine (MKE) can use local network drivers to orchestrate your cluster. You can create a config network with a driver such as MAC VLAN, and use this network in the same way as any other named network in MKE. In addition, if it is set up as attachable you can attach containers.

Warning

Encrypting communication between containers on different nodes only works with overlay networks.

Create node-specific networks with MKE

To create a node-specific network for use with MKE, always do so through MKE, using either the MKE web UI or the CLI with an admin bundle. If you create such a network without MKE, it will not have the correct access label and it will not be available in MKE.

Create a MAC VLAN network
  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation menu, click Swarm > Networks.

  3. Click Create to call the Create Network screen.

  4. Select macvlan from the Drivers` dropdown.

  5. Enter macvlan into the Name field.

  6. Select the type of network to create, Network or Local Config.

    • If you select Local Config, the SCOPE is automatically set to Local. You subsequently select the nodes for which to create the Local Config from those listed. MKE will prefix the network with the node name for each selected node to ensure consistent application of access labels, and you then select a Collection for the Local Configs to reside in. All Local Configs with the same name must be in the same collection, or MKE returns an error. If you do not not select a Collection, the network is placed in your default collection, which is / in a new MKE installation.

    • If you select Network, the SCOPE is automatically set to Swarm. Choose an existing Local Config from which to create the network. The network and its labels and collection placement are inherited from the related Local Configs.

  7. Optional. Configure IPAM.

  8. Click Create.

Manage MKE certificate authorities

MKE deploys three certificate authority (CA) servers: MKE Cluster Root CA, MKE etcd Root CA, and MKE Client Root CA.

MKE Cluster Root CA

Available since MKE 3.7.0

The self-deployed MKE Cluster Root CA server issues certificates for MKE cluster nodes and internal components that enable the components to communicate with each other. The server also issues certificates that are used in admin client bundles.

To rotate the certificate material of the MKE Cluster Root CA or provide your own certificate and private key:

Caution

  • If there are unhealthy nodes in the cluster, CA rotation will be unable to complete. If rotation seems to be hanging, run docker node ls --format "{{.ID}} {{.Hostname}} {{.Status}} {{.TLSStatus}}" to determine whether any nodes are down or are otherwise unable to rotate TLS certificates.

  • MKE Cluster Root CA server is coupled with Docker Swarm Root CA, as MKE nodes are also swarm nodes. Thus, if users want to rotate the Docker Swarm Root CA certificate, they must not use the docker swarm ca command in any form as it may break their MKE cluster.

  • Rotating MKE Cluster Root CA causes several MKE components to restart, which can result in cluster downtime. As such, Mirantis recommends performing such rotations outside of peak business hours.

  • You should only rotate the MKE Cluster Root CA certificate for reasons of security, a good example being if the certificate has been compromised. The MKE Cluster Root CA certificate is valid for 20 years, thus rotation is typically not necessary.

You must use the MKE CLI to rotate the existing root CA certificate or to provide your own root CA certificate and private key:

  1. SSH into one of the manager nodes of your cluster.

  2. Make a backup prior to making changes to MKE Cluster Root CA.

    Warning

    In the event of a failure, and if troubleshooting procedures do not work, you may need to restore the cluster using the backup.

  3. Use the mirantis/ucp:3.x.y image to run the ca --cluster command with the desired options.

    docker container run -it --rm \
      --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      mirantis/ucp:3.x.y \
      ca --cluster <command-options>
    

    Note

    You can use the --rotate flag to automatically regenerate root CA material.

    You can provide your own root CA certificate and private key by bind-mounting them to the CLI container at /ca/cert.pem and /ca/key.pem, respectively.

    • The certificate must be a self-signed root certificate, and intermediate certificates are not allowed.

    • The MKE Cluster Root CA certificate must have a common name that is equivalent to swarm-ca.

    • The certificate and key must be in PEM format without a passphrase.

    docker container run -it --rm \
      --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v /path/to/cert.pem:/ca/cert.pem \
      -v /path/to/key.pem:/ca/key.pem \
      mirantis/ucp:3.x.y \
      ca --cluster
    
MKE etcd Root CA

Available since MKE 3.7.2

The self-deployed MKE etcd Root CA server issues certificates for MKE components that enable the components to communicate with etcd cluster.

Important

If you upgraded your cluster from any version of MKE prior to MKE 3.7.2, the etcd root CA will not be unique. To ensure the uniqueness of the etcd root CA, rotate the etcd CA material using the instructions herein.

To rotate the certificate material of the MKE etcd Root CA or provide your own certificate and private key:

Caution

  • Rotating MKE etcd Root CA causes several MKE components to restart, which can result in cluster downtime. As such, Mirantis recommends performing such rotations outside of peak business hours.

  • Other than the aforementioned purpose of ensuring the uniqueness of the etcd root CA, you should only rotate the MKE etcd Root CA certificate for reasons of security, a good example being if the certificate has been compromised. The MKE etcd Root CA certificate is valid for 20 years, thus rotation is typically not necessary.

You must use the MKE CLI to rotate the existing root CA certificate and private key:

  1. SSH into one of the manager nodes of your cluster.

  2. Make a backup prior to making changes to MKE etcd Root CA.

    Warning

    In the event of a failure, and if troubleshooting procedures do not work, you may need to restore the cluster using the backup.

  3. Use the mirantis/ucp:3.x.y image to run the ca --etcd command with the desired options.

    docker container run -it --rm \
      --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      mirantis/ucp:3.x.y \
      ca --etcd <command-options>
    

    Note

    You can use the --rotate flag to automatically regenerate root CA material.

    You can provide your own root CA certificate and private key by bind-mounting them to the CLI container at /ca/cert.pem and /ca/key.pem, respectively.

    • The certificate must be a self-signed root certificate, and intermediate certificates are not allowed.

    • The MKE etcd Root CA certificate must have a common name that is equivalent to MKE etcd Root CA.

    • The certificate and key must be in PEM format without a passphrase.

    docker container run -it --rm \
      --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v /path/to/cert.pem:/ca/cert.pem \
      -v /path/to/key.pem:/ca/key.pem \
      mirantis/ucp:3.x.y \
      ca --etcd
    
MKE Client Root CA

Available since MKE 3.7.0

MKE deploys the MKE Client Root CA server to act as the default signer of the Kubernetes Controller Manager, while also signing TLS certificates for non-admin client bundles. In addition, this CA server is used by default when accessing MKE API using HTTPS.

Note

To replace the MKE Client Root CA server with an external CA for MKE API use only, refer to Use your own TLS certificates.

To rotate the existing root CA certificate or provide your own certificate and private key:

Caution

  • As rotating the MKE Client Root CA invalidates all previously created non-admin client bundles, you will need to recreate these bundles following the rotation.

  • You should only rotate the MKE Client Root CA certificate for reasons of security, a good example being if the certificate has been compromised. The MKE Client Root CA certificate is valid for 20 years, thus rotation is typically not necessary.

You must use the MKE CLI to rotate the existing root CA certificate or to provide your own root CA certificate and private key:

  1. SSH into one of the manager nodes of your cluster.

  2. Make a backup prior to making changes to MKE Cluster Root CA.

    Warning

    In the event of a failure, and if troubleshooting procedures do not work, you may need to restore the cluster using the backup.

  3. Use the mirantis/ucp:3.x.y image to run the ca --client command with the desired options:

    docker container run -it --rm \
      --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      mirantis/ucp:3.x.y \
      ca --client <command-options>
    

    Note

    You can use the --rotate flag to automatically regenerate root CA material.

    You can provide your own root CA certificate and private key by bind-mounting them to the CLI container at /ca/cert.pem and /ca/key.pem, respectively.

    • The certificate must be a self-signed root certificate, and intermediate certificates are not allowed.

    • The MKE Client Root CA certificate must have UCP Client Root CA as its common name.

    • The certificate and key must be in PEM format without a passphrase.

    docker container run -it --rm \
      --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v /path/to/cert.pem:/ca/cert.pem \
      -v /path/to/key.pem:/ca/key.pem \
      mirantis/ucp:3.x.y \
      ca --client
    

Use your own TLS certificates

To ensure all communications between clients and MKE are encrypted, all MKE services are exposed using HTTPS. By default, this is done using self-signed TLS certificates that are not trusted by client tools such as web browsers. Thus, when you try to access MKE, your browser warns that it does not trust MKE or that MKE has an invalid certificate.

You can configure MKE to use your own TLS certificates. As a result, your browser and other client tools will trust your MKE installation.

Mirantis recommends that you make this change outside of peak business hours. Your applications will continue to run normally, but existing MKE client certificates will become invalid, and thus users will have to download new certificates to access MKE from the CLI.


To configure MKE to use your own TLS certificates and keys:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Certificates.

  3. Upload your certificates and keys based on the following table.

    Note

    All keys and certificates must be uploaded in PEM format.

    Type

    Description

    Private key

    The unencrypted private key for MKE. This key must correspond to the public key used in the server certificate. This key does not use a password.

    Click Upload Key to upload a PEM file.

    Server certificate

    The MKE public key certificate, which establishes a chain of trust up to the root CA certificate. It is followed by the certificates of any intermediate certificate authorities.

    Click Upload Certificate to upload a PEM file.

    CA certificate

    The public key certificate of the root certificate authority that issued the MKE server certificate. If you do not have a CA certificate, use the top-most intermediate certificate instead.

    Click Upload CA Certificate to upload a PEM file.

    Client CA

    This field may contain one or more Root CA certificates that the MKE controller uses to verify that client certificates are issued by a trusted entity.

    Click Upload CA Certificate to upload a PEM file.

    Click Download MKE Server CA Certificate to download the certificate as a PEM file.

    Note

    MKE is automatically configured to trust its internal CAs, which issue client certificates as part of generated client bundles. However, you may supply MKE with additional custom root CA certificates using this field to enable MKE to trust the client certificates issued by your corporate or trusted third-party certificate authorities. Note that your custom root certificates will be appended to MKE internal root CA certificates.

  4. Click Save.

After replacing the TLS certificates, your users will not be able to authenticate with their old client certificate bundles. Ask your users to access the MKE web UI and download new client certificate bundles.

Finally, Mirantis Secure Registry (MSR) deployments must be reconfigured to trust the new MKE TLS certificates. To do this, MSR 3.1.x users can refer to Add a custom TLS certificate, MSR 3.0.x userr to Add a custom TLS certificate, and MSR 2.9.x users to Add a custom TLS certificate.

Manage and deploy private images

Mirantis offers its own image registry, Mirantis Secure Registry (MSR), which you can use to store and manage the images that you deploy to your cluster. This topic describes how to use MKE to push the official WordPress image to MSR and later deploy that image to your cluster.


To create an MSR image repository:

  1. Log in to the MKE web UI.

  2. From the left-side navigation panel, navigate to <user name> > Admin Settings > Mirantis Secure Registry.

  3. In the Installed MSRs section, capture the MSR URL for your cluster.

  4. In a new browser tab, navigate to the MSR URL captured in the previous step.

  5. From the left-side navigation panel, click Repositories.

  6. Click New repository.

  7. In the namespace field under New Repository, select the required namespace. The default namespace is your user name.

  8. In the name field under New Repository, enter the name wordpress.

  9. To create the repository, click Save.


To push an image to MSR:

In this example, you will pull the official WordPress image from Docker Hub, tag it, and push it to MSR. Once pushed to MSR, only authorized users will be able to make changes to the image. Pushing to MSR requires CLI access to a licensed MSR installation.

  1. Pull the public WordPress image from Docker Hub:

    docker pull wordpress
    
  2. Tag the image, using the IP address or DNS name of your MSR instance. For example:

    docker tag wordpress:latest <msr-url>:<port>/<namespace>/wordpress:latest
    
  3. Log in to an MKE manager node.

  4. Push the tagged image to MSR:

    docker image push <msr-url>:<port>/admin/wordpress:latest
    
  5. Verify that the image is stored in your MSR repository:

    1. Log in to the MSR web UI.

    2. In the left-side navigation panel, click Repositories.

    3. Click admin/wordpress to open the repo.

    4. Click the Tags tab to view the stored images.

    5. Verify that the latest tag is present.


To deploy the private image to MKE:

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, click Kubernetes.

  3. Click Create to open the Create Kubernetes Object page.

  4. In the Namespace dropdown, select default.

  5. In the Object YAML editor, paste the following Deployment object YAML:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: wordpress-deployment
    spec:
      selector:
        matchLabels:
          app: wordpress
      replicas: 2
      template:
        metadata:
          labels:
            app: wordpress
        spec:
          containers:
            - name: wordpress
              image: 52.10.217.20:444/admin/wordpress:latest
              ports:
                - containerPort: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: wordpress-service
      labels:
        app: wordpress
    spec:
      type: NodePort
      ports:
        - port: 80
          nodePort: 32768
      selector:
        app: wordpress
    

    The Deployment object YAML specifies your MSR image in the Pod template spec: image: <msr-url>:<port>/admin/wordpress:latest. Also, the YAML file defines a NodePort service that exposes the WordPress application so that it is accessible from outside the cluster.

  6. Click Create. Creating the new Kubernetes objects will open the Controllers page.

  7. After a few seconds, verify that wordpress-deployment has a green status icon and is thus successfully deployed.

Set the node orchestrator

When you add a node to your cluster, by default its workloads are managed by Swarm. Changing the default orchestrator does not affect existing nodes in the cluster. You can also change the orchestrator type for individual nodes in the cluster.

Select the node orchestrator

The workloads on your cluster can be scheduled by Kubernetes, Swarm, or a combination of the two. If you choose to run a mixed cluster, be aware that different orchestrators are not aware of each other, and thus there is no coordination between them.

Mirantis recommends that you decide which orchestrator you will use when initially setting up your cluster. Once you start deploying workloads, avoid changing the orchestrator setting. If you do change the node orchestrator, your workloads will be evicted and you will need to deploy them again using the new orchestrator.

Caution

When you promote a worker node to be a manager, its orchestrator type automatically changes to Mixed. If you later demote that node to be a worker, its orchestrator type remains as Mixed.

Note

The default behavior for Mirantis Secure Registry (MSR) nodes is to run in the Mixed orchestration mode. If you change the MSR orchestrator type to Swarm or Kubernetes only, reconciliation will revert the node back to the Mixed mode.

Changing a node orchestrator

When you change the node orchestrator, existing workloads are evicted and they are not automatically migrated to the new orchestrator. You must manually migrate them to the new orchestrator. For example, if you deploy WordPress on Swarm, and you change the node orchestrator to Kubernetes, MKE does not migrate the workload, and WordPress continues running on Swarm. You must manually migrate your WordPress deployment to Kubernetes.

The following table summarizes the results of changing a node orchestrator.

Workload

Orchestrator-related change

Containers

Containers continue running on the node.

Docker service

The node is drained and tasks are rescheduled to another node.

Pods and other imperative resources

Imperative resources continue running on the node.

Deployments and other declarative resources

New declarative resources will not be scheduled on the node and existing ones will be rescheduled at a time that can vary based on resource details.

If a node is running containers and you change the node to Kubernetes, the containers will continue running and Kubernetes will not be aware of them. This is functionally the same as running the node in the Mixed mode.

Warning

The Mixed mode is not intended for production use and it may impact the existing workloads on the node.

This is because the two orchestrator types have different views of the node resources and they are not aware of the other orchestrator resources. One orchestrator can schedule a workload without knowing that the node resources are already committed to another workload that was scheduled by the other orchestrator. When this happens, the node can run out of memory or other resources.

Mirantis strongly recommends against using the Mixed mode in production environments.

Change the node orchestrator

This topic describes how to set the default orchestrator and change the orchestrator for individual nodes.

Set the default orchestrator

To set the default orchestrator using the MKE web UI:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Orchestration.

  3. Under Scheduler, select the required default orchestrator.

  4. Click Save.

New workloads will now be scheduled by the specified orchestrator type. Existing nodes in the cluster are not affected.

Once a node is joined to the cluster, you can change the orchestrator that schedules its workloads.


To set the default orchestrator using the MKE configuration file:

  1. Obtain the current MKE configuration file for your cluster.

  2. Set default_node_orchestrator to "swarm" or "kubernetes".

  3. Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

Change the node orchestrator

To change the node orchestrator using the MKE web UI:

  1. Log in to the MKE web UI as an administrator.

  2. From the left-side navigation panel, navigate to Shared Resources > Nodes.

  3. Click the node that you want to assign to a different orchestrator.

  4. In the upper right, click the Edit Node icon.

  5. In the Details pane, in the Role section under ORCHESTRATOR TYPE, select either Swarm, Kubernetes, or Mixed.

    Warning

    Mirantis strongly recommends against using the Mixed mode in production environments.

  6. Click Save to assign the node to the selected orchestrator.


To change the node orchestrator using the CLI:

Set the orchestrator on a node by assigning the orchestrator labels, com.docker.ucp.orchestrator.swarm or com.docker.ucp.orchestrator.kubernetes to true.

  1. Change the node orchestrator. Select from the following options:

    • Schedule Swarm workloads on a node:

      docker node update --label-add com.docker.ucp.orchestrator.swarm=true <node-id>
      
    • Schedule Kubernetes workloads on a node:

      docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>
      
    • Schedule both Kubernetes and Swarm workloads on a node:

      docker node update --label-add com.docker.ucp.orchestrator.swarm=true <node-id>
      docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>
      

      Warning

      Mirantis strongly recommends against using the Mixed mode in production environments.

    • Change the orchestrator type for a node from Swarm to Kubernetes:

      docker node update --label-add com.docker.ucp.orchestrator.kubernetes=true <node-id>
      docker node update --label-rm com.docker.ucp.orchestrator.swarm <node-id>
      
    • Change the orchestrator type for a node from Kubernetes to Swarm:

      docker node update --label-add com.docker.ucp.orchestrator.swarm=true <node-id>
      docker node update --label-rm com.docker.ucp.orchestrator.kubernetes <node-id>
      

    Note

    You must first add the target orchestrator label and then remove the old orchestrator label. Doing this in the reverse order can fail to change the orchestrator.

  2. Verify the value of the orchestrator label by inspecting the node:

    docker node inspect <node-id> | grep -i orchestrator
    

    Example output:

    "com.docker.ucp.orchestrator.kubernetes": "true"
    

Important

The com.docker.ucp.orchestrator label is not displayed in the MKE web UI Labels list, which presents in the Overview pane for each node.

View Kubernetes objects in a namespace

MKE administrators can filter the view of Kubernetes objects by the namespace that the objects are assigned to, specifying a single namespace or all available namespaces. This topic describes how to deploy services to two newly created namespaces and then view those services, filtered by namespace.


To create two namespaces:

  1. Log in to the MKE web UI as an administrator.

  2. From the left-side navigation panel, click Kubernetes.

  3. Click Create to open the Create Kubernetes Object page.

  4. Leave the Namespace drop-down blank.

  5. In the Object YAML editor, paste the following YAML code:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: blue
    ---
    apiVersion: v1
    kind: Namespace
    metadata:
      name: green
    
  6. Click Create to create the blue and green namespaces.


To deploy services:

  1. Create a NodePort service in the blue namespace:

    1. From the left-side navigation panel, navigate to Kubernetes > Create.

    2. In the Namespace drop-down, select blue.

    3. In the Object YAML editor, paste the following YAML code:

      apiVersion: v1
      kind: Service
      metadata:
        name: app-service-blue
        labels:
          app: app-blue
      spec:
        type: NodePort
        ports:
          - port: 80
            nodePort: 32768
        selector:
          app: app-blue
      
    4. Click Create to deploy the service in the blue namespace.

  2. Create a NodePort service in the green namespace:

    1. From the left-side navigation panel, navigate to Kubernetes > Create.

    2. In the Namespace drop-down, select green.

    3. In the Object YAML editor, paste the following YAML code:

      apiVersion: v1
      kind: Service
      metadata:
        name: app-service-green
        labels:
          app: app-green
      spec:
        type: NodePort
        ports:
          - port: 80
            nodePort: 32769
        selector:
          app: app-green
      
    4. Click Create to deploy the service in the green namespace.


To view the newly created services:

  1. In the left-side navigation panel, click Namespaces.

  2. In the upper-right corner, click the Set context for all namespaces toggle. The indicator in the left-side navigation panel under Namespaces changes to All Namespaces.

  3. Click Services to view your services.


Filter the view by namespace:

  1. In the left-side navigation panel, click Namespaces.

  2. Hover over the blue namespace and click Set Context. The indicator in the left-side navigation panel under Namespaces changes to blue.

  3. Click Services to view the app-service-blue service. Note that the app-service-green service does not display.

Perform the forgoing steps on the green namespace to view only the services deployed in the green namespace.

Join Nodes

Set up high availability

MKE is designed to facilitate high availability (HA). You can join multiple manager nodes to the cluster, so that if one manager node fails, another one can automatically take its place without impacting the cluster.

Including multiple manager nodes in your cluster allows you to handle manager node failures and load-balance user requests across all manager nodes.

The following table exhibits the relationship between the number of manager nodes used and the number of faults that your cluster can tolerate:

Manager nodes

Failures tolerated

1

0

3

1

5

2

For deployment into product environments, follow these best practices:

  • For HA with minimal network overhead, Mirantis recommends using three manager nodes and a maximum of five. Adding more manager nodes than this can lead to performance degradation, as configuration changes must be replicated across all manager nodes.

  • You should bring failed manager nodes back online as soon as possible, as each failed manager node decreases the number of failures that your cluster can tolerate.

  • You should distribute your manager nodes across different availability zones. This way your cluster can continue working even if an entire availability zone goes down.

Join Linux nodes

MKE allows you to add or remove nodes from your cluster as your needs change over time.

Because MKE leverages the clustering functionality provided by Mirantis Container Runtime (MCR), you use the docker swarm join command to add more nodes to your cluster. When you join a new node, MCR services start running on the node automatically.

You can add both Linux manager and worker nodes to your cluster.

Join a node to the cluster

Important

Prior to adding a node that was previously a part of the same MKE cluster or a different one, you must run the following command to remove any stale MKE volumes:

docker volume rm `docker volume list --filter name=ucp* -q`

Next, run the following command to verify the removal of the stale volumes:

docker volume list --filter name=ucp*
  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes.

  3. Click Add Node.

  4. Select Linux for the node type.

  5. Select either Manager or Worker, as required.

  6. Optional. Select Use a custom listen address to specify the address and port where the new node listens for inbound cluster management traffic.

  7. Optional. Select Use a custom advertise address to specify the IP address that is advertised to all members of the cluster for API access.

  8. Copy the displayed command, which looks similar to the following:

    docker swarm join --token <token> <mke-node-ip>
    
  9. Use SSH to log in to the host that you want to join to the cluster.

  10. Run the docker swarm join command captured previously.

    The node will display in the Shared Resources > Nodes page.

Pause or drain a node

Note

You can pause or drain a node only with swarm workloads.

You can configure the availability of a node so that it is in one of the following three states:

Active

The node can receive and execute tasks.

Paused

The node continues running existing tasks, but does not receive new tasks.

Drained

Existing tasks are stopped, while replica tasks are launched in active nodes. The node does not receive new tasks.


To pause or drain a node:

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.

  3. In the Details pane, click Configure and select Details to open the Edit Node page.

  4. In the upper right, select the Edit Node icon.

  5. In the Availability section, click Active, Pause, or Drain.

  6. Click Save.

Promote or demote a node

You can promote worker nodes to managers to make MKE fault tolerant. You can also demote a manager node into a worker node.

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.

  3. In the upper right, select the Edit Node icon.

  4. In the Role section, click Manager or Worker.

  5. Click Save and wait until the operation completes.

  6. Navigate to Shared Resources > Nodes and verify the new node role.

Note

If you are load balancing user requests to MKE across multiple manager nodes, you must remove these nodes from the load-balancing pool when demoting them to workers.

Remove a node from the cluster

To remove an inaccessible worker node or one that is down:

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.

  3. In the upper right, select the vertical ellipsis and click Remove.

  4. When prompted, click Confirm.


To remove an inactive worker node:

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.

  3. Drain the node, to ensure that the workload is scheduled to another node.

  4. Click the vertical ellipsis in the upper right and select Force Remove.

  5. When prompted, click Confirm.


To remove a manager node:

  1. Verify that all nodes in the cluster are healthy.

    Warning

    Do not remove the manager node if all nodes are not healthy.

  2. Demote the manager to a worker node.

  3. Remove the newly-demoted worker from the cluster, as described in the preceding steps.

Join Windows worker nodes

MKE allows you to add or remove nodes from your cluster as your needs change over time.

Because MKE leverages the clustering functionality provided by Mirantis Container Runtime (MCR), you use the docker swarm join command to add more nodes to your cluster. When you join a new node, MCR services start running on the node automatically.

MKE supports running worker nodes on Windows Server. You must run all manager nodes on Linux.

Windows nodes limitations

The following features are not yet supported using Windows Server:

Category

Feature

Networking

Encrypted networks are not supported. If you have upgraded from a previous version of MKE, you will need to recreate an unencrypted version of the ucp-hrm network.

Secrets

  • When using secrets with Windows services, Windows stores temporary secret files on your disk. You can use BitLocker on the volume containing the Docker root directory to encrypt the secret data at rest.

  • When creating a service that uses Windows containers, the options to specify UID, GID, and mode are not supported for secrets. Secrets are only accessible by administrators and users with system access within the container.

Mounts

On Windows, Docker cannot listen on a Unix socket. Use TCP or a named pipe instead.

Configure the Docker daemon for Windows nodes

Note

If the cluster is deployed in a site that is offline, sideload MKE images onto the Windows Server nodes. For more information, refer to Install MKE offline.

  1. On a manager node, list the images that are required on Windows nodes:

    docker container run --rm -v /var/run/docker.sock:/var/run/docker.sock mirantis/ucp:3.7.16 images --list --enable-windows
    

    Example output:

    mirantis/ucp-agent-win:3.7.16
    mirantis/ucp-dsinfo-win:3.7.16
    
  2. Pull the required images. For example:

    docker image pull mirantis/ucp-agent-win:3.7.16
    docker image pull mirantis/ucp-dsinfo-win:3.7.16
    
Join Windows nodes to the cluster
  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes.

  3. Click Add Node.

  4. Select Windows for the node type.

  5. Optional. Select Use a custom listen address to specify the address and port where the new node listens for inbound cluster management traffic.

  6. Optional. Select Use a custom advertise address to specify the IP address that is advertised to all members of the cluster for API access.

  7. Copy the displayed command, which looks similar to the following:

    docker swarm join --token <token> <mke-worker-ip>
    

    Alternatively, you can use the command line to obtain the join token. Using your MKE client bundle, run:

    docker swarm join-token worker
    
  8. Run the docker swarm join command captured in the previous step on each instance of Windows Server that will be a worker node.

Use a load balancer

After joining multiple manager nodes for high availability (HA), you can configure your own load balancer to balance user requests across all manager nodes.

Use of a load balancer allows users to access MKE using a centralized domain name. The load balancer can detect when a manager node fails and stop forwarding requests to that node, so that users are unaffected by the failure.

Configure load balancing on MKE
  1. Because MKE uses TLS, do the following when configuring your load balancer:

    • Load-balance TCP traffic on ports 443 and 6443.

    • Do not terminate HTTPS connections.

    • On each manager node, use the /_ping endpoint to verify whether the node is healthy and whether or not it should remain in the load balancing pool.

  2. Use the following examples to configure your load balancer for MKE:

    user  nginx;
       worker_processes  1;
    
       error_log  /var/log/nginx/error.log warn;
       pid        /var/run/nginx.pid;
    
       events {
          worker_connections  1024;
       }
    
       stream {
          upstream ucp_443 {
             server <UCP_MANAGER_1_IP>:443 max_fails=2 fail_timeout=30s;
             server <UCP_MANAGER_2_IP>:443 max_fails=2 fail_timeout=30s;
             server <UCP_MANAGER_N_IP>:443 max_fails=2 fail_timeout=30s;
          }
          server {
             listen 443;
             proxy_pass ucp_443;
          }
          upstream ucp_6443 {
             server <UCP_MANAGER_1_IP>:6443 max_fails=2 fail_timeout=30s;
             server <UCP_MANAGER_2_IP>:6443 max_fails=2 fail_timeout=30s;
             server <UCP_MANAGER_N_IP>:6443 max_fails=2 fail_timeout=30s;
          }
          server {
             listen 6443;
             proxy_pass ucp_6443;
          }
       }
    
    global
          log /dev/log    local0
          log /dev/log    local1 notice
    
       defaults
             mode    tcp
             option  dontlognull
             timeout connect     5s
             timeout client      50s
             timeout server      50s
             timeout tunnel      1h
             timeout client-fin  50s
       ### frontends
       # Optional HAProxy Stats Page accessible at http://<host-ip>:8181/haproxy?stats
       frontend ucp_stats
             mode http
             bind 0.0.0.0:8181
             default_backend ucp_stats
       frontend ucp_443
             mode tcp
             bind 0.0.0.0:443
             default_backend ucp_upstream_servers_443
       frontend ucp_6443
             mode tcp
             bind 0.0.0.0:6443
             default_backend ucp_upstream_servers_6443
       ### backends
       backend ucp_stats
             mode http
             option httplog
             stats enable
             stats admin if TRUE
             stats refresh 5m
       backend ucp_upstream_servers_443
             mode tcp
             option httpchk GET /_ping HTTP/1.1\r\nHost:\ <UCP_FQDN>
             server node01 <UCP_MANAGER_1_IP>:443 weight 100 check check-ssl verify none
             server node02 <UCP_MANAGER_2_IP>:443 weight 100 check check-ssl verify none
             server node03 <UCP_MANAGER_N_IP>:443 weight 100 check check-ssl verify none
       backend ucp_upstream_servers_6443
             mode tcp
             option httpchk GET /_ping HTTP/1.1\r\nHost:\ <UCP_FQDN>
             server node01 <UCP_MANAGER_1_IP>:6443 weight 100 check check-ssl verify none
             server node02 <UCP_MANAGER_2_IP>:6443 weight 100 check check-ssl verify none
             server node03 <UCP_MANAGER_N_IP>:6443 weight 100 check check-ssl verify none
    
    {
          "Subnets": [
             "subnet-XXXXXXXX",
             "subnet-YYYYYYYY",
             "subnet-ZZZZZZZZ"
          ],
          "CanonicalHostedZoneNameID": "XXXXXXXXXXX",
          "CanonicalHostedZoneName": "XXXXXXXXX.us-west-XXX.elb.amazonaws.com",
          "ListenerDescriptions": [
             {
                   "Listener": {
                      "InstancePort": 443,
                      "LoadBalancerPort": 443,
                      "Protocol": "TCP",
                      "InstanceProtocol": "TCP"
                   },
                   "PolicyNames": []
             },
             {
                   "Listener": {
                      "InstancePort": 6443,
                      "LoadBalancerPort": 6443,
                      "Protocol": "TCP",
                      "InstanceProtocol": "TCP"
                   },
                   "PolicyNames": []
             }
          ],
          "HealthCheck": {
             "HealthyThreshold": 2,
             "Interval": 10,
             "Target": "HTTPS:443/_ping",
             "Timeout": 2,
             "UnhealthyThreshold": 4
          },
          "VPCId": "vpc-XXXXXX",
          "BackendServerDescriptions": [],
          "Instances": [
             {
                   "InstanceId": "i-XXXXXXXXX"
             },
             {
                   "InstanceId": "i-XXXXXXXXX"
             },
             {
                   "InstanceId": "i-XXXXXXXXX"
             }
          ],
          "DNSName": "XXXXXXXXXXXX.us-west-2.elb.amazonaws.com",
          "SecurityGroups": [
             "sg-XXXXXXXXX"
          ],
          "Policies": {
             "LBCookieStickinessPolicies": [],
             "AppCookieStickinessPolicies": [],
             "OtherPolicies": []
          },
          "LoadBalancerName": "ELB-UCP",
          "CreatedTime": "2017-02-13T21:40:15.400Z",
          "AvailabilityZones": [
             "us-west-2c",
             "us-west-2a",
             "us-west-2b"
          ],
          "Scheme": "internet-facing",
          "SourceSecurityGroup": {
             "OwnerAlias": "XXXXXXXXXXXX",
             "GroupName":  "XXXXXXXXXXXX"
          }
       }
    
  3. Create either the nginx.conf or haproxy.cfg file, as required.

    For instruction on deploying with AWS LB, refer to Getting Started with Network Load Balancers in the AWS documentation.

  4. Deploy the load balancer:

    docker run --detach \
    --name ucp-lb \
    --restart=unless-stopped \
    --publish 443:443 \
    --publish 6443:6443 \
    --volume ${PWD}/nginx.conf:/etc/nginx/nginx.conf:ro \
    nginx:stable-alpine
    
    docker run --detach \
    --name ucp-lb \
    --publish 443:443 \
    --publish 6443:6443 \
    --publish 8181:8181 \
    --restart=unless-stopped \
    --volume ${PWD}/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro \
    haproxy:1.7-alpine haproxy -d -f /usr/local/etc/haproxy/haproxy.cfg
    
Load balancing MKE and MSR together

By default, both MKE and Mirantis Secure Registry (MSR) use port 443. If you plan to deploy MKE and MSR together, your load balancer must distinguish traffic between the two by IP address or port number.

If you want MKE and MSR both to use port 443, then you must either use separate load balancers for each or use two virtual IPs. Otherwise, you must configure your load balancer to expose MKE or MSR on a port other than 443.

Use two-factor authentication

Two-factor authentication (2FA) adds an extra layer of security when logging in to the MKE web UI. Once enabled, 2FA requires the user to submit an additional authentication code generated on a separate mobile device along with their user name and password at login.

Configure 2FA

MKE 2FA requires the use of a time-based one-time password (TOTP) application installed on a mobile device to generate a time-based authentication code for each login to the MKE web UI. Examples of such applications include 1Password, Authy, and LastPass Authenticator.

To configure 2FA:

  1. Install a TOTP application to your mobile device.

  2. In the MKE web UI, navigate to My Profile > Security.

  3. Toggle the Two-factor authentication control to enabled.

  4. Open the TOTP application and scan the offered QR code. The device will display a six-digit code.

  5. Enter the six-digit code in the offered field and click Register. The TOTP application will save your MKE account.

    Important

    A set of recovery codes displays in the MKE web UI when two-factor authentication is enabled. Save these codes in a safe location, as they can be used to access the MKE web UI if for any reason the configured mobile device becomes unavailable. Refer to Recover 2FA for details.

Access MKE using 2FA

Once 2FA is enabled, you will need to provide an authentication code each time you log in to the MKE web UI. Typically, the TOTP application installed on your mobile device generates the code and refreshes it every 30 seconds.

Access the MKE web UI with 2FA enabled:

  1. In the MKE web UI, click Sign in. The Sign in page will display.

  2. Enter a valid user name and password.

  3. Access the MKE code in the TOTP application on your mobile device.

  4. Enter the current code in the 2FA Code field in the MKE web UI.

Note

Multiple authentication failures may indicate a lack of synchronization between the mobile device clock and the mobile provider.

Disable 2FA

Mirantis strongly recommends using 2FA to secure MKE accounts. If you need to temporarily disable 2FA, re-enable it as soon as possible.

To disable 2FA:

  1. In the MKE web UI, navigate to My Profile > Security.

  2. Toggle the Two-factor authentication control to disabled.

Recover 2FA

If the mobile device with authentication codes is unavailable, you can re-access MKE using any of the recovery codes that display in the MKE web UI when 2FA is first enabled.

To recover 2FA:

  1. Enter one of the recovery codes when prompted for the two-factor authentication code upon login to the MKE web UI.

  2. Navigate to My Profile > Security.

  3. Disable 2FA and then re-enable it.

  4. Open the TOTP application and scan the offered QR code. The device will display a six-digit code.

  5. Enter the six-digit code in the offered field and click Register. The TOTP application will save your MKE account.

If there are no recovery codes to draw from, ask your system administrator to disable 2FA in order to regain access to the MKE web UI. Once done, repeat the Configure 2FA procedure to reinstate 2FA protection.

MKE administrators are not able to re-enable 2FA for users.

Account lockout

You can configure MKE so that a user account is temporarily blocked from logging in following a series of unsuccessful login attempts. The account lockout feature only prevents log in attempts that are made using basic authorization or LDAP. Log in attempts using either SAML or OIDC do not trigger the account lockout feature. Admin accounts are never locked.

Account lockouts expire after a set amount of time, after which the affected user can log in as normal. Subsequent log in attempts on a locked account do not extend the lockout period. Log in attempts against a locked account always cause a standard incorrect credentials error, providing no indication to the user that the account is locked. Only MKE admins can see account lockout status.

Configure account lockout functionality
  1. Obtain the current MKE configuration file for your cluster.

  2. Set the following parameters in the auth.account_lock section of the MKE configuration file:

    • Set the value of enabled to true.

    • Set the value of failureTriggers to the number of failed log in attempts that can be made before an account is locked.

    • Set the value of durationSeconds to the desired lockout duration. A value of 0 indicates that the account will remain locked until it is unlocked by an administrator.

  3. Upload the new MKE configuration file.

Note

You can verify the lockout status of your organization accounts by issuing a GET request to the /accounts endpoint.

Unlock an account

The account remains locked until the specified amount of time has elapsed. Otherwise, you must either have an administrator unlock the account or globally disable the account lockout feature.


To unlock a locked account:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to Access Control > Users and select the user who is locked out of their account.

  3. Click the gear icon in the upper right corner.

  4. Navigate to the Security tab.

    Note

    An expired account lock only resets once a new log in attempt is made. Until such time, the account will present as locked to administrators.

  5. Click the Unlock account button.


To globally disable the account lockout feature:

  1. Obtain the current MKE configuration file for your cluster.

  2. In the auth.account_lock section of the MKE configuration file, set the value of enabled to false.

  3. Upload the new MKE configuration file.

Custom kubelet profiles

Available since MKE 3.7.10

Using kubelet node profiles, you can customize your kubelet settings at a node-by-node level, rather than setting cluster-wide flags that apply to all of your kubelet agents.

Note

MKE does not currently support kubelet node profiles on windows nodes.

Add custom kubelet profiles

Custom kubelet profiles are set up through the MKE configuration file.

  1. Download the MKE configuration file.

  2. Add a parameter called [cluster_config.custom_kubelet_flags_profiles] to the MKE configuration file.

    Note

    You can define the profile name with one more more valid kubelet flags, with each flag separated by a space.

    Example:

[cluster_config.custom_kubelet_flags_profiles]
  high = "--kube-reserved=cpu=200m,memory=512Mi --kube-reserved-cgroup=/high"
  low = "--kube-reserved=cpu=100m,memory=256Mi --kube-reserved-cgroup=/low"
Apply kubelet node profiles

Once you have added the new kubelet node profiles to the MKE configuration file and uploaded the file to MKE, you can apply the profiles to your nodes.

  1. Download and configure the client bundle.

  2. View the nodes in the Kubernetes cluster:

    kubectl get node
    

    Example: output:

    NAME     STATUS   ROLES    AGE   VERSION
    node-0   Ready    master   23h   v1.27.10-mirantis-1
    node-1   Ready    <none>   23h   v1.27.10-mirantis-1
    node-2   Ready    <none>   23h   v1.27.10-mirantis-1
    ...
    
  3. Apply a profile to a target node:

    kubectl label node node-1 custom-kubelet-profile=low
    

    Example: output:

    node/node-1 labeled
    

    Note

    You can apply profiles to any number of nodes, however be aware that each node can only have one active profile.

Modify kubelet node profiles

Any changes you make to a kubelet node profile will instantly affect the nodes in which the profile is in use. As such, Mirantis strongly recommends that you first test any modifications in a limited scope, by creating a new profile with the modifications and applying it to a small number of nodes.

Warning

Misconfigured modifications made to a kubelet node profile that is in use by a large number of cluster nodes can result in those nodes becoming nonoperational.

Example scenario:

You have defined the following kubelet node profiles in the MKE configuration file:

[cluster_config.custom_kubelet_flags_profiles]
  high = "--kube-reserved=cpu=200m,memory=512Mi --kube-reserved-cgroup=/high"
  low = "--kube-reserved=cpu=100m,memory=256Mi --kube-reserved-cgroup=/low"

Modify the profile definition as follows:

[cluster_config.custom_kubelet_flags_profiles]
  high = "--kube-reserved=cpu=200m,memory=512Mi --kube-reserved-cgroup=/high"
  low = "--kube-reserved=cpu=100m,memory=256Mi --kube-reserved-cgroup=/low"
  lowtest = "--kube-reserved=cpu=150m,memory=256Mi --kube-reserved-cgroup=/low"

Apply the new lowtest label to a small set of test nodes.

Once the profile is verified on your test nodes, remove lowtest from the profile definition and update low to use the updated --kube-reserved=cpu value.

Configure Graceful Node Shutdown with kubelet node profiles

Available since MKE 3.7.12

To configure Graceful Node Shutdown grace periods in MKE cluster, set the following flags in the [cluster_config.custom_kubelet_flags_profiles] section of the MKE configuration file:

  • --shutdown-grace-period=0s

  • --shutdown-grace-period-critical-pods=0s

The GracefulNodeShutdown feature gate is enabled by default, with shutdown grace period parameters both set to 0s.

  1. When you add your custom kubelet profiles, insert and set the GracefulNodeShutdown flags in the MKE configuration file. For example:

    [cluster_config.custom_kubelet_flags_profiles]
      manager = "--shutdown-grace-period=30s --shutdown-grace-period-critical-pods=20s"
      worker = "--shutdown-grace-period=60s
      --shutdown-grace-period-critical-pods=50s"
    
  2. Apply your kubelet node profiles.

  3. From a labeled node with GracefulNodeShutdown enabled, verify that the inhibitor lock is taken by the kubelet:

    systemd-inhibit --list
        Who: kubelet (UID 0/root, PID 337097/kubelet)
        What: shutdown
        Why: Kubelet needs time to handle node shutdown
        Mode: delay
    
    1 inhibitors listed.
    
Troubleshooting

The Graceful Node Shutdown feature may present various issues.

Missing kubelet inhibitors and ucp-kubelet errors

A Graceful Node Shutdown configuration of --shutdown-grace-period=60s --shutdown-grace-period-critical-pods=50s can result in the following error message:

Failed to start node shutdown manager" err="node shutdown manager was unable
to update logind InhibitDelayMaxSec to 60s (ShutdownGracePeriod), current
value of InhibitDelayMaxSec (30s) is less than requested ShutdownGracePeriod

The error message indicates missing kubelet inhibitors and ucp-kubelet errors, due to the current default InhibitDelayMaxSec setting of 30s in the operating system.

You can resolve the issue either by changing the InhibitDelayMaxSec parameter setting to a larger value or by removing it.

The configuration file that contains the InhibitDelayMaxSec parameter setting can be located in any one of a number of locations:

  • /etc/systemd/logind.conf

  • /etc/systemd/logind.conf.d/.conf

  • /run/systemd/logind.conf.d/.conf

  • /usr/lib/systemd/logind.conf.d/*.conf

  • /usr/lib/systemd/logind.conf.d/unattended-upgrades-logind-maxdelay.conf

Graceful node drain does not occur and the pods are not terminated

Due to the systemd PrepareForShutdown signal not being sent to dbus, in some operating system distributions graceful node drain does not occur and the pods are not terminated.

Currently, in the following cases, the PrepareForShutdown signal is triggered and the Graceful Node Shutdown feature works as intended:

  • systemctl reboot

  • systemctl poweroff

  • shutdown -h

  • shutdown -h +0

  • shutdown -h +5

Delete kubelet node profiles

If you delete from the MKE configuration file a kubelet node profile, the nodes that are using that will enter an erroneous state. For this reason, MKE prevents users from deleting any kubelet node profile that is in use by a cluster node. It is a best practice, though, to verify before deleting any profile that it is not in use.

Example scenario:

To check whether any nodes are using a previously defined low profile, run:

kubectl get nodes -l=custom-kubelet-profile=low

Example: output:

NAME     STATUS   ROLES    AGE   VERSION
node-1   Ready    <none>   24h   v1.27.10-mirantis-1

The result indicates that one node is using the indicated kubelet node low profile. That node should be cleared of the profile before that profile is deleted.

To clear the profile, run:

kubectl label node node-ubuntu-1 custom-kubelet-profile-

Example: output:

node/node-1 unlabeled

Configure and use OpsCare

Any time there is an issue with your cluster, OpsCare routes notifications from your MKE deployment to Mirantis support engineers. These company personnel will then either directly resolve the problem or arrange to troubleshoot the matter with you.

For more information, refer to Mirantis OpsCare Plus, OpsCare & LabCare.

Configure OpsCare

To configure OpsCare you must first obtain a Salesforce username, password, and environment ID from your Mirantis Customer Success Manager. You then store these credentials as Swarm secrets using the following naming convention:

  • User name: sfdc_opscare_api_username

  • Password: sfdc_opscare_api_password

  • Environment ID: sfdc_environment_id

Note

  • Every cluster that uses OpsCare must have its own unique sfdc_environment_id.

  • OpsCare requires that MKE has access to mirantis.my.salesforce.com on port 443.

  • Any custom certificates in use must contain all of the manager node private IP addresses.

  • The provided Salesforce credentials are not associated with the Mirantis support portal login, but are for Opscare alerting only.


To configure OpsCare using the CLI:

  1. Download and configure the client bundle.

  2. Create secrets for your Salesforce login credentials:

    printf "<username-obtained-from-csm>" | docker secret create sfdc_opscare_api_username -
    printf "<password-obtained-from-csm>" | docker secret create sfdc_opscare_api_password -
    printf "<environment-id-obtained-from-csm>" | docker secret create sfdc_environment_id -
    
  3. Enable OpsCare:

    MKE_USERNAME=<mke-username>
    MKE_PASSWORD=<mke-password>
    MKE_HOST=<mke-host>
    
    AUTHTOKEN=$(curl --silent --insecure --data "{\"username\":\"$MKE_USERNAME\",\"password\":\"$MKE_PASSWORD\"}" https://$MKE_HOST/auth/login | jq --raw-output .auth_token)
    curl --silent --insecure -X GET "https://$MKE_HOST/api/ucp/config-toml" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" > ucp-config.toml
    sed -i 's/ops_care = false/ops_care = true/' ucp-config.toml
    curl --silent --insecure -X PUT -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file './ucp-config.toml' https://$MKE_HOST/api/ucp/config-toml
    

To configure OpsCare using the MKE web UI:

  1. Log in to the MKE web UI.

  2. Using the left-side navigation panel, navigate to <username> > Admin Settings > Usage.

  3. In the Salesforce Username field, enter your Salesforce user name.

  4. Next, enter your Salesforce password and Salesforce environment ID.

  5. Click Create Secrets.

  6. Under OpsCare Settings, toggle the Enable OpsCare slider to the right.

  7. Click Save.

Manage Salesforce alerts

OpsCare uses a predefined group of MKE alerts to notify your Customer Success Manager of problems with your deployment. This alerts group is identical to those seen in any MKE cluster that is provisioned by Mirantis Container Cloud. A single watchdog alert serves to verify the proper function of the OpsCare alert pipeline as a whole.

To verify that the OpsCare alerts are functioning properly:

  1. Log in to Salesforce.

  2. Navigate to Cases and verify that the watchdog alert is present. It presents as Watchdog alert. It is always firing.

Disable OpsCare

You must disable OpsCare before you can delete the three secrets in use.

To disable OpsCare:

  1. Log in to the MKE web UI.

  2. Using the left-side navigation panel, navigate to <username> > Admin Settings > Usage.

  3. Toggle the Enable Ops Care slider to the left.

Alternatively, you can disable OpsCare by changing the ops_care entry in the MKE configuration file to false.

Configure cluster and service networking in an existing cluster

On systems that use the managed CNI, you can switch existing clusters to either kube-proxy with ipvs proxier or eBPF mode.

MKE does not support switching kube-proxy in an existing cluster from ipvs proxier to iptables proxier, nor does it support disabling eBPF mode after it has been enabled. Using a CNI that supports both cluster and service networking requires that you disable kube-proxy.

Refer to Cluster and service networking options in the MKE Installation Guide for information on how to configure cluster and service networking at install time.

Caution

The configuration changes described here cannot be reversed. As such, Mirantis recommends that you make a cluster backup, drain your workloads, and take your cluster offline prior to performing any of these changes.

Caution

Swarm workloads that require the use of encrypted overlay networks must use iptables proxier. Be aware that the other networking options detailed here automatically disable Docker Swarm encrypted overlay networks.


To switch an existing cluster to kube-proxy with ipvs proxier while using the managed CNI:

  1. Obtain the current MKE configuration file for your cluster.

  2. Set kube_proxy_mode to "ipvs".

  3. Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

  4. Verify that the following values are set in your MKE configuration file:

    unmanaged_cni = false
    calico_ebpf_enabled = false
    kube_default_drop_masq_bits = false
    kube_proxy_mode = "ipvs"
    kube_proxy_no_cleanup_on_start = false
    
  5. Verify that the ucp-kube-proxy container logs on all nodes contain the following:

    KUBE_PROXY_MODE (ipvs) CLEANUP_ON_START_DISABLED false
    Performing cleanup
    kube-proxy cleanup succeeded
    Actually starting kube-proxy....
    
  6. Obtain the current MKE configuration file for your cluster.

  7. Set kube_proxy_no_cleanup_on_start to true.

  8. Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

  9. Reboot all nodes.

  10. Verify that the following values are set in your MKE configuration file and that your cluster is in a healthy state with all nodes ready:

    unmanaged_cni = false
    calico_ebpf_enabled = false
    kube_default_drop_masq_bits = false
    kube_proxy_mode = "ipvs"
    kube_proxy_no_cleanup_on_start = true
    
  11. Verify that the ucp-kube-proxy container logs on all nodes contain the following:

    KUBE_PROXY_MODE (ipvs) CLEANUP_ON_START_DISABLED true
    Actually starting kube-proxy....
    .....
    I1111 02:41:05.559641     1 server_others.go:274] Using ipvs Proxier.
    W1111 02:41:05.559951     1 proxier.go:445] IPVS scheduler not specified, use rr by default
    
  12. Optional. Configure the following ipvs-related parameters in the MKE configuration file (otherwise, MKE will use the Kubernetes default parameter settings):

    • ipvs_exclude_cidrs = ""

    • ipvs_min_sync_period = ""

    • ipvs_scheduler = ""

    • ipvs_strict_arp = false

    • ipvs_sync_period = ""

    • ipvs_tcp_timeout = ""

    • ipvs_tcpfin_timeout = ""

    • ipvs_udp_timeout = ""

    For more information on using these parameters, refer to kube-proxy in the Kubernetes documentation.


To switch an existing cluster to eBPF mode while using the managed CNI:

  1. Verify that the prerequisites for eBPF use have been met, including kernel compatibility, for all Linux manager and worker nodes. Refer to the Calico documentation Enable the eBPF dataplane for more information.

  2. Obtain the current MKE configuration file for your cluster.

  3. Set kube_default_drop_masq_bits to true.

  4. Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

  5. Verify that the ucp-kube-proxy container started on all nodes, that the kube-proxy cleanup took place, and that ucp-kube-proxy launched kube-proxy.

    for cont in $(docker ps -a|rev | cut -d' ' -f 1 | rev|grep ucp-kube-proxy); \
    do nodeName=$(echo $cont|cut -d '/' -f1); \
    docker logs $cont 2>/dev/null|grep -q 'kube-proxy cleanup succeeded'; \
    if [ $? -ne 0 ]; \
    then echo $nodeName; \
    fi; \
    done|sort
    

    Expected output in the ucp-kube-proxy logs:

    KUBE_PROXY_MODE (iptables) CLEANUP_ON_START_DISABLED false
    Performing cleanup
    kube-proxy cleanup succeeded
    Actually starting kube-proxy....
    

    Note

    If the count returned by the command does not quickly converge at 0, check the ucp-kube-proxy logs on the nodes where either of the following took place:

    • The ucp-kube-proxy container did not launch.

    • The kube-proxy cleanup did not happen.

  6. Reboot all nodes.

  7. Obtain the current MKE configuration file for your cluster.

  8. Verify that the following values are set in your MKE configuration file:

    unmanaged_cni = false
    calico_ebpf_enabled = false
    kube_default_drop_masq_bits = true
    kube_proxy_mode = "iptables"
    kube_proxy_no_cleanup_on_start = false
    
  9. Verify that the ucp-kube-proxy container logs on all nodes contain the following:

    KUBE_PROXY_MODE (iptables) CLEANUP_ON_START_DISABLED false
    Performing cleanup
    ....
    kube-proxy cleanup succeeded
    Actually starting kube-proxy....
    ....
    I1111 03:29:25.048458     1 server_others.go:212] Using iptables Proxier.
    
  10. Set kube_proxy_mode to "disabled".

  11. Set calico_ebpf_enabled to true.

  12. Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

  13. Verify that the ucp-kube-proxy container started on all nodes, that the kube-proxy cleanup took place, and that ucp-kube-proxy did not launch kube-proxy.

    for cont in $(docker ps -a|rev | cut -d' ' -f 1 | rev|grep ucp-kube-proxy); \
    do nodeName=$(echo $cont|cut -d '/' -f1); \
    docker logs $cont 2>/dev/null|grep -q 'Sleeping forever'; \
    if [ $? -ne 0 ]; \
    then echo $nodeName; \
    fi; \
    done|sort
    

    Expected output in the ucp-kube-proxy logs:

    KUBE_PROXY_MODE (disabled) CLEANUP_ON_START_DISABLED false
    Performing cleanup
    kube-proxy cleanup succeeded
    Sleeping forever....
    

    Note

    If the count returned by the command does not quickly converge at 0, check the ucp-kube-proxy logs on the nodes where either of the following took place:

    • The ucp-kube-proxy container did not launch.

    • The ucp-kube-proxy container launched kube-proxy.

  14. Obtain the current MKE configuration file for your cluster.

  15. Verify that the following values are set in your MKE configuration file:

    unmanaged_cni = false
    calico_ebpf_enabled = true
    kube_default_drop_masq_bits = true
    kube_proxy_mode = "disabled"
    kube_proxy_no_cleanup_on_start = false
    
  16. Set kube_proxy_no_cleanup_on_start to true.

  17. Upload the new MKE configuration file. Be aware that this will require a wait time of approximately five minutes.

  18. Verify that the following values are set in your MKE configuration file and that your cluster is in a healthy state with all nodes ready:

    unmanaged_cni = false
    calico_ebpf_enabled = true
    kube_default_drop_masq_bits = true
    kube_proxy_mode = "disabled"
    kube_proxy_no_cleanup_on_start = true
    
  19. Verify that eBPF mode is operational by confirming the presence of the following lines in the ucp-kube-proxy container logs:

    KUBE_PROXY_MODE (disabled) CLEANUP_ON_START_DISABLED true
    "Sleeping forever...."
    
  20. Verify that you can SSH into all nodes.

Schedule image pruning

MKE administrators can schedule the cleanup of unused images, whitelisting which images to keep. To determine which images will be removed, they can perform a dry run prior to setting the image-pruning schedule.

Schedule image pruning using the CLI

To perform a dry run without whitelisting any images:

Perform a dry run to determine which images will be pruned:

AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/images/prune/dry

Example response:

[
   {
      "Containers":-1,
      "Created":1647029986,
      "Id":"sha256:2fb6fc2d97e10c79983aa10e013824cc7fc8bae50630e32159821197dda95fe3",
      "Labels":null,
      "ParentId":"",
      "RepoDigests":[
         "busybox@sha256:caa382c432891547782ce7140fb3b7304613d3b0438834dce1cad68896ab110a"
      ],
      "RepoTags":[
         "busybox:latest"
      ],
      "SharedSize":-1,
      "Size":1239748,
      "VirtualSize":1239748
   }
]

To perform a dry run with whitelisted images:

  1. Obtain the current MKE configuration file for your cluster.

  2. Whitelist the images that should not be removed.

    Note

    Where possible, use the image ID to specify the image rather than the image name.

    For example:

    [[cluster_config.image_prune_whitelist]]
      key = "label"
      value = "<label-value>"
    
    [[cluster_config.image_prune_whitelist]]
      key = "before"
      value = "<image-id>"
    

    Refer to cluster_config.image_prune_whitelist (optional) for more information.

  3. Upload the new MKE configuration file.

  4. Perform a dry run to determine which images will be pruned:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/images/prune/dry
    

    Example response:

    [
       {
          "Containers":-1,
          "Created":1647029986,
          "Id":"sha256:2fb6fc2d97e10c79983aa10e013824cc7fc8bae50630e32159821197dda95fe3",
          "Labels":null,
          "ParentId":"",
          "RepoDigests":[
             "busybox@sha256:caa382c432891547782ce7140fb3b7304613d3b0438834dce1cad68896ab110a"
          ],
          "RepoTags":[
             "busybox:latest"
          ],
          "SharedSize":-1,
          "Size":1239748,
          "VirtualSize":1239748
       }
    ]
    

To schedule image pruning:

  1. Obtain the current MKE configuration file for your cluster.

  2. Optional. Whitelist the images that should not be removed, if you have not already done so.

    Note

    Where possible, use the image ID to specify the image rather than the image name.

    For example:

    [[cluster_config.image_prune_whitelist]]
      key = "label"
      value = "<label-value>"
    
    [[cluster_config.image_prune_whitelist]]
      key = "before"
      value = "<image-id>"
    

    Refer to cluster_config.image_prune_whitelist (optional) for more information.

  3. Set the value of image_prune_schedule to the desired cron schedule. Refer to cluster_config table (required) for more information.

    The following example schedules image pruning for every day at midnight:

    [cluster_config]
    
        image_prune_schedule = "0 0 0 * * *"
    
  4. Upload the new MKE configuration file.

Schedule image pruning using the MKE web UI
  1. Log in to the MKE web UI as an administrator.

  2. From the left-side navigation panel, navigate to <user name> > Admin Settings > Tuning and scroll to Image pruning config.

  3. Enter the desired pruning schedule.

  4. Optional. Select the desired whitelist rules.

  5. Optional. Test your image pruning configuration by clicking Start a dry run under Test configuration.

Manage etcd

etcd is a consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. It handles leader elections during network partitions and can tolerate machine failure, even in the leader node.

For MKE, etcd serves as the Kubernetes backing store for all cluster data, with an etcd replica deployed on each MKE manager node. This is a primary reason why Mirantis recommends that you deploy an odd number of MKE manager nodes, as etcd uses the Raft consensus algorithm and thus requires that a quorum of nodes agree on any updates to the cluster state.

Configure etcd storage quota

You can control the etcd distributed key-value storage quota using the etcd_storage_quota parameter in the MKE configuration file. By default, the value of the parameter is 2GB. For information on how to adjust the parameter, refer to Configure an MKE cluster.

If you choose to increase the etcd quota, be aware that this quota has a limit and such action should be used in conjunction with other strategies, such as decreasing events TTL to ensure that the etcd database does not run out of space.

Important

If a manager node virtual machine runs out of disk space, or if all of its system memory is depleted, etcd can cause the MKE cluster to move into an irrecoverable state. To prevent this from happening, configure the disk space and the memory of the manager node VMs to levels that are well in excess of the set etcd storage quota. Be aware, though, that warning banners will display in the MKE web UI if the etcd storage quota is set at an amount in excess of 40% of system memory.

See also

Cleanse etcd of Kubernetes events

Kubernetes events are generated in response to changes within Kubernetes resources, such as nodes, Pods, or containers. These events are created with a time to live (TTL), after which they are automatically cleaned up. Should it happen, however, that a large amount of Kubernetes events are generated or other cluster issues arise, it may be necessary to manually clean up the Kubernetes events to prevent etcd from exceeding its quota. MKE offers an API that you can use to directly clean up event objects within your cluster, with which you can specify whether all events should be deleted or only those that have a certain TTL.

Note

The etcd cleanup API is a preventative measure only. If etcd already exceeds the established quota MKE may no longer be operational, and as a result the API will not work.

To trigger etcd cleanup:

  1. Issue a POST to the https://MKE_HOST/api/ucp/etcd/cleanup endpoint.

    You can specify two parameters:

    dryRun

    Sets where to issue a dry cleanup run instead of the production run. A dry run returns a list of etcd keys (Kubernetes events) that will be deleted without actually deleting them. Defaults to false.

    MinTTLToKeepSeconds

    Sets the minimum TTL to retain, meaning that only events with a lower TTL are deleted. By default, all events are deleted regardless of TTL.

    Mirantis recommends that you adjust these parameters based on the size of the etcd database and the amount of time that has elapsed since the last cleanup.

    Example command (dry run):

     AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' <https://MKE_HOST/auth/login> | jq --raw-output .auth_token)
    
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" <https://MKE_HOST/api/ucp/etcd/cleanup> --data '{"dryRun": true}'
    

    Command response (dry run):

    [
        {
            "key": "/registry/events/default/eventkey1",
            "ttl": 3638
        },
        {
            "key": "/registry/events/default/eventkey2",
            "ttl": 3639
        }
        ...
    ]
    

    Example command (live):

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' <https://MKE_HOST/auth/login> | jq --raw-output .auth_token)
    
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" <https://MKE_HOST/api/ucp/etcd/cleanup> --data '{"dryRun": false}'
    

    Example response (live):

    "Etcd Cleanup Initiated"
    
  2. Review the etcd cleanup state:

    Example command:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' <https://MKE_HOST/auth/login> | jq --raw-output .auth_token)
    
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" <https://MKE_HOST/api/ucp/etcd/info>
    

    Example response:

    {
        "CleanupInProgress": false,
        "CleanupResult": "Cluster Cleanup finished & Revisions Compacted. Issue a cluster defrag to permanently clear up space.",
        "DefragInProgress": false,
        "DefragResult": "",
        "MemberInfo": [
            {
                "MemberID": 16494148364752423721,
                "Endpoint": "<https://172.31.47.35:12379",>
                "EtcdVersion": "3.5.6",
                "DbSize": "1 MB",
                "IsLeader": true,
                "Alarms": null
            }
        ]
    }
    

The CleanupResult field in the response indicates any issues that arise. It also indicates when the cleanup is finished.

Note

Although the etcd cleanup process deletes the keys, you must run an etcd defragmentation to release the storage space used by those keys. The defragmentation is a blocking operation, and as such it is not run automatically but must be run in order for the cleanup to release space back to the filesystem.

Apply etcd defragmentation

The etcd distributed key-value store retains a history of its keyspace. That history is set for compaction following a specified number of revisions, however it only releases the used space back to the host filesystem following defragmentation. For more information, refer to the etcd documentation.

With MKE you can defragment the etcd cluster while avoiding cluster outages. To do this, you apply defragmentation to etcd members one at a time. MKE will defragment the current etcd leader last, to prevent the triggering of multiple leader elections.

Important

In a High Availability (HA) cluster, the defragmentation process subtly affects cluster dynamics, because when a node undergoes defragmentation it temporarily leaves the pool of active nodes. This subsequent reduction in the active node count results in a proportional increase of the load on the remaining nodes, which can lead to performance degradation if the remaining nodes do not have the capacity to handle the additional load. In addition, at the end of the process, when the leader node is undergoing defragmentation, there is a brief period during which cluster write operations do not take place. This pause occurs when the system initiates and completes the leader election process, and though it is automated and brief it does result in a momentary write block on the cluster.

Taking the described factors into account, Mirantis recommends taking a cautious scheduling approach in defragmenting HA clusters. Ideally, the defragmentation should occur during planned maintenance windows rather than relying on a recurring cron job, as during such periods you can closely monitor potential impacts on performance and availability and mitigate as necessary.


To defragment the etcd cluster:

  1. Trigger the etcd cluster defragmentation by issuing a POST to the https://MKE_HOST/api/ucp/etcd/defrag endpoint.

    You can specify two parameters:

    timeoutSeconds

    Sets how long MKE waits for each member to finish defragmentation. Default: 60 seconds. MKE will cancel the defragmentation if the timeout occurs before the member defragmentation completes.

    pauseSeconds

    Sets how long MKE waits between each member defragmentation. Default: 60 seconds.

    Mirantis recommends that you adjust these parameters based on the size of the etcd database and the amount of time that has elapsed since the last defragmentation.

    Example command:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/etcd/defrag --data '{"timeoutSeconds": 60, "pauseSeconds": 60}'
    

    Example response:

    "Cluster Defragmentation Initiated"
    
  2. Review the state of individual etcd cluster members and the state of the cluster defragmentation by running the following command:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/etcd/info
    

    Example output:

    {
        "DefragInProgress": true,
        "DefragResult": "Cluster Defrag Initiated",
        "MemberInfo": [
            {
                "MemberID": 5051939019959384922,
                "Endpoint": "https://172.31.21.33:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": true,
                "Alarms": null
            },
            {
                "MemberID": 10749614093923491478,
                "Endpoint": "https://172.31.30.179:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": false,
                "Alarms": null
            },
            {
                "MemberID": 7837950661722744517,
                "Endpoint": "https://172.31.30.44:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": false,
                "Alarms": null
            }
        ]
    }
    

    You can monitor this endpoint until the defragmentation is complete. The information is also available in the ucp-controller logs.


To manually remove the etcd defragmentation lock file:

To maintain etcd cluster availability, MKE uses a lock file that prevents multiple defragmentations from being simultaneously implemented. MKE removes the lock file at the conclusion of defragmentation, however you can manually remove it as necessary.

Manually remove the lock file by running the following command:

docker exec ucp-controller rm /var/lock/etcd-defrag
etcd alarms response

Available since MKE 3.7.5

etcd issues alarms to indicate problems that need to be quickly addressed to ensure uninterrupted function.

NOSPACE alarm

A NOSPACE alarm is issued in the event that etcd runs low on storage space, to protect the cluster from further writes. Once this low storage space state is reached, etcd will respond to all write requests with the mvcc: database space exceeded error message until the issue is rectified.

When MKE detects the NOSPACE alarm condition, it displays a critical banner to inform administrators. In addition, MKE restarts etcd with an increased value for the etcd datastore quota, thus allowing administrators to resolve the NOSPACE alarm without interference.

To resolve the NOSPACE alarm:

  1. Identify what data occupies most of the storage space. Be aware that in MKE the recommended etcdctl commands must be run in the ucp-kv container, the instruction for which is available in Troubleshoot the etcd key-value store with the CLI.

    If a bug-ridden appliction is the cause of the unexpected use of storage space, stop that application.

  2. Manually delete the unused data from etcd, if possible.

  3. Apply etcd defragmentation.

  4. If necessary, increase the etcd_storage_quota setting in the cluster_config table of the MKE configuration file.

Note

Contact Mirantis Support if you require assistance in resolving the etcd NOSPACE alarm.

CORRUPT alarm

The CORRUPT alarm is issued when a cluster corruption is detected by etcd. MKE cluster administrators are informed of the condition by way of a critical banner. To resolve such an issue, contact Mirantis Support and refer to the official etcd documentation regarding data corruption recovery.

Operate a hybrid Windows cluster

Hybrid Windows clusters concurrently run two versions of Windows Server, with one version deployed on one set of nodes and the second version deployed on a different set of nodes. The Windows versions that MKE supports are:

  • Windows Server 2019, build number 10.0.17763

  • Windows Server 2022, build number 10.0.20348

For more information on Windows releases and build numbers, refer to Windows container version compatibility.

To learn how to upgrade to Windows Server 2022, refer to Upgrade nodes to Windows Server 2022.

Limitations
  • A Windows Server 2019 node cannot run a container that uses a Windows Server 2022 image.

  • For a Windows Server 2022 node to run a container that uses a Windows Server 2019 image, you must run the container with Hyper-V isolation. Refer to the Microsoft documentation Hyper-V isolation for containers for more information.

Mirantis recommends that you use the same version of Windows Server for both your container images and for the node on which the containers run. For reference purposes, in both Kubernetes and Swarm clusters, MKE assigns a label to Windows nodes that includes the Windows Server version.

Run hybrid workloads in Kubernetes

To run Windows workloads in a hybrid Windows Kubernetes cluster, you must target your workloads to nodes that are running the correct Windows version. Failure to correctly target your workloads may result in an error when Kubernetes schedules the Pod on an incompatible node:

Error response from daemon: hcsshim::CreateComputeSystem win2019-deployment-no-nodeselect: The container operating system does not match the host operating system.
  1. Note the Windows version associated with each of the nodes in your cluster:

    kubectl get node
    

    Example output:

    NAME                         STATUS   ROLES    AGE   VERSION
    manager-node                 Ready    master   51m   v1.23.4-mirantis-1
    win2019-node                 Ready    <none>   44m   v1.23.4
    win2022-node                 Ready    <none>   38m   v1.23.4
    
  2. Create a deployment with the appropriate node selectors. Use 10.0.17763 for Windows Server 2019 workloads and 10.0.20348 for Windows Server 2022 workloads.

    For example purposes, paste the following content into a file called win2019-deployment.yaml:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: win2019-deployment
      name: win2019-deployment
    spec:
      replicas: 5
      selector:
        matchLabels:
          app: win2019-deployment
      template:
        metadata:
          labels:
            app: win2019-deployment
          name: win2019-deployment
        spec:
          containers:
          - name: win2019-deployment
            image: mcr.microsoft.com/windows/nanoserver:1809
            command: ["cmd", "/c", "ping -t localhost"]
            ports:
            - containerPort: 80
          nodeSelector:
            kubernetes.io/os: windows
            node.kubernetes.io/windows-build: 10.0.17763
    
  3. Apply the deployment:

    kubectl apply -f win2019-deployment.yaml
    
  4. Verify that the Pods are scheduled on the required node:

    kubectl get pods -o wide
    

    Example output:

    NAME                                                READY   STATUS             RESTARTS      AGE     IP              NODE                      NOMINATED NODE   READINESS GATES
    win2019-deployment-57d75f6f9f-ldsqf                 1/1     Running            0             6m39s   192.168.50.76   win2019-node              <none>           <none>
    win2019-deployment-57d75f6f9f-n5b25                 1/1     Running            0             6m39s   192.168.50.79   win2019-node              <none>           <none>
    win2019-deployment-57d75f6f9f-r5mz6                 1/1     Running            0             6m39s   192.168.50.78   win2019-node              <none>           <none>
    win2019-deployment-57d75f6f9f-xggmt                 1/1     Running            0             6m39s   192.168.50.77   win2019-node              <none>           <none>
    win2019-deployment-57d75f6f9f-zltk2                 1/1     Running            0             7m7s    192.168.50.73   win2019-node              <none>           <none>
    
Run hybrid workloads in Swarm

To run Windows workloads in a hybrid Windows Swarm cluster, you must target your workloads to nodes that are running the correct Windows version. Failure to correctly target your workloads may result in an operating system mismatch error.

  1. Verify that nodes running the appropriate Windows version are present in the cluster. Use an OsVersion label of 10.0.17763 for Windows Server 2019 and 10.0.20348 for Windows Server 2022. For example:

    docker node ls -f "node.label=OsVersion=10.0.20348"
    

    Example output:

    ID                            HOSTNAME                  STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
    yft1t1mnytt524y03zmdevzuk     win2022-node-1            Ready     Active                          20.10.12
    
  2. Create a service that runs the required version of Windows Server, in this case Windows Server 2022. The service requires the inclusion of various constraints, to ensure that it is scheduled on the correct node. For example:

    docker service create --name windows2022-example-service \
    --constraint "node.platform.OS == windows" \
    --constraint "node.labels.OsVersion == 10.0.20348" \
    mcr.microsoft.com/windows/nanoserver:ltsc2022 cmd "/c ping -t localhost"
    
  3. Verify that the service is scheduled on the required node:

    docker service ps windows2022-example-service
    

    Example output:

    ID             NAME                            IMAGE                                           NODE                      DESIRED STATE   CURRENT STATE           ERROR     PORTS
    uqrosib62602   windows2022-example-service.1   mcr.microsoft.com/windows/nanoserver:ltsc2022   win2022-node-1            Running         Running 9 minutes ago
    

Manage NodeLocalDNS

With NodeLocalDNS you can run a local instance of the DNS caching agent on each node in the MKE cluster. This can significantly improve cluster performance, compared to relying on a centralized CoreDNS instance to resolve external DNS records, as a local NodeLocalDNS instance can cache DNS results and eliminate the network latency factor.

Enable and disable NodeLocalDNS

The NodeLocalDNS feature is enabled and disabled through the MKE configuration file, comprehensive information for which is available at Use an MKE configuration file.

Prerequisites

Before installing the NodelocalDNS in an MKE cluster, you must verify the following settings in MKE:

  • The unmanaged CNI plugin is not enabled.

  • kube-proxy is running in iptables mode.

Note

If you are running MKE on RHEL, centOS, or Rocky Linux, review Troubleshoot NodeLocalDNS to learn of issues that NodeLocalDNS has with these operating systems and their corresponding fixes.

Enable NodeLocalDNS

To install NodelocalDNS in your MKE cluster:

  1. Modify the MKE configuration file to set the cluster_config.node_local_dns parameter to true.

    [cluster_config]
      node_local_dns = true
    
  2. Upload the modified MKE configuration file. Be aware that it may take up to five minutes for the changes to propagate through your cluster.

  3. Check for the following NodelocalDNS resources, to verify the presence of NodelocalDNS in the cluster.

    • DaemonSet

      kubectl get ds -l k8s-app=node-local-dns -n kube-system
      

      Example output:

       Result
      NAME             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
      node-local-dns   2         2         2       2            2           <none>          79s
      
    • Service

      kubectl get svc -l k8s-app=node-local-dns -n kube-system
      

      Example output:

       Result
      NAME             TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
      node-local-dns   ClusterIP   None         <none>        9253/TCP   100s
      
    • Pods

      kubectl get po -o wide -n kube-system -l k8s-app=node-local-dns
      

      Example output:

       Result
      NAME                   READY   STATUS    RESTARTS   AGE    IP              NODE                         NOMINATED NODE   READINESS GATES
      node-local-dns-k9gns   1/1     Running   0          116s   172.31.44.29    ubuntu-18-ubuntu-1   <none>           <none>
      node-local-dns-zskp9   1/1     Running   0          116s   172.31.32.242   ubuntu-18-ubuntu-0   <none>           <none>
      
Disable NodeLocalDNS

To uninstall NodelocalDNS:

  1. Modify the MKE configuration file to set the cluster_config.node_local_dns parameter to false.

    [cluster_config]
      node_local_dns = false
    
  2. Upload the modified MKE configuration file. Be aware that it may take up to five minutes for the changes to propagate through your cluster.

  3. Check for the following NodelocalDNS resources, to verify the presence of NodelocalDNS in the cluster.

    • DaemonSet

      kubectl get ds -l k8s-app=node-local-dns -n kube-system
      

      Example output:

      No resources found in kube-system namespace.
      
    • Service

      kubectl get svc -l k8s-app=node-local-dns -n kube-system
      

      Example output:

      No resources found in kube-system namespace.
      
    • Pods

      kubectl get po -o wide -n kube-system -l k8s-app=node-local-dns
      

      Example output:

      No resources found in kube-system namespace.
      
Run DNS queries

With NodelocalDNS installed on your MKE cluster, you can run queries from the Pod that has the DNS diagnostic utilities.

  1. Run the Pod:

    kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
    

    Example output:

    pod/dnsutils created
    
  2. Verify that Pod status has changed to Running:

    kubectl get pods dnsutils
    

    Example output:

    NAME      READY  STATUS   RESTARTS  AGE
    dnsutils  1/1    Running  0         26m
    
  3. Run the nslookup and dig DNS queries multiple times. The cache hits should increase for NodeLocalDnsCache.

    1. DNS query using nslookup:

      kubectl exec -i -t dnsutils -- nslookup example.com
      kubectl exec -i -t dnsutils -- dig +short @169.254.0.10 example.com
      

      Example output:

      kubectl exec -i -t dnsutils -- nslookup example.com
      Server:                10.96.0.10
      Address:       10.96.0.10#53
      
      Non-authoritative answer:
      Name:  example.com
      Address: 93.184.215.14
      
    2. DNS query using dig:

      kubectl exec -i -t dnsutils -- dig +short @169.254.0.10 example.com
      

      Example output:

      93.184.215.14
      
  4. Check the NodeLocalDNS metrics for cache hits:

    curl http://localhost:9253/metrics | grep coredns_cache_hits_total
    

    Example output:

     curl http://localhost:9253/metrics | grep coredns_cache_hits_total
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
      0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP coredns_cache_hits_total The count of cache hits.
    # TYPE coredns_cache_hits_total counter
    coredns_cache_hits_total{server="dns://10.96.0.10:53",type="denial",view="",zones="."} 8
    coredns_cache_hits_total{server="dns://10.96.0.10:53",type="denial",view="",zones="cluster.local."} 71
    coredns_cache_hits_total{server="dns://10.96.0.10:53",type="success",view="",zones="."} 8
    coredns_cache_hits_total{server="dns://10.96.0.10:53",type="success",view="",zones="cluster.local."} 53
    coredns_cache_hits_total{server="dns://169.254.0.10:53",type="success",view="",zones="."} 6
    100 64055    0 64055    0     0  30.5M      0 --:--:-- --:--:-- --:--:-- 30.5M
    

Authorize role-based access

MKE allows administrators to authorize users to view, edit, and use cluster resources by granting role-based permissions for specific resource sets. This section describes how to configure all the relevant components of role-based access control (RBAC).

Refer to Role-based access control for detailed reference information.

Create organizations, teams, and users

This topic describes how to create organizations, teams, and users.

Note

  • Individual users can belong to multiple teams but a team can belong to only one organization.

  • New users have a default permission level that you can extend by adding the user to a team and creating grants. Alternatively, you can make the user an administrator to extend their permission level.

  • In addition to integrating with LDAP services, MKE provides built-in authentication. You must manually create users to use MKE built-in authentication.

Create an organization
  1. Log in to the MKE web UI as an administrator.

  2. Navigate to Access Control > Orgs & Teams > Create.

  3. Enter a unique organization name that is 1-100 characters in length and which does not contain any of the following:

    • Capital letters

    • Spaces

    • The following non-alphabetic characters: \*+[\]:;|=,?<>"'

  4. Click Create.

Create a team in the organization
  1. Log in to the MKE web UI as an administrator.

  2. Navigate to the required organization and click the plus icon in the top right corner to call the Create Team dialog.

  3. Enter a team name with a maximum of 100 characters.

  4. Optional. Enter a description for the team. Maximum: 140 characters.

  5. Click Create.

Add an existing user to a team
  1. Log in to the MKE web UI as an administrator.

  2. Navigate to the required team and click the plus sign in the top right corner.

  3. Select the users you want to include and click Add Users.

Create a user
  1. Log in to the MKE web UI as an administrator.

  2. Navigate to Access Control > Users > Create.

  3. Enter a unique user name that is 1-100 characters in length and which does not contain any of the following:

    • Capital letters

    • Spaces

    • The following non-alphabetic characters: \*+[\]:;|=,?<>"'

  4. Enter a password that contains at least 8 characters.

  5. Enter the full name of the user.

  6. Optional. Toggle IS A MIRANTIS KUBERNETES ENGINE ADMIN to Yes to give the user administrator privileges.

  7. Click Create.

Enable LDAP and sync teams and users

Once you enable LDAP you can sync your LDAP directory to the teams and users that are present in MKE.


To enable LDAP:

  1. Log in to the MKE web UI as an MKE administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Authentication & Authorization.

  3. Scroll down to the Identity Provider Integration section.

  4. Toggle LDAP to Enabled. A list of LDAP settings displays.

  5. Enter the values that correspond with your LDAP server installation.

  6. Use the built-in MKE LDAP Test login tool to confirm that your LDAP settings are correctly configured.


To synchronize LDAP users into MKE teams:

  1. In the left-side navigation panel, navigate to Access Control > Orgs & Teams and select an organization.

  2. Click + to create a team.

  3. Enter a team name and description.

  4. Toggle ENABLE SYNC TEAM MEMBERS to Yes.

  5. Choose between the following two methods for matching group members from an LDAP directory. Refer to the table below for more information.

    • Keep the default Match Search Results method and fill out Search Base DN, Search filter, and Search subtree instead of just one level as required.

    • Toggle LDAP MATCH METHOD to change the method for matching group members in the LDAP directory to Match Group Members.

  6. Optional. Select Immediately Sync Team Members to run an LDAP sync operation after saving the configuration for the team.

  7. Optional. To allow non-LDAP team members to sync the LDAP directory, select Allow non-LDAP members.

    Note

    If you do not select Allow non-LDAP members, manually-added and SAML users are removed during the LDAP sync.

  8. Click Create.

  9. Repeat the preceding steps to synchronize LDAP users into additional teams.


There are two methods for matching group members from an LDAP directory:

Bind method

Description

Match Search Results (search bind)

Specifies that team members are synced using a search query against the LDAP directory of your organization. The team membership is synced to match the users in the search results.

Search Base DN

The distinguished name of the node in the directory tree where the search starts looking for users.

Search filter

Filter to find users. If empty, existing users in the search scope are added as members of the team.

Search subtree instead of just one level

Defines search through the full LDAP tree, not just one level, starting at the base DN.

Match Group Members (direct bind)

Specifies that team members are synced directly with members of a group in your LDAP directory. The team membership syncs to match the membership of the group.

Group DN

The distinguished name of the group from which you select users.

Group Member Attribute

The value of this attribute corresponds to the distinguished names of the members of the group.

Define roles with authorized API operations

Roles define a set of API operations permitted for a resource set. You apply roles to users and teams by creating grants. Roles have the following important characteristics:

  • Roles are always enabled.

  • Roles cannot be edited. To change a role, you must delete it and create a new role with the changes you want to implement.

  • To delete roles used within a grant, you must first delete the grant.

  • Only administrators can create and delete roles.

This topic explains how to create custom Swarm roles and describes default and Swarm operations roles.

Default roles

The following describes the built-in roles:

Role

Description

None

Users have no access to Swarm or Kubernetes resources. Maps to No Access role in UCP 2.1.x.

View Only

Users can view resources but cannot create them.

Restricted Control

Users can view and edit resources but cannot run a service or container in a way that affects the node where it is running. Users cannot mount a node directory, exec into containers, or run containers in privileged mode or with additional kernel capabilities.

Scheduler

Users can view worker and manager nodes and schedule, but not view, workloads on these nodes. By default, all users are granted the Scheduler role for the Shared collection. To view workloads, users need Container View permissions.

Full Control

Users can view and edit all granted resources. They can create containers without any restriction, but cannot see the containers of other users.

To learn how to apply a default role using a grant, refer to Create grants.

Create a custom Swarm role

You can use default or custom roles.

To create a custom Swarm role:

  1. Log in to the MKE web UI.

  2. Click Access Control > Roles.

  3. Select the Swarm tab and click Create.

  4. On the Details tab, enter the role name.

  5. On the Operations tab, select the permitted operations for each resource type. For the operation descriptions, refer to Swarm operations roles.

  6. Click Create.

Note

  • The Roles page lists all applicable default and custom roles in the organization.

  • You can apply a role with the same name to different resource sets.

To learn how to apply a custom role using a grant, refer to Create grants.

Swarm operations roles

The following describes the set of operations (calls) that you can execute to the Swarm resources. Each permission corresponds to a CLI command and enables the user to execute that command. Refer to the Docker CLI documentation for a complete list of commands and examples.

Operation

Command

Description

Config

docker config

Manage Docker configurations.

Container

docker container

Manage Docker containers.

Container

docker container create

Create a new container.

Container

docker create [OPTIONS] IMAGE [COMMAND] [ARG...]

Create new containers.

Container

docker update [OPTIONS] CONTAINER [CONTAINER...]

Update configuration of one or more containers. Using this command can also prevent containers from consuming too many resources from their Docker host.

Container

docker rm [OPTIONS] CONTAINER [CONTAINER...]

Remove one or more containers.

Image

docker image COMMAND

Remove one or more containers.

Image

docker image remove

Remove one or more images.

Network

docker network

Manage networks. You can use child commands to create, inspect, list, remove, prune, connect, and disconnect networks.

Node

docker node COMMAND

Manage Swarm nodes.

Secret

docker secret COMMAND

Manage Docker secrets.

Service

docker service COMMAND

Manage services.

Volume

docker volume create [OPTIONS] [VOLUME]

Create a new volume that containers can consume and store data in.

Volume

docker volume rm [OPTIONS] VOLUME [VOLUME...]

Remove one or more volumes. Users cannot remove a volume that is in use by a container.

Use collections and namespaces

MKE enables access control to cluster resources by grouping them into two types of resource sets: Swarm collections (for Swarm workloads) and Kubernetes namespaces (for Kubernetes workloads). Refer to Role-based access control for a description of the difference between Swarm collections and Kubernetes namespaces. Administrators use grants to combine resources sets, giving users permission to access specific cluster resources.

Swarm collection labels

Users assign resources to collections with labels. The following resource types have editable labels and thus you can assign them to collections: services, nodes, secrets, and configs. For these resources types, change com.docker.ucp.access.label to move a resource to a different collection. Collections have generic names by default, but you can assign them meaningful names as required (such as dev, test, and prod).

Note

The following resource types do not have editable labels and thus you cannot assign them to collections: containers, networks, and volumes.

Groups of resources identified by a shared label are called stacks. You can place one stack of resources in multiple collections. MKE automatically places resources in the default collection. Users can change this using a specific com.docker.ucp.access.label in the stack/compose file.

The system uses com.docker.ucp.collection.* to enable efficient resource lookup. You do not need to manage these labels, as MKE controls them automatically. Nodes have the following labels set to true by default:

  • com.docker.ucp.collection.root

  • com.docker.ucp.collection.shared

  • com.docker.ucp.collection.swarm

Default and built-in Swarm collections

This topic describes both MKE default and built-in Swarm collections.


Default Swarm collections

Each user has a default collection, which can be changed in the MKE preferences.

To deploy resources, they must belong to a collection. When a user deploys a resource without using an access label to specify its collection, MKE automatically places the resource in the default collection.

Default collections are useful for the following types of users:

  • Users who work only on a well-defined portion of the system

  • Users who deploy stacks but do not want to edit the contents of their compose files

Custom collections are appropriate for users with more complex roles in the system, such as administrators.

Note

For those using Docker Compose, the system applies default collection labels across all resources in the stack unless you explicitly set com.docker.ucp.access.label.

Built-in Swarm collections

MKE includes the following built-in Swarm collections:

Built-in Swarm collection

Description

/

Path to all resources in the Swarm cluster. Resources not in a collection are put here.

/System

Path to MKE managers, MSR nodes, and MKE/MSR system services. By default, only administrators have access to this collection.

/Shared

Path to a user’s private collection. Private collections are not created until the user logs in for the first time.

/Shared/Private

Path to a user’s private collection. Private collections are not created until the user logs in for the first time.

/Shared/Legacy

Path to the access control labels of legacy versions (UCP 2.1 and earlier).

Group and isolate cluster resources

This topic describes how to group and isolate cluster resources into swarm collections and Kubernetes namespaces.

Log in to the MKE web UI as an administrator and complete the following steps:

To create a Swarm collection:

  1. Navigate to Shared Resources > Collections.

  2. Click View Children next to Swarm.

  3. Click Create Collection.

  4. Enter a collection name and click Create.


To move a resource to a different collection:

  1. In the left-side navigation panel, navigate to the resource type you want to move and click it. As an example, navigate to and click on Shared Resources > Nodes.

  2. Click the node you want to move to display the information window for that node.

  3. Click the slider icon at the top right of the information window to display the edit dialog for the node.

  4. Scroll down to Labels and change the com.docker.ucp.access.label swarm label to the name of your collection.

    Note

    Optionally, you can navigate to Collection in the left-side navigation panel and select the collection to which you want to move the resource.


To create a Kubernetes namespace:

  1. Navigate to Kubernetes > Namespaces and click Create.

  2. Leave the Namespace drop-down blank.

  3. Paste the following in the Object YAML editor:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: namespace-name
    
  4. Click Create.

Note

For more information on assigning resources to a particular namespace, refer to Kubernetes Documentation: Namespaces Walkthrough.

See also

Kubernetes

See also

Kubernetes

Create grants

MKE administrators create grants to control how users and organizations access resource sets. A grant defines user permissions to access resources. Each grant associates one subject with one role and one resource set. For example, you can grant the Prod Team Restricted Control over services in the /Production collection.

The following is a common workflow for creating grants:

  1. create-manually.

  2. Define custom roles (or use defaults) by adding permitted API operations per type of resource.

  3. Group cluster resources into Swarm collections or Kubernetes namespaces.

  4. Create grants by combining subject, role, and resource set.

Note

This section assumes that you have created the relevant objects for the grant, including the subject, role, and resource set (Kubernetes namespace or Swarm collection).

To create a Kubernetes grant:

  1. Log in to the MKE web UI.

  2. Navigate to Access Control > Grants.

  3. Select the Kubernetes tab and click Create Role Binding.

  4. Under Subject, select Users, Organizations, or Service Account.

    • For Users, select the user from the pull-down menu.

    • For Organizations, select the organization and, optionally, the team from the pull-down menu.

    • For Service Account, select the namespace and service account from the pull-down menu.

  5. Click Next to save your selections.

  6. Under Resource Set, toggle the switch labeled Apply Role Binding to all namespaces (Cluster Role Binding).

  7. Click Next.

  8. Under Role, select a cluster role.

  9. Click Create.


To create a Swarm grant:

  1. Log in to the MKE web UI.

  2. Navigate to Access Control > Grants.

  3. Select the Swarm tab and click Create Grant.

  4. Under Subject, select Users or Organizations.

    • For Users, select a user from the pull-down menu.

    • For Organizations, select the organization and, optionally, the team from the pull-down menu.

  5. Click Next to save your selections.

  6. Under Resource Set, click View Children until the required collection displays.

  7. Click Select Collection next to the required collection.

  8. Click Next.

  9. Under Role, select a role type from the drop-down menu.

  10. Click Create.

Note

MKE places new users in the docker-datacenter organization by default. To apply permissions to all MKE users, create a grant with the docker-datacenter organization as a subject.

Grant users permission to pull images

By default, only administrators can pull images into a cluster managed by MKE. This topic describes how to give non-administrator users permission to pull images.

Images are always in the swarm collection, as they are a shared resource. Grant users the Image Create permission for the Swarm collection to allow them to pull images.

To grant a user permission to pull images:

  1. Log in to the MKE web UI as an administrator.

  2. Navigate to Access Control > Roles.

  3. Select the Swarm tab and click Create.

  4. On the Details tab, enter Pull images for the role name.

  5. On the Operations tab, select Image Create from the IMAGE OPERATIONS drop-down.

  6. Click Create.

  7. Navigate to Access Control > Grants.

  8. Select the Swarm tab and click Create Grant.

  9. Under Subject, click Users and select the required user from the drop-down.

  10. Click Next.

  11. Under Resource Set, select the Swarm collection and click Next.

  12. Under Role, select Pull images from the drop-down.

  13. Click Create.

Reset passwords

This topic describes how to reset passwords for users and administrators.

To change a user password in MKE:

  1. Log in to the MKE web UI with administrator credentials.

  2. Click Access Control > Users.

  3. Select the user whose password you want to change.

  4. Click the gear icon in the top right corner.

  5. Select Security from the left navigation.

  6. Enter the new password, confirm that it is correct, and click Update Password.

Note

For users managed with an LDAP service, you must change user passwords on the LDAP server.

To change an administrator password in MKE:

  1. SSH to an MKE manager node and run:

    docker run --net=host -v ucp-auth-api-certs:/tls -it \
    "$(docker inspect --format \
    '{{ .Spec.TaskTemplate.ContainerSpec.Image }}' \
    ucp-auth-api)" \
    "$(docker inspect --format \
    '{{ index .Spec.TaskTemplate.ContainerSpec.Args 0 }}' \
    ucp-auth-api)" \
    passwd -i
    
  2. Optional. If you have DEBUG set as your global log level within MKE, running $(docker inspect --format '{{ index .Spec.TaskTemplate.ContainerSpec.Args 0 }}` returns --debug instead of --db-addr.

    Pass Args 1 to $docker inspect instead to reset your administrator password:

    docker run --net=host -v ucp-auth-api-certs:/tls -it \
    "$(docker inspect --format \
    '{{ .Spec.TaskTemplate.ContainerSpec.Image }}' \
    ucp-auth-api)" \
    "$(docker inspect --format \
    '{{ index .Spec.TaskTemplate.ContainerSpec.Args 1 }}' \
    ucp-auth-api)" \
    passwd -i
    

Note

Alternatively, ask another administrator to change your password.

RBAC tutorials

This section contains a collection of tutorials that explain how to use RBAC in a variety of scenarios.

Deploy a simple stateless app with RBAC

This topic describes how to deploy an NGINX web server, limiting access to one team using role-based access control (RBAC).

You are the MKE system administrator and will configure permissions to company resources using a four-step process:

  1. Build the organization with teams and users.

  2. Define roles with allowable operations per resource type, such as permission to run containers.

  3. Create collections or namespaces for accessing actual resources.

  4. Create grants that join team, role, and resource set.


To deploy a simple stateless app with RBAC:

  1. Build the organization:

    1. Log in to the MKE web UI.

    2. Add an organization called company-datacenter.

    3. Create three teams according to the following structure:

      Team

      Users

      DBA

      Alex

      Dev

      Bett

      Ops

      Alex, Chad

  2. Deploy NGINX with Kubernetes:

    1. Create a namespace:

      1. Click Kubernetes > Create.

      2. Paste the following manifest in the Object YAML editor and click Create.

        apiVersion: v1
        kind: Namespace
        metadata:
          name: nginx-namespace
        
    2. Create a role for the Ops team called kube-deploy:

      1. Click Kubernetes > Create.

      2. Select nginx-namespace from the Namespace drop-down.

      3. Paste the following manifest in the Object YAML editor and click Create.

        apiVersion: rbac.authorization.k8s.io/v1
        kind: Role
        metadata:
          name: kube-deploy
        rules:
        - apiGroups: ["*"]
          resources: ["*"]
          verbs: ["*"]
        
    3. Create a role binding, to allow the Ops team to deploy applications to nginx-namespace:

      1. Click Access Control > Grants.

      2. Select the Kubernetes tab and click Create Role Binding.

      3. Under Subject, select Organizations and configure Organization as company-datacenter and Team as Ops.

      4. Click Next.

      5. Under Resource Set, select nginx-namespace and click Next.

      6. Under Role, select the kube-deploy role and click Create.

    4. Deploy an application as a member of the Ops team:

      1. Log in to the MKE web UI as Chad, a member of the Ops team.

      2. Click Kubernetes > Create.

      3. Select nginx-namespace from the Namespace drop-down.

      4. Paste the following manifest in the Object YAML editor and click Create.

        apiVersion: apps/v1
        kind: Deployment
        metadata:
           name: nginx-deployment
        spec:
           replicas: 2
           selector:
              matchLabels:
              app: nginx
           template:
              metadata:
              labels:
                 app: nginx
              spec:
              containers:
              - name: nginx
                 image: nginx:latest
                 ports:
                 - containerPort: 80
        
  3. Verify that Ops team members can view the nginx-deployment resources:

    1. Log in to the MKE web UI as Alex, a member of the Ops team.

    2. Click Kubernetes > Controllers.

    3. Confirm the presence of NGINX deployment and ReplicaSet.

  4. Verify that Dev team members cannot view the nginx-deployment resources:

    1. Log in to the MKE web UI as Bett, who is not a member of the Ops team.

    2. Click Kubernetes > Controllers.

    3. Confirm that NGINX deployment and ReplicaSet are not present.

  5. Deploy NGINX as a Swarm service:

    1. Create a collection for NGINX resources called nginx-collection nested under the Shared collection. To view child collections, click View Children.

    2. Create a simple role for the Ops team called Swarm Deploy.

    3. Create a grant for the Ops team to access the nginx-collection with the Swarm Deploy custom role.

    4. Log in to the MKE web UI as Chad on the Ops team.

    5. Click Swarm > Services > Create.

    6. On the Details tab, enter the following:

      • Name: nginx-service

      • Image: nginx:latest

    7. On the Collection tab, click View Children next to Swarm and then next to Shared.

    8. Click nginx-collection, then click Create.

    9. Sign in as each user and verify that the following users cannot see nginx-collection:

      • Alex on the DBA team

      • Bett on the Dev team

Isolate volumes to specific teams

This topic describes how to grant two teams access to separate volumes in two different resource collections such that neither team can see the volumes of the other team. MKE allows you to do this even if the volumes are on the same nodes.

To create two teams:

  1. Log in to the MKE web UI.

  2. Navigate to Orgs & Teams.

  3. Create two teams in the engineering organization named Dev and Prod.

  4. Add a non-admin MKE user to the Dev team.

  5. Add a non-admin MKE user to the Prod team.

To create two resource collections:

  1. Create a Swarm collection called dev-volumes nested under the Shared collection.

  2. Create a Swarm collection called prod-volumes nested under the Shared collection.

To create grants for controlling access to the new volumes:

  1. Create a grant for the Dev team to access the dev-volumes collection with the Restricted Control built-in role.

  2. Create a grant for the Prod team to access the prod-volumes collection with the Restricted Control built-in role.

To create a volume as a team member:

  1. Log in as one of the users on the Dev team.

  2. Navigate to Swarm > Volumes and click Create.

  3. On the Details tab, name the new volume dev-data.

  4. On the Collection tab, navigate to the dev-volumes collection and click Create.

  5. Log in as one of the users on the Prod team.

  6. Navigate to Swarm > Volumes and click Create.

  7. On the Details tab, name the new volume prod-data.

  8. On the Collection tab, navigate to the prod-volumes collection and click Create.

As a result, the user on the Prod team cannot see the Dev team volumes, and the user on the Dev team cannot see the Prod team volumes. MKE administrators can see all of the volumes created by either team.

Isolate nodes

You can use MKE to physically isolate resources by organizing nodes into collections and granting Scheduler access for different users. Control access to nodes by moving them to dedicated collections where you can grant access to specific users, teams, and organizations.

The following tutorials explain how to isolate nodes using Swarm and Kubernetes.

Isolate cluster nodes with Swarm

This tutorial explains how to give a team access to a node collection and a resource collection. MKE access control ensures that team members cannot view or use Swarm resources that are not in their collection.

Note

You need an MKE license and at least two worker nodes to complete this tutorial.

The following is a high-level overview of the steps you will take to isolate cluster nodes:

  1. Create an Ops team and assign a user to it.

  2. Create a Prod collection for the team node.

  3. Assign a worker node to the Prod collection.

  4. Grant the Ops teams access to its collection.


To create a team:

  1. Log in to the MKE web UI.

  2. Create a team named Ops in your organization.

  3. Add a user to the team who is not an administrator.


To create the team collections:

In this example, the Ops team uses a collection for its assigned nodes and another for its resources.

  1. Create a Swarm collection called Prod nested under the Swarm collection.

  2. Create a Swarm collection called Webserver nested under the Prod collection.

The Prod collection is for the worker nodes and the Webserver sub-collection is for an application that you will deploy on the corresponding worker nodes.


To move a worker node to a different collection:

Note

MKE places worker nodes in the Shared collection by default, and it places those running MSR in the System collection.

  1. Navigate to Shared Resources > Nodes to view all of the nodes in the swarm.

  2. Find a node located in the Shared collection. You cannot move worker nodes that are assigned to the System collection.

  3. Click the slider icon on the node details page.

  4. In the Labels section on the Details tab, change com.docker.ucp.access.label from /Shared to /Prod.

  5. Click Save to move the node to the Prod collection.


To create two grants for team access to the two collections:

  1. Create a grant for the Ops team to access the Webserver collection with the built-in Restricted Control role.

  2. Create a grant for the Ops team to access the Prod collection with the built-in Scheduler role.

The cluster is now set up for node isolation. Users with access to nodes in the Prod collection can deploy Swarm services and Kubernetes apps. They cannot, however, schedule workloads on nodes that are not in the collection.


To deploy a Swarm service as a team member:

When a user deploys a Swarm service, MKE assigns its resources to the default collection. As a user on the Ops team, set Webserver to be your default collection.

Note

From the resource target collection, MKE walks up the ancestor collections until it finds the highest ancestor that the user has Scheduler access to. MKE schedules tasks on any nodes in the tree below this ancestor. In this example, MKE assigns the user service to the Webserver collection and schedules tasks on nodes in the Prod collection.

  1. Log in as a user on the Ops team.

  2. Navigate to Shared Resources > Collections.

  3. Navigate to the Webserver collection.

  4. Under the vertical ellipsis menu, select Set to default.

  5. Navigate to Swarm > Services and click Create to create a Swarm service.

  6. Name the service NGINX, enter nginx:latest in the Image* field, and click Create.

  7. Click the NGINX service when it turns green.

  8. Scroll down to TASKS, click the NGINX container, and confirm that it is in the Webserver collection.

  9. Navigate to the Metrics tab on the container page, select the node, and confirm that it is in the Prod collection.

Note

An alternative approach is to use a grant instead of changing the default collection. An administrator can create a grant for a role that has the Service Create permission for the Webserver collection or a child collection. In this case, the user sets the value of com.docker.ucp.access.label to the new collection or one of its children that has a Service Create grant for the required user.

Isolate cluster nodes with Kubernetes

This topic describes how to use a Kubernetes namespace to deploy a Kubernetes workload to worker nodes using the MKE web UI.

MKE uses the scheduler.alpha.kubernetes.io/node-selector annotation key to assign node selectors to namespaces. Assigning the name of the node selector to this annotation pins all applications deployed in the namespace to the nodes that have the given node selector specified.

To isolate cluster nodes with Kubernetes:

  1. Create a Kubernetes namespace.

    Note

    You can also associate nodes with a namespace by providing the namespace definition information in a configuration file.

    1. Log in to the MKE web UI as an administrator.

    2. In the left-side navigation panel, navigate to Kubernetes and click Create to open the Create Kubernetes Object page.

    3. Paste the following in the Object YAML editor:

      apiVersion: v1
      kind: Namespace
      metadata:
        name: namespace-name
      
    4. Click Create to create the namespace-name namespace.

  2. Grant access to the Kubernetes namespace:

    1. Create a role binding for a user of your choice to access the namespace-name namespace with the built-in cluster-admin Cluster Role.

  3. Associate nodes with the namespace:

    1. From the left-side navigation panel, navigate to Shared Resources > Nodes.

    2. Select the required node.

    3. Click the Edit Node icon in the upper-right corner.

    4. Scroll down to the Kubernetes Labels section and click Add Label.

    5. In the Key field, enter zone.

    6. In the Value field, enter example-zone.

    7. Click Save.

    8. Add a scheduler node selector annotation as part of the namespace definition:

      apiVersion: v1
         kind: Namespace
         metadata:
            annotations:
            scheduler.alpha.kubernetes.io/node-selector: zone=example-zone
            name: ops-nodes
      
Set up access control architecture

This tutorial explains how to set up a complete access architecture for a fictitious company called OrcaBank.

OrcaBank is reorganizing their application teams by product with each team providing shared services as necessary. Developers at OrcaBank perform their own DevOps and deploy and manage the lifecycle of their applications.

OrcaBank has four teams with the following resource needs:

  • Security needs view-only access to all applications in the cluster.

  • DB (database) needs full access to all database applications and resources.

  • Mobile needs full access to their mobile applications and limited access to shared DB services.

  • Payments needs full access to their payments applications and limited access to shared DB services.

OrcaBank is taking advantage of the flexibility in the MKE grant model by applying two grants to each application team. One grant allows each team to fully manage the apps in their own collection, and the second grant gives them the (limited) access they need to networks and secrets within the db collection.

The resulting access architecture has applications connecting across collection boundaries. By assigning multiple grants per team, the Mobile and Payments applications teams can connect to dedicated database resources through a secure and controlled interface, leveraging database networks and secrets.

Note

MKE deploys all resources across the same group of worker nodes while providing the option to segment nodes.


To set up a complete access control architecture:

  1. Set up LDAP/AD integration and create the required teams.

    OrcaBank will standardize on LDAP for centralized authentication to help their identity team scale across all the platforms they manage.

    To implement LDAP authentication in MKE, OrcaBank is using the MKE native LDAP/AD integration to map LDAP groups directly to MKE teams. You can add or remove users from MKE teams via LDAP, which the OrcaBank identity team will centrally manage.

    1. Enable LDAP in MKE and sync your directory.

    2. Create the following teams: Security, DB, Mobile, and Payments.

  2. Define the required roles:

    1. Define an Ops role that allows users to perform all operations against configs, containers, images, networks, nodes, secrets, services, and volumes.

    2. Define a View & Use Networks + Secrets role that enables users to view and connect to networks and view and use secrets used by DB containers, but that prevents them from seeing or impacting the DB applications themselves.

    Note

    You will also use the built-in View Only role that allows users to see all resources, but not edit or use them.

  3. Create the required Swarm collections.

    All OrcaBank applications share the same physical resources, so all nodes and applications are configured in collections that nest under the built-in Shared collection.

    Create the following collections:

    • /Shared/mobile to host all mobile applications and resources.

    • /Shared/payments to host all payments applications and resources.

    • /Shared/db to serve as a top-level collection for all db resources.

    • /Shared/db/mobile to hold db resources for mobile applications.

    • /Shared/db/payments to hold db resources for payments applications.

    Note

    The OrcaBank grant composition will ensure that the Swarm collection architecture gives the DB team access to all db resources and restricts app teams to shared db resources.

  4. Create the required grants:

    1. For the Security team, create grants to access the following collections with the View Only built-in role: /Shared/mobile, /Shared/payments, /Shared/db, /Shared/db/mobile, and /Shared/db/payments.

    2. For the DB team, create grants to access the /Shared/db, /Shared/db/mobile, and /Shared/db/payments collections with the Ops custom role.

    3. For the Mobile team, create a grant to access the /Shared/mobile collection with the Ops custom role.

    4. For the Mobile team, create a grant to access the /Shared/db/mobile collection with the View & Use Networks + Secrets custom role.

    5. For the Payments team, create a grant to access the /Shared/payments collection with the Ops custom role.

    6. For the Payments team, create a grant to access the /Shared/db/payments collection with the View & Use Networks + Secrets custom role.

Set up access control architecture with additional security requirements

Caution

Complete the Set up access control architecture tutorial before you attempt this advanced tutorial.

In the previous tutorial, you assigned multiple grants to resources across collection boundaries on a single platform. In this tutorial, you will implement the following stricter security requirements for the fictitious company, OrcaBank:

  • OrcaBank is adding a staging zone to their deployment model, deploying applications first from development, then from staging, and finally from production.

  • OrcaBank will no longer permit production applications to share any physical infrastructure with non-production infrastructure. They will use node access control to segment application scheduling and access.

    Note

    Node access control is an MKE feature that provides secure multi-tenancy with node-based isolation. Use it to place nodes in different collections so that you can schedule and isolate resources on disparate physical or virtual hardware. For more information, refer to Isolate nodes.

OrcaBank will still use its three application teams from the previous tutorial (DB, Mobile, and Payments) but with varying levels of segmentation between them. The new access architecture will organize the MKE cluster into staging and production collections with separate security zones on separate physical infrastructure.

The four OrcaBank teams now have the following production and staging needs:

  • Security` needs view-only access to all applications in production and no access to staging.

  • DB needs full access to all database applications and resources in production and no access to staging.

  • In both production and staging, Mobile needs full access to their applications and limited access to shared DB services.

  • In both production and staging, Payments needs full access to their applications and limited access to shared DB services.

The resulting access architecture will provide physical segmentation between production and staging using node access control.

Applications are scheduled only on MKE worker nodes in the dedicated application collection. Applications use shared resources across collection boundaries to access the databases in the /prod/db collection.


To set up a complete access control architecture with additional security requirements:

  1. Verify LDAP, teams, and roles are set up properly:

    1. Verify LDAP is enabled and syncing. If it is not, configure that now.

    2. Verify the following teams are present in your organization: Security, DB, Mobile, and Payment, and if they are not, create them.

    3. Verify that there is a View & Use Networks + Secrets role. If there is not, define a View & Use Networks + Secrets role that enables users to view and connect to networks and view and use secrets used by DB containers. Configure the role so that it prevents those who use it from seeing or impacting the DB applications themselves.

    Note

    You will also use the following built-in roles:

    • View Only allows users to see but not edit all cluster resources.

    • Full Control allows users complete control of all collections granted to them. They can also create containers without restriction but cannot see the containers of other users. This role will replace the custom Ops role from the previous tutorial.

  2. Create the required Swarm collections.

    In the previous tutorial, OrcaBank created separate collections for each application team and nested them all under /Shared.

    To meet their new security requirements for production, OrcaBank will add top-level prod and staging collections with mobile and payments application collections nested underneath. The prod collection (but not the staging collection) will also include a db collection with a second set of mobile and payments collections nested underneath.

    OrcaBank will also segment their nodes such that the production and staging zones will have dedicated nodes, and in production each application will be on a dedicated node.

    Create the following collections:

    • /prod

    • /prod/mobile

    • /prod/payments

    • /prod/db

    • /prod/db/mobile

    • /prod/db/payments

    • /staging

    • /staging/mobile

    • /staging/payments

  3. Create the required grants as described in Create grants:

    1. For the Security team, create grants to access the following collections with the View Only built-in role: /prod, /prod/mobile, /prod/payments, /prod/db, /prod/db/mobile, and /prod/db/payments.

    2. For the DB team, create grants to access the following collections with the Full Control built-in role: /prod/db, /prod/db/mobile, and /prod/db/payments.

    3. For the Mobile team, create grants to access the /prod/mobile and /staging/mobile collections with the Full Control built-in role.

    4. For the Mobile team, create a grant to access the /prod/db/mobile collection with the View & Use Networks + Secrets custom role.

    5. For the Payments team, create grants to access the /prod/payments and /staging/payments collections with the Full Control built-in role.

    6. For the Payments team, create a grant to access the /prod/db/payments collection with the View & Use Networks + Secrets custom role.

Upgrades and migrations

Upgrade an MKE installation

Note

Prior to upgrading MKE, review the MKE release notes for information that may be relevant to the upgrade process.

In line with your MKE upgrade, you should plan to upgrade the Mirantis Container Runtime (MCR) instance on each cluster node to version 20.10.0 or later. Mirantis recommends that you schedule the upgrade for non-business hours to ensure minimal user impact.

Important

Do not make changes to your MKE configuration while upgrading, as doing so can cause misconfiguration.

Semantic versioning

MKE uses semantic versioning. While downgrades are not supported, Mirantis supports upgrades according to the following rules:

  • When you upgrade from one patch version to another, you can skip patch versions as no data migration takes place between patch versions.

  • When you upgrade between minor releases, you cannot skip releases. You can, however, upgrade from any patch version from the previous minor release to any patch version of the subsequent minor release.

  • When you upgrade between major releases, you cannot skip releases.

Warning

Upgrading from one MKE minor version to another minor version can result in a downgrading of MKE middleware components. For more information, refer to the component listings in the release notes of both the source and target MKE versions.

Supported upgrade paths

Description

From

To

Supported

Patch upgrade

x.y.0

x.y.1

Yes

Skip patch version

x.y.0

x.y.2

Yes

Patch downgrade

x.y.2

x.y.1

No

Minor upgrade

x.y.*

x.y+1.*

Yes

Skip minor version

x.y.*

x.y+2.*

No

Minor downgrade

x.y.*

x.y-1.*

No

Major upgrade

x.y.z

x+1.0.0

Yes

Major upgrade skipping minor version

x.y.z

x+1.y+1.z

No

Skip major version

x.*.*

x+2.*.*

No

Major downgrade

x.*.*

x-1.*.*

No

Automated rollbacks

Available as of MKE 3.7.0 MKE supports automated rollbacks. As such, if an MKE upgrade fails for any reason, the system will automatically revert to the previously running MKE version and thus ensure that the cluster remains in a usable state.

Note

  • Rollback will be automatically initiated in the event that any step of the upgrade process does not progress within 20 minutes.

  • The automated rollbacks feature is enabled by default. To opt out of the function, refer to the MKE upgrade CLI command documentation.

Verify your environment

Before you perform the environment verifications necessary to ensure a smooth upgrade, Mirantis recommends that you run upgrade checks:

docker container run --rm -it \
--name ucp \
-v /var/run/docker.sock:/var/run/docker.sock \
mirantis/ucp \
upgrade checks [command options]

This process confirms:

  • Port availability

  • Sufficient memory and disk space

  • Supported OS version is in use

  • Existing backup availability


To perform system verifications:

  1. Verify time synchronization across all nodes and assess time daemon logs for any large time drifting.

  2. Verify that PROD=4vCPU/16GB system requirements are met for MKE managers and MSR replicas.

  3. Verify that your port configurations meet all MKE, MSR, and MCR port requirements.

  4. Verify that your cluster nodes meet the minimum requirements.

  5. Verify that you meet all minimum hardware and software requirements.

Note

Azure installations have additional prerequisites. Refer to Install MKE on Azure for more information.


To perform storage verifications:

  1. Verify that no more than 70% of /var/ storage is used. If more than 70% is used, allocate enough storage to meet this requirement. Refer to MKE hardware requirements for the minimum and recommended storage requirements.

  2. Verify whether any node local file systems have disk storage issues, including MSR backend storage, for example, NFS.

  3. Verify that you are using Overlay2 storage drivers, as they are more stable. If you are not, you should transition to Overlay2 at this time. Transitioning from device mapper to Overlay2 is a destructive rebuild.


To perform operating system verifications:

  1. Patch all relevant packages to the most recent cluster node operating system version, including the kernel.

  2. Perform rolling restart of each node to confirm in-memory settings are the same as startup scripts.

  3. After performing rolling restarts, run check-config.sh on each cluster node checking for kernel compatibility issues.


To perform procedural verifications:

  1. Perform Swarm, MKE, and MSR backups.

  2. Gather Compose, service, and stack files.

  3. Generate an MKE support bundle for this specific point in time.

  4. Preinstall MKE, MSR, and MCR images. If your cluster does not have an Internet connection, Mirantis provides tarballs containing all the required container images. If your cluster does have an Internet connection, pull the required container images onto your nodes:

    $ docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:3.7.16 images \
    --list | xargs -L 1 docker pull
    
  5. Load troubleshooting packages, for example, netshoot.


To upgrade MCR:

The MKE upgrade requires MCR 20.10.0 or later to be running on every cluster node. If it is not, perform the following steps first on manager and then on worker nodes:

  1. Log in to the node using SSH.

  2. Upgrade MCR to version 20.10.0 or later.

  3. Using the MKE web UI, verify that the node is in a healthy state:

    1. Log in to the MKE web UI.

    2. Navigate to Shared Resources > Nodes.

    3. Verify that the node is healthy and a part of the cluster.

Caution

Mirantis recommends upgrading in the following order: MCR, MKE, MSR. This topic is limited to the upgrade instructions for MKE.


To perform cluster verifications:

  1. Verify that your cluster is in a healthy state, as it will be easier to troubleshoot should a problem occur.

  2. Create a backup of your cluster, thus allowing you to recover should something go wrong during the upgrade process.

  3. Verify that the Docker engine is running on all MKE cluster nodes.

Note

You cannot use the backup archive during the upgrade process, as it is version specific. For example, if you create a backup archive for an MKE 3.4.2 cluster, you cannot use the archive file after you upgrade to MKE 3.4.4.

Perform the upgrade

Note

  • If the MKE Interlock configuration is customized, the Interlock component is managed by the user and thus cannot be upgraded using the upgrade command. In such cases, Interlock must be manually upgraded using Docker, as follows:

    docker service update --image mirantis/ucp-interlock:<upgrade-target-mke-version>ucp-interlock
    docker service update --image mirantis/ucp-interlock-extension:<upgrade-target-mke-version> ucp-interlock-extension
    docker service update --image mirantis/ucp-interlock-proxy:<upgrade-target-mke-version> ucp-interlock-proxy \
    --debug
    
  • To upgrade MKE on machines that are not connected to the Internet, refer to Install MKE offline to learn how to download the MKE package for offline installation.

  • To manually interrupt the upgrade process, enter Control-C on the terminal upon which you have initiated the upgrade bootstrapper. Doing so will trigger an automatic rollback to the previous MKE version.

  • If no upgrade progress is made within 20 minutes, MKE will initiate a rollback to the original version.

The upgrade methods for MKE are:

With all of these upgrade methods, manager nodes are automatically upgraded in place. You cannot control the order of manager node upgrades. For each worker node that requires an upgrade, you can upgrade that node in place or you can replace the node with a new worker node. The type of upgrade you perform depends on what is needed for each node.

Automated rollbacks are only supported when MKE is in control of the upgrade process, which is while the upgrade containers are running. As such, the feature scope is limited in terms of any failures encountered during Phased in-place cluster upgrade and Replace existing worker nodes using blue-green deployment upgrade methods.

Upgrade method

Description

Automated rollback support

Automated in-place cluster upgrade

Performed on any manager node. This method automatically upgrades the entire cluster.

Yes.

Phased in-place cluster upgrade

Automatically upgrades manager nodes and allows you to control the upgrade order of worker nodes. This type of upgrade is more advanced than the automated in-place cluster upgrade.

Only if the failure occurs before or during manager node upgrade.

Replace existing worker nodes using blue-green deployment

This type of upgrade allows you to stand up a new cluster in parallel to the current one and switch over when the upgrade is complete. It requires that you join new worker nodes, schedule workloads to run on them, pause, drain, and remove old worker nodes in batches (rather than one at a time), and shut down servers to remove worker nodes. This is the most advanced upgrade method.

Only if the failure occurs before or during manager node upgrade.

Automated in-place cluster upgrade method

Automated in-place cluster upgrade is the standard method for upgrading MKE. It updates all MKE components on all nodes within the MKE cluster one-by-one until the upgrade is complete, and thus it is not ideal for those who need to upgrade their worker nodes in a particular order.

  1. Verify that all MCR instances have been upgraded to the corresponding new version.

  2. SSH into one MKE manager node and run the following command (do not run this command on a workstation with a client bundle):

    docker container run --rm -it \
    --name ucp \
    --volume /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:3.7.16 \
    upgrade \
    --interactive \
    --debug
    

    The upgrade command will print messages as it automatically upgrades MKE on all nodes in the cluster.

Phased in-place cluster upgrade

The Phased in-place cluster upgrade method allows for granular control of the MKE upgrade process by first upgrading a manager node and thereafter allowing you to upgrade worker nodes manually in your preferred order. This allows you to migrate workloads and control traffic while upgrading. You can temporarily run MKE worker nodes with different versions of MKE and MCR.

This method allows you to handle failover by adding additional worker node capacity during an upgrade. You can add worker nodes to a partially-upgraded cluster, migrate workloads, and finish upgrading the remaining worker nodes.

  1. Verify that all MCR instances have been upgraded to the corresponding new version.

  2. SSH into one MKE manager node and run the following command (do not run this command on a workstation with a client bundle):

    docker container run --rm -it \
    --name ucp \
    --volume /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:3.7.16 \
    upgrade \
    --manual-worker-upgrade \
    --interactive \
    --debug
    

    The --manual-worker-upgrade flag allows MKE to upgrade only the manager nodes. It adds an upgrade-hold label to all worker nodes, which prevents MKE from upgrading each worker node until you remove the label.

  3. Optional. Join additional worker nodes to your cluster:

    docker swarm join --token SWMTKN-<swarm-token> <manager-ip>:2377
    

    For more information, refer to Join Linux nodes.

    Note

    New worker nodes will already have the newer version of MCR and MKE installed when they join the cluster.

  4. Remove the upgrade-hold label from each worker node to upgrade:

    docker node update --label-rm com.docker.ucp.upgrade-hold \
    <node-name-or-id>
    
Replace existing worker nodes using blue-green deployment

Th Replace existing worker nodes using blue-green deployment upgrade method creates a parallel environment for a new deployment, which reduces downtime, upgrades worker nodes without disrupting workloads, and allows you to migrate traffic to the new environment with worker node rollback capability.

Note

You do not have to replace all worker nodes in the cluster at one time, but can instead replace them in groups.

  1. Verify that all MCR instances have been upgraded to the corresponding new version.

  2. SSH into one MKE manager node and run the following command (do not run this command on a workstation with a client bundle):

    docker container run --rm -it \
    --name ucp \
    --volume /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:3.7.16 \
    upgrade \
    --manual-worker-upgrade \
    --interactive \
    --debug
    

    The --manual-worker-upgrade flag allows MKE to upgrade only the manager nodes. It adds an upgrade-hold label to all worker nodes, which prevents MKE from upgrading each worker node until the label is removed.

  3. Join additional worker nodes to your cluster:

    docker swarm join --token SWMTKN-<swarm-token> <manager-ip>:2377
    

    For more information, refer to Join Linux nodes.

    Note

    New worker nodes will already have the newer version of MCR and MKE installed when they join the cluster.

  4. Join MCR to the cluster:

    docker swarm join --token SWMTKN-<your-token> <manager-ip>:2377
    
  5. Pause all existing worker nodes to ensure that MKE does not deploy new workloads on existing nodes:

    docker node update --availability pause <node-name>
    
  6. Drain the paused nodes in preparation for migrating your workloads:

    docker node update --availability drain <node-name>
    

    Note

    MKE automatically reschedules workloads onto new nodes while existing nodes are paused.

  7. Remove each fully-drained node:

    docker swarm leave <node-name>
    
  8. Remove each manager node after its worker nodes become unresponsive:

    docker node rm <node-name>
    
  9. From any manager node, remove old MKE agents after the upgrade is complete, including 390x and Windows agents carried over from the previous install:

    docker service rm ucp-agent
    docker service rm ucp-agent-win
    docker service rm ucp-agent-s390x
    
Troubleshoot the upgrade process

This topic describes common problems and errors that occur during the upgrade process and how to identify and resolve them.


To check for multiple conflicting upgrades:

The upgrade command automatically checks for multiple ucp-worker-agents, the existence of which can indicate that the cluster is still undergoing a prior manual upgrade. You must resolve the conflicting node labels before proceeding with the upgrade.


To check Kubernetes errors:

For more information on anything that might have gone wrong during the upgrade process, check Kubernetes errors in node state messages after the upgrade is complete.


To circumvent the SLESS12 SP5 Calico CNI error:

Beginning with MKE 3.7.12, MKE cluster upgrades on SUSE Linux Enterprise Server 12 SP5 result in a Calico CNI Plugin Pod is Unhealthy error. You can bypass this error by manually starting cri-dockerd:

sudo systemctl start cri-dockerd-mke

Upgrade nodes to Windows Server 2022

You can upgrade your cluster to use Windows Server 2002 nodes in one of two ways. The approach that Mirantis recommends is to join nodes that have a fresh installation of Windows Server 2022, whereas the alternative is to perform an in-place upgrade of existing Windows Server 2019 nodes.

Approach #2: Upgrade existing Windows Server nodes

While it is not recommended, you can upgrade to Windows Server 2022 by performing an in-place upgrade of the existing Windows Server 2019 nodes.

Upgrade existing Windows Server nodes
  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required Window Server 2019 node.

  3. In the upper right, select the Edit Node icon.

  4. In the Availability section, click Drain.

  5. Click Save to evict the workloads from the node.

  6. Upgrade the node from Windows Server 2019 to Windows Server 2022.

    • Windows full version nodes:

      Connect to the node and use the Windows UI to perform the upgrade. For instructions, refer to Perform an in-place upgrade of Windows Server in the Microsoft documentation.

    • Windows core version nodes:

      1. Mount the ISO for Windows Server 2022.

        If you are using a physical server, insert a drive that has the Windows Server 2022 installation media installed. Otherwise, upload the ISO to the server and mount the image.

        Note

        Windows core version users can mount the ISO in PowerShell using Mount-DiskImage -ImagePath "path".

      2. Navigate to the drive where the ISO is mounted and run setup.exe to launch the setup wizard.

      3. Follow the steps offered in the Microsoft documentation, Perform an in-place upgrade of Windows Server.

  7. Once the upgrade completes, remove all the MKE images on the node and re-pull them. Docker will automatically pull the image versions that are built for Windows Server 2022.

    Note

    To obtain the list of required images, refer to Configure the Docker daemon for Windows nodes.

  8. If ucp-work-agent-win is not running on the node, go to the following section, To troubleshoot the upgrade process.

  9. Return to the MKE web UI.

  10. In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.

  11. In the upper right, select the Edit Node icon.

  12. In the Availability section, click Active.

  13. Click Save.

Troubleshoot the upgrade process
  1. If ucp-work-agent-win is not running on the node, use Docker Swarm to rerun the service on the node:

    docker service update ucp-worker-agent-win-x
    

    If ucp-work-agent-win is still not running on the node, it could be due operating system mismatches, which can occur after failing to update registry keys during the Windows upgrade process.

    1. Review the output of the following command, looking for references to Windows Server 2019 or build number 17763:

      Get-ItemProperty "HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion"
      
    2. Update any out-of-date registry keys:

      Set-Itemproperty -path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\' -Name CurrentBuildNumber -value 20348
      
  2. Return to the MKE web UI.

  3. In the left-side navigation panel, navigate to Shared Resources > Nodes and select the required node.

  4. In the upper right, select the Edit Node icon.

  5. In the Availability section, click Active.

  6. Click Save.

Migrate an MKE cluster to a new OS

MKE supports the use of a Node-replacement strategy in migrating an active cluster to any supported Linux OS.

Note

If you are running MKE on SUSE Linux Enterprise Server 12 SP5, review To circumvent the SLESS12 SP5 Calico CNI error.

Migrate manager nodes

When migrating manager Nodes, Mirantis recommends that you replace one manager Node at a time, to preserve fault tolerance and minimize performance impact.

  1. Add a Node that is running the new OS to your MKE cluster.

  2. Promote the new Node to an MKE manager and wait until the Node becomes healthy.

  3. Demote a manager node that is running the old OS.

  4. Remove the demoted Node from the cluster.

  5. Repeat the previous steps until all manager Nodes are running the new OS.

Migrate worker nodes

It is not necessary to migrate worker Nodes one at a time.

  1. Add the required worker nodes that are running the new OS to your MKE cluster.

  2. Remove the worker Nodes that are running the old OS.

Deploy applications with Swarm

Deploy a single-service application

This topic describes how to use both the MKE web UI and the CLI to deploy an NGINX web server and make it accessible on port 8000.


To deploy a single-service application using the MKE web UI:

  1. Log in to the MKE web UI.

  2. Navigate to Swarm > Services and click Create a service.

  3. In the Service Name field, enter nginx.

  4. In the Image Name field, enter nginx:latest.

  5. Navigate to Network > Ports and click Publish Port.

  6. In the Target port field, enter 80.

  7. In the Protocol field, enter tcp.

  8. In the Publish mode field, enter Ingress.

  9. In the Published port field, enter 8000.

  10. Click Confirm to map the ports for the NGINX service.

  11. Specify the service image and ports.

  12. Click Create to deploy the service into the MKE cluster.


To view the default NGINX page through the MKE web UI:

  1. Navigate to Swarm > Services.

  2. Click nginx.

  3. Click Published Endpoints.

  4. Click the link to open a new tab with the default NGINX home page.


To deploy a single service using the CLI:

  1. Verify that you have downloaded and configured the client bundle.

  2. Deploy the single-service application:

    docker service create --name nginx \
    --publish mode=ingress,target=80,published=8000 \
    --label com.docker.ucp.access.owner=<your-username> \
    nginx
    
  3. View the default NGINX page by visiting http://<node-ip>:8000.

See also

NGINX

Deploy a multi-service application

This topic describes how to use both the MKE web UI and the CLI to deploy a multi-service application for voting on whether you prefer cats or dogs.


To deploy a multi-service application using the MKE web UI:

  1. Log in to the MKE web UI.

  2. Navigate to Shared Resources > Stacks and click Create Stack.

  3. In the Name field, enter voting-app.

  4. Under ORCHESTRATOR MODE, select Swarm Services and click Next.

  5. In the Add Application File editor, paste the following application definition written in the docker-compose.yml format:

    version: "3"
    services:
    
      # A Redis key-value store to serve as message queue
      redis:
        image: redis:alpine
        ports:
          - "6379"
        networks:
          - frontend
    
      # A PostgreSQL database for persistent storage
      db:
        image: postgres:9.4
        volumes:
          - db-data:/var/lib/postgresql/data
        networks:
          - backend
    
      # Web UI for voting
      vote:
        image: dockersamples/examplevotingapp_vote:before
        ports:
          - 5000:80
        networks:
          - frontend
        depends_on:
          - redis
    
      # Web UI to count voting results
      result:
        image: dockersamples/examplevotingapp_result:before
        ports:
          - 5001:80
        networks:
          - backend
        depends_on:
          - db
    
      # Worker service to read from message queue
      worker:
        image: dockersamples/examplevotingapp_worker
        networks:
          - frontend
          - backend
    
    networks:
      frontend:
      backend:
    
    volumes:
      db-data:
    
  6. Click Create to deploy the stack.

  7. In the list on the Shared Resources > Stacks page, verify that the application is deployed by looking for voting-app. If the application is in the list, it is deployed.

  8. To view the individual application services, click voting-app and navigate to the Services tab.

  9. Cast votes by accessing the service on port 5000.

Caution

  • MKE does not support referencing external files when using the MKE web UI to deploy applications, and thus does not support the following keywords:

    • build

    • dockerfile

    • env_file

  • You must use a version control system to store the stack definition used to deploy the stack, as MKE does not store the stack definition.


To deploy a multi-service application using the MKE CLI:

  1. Download and configure the client bundle.

  2. Create a file named docker-compose.yml with the following content:

    version: "3"
    services:
    
      # A Redis key-value store to serve as message queue
      redis:
        image: redis:alpine
        ports:
          - "6379"
        networks:
          - frontend
    
      # A PostgreSQL database for persistent storage
      db:
        image: postgres:9.4
        volumes:
          - db-data:/var/lib/postgresql/data
        networks:
          - backend
        environment:
          - POSTGRES_PASSWORD=<password>
    
      # Web UI for voting
      vote:
        image: dockersamples/examplevotingapp_vote:before
        ports:
          - 5000:80
        networks:
          - frontend
        depends_on:
          - redis
    
      # Web UI to count voting results
      result:
        image: dockersamples/examplevotingapp_result:before
        ports:
          - 5001:80
        networks:
          - backend
        depends_on:
          - db
    
      # Worker service to read from message queue
      worker:
        image: dockersamples/examplevotingapp_worker
        networks:
          - frontend
          - backend
    
    networks:
      frontend:
      backend:
    
    volumes:
      db-data:
    
  3. Create the application:

    docker stack deploy --compose-file docker-compose.yml voting-app
      
    docker-compose --file docker-compose.yml --project-name voting-app up -d
      
  4. Verify that the application is deployed:

    docker stack ps voting-app
    
  5. Cast votes by accessing the service on port 5000.

Deploy services to a Swarm collection

This topic describes how to use both the CLI and a Compose file to deploy application resources to a particular Swarm collection. Attach the Swarm collection path to the service access label to assign the service to the required collection. MKE automatically assigns new services to the default collection unless you use either of the methods presented here to assign a different Swarm collection.

Caution

To assign services to Swarm collections, an administrator must first create the Swarm collection and grant the user access to the required collection. Otherwise the deployment will fail.

Note

If required, you can place application resources into multiple collections.


To deploy a service to a Swarm collection using the CLI:

Use docker service create to deploy your service to a collection:

docker service create \
--name <service-name> \
--label com.docker.ucp.access.label="</collection/path>"
<app-name>:<version>

To deploy a service to a Swarm collection using a Compose file:

  1. Use a labels: dictionary in a Compose file and add the Swarm collection path to the com.docker.ucp.access.label key.

    The following example specifies two services, WordPress and MySQL, and assigns /Shared/wordpress to their access labels:

    version: '3.1'
    
    services:
    
      wordpress:
        image: wordpress
        networks:
          - wp
        ports:
          - 8080:80
        environment:
          WORDPRESS_DB_PASSWORD: example
        deploy:
          labels:
            com.docker.ucp.access.label: /Shared/wordpress
      mysql:
        image: mysql:5.7
        networks:
          - wp
        environment:
          MYSQL_ROOT_PASSWORD: example
        deploy:
          labels:
            com.docker.ucp.access.label: /Shared/wordpress
    
    networks:
      wp:
        driver: overlay
        labels:
          com.docker.ucp.access.label: /Shared/wordpress
    
  2. Log in to the MKE web UI.

  3. Navigate to the Shared Resources > Stacks and click Create Stack.

  4. Name the application wordpress.

  5. Under ORCHESTRATOR MODE, select Swarm Services and click Next.

  6. In the Add Application File editor, paste the Compose file.

  7. Click Create to deploy the application

  8. Click Done when the deployment completes.

Note

MKE reports an error if the /Shared/wordpress collection does not exist or if you do not have a grant for accessing it.


To confirm that the service deployed to the correct Swarm collection:

  1. Navigate to Shared Resources > Stacks and select your application.

  2. Navigate to the to Services tab and select the required service.

  3. On the details pages, verify that the service is assigned to the correct Swarm collection.

Note

MKE creates a default overlay network for your stack that attaches to each container you deploy. This works well for administrators and those assigned full control roles. If you have lesser permissions, define a custom network with the same com.docker.ucp.access.label label as your services and attach this network to each service. This correctly groups your network with the other resources in your stack.

Use secrets in Swarm deployments

This topic describes how to create and use secrets with MKE by showing you how to deploy a WordPress application that uses a secret for storing a plaintext password. Other sensitive information you might use a secret to store includes TLS certificates and private keys. MKE allows you to securely store secrets and configure who can access and manage them using role-based access control (RBAC).

The application you will create in this topic includes the following two services:

  • wordpress

    Apache, PHP, and WordPress

  • wordpress-db

    MySQL database

The following example stores a password in a secret, and the secret is stored in a file inside the container that runs the services you will deploy. The services have access to the file, but no one else can see the plaintext password. To make things simple, you will not configure the database to persist data, and thus when the service stops, the data is lost.


To create a secret:

  1. Log in to the MKE web UI.

  2. Navigate to Swarm > Secrets and click Create.

    Note

    After you create the secret, you will not be able to edit or see the secret again.

  3. Name the secret wordpress-password-v1.

  4. In the Content field, assign a value to the secret.

  5. Optional. Define a permission label so that other users can be given permission to use this secret.

    Note

    To use services and secrets together, they must either have the same permission label or no label at all.


To create a network for your services:

  1. Navigate to Swarm > Networks and click Create.

  2. Create a network called wordpress-network with the default settings.


To create the MySQL service:

  1. Navigate to Swarm > Services and click Create.

  2. Under Service Details, name the service wordpress-db.

  3. Under Task Template, enter mysql:5.7.

  4. In the left-side menu, navigate to Network, click Attach Network +, and select wordpress-network from the drop-down.

  5. In the left-side menu, navigate to Environment, click Use Secret +, and select wordpress-password-v1 from the drop-down.

  6. Click Confirm to associate the secret with the service.

  7. Scroll down to Environment variables and click Add Environment Variable +.

  8. Enter the following string to create an environment variable that contains the path to the password file in the container:

    MYSQL_ROOT_PASSWORD_FILE=/run/secrets/wordpress-password-v1
    
  9. If you specified a permission label on the secret, you must set the same permission label on this service.

  10. Click Create to deploy the MySQL service.

This creates a MySQL service that is attached to the wordpress-network network and that uses the wordpress-password-v1 secret. By default, this creates a file with the same name in /run/secrets/<secret-name> inside the container running the service.

We also set the MYSQL_ROOT_PASSWORD_FILE environment variable to configure MySQL to use the content of the /run/secrets/wordpress-password-v1 file as the root password.


To create the WordPress service:

  1. Navigate to Swarm > Services and click Create.

  2. Under Service Details, name the service wordpress.

  3. Under Task Template, enter wordpress:latest.

  4. In the left-side menu, navigate to Network, click Attach Network +, and select wordpress-network from the drop-down.

  5. In the left-side menu, navigate to Environment, click Use Secret +, and select wordpress-password-v1 from the drop-down.

  6. Click Confirm to associate the secret with the service.

  7. Scroll down to Environment variables and click Add Environment Variable +.

  8. Enter the following string to create an environment variable that contains the path to the password file in the container:

    WORDPRESS_DB_PASSWORD_FILE=/run/secrets/wordpress-password-v1
    
  9. Add another environment variable and enter the following string:

    WORDPRESS_DB_HOST=wordpress-db:3306
    
  10. If you specified a permission label on the secret, you must set the same permission label on this service.

  11. Click Create to deploy the WordPress service.

This creates a WordPress service that is attached to the same network as the MySQL service so that they can communicate, and maps the port 80 of the service to port 8000 of the cluster routing mesh.

Once you deploy this service, you will be able to access it on port 8000 using the IP address of any node in your MKE cluster.


To update a secret:

If the secret is compromised, you need to change it, update the services that use it, and delete the old secret.

  1. Create a new secret named wordpress-password-v2.

  2. From Swarm > Secrets, select the wordpress-password-v1 secret to view all the services that you need to update. In this example, it is straightforward, but that will not always be the case.

  3. Update wordpress-db to use the new secret.

  4. Update the MYSQL_ROOT_PASSWORD_FILE environment variable with either of the following methods:

    • Update the environment variable directly with the following:

      MYSQL_ROOT_PASSWORD_FILE=/run/secrets/wordpress-password-v2
      
    • Mount the secret file in /run/secrets/wordpress-password-v1 by setting the Target Name field with wordpress-password-v1. This mounts the file with the wordpress-password-v2 content in /run/secrets/wordpress-password-v1.

  5. Delete the wordpress-password-v1 secret and click Update.

  6. Repeat the foregoing steps for the WordPress service.

Interlock

Layer 7 routing

MKE includes a system for application-layer (layer 7) routing that offers both application routing and load balancing (ingress routing) for Swarm orchestration. The Interlock architecture leverages Swarm components to provide scalable layer 7 routing and Layer 4 VIP mode functionality.

Swarm mode provides MCR with a routing mesh, which enables users to access services using the IP address of any node in the swarm. layer 7 routing enables you to access services through any node in the swarm by using a domain name, with Interlock routing the traffic to the node with the relevant container.

Interlock uses the Docker remote API to automatically configure extensions such as NGINX and HAProxy for application traffic. Interlock is designed for:

  • Full integration with MCR, including Swarm services, secrets, and configs

  • Enhanced configuration, including context roots, TLS, zero downtime deployment, and rollback

  • Support through extensions for external load balancers, such as NGINX, HAProxy, and F5

  • Least privilege for extensions, such that they have no Docker API access

Note

Interlock and layer 7 routing are used for Swarm deployments. Refer to NGINX Ingress Controller for information on routing traffic to your Kubernetes applications.

Terminology
Cluster

A group of compute resources running MKE

Swarm

An MKE cluster running in Swarm mode

Upstream

An upstream container that serves an application

Proxy service

A service, such as NGINX, that provides load balancing and proxying

Extension service

A secondary service that configures the proxy service

Service cluster

A combined Interlock extension and proxy service

gRPC

A high-performance RPC framework

Interlock services
Interlock

The central piece of the layer 7 routing solution. The core service is responsible for interacting with the Docker remote API and building an upstream configuration for the extensions. Interlock uses the Docker API to monitor events, and manages the extension and proxy services, and it serves this on a gRPC API that the extensions are configured to access.

Interlock manages extension and proxy service updates for both configuration changes and application service deployments. There is no operator intervention required.

The Interlock service starts a single replica on a manager node. The Interlock extension service runs a single replica on any available node, and the Interlock proxy service starts two replicas on any available node. Interlock prioritizes replica placement in the following order:

  • Replicas on the same worker node

  • Replicas on different worker nodes

  • Replicas on any available nodes, including managers

Interlock extension

A secondary service that queries the Interlock gRPC API for the upstream configuration. The extension service configures the proxy service according to the upstream configuration. For proxy services that use files such as NGINX or HAProxy, the extension service generates the file and sends it to Interlock using the gRPC API. Interlock then updates the corresponding Docker configuration object for the proxy service.

Interlock proxy

A proxy and load-balancing service that handles requests for the upstream application services. Interlock configures these using the data created by the corresponding extension service. By default, this service is a containerized NGINX deployment.

Features and benefits
High availability

All layer 7 routing components are failure-tolerant and leverage Docker Swarm for high availability.

Automatic configuration

Interlock uses the Docker API for automatic configuration, without needing you to manually update or restart anything to make services available. MKE monitors your services and automatically reconfigures proxy services.

Scalability

Interlock uses a modular design with a separate proxy service, allowing an operator to individually customize and scale the proxy Layer to handle user requests and meet services demands, with transparency and no downtime for users.

TLS

You can leverage Docker secrets to securely manage TLS certificates and keys for your services. Interlock supports both TLS termination and TCP passthrough.

Context-based routing

Interlock supports advanced application request routing by context or path.

Host mode networking

Layer 7 routing leverages the Docker Swarm routing mesh by default, but Interlock also supports running proxy and application services in host mode networking, allowing you to bypass the routing mesh completely, thus promoting maximum application performance.

Security

The layer 7 routing components that are exposed to the outside world run on worker nodes, thus your cluster will not be affected if they are compromised.

SSL

Interlock leverages Docker secrets to securely store and use SSL certificates for services, supporting both SSL termination and TCP passthrough.

Blue-green and canary service deployment

Interlock supports blue-green service deployment allowing an operator to deploy a new application while the current version is serving. Once the new application verifies the traffic, the operator can scale the older version to zero. If there is a problem, the operation is easy to reverse.

Service cluster support

Interlock supports multiple extension and proxy service combinations, thus allowing for operators to partition load balancing resources to be used, for example, in region- or organization-based load balancing.

Least privilege

Interlock supports being deployed where the load balancing proxies do not need to be colocated with a Swarm manager. This is a more secure approach to deployment as it ensures that the extension and proxy services do not have access to the Docker API.

Single Interlock deployment

When an application image is updated, the following actions occur:

  1. The service is updated with a new version of the application.

  2. The default “stop-first” policy stops the first replica before scheduling the second. The interlock proxies remove ip1.0 out of the backend pool as the app.1 task is removed.

  3. The first application task is rescheduled with the new image after the first task stops.

  4. The interlock proxy.1 is then rescheduled with the new NGINX configuration that contains the update for the new app.1 task.

  5. After proxy.1 is complete, proxy.2 redeploys with the updated ngnix configuration for the app.1 task.

  6. In this scenario, the amount of time that the service is unavailable is less than 30 seconds.

Optimizing Interlock for applications
Application update order

Swarm provides control over the order in which old tasks are removed while new ones are created. This is controlled on the service-level with --update-order.

  • stop-first (default)- Configures the currently updating task to stop before the new task is scheduled.

  • start-first - Configures the current task to stop after the new task has scheduled. This guarantees that the new task is running before the old task has shut down.

Use start-first if …

  • You have a single application replica and you cannot have service interruption. Both the old and new tasks run simultaneously during the update, but this ensurse that there is no gap in service during the update.

Use stop-first if …

  • Old and new tasks of your service cannot serve clients simultaneously.

  • You do not have enough cluster resourcing to run old and new replicas simultaneously.

In most cases, start-first is the best choice because it optimizes for high availability during updates.

Application update delay

Swarm services use update-delay to control the speed at which a service is updated. This adds a timed delay between application tasks as they are updated. The delay controls the time from when the first task of a service transitions to healthy state and the time that the second task begins its update. The default is 0 seconds, which means that a replica task begins updating as soon as the previous updated task transitions in to a healthy state.

Use update-delay if …

  • You are optimizing for the least number of dropped connections and a longer update cycle as an acceptable tradeoff.

  • Interlock update convergence takes a long time in your environment (can occur when having large amount of overlay networks).

Do not use update-delay if …

  • Service updates must occur rapidly.

  • Old and new tasks of your service cannot serve clients simultaneously.

Use application health checks

Swarm uses application health checks extensively to ensure that its updates do not cause service interruption. health-cmd can be configured in a Dockerfile or compose file to define a method for health checking an application. Without health checks, Swarm cannot determine when an application is truly ready to service traffic and will mark it as healthy as soon as the container process is running. This can potentially send traffic to an application before it is capable of serving clients, leading to dropped connections.

Application stop grace period

Use stop-grace-period to configure the maximum time period delay prior to force killing of the task (default: 10 seconds). In short, under the default setting a task can continue to run for no more than 10 seconds once its shutdown cycle has been initiated. This benefits applications that require long periods to process requests, allowing connection to terminate normally.

Interlock optimizations
Use service clusters for Interlock segmentation

Interlock service clusters allow Interlock to be segmented into multiple logical instances called “service clusters”, which have independently managed proxies. Application traffic only uses the proxies for a specific service cluster, allowing the full segmentation of traffic. Each service cluster only connects to the networks using that specific service cluster, which reduces the number of overlay networks to which proxies connect. Because service clusters also deploy separate proxies, this also reduces the amount of churn in LB configs when there are service updates.

Minimizing number of overlay networks

Interlock proxy containers connect to the overlay network of every Swarm service. Having many networks connected to Interlock adds incremental delay when Interlock updates its load balancer configuration. Each network connected to Interlock generally adds 1-2 seconds of update delay. With many networks, the Interlock update delay causes the LB config to be out of date for too long, which can cause traffic to be dropped.

Minimizing the number of overlay networks that Interlock connects to can be accomplished in two ways:

  • Reduce the number of networks. If the architecture permits it, applications can be grouped together to use the same networks.

  • Use Interlock service clusters. By segmenting Interlock, service clusters also segment which networks are connected to Interlock, reducing the number of networks to which each proxy is connected.

  • Use admin-defined networks and limit the number of networks per service cluster.

Use Interlock VIP Mode

VIP Mode can be used to reduce the impact of application updates on the Interlock proxies. It utilizes the Swarm L4 load balancing VIPs instead of individual task IPs to load balance traffic to a more stable internal endpoint. This prevents the proxy LB configs from changing for most kinds of app service updates reducing churn for Interlock. The following features are not supported in VIP mode:

  • Sticky sessions

  • Websockets

  • Canary deployments

The following features are supported in VIP mode:

  • Host & context routing

  • Context root rewrites

  • Interlock TLS termination

  • TLS passthrough

  • Service clusters

See also

NGINX

Deploy
Deploy a layer 7 routing solution

This topic describes how to route traffic to Swarm services by deploying a layer 7 routing solution into a Swarm-orchestrated cluster. It has the following prerequisites:


Enabling layer 7 routing causes the following to occur:

  1. MKE creates the ucp-interlock overlay network.

  2. MKE deploys the ucp-interlock service and attaches it both to the Docker socket and the overlay network that was created. This allows the Interlock service to use the Docker API, which is why this service needs to run on a manger node.

  3. The ucp-interlock service starts the ucp-interlock-extension service and attaches it to the ucp-interlock network, allowing both services to communicate.

  4. The ucp-interlock-extension generates a configuration for the proxy service to use. By default the proxy service is NGINX, so this service generates a standard NGINX configuration. MKE creates the com.docker.ucp.interlock.conf-1 configuration file and uses it to configure all the internal components of this service.

  5. The ucp-interlock service takes the proxy configuration and uses it to start the ucp-interlock-proxy service.

Note

Layer 7 routing is disabled by default.


To enable layer 7 routing using the MKE web UI:

  1. Log in to the MKE web UI as an administrator.

  2. Navigate to <user-name> > Admin Settings.

  3. Click Ingress.

  4. Toggle the Swarm HTTP ingress slider to the right.

  5. Optional. By default, the routing mesh service listens on port 8080 for HTTP and 8443 for HTTPS. Change these ports if you already have services using them.

The three primary Interlock services include the core service, the extensions, and the proxy. The following is the default MKE configuration, which is created automatically when you enable Interlock as described in this topic.

ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
AllowInsecure = false
PollInterval = "3s"

[Extensions]
  [Extensions.default]
    Image = "mirantis/ucp-interlock-extension:3.7.16"
    ServiceName = "ucp-interlock-extension"
    Args = []
    Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
    ProxyImage = "mirantis/ucp-interlock-proxy:3.7.16"
    ProxyServiceName = "ucp-interlock-proxy"
    ProxyConfigPath = "/etc/nginx/nginx.conf"
    ProxyReplicas = 2
    ProxyStopSignal = "SIGQUIT"
    ProxyStopGracePeriod = "5s"
    ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
    PublishMode = "ingress"
    PublishedPort = 8080
    TargetPort = 80
    PublishedSSLPort = 8443
    TargetSSLPort = 443
    [Extensions.default.Labels]
      "com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
    [Extensions.default.ContainerLabels]
      "com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
    [Extensions.default.ProxyLabels]
      "com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
    [Extensions.default.ProxyContainerLabels]
      "com.docker.ucp.InstanceID" = "fewho8k85kyc6iqypvvdh3ntm"
    [Extensions.default.Config]
      Version = ""
      User = "nginx"
      PidPath = "/var/run/proxy.pid"
      MaxConnections = 1024
      ConnectTimeout = 5
      SendTimeout = 600
      ReadTimeout = 600
      IPHash = false
      AdminUser = ""
      AdminPass = ""
      SSLOpts = ""
      SSLDefaultDHParam = 1024
      SSLDefaultDHParamPath = ""
      SSLVerify = "required"
      WorkerProcesses = 1
      RLimitNoFile = 65535
      SSLCiphers = "HIGH:!aNULL:!MD5"
      SSLProtocols = "TLSv1.2"
      AccessLogPath = "/dev/stdout"
      ErrorLogPath = "/dev/stdout"
      MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t    '$status $body_bytes_sent \"$http_referer\" '\n\t\t    '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
      TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t    '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t    '\"$http_x_forwarded_for\" $request_id $msec $request_time '\n\t\t    '$upstream_connect_time $upstream_header_time $upstream_response_time';"
      KeepaliveTimeout = "75s"
      ClientMaxBodySize = "32m"
      ClientBodyBufferSize = "8k"
      ClientHeaderBufferSize = "1k"
      LargeClientHeaderBuffers = "4 8k"
      ClientBodyTimeout = "60s"
      UnderscoresInHeaders = false
      HideInfoHeaders = false

Note

The value of LargeClientHeaderBuffers indicates the number of buffers to use to read a large client request header, as well as the size of those buffers.


To enable layer 7 routing from the command line:

Interlock uses a TOML file for the core service configuration. The following example uses Swarm deployment and recovery features by creating a Docker config object.

  1. Create a Docker config object:

    cat << EOF | docker config create service.interlock.conf -
    ListenAddr = ":8080"
    DockerURL = "unix:///var/run/docker.sock"
    PollInterval = "3s"
    
    [Extensions]
      [Extensions.default]
        Image = "mirantis/ucp-interlock-extension:3.7.16"
        Args = ["-D"]
        ProxyImage = "mirantis/ucp-interlock-proxy:3.7.16"
        ProxyArgs = []
        ProxyConfigPath = "/etc/nginx/nginx.conf"
        ProxyReplicas = 1
        ProxyStopGracePeriod = "3s"
        ServiceCluster = ""
        PublishMode = "ingress"
        PublishedPort = 8080
        TargetPort = 80
        PublishedSSLPort = 8443
        TargetSSLPort = 443
        [Extensions.default.Config]
          User = "nginx"
          PidPath = "/var/run/proxy.pid"
          WorkerProcesses = 1
          RlimitNoFile = 65535
          MaxConnections = 2048
    EOF
    oqkvv1asncf6p2axhx41vylgt
    
  2. Create a dedicated network for Interlock and the extensions:

    docker network create --driver overlay ucp-interlock
    
  3. Create the Interlock service:

    docker service create \
    --name ucp-interlock \
    --mount src=/var/run/docker.sock,dst=/var/run/docker.sock,type=bind \
    --network ucp-interlock \
    --constraint node.role==manager \
    --config src=service.interlock.conf,target=/config.toml \
    mirantis/ucp-interlock:3.7.16 -D run -c /config.toml
    

    Note

    The Interlock core service must have access to a Swarm manager (--constraint node.role==manager), however the extension and proxy services are recommended to run on workers.

  4. Verify that the three services are created, one for the Interlock service, one for the extension service, and one for the proxy service:

    docker service ls
    ID                  NAME                     MODE                REPLICAS            IMAGE                                                                PORTS
    sjpgq7h621ex        ucp-interlock            replicated          1/1                 mirantis/ucp-interlock:3.7.16
    oxjvqc6gxf91        ucp-interlock-extension  replicated          1/1                 mirantis/ucp-interlock-extension:3.7.16
    lheajcskcbby        ucp-interlock-proxy      replicated          1/1                 mirantis/ucp-interlock-proxy:3.7.16        *:80->80/tcp *:443->443/tcp
    
Configure layer 7 routing for production

This topic describes how to configure Interlock for a production environment and builds upon the instruction in the previous topic, Deploy a layer 7 routing solution. It does not describe infrastructure deployment, and it assumes you are using a typical Swarm cluster, using docker init and docker swarm join from the nodes.

The layer 7 solution that ships with MKE is highly available, fault tolerant, and designed to work independently of how many nodes you manage with MKE.

The following procedures require that you dedicate two worker nodes for running the ucp-interlock-proxy service. This tuning ensures the following:

  • The proxy services have dedicated resources to handle user requests. You can configure these nodes with higher performance network interfaces.

  • No application traffic can be routed to a manager node, thus making your deployment more secure.

  • If one of the two dedicated nodes fails, layer 7 routing continues working.


To dedicate two nodes to running the proxy service:

  1. Select two nodes that you will dedicate to running the proxy service.

  2. Log in to one of the Swarm manager nodes.

  3. Add labels to the two dedicated proxy service nodes, configuring them as load balancer worker nodes, for example, lb-00 and lb-01:

    docker node update --label-add nodetype=loadbalancer lb-00
    lb-00
    docker node update --label-add nodetype=loadbalancer lb-01
    lb-01
    
  4. Verify that the labels were added successfully:

    docker node inspect -f '{{ .Spec.Labels  }}' lb-00
    map[nodetype:loadbalancer]
    docker node inspect -f '{{ .Spec.Labels  }}' lb-01
    map[nodetype:loadbalancer]
    

To update the proxy service:

You must update the ucp-interlock-proxy service configuration to deploy the proxy service properly constrained to the dedicated worker nodes.

  1. From a manager node, add a constraint to the ucp-interlock-proxy service to update the running service:

    docker service update --replicas=2 \
    --constraint-add node.labels.nodetype==loadbalancer \
    --stop-signal SIGQUIT \
    --stop-grace-period=5s \
    $(docker service ls -f 'label=type=com.docker.interlock.core.proxy' -q)
    

    This updates the proxy service to have two replicas, ensures that they are constrained to the workers with the label nodetype==loadbalancer, and configures the stop signal for the tasks to be a SIGQUIT with a grace period of five seconds. This ensures that NGINX does not exit before the client request is finished.

  2. Inspect the service to verify that the replicas have started on the selected nodes:

    docker service ps $(docker service ls -f \
    'label=type=com.docker.interlock.core.proxy' -q)
    

    Example of system response:

    ID            NAME                    IMAGE          NODE     DESIRED STATE   CURRENT STATE                   ERROR   PORTS
    o21esdruwu30  interlock-proxy.1       nginx:alpine   lb-01    Running         Preparing 3 seconds ago
    n8yed2gp36o6   \_ interlock-proxy.1   nginx:alpine   mgr-01   Shutdown        Shutdown less than a second ago
    aubpjc4cnw79  interlock-proxy.2       nginx:alpine   lb-00    Running         Preparing 3 seconds ago
    
  3. Add the constraint to the ProxyConstraints array in the interlock-proxy service configuration in case Interlock is restored from backup:

    [Extensions]
      [Extensions.default]
        ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.nodetype==loadbalancer"]
    
  4. Optional. By default, the config service is global, scheduling one task on every node in the cluster. To modify constraint scheduling, update the ProxyConstraints variable in the Interlock configuration file. Refer to Configure layer 7 routing service for more information.

  5. Verify that the proxy service is running on the dedicated nodes:

    docker service ps ucp-interlock-proxy
    
  6. Update the settings in the upstream load balancer, such as ELB or F5, with the addresses of the dedicated ingress workers, thus directing all traffic to these two worker nodes.

See also

NGINX

Offline installation considerations

To install Interlock on your cluster without an Internet connection, you must have the required Docker images loaded on your computer. This topic describes how to export the required images from a local instance of MCR and then load them to your Swarm-orchestrated cluster.

To export Docker images from a local instance:

  1. Using a local instance of MCR, save the required images:

    docker save mirantis/ucp-interlock:3.7.16 > interlock.tar
    docker save mirantis/ucp-interlock-extension:3.7.16 > interlock-extension-nginx.tar
    docker save mirantis/ucp-interlock-proxy:3.7.16 > interlock-proxy-nginx.tar
    

    This saves the following three files:

    • interlock.tar - the core Interlock application.

    • interlock-extension-nginx.tar - the Interlock extension for NGINX.

    • interlock-proxy-nginx.tar - the official NGINX image based on Alpine.

    Note

    Replace mirantis/ucp-interlock-extension:3.7.16 and mirantis/ucp-interlock-proxy:3.7.16 with the corresponding extension and proxy image if you are not using NGINX.

  2. Copy the three files you just saved to each node in the cluster and load each image:

    docker load < interlock.tar
    docker load < interlock-extension-nginx.tar
    docker load < interlock-proxy-nginx.tar
    

Refer to Deploy a layer 7 routing solution to continue the installation.

See also

NGINX

Configure
Configure layer 7 routing service

This section describes how to customize layer 7 routing by updating the ucp-interlock service with a new Docker configuration, including configuration options and the procedure for creating a proxy service.

Configure the Interlock service

This topic describes how to update the ucp-interlock service with a new Docker configuration.

  1. Obtain the current configuration for the ucp-interlock service and save it as a TOML file named config.toml:

    CURRENT_CONFIG_NAME=$(docker service inspect --format \
    '{{ (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' \
    ucp-interlock) && docker config inspect --format \
    '{{ printf "%s" .Spec.Data }}' $CURRENT_CONFIG_NAME > config.toml
    
  2. Configure config.toml as required. Refer to Configuration file options for layer 7 routing for layer 7 routing customization options.

  3. Create a new Docker configuration object from the config.toml file:

    NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$\
    (( $(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
    docker config create $NEW_CONFIG_NAME config.toml
    
  4. Verify that the configuration was successfully created:

    docker config ls --filter name=com.docker.ucp.interlock
    

    Example output:

    ID                          NAME                              CREATED          UPDATED
    vsnakyzr12z3zgh6tlo9mqekx   com.docker.ucp.interlock.conf-1   6 hours ago      6 hours ago
    64wp5yggeu2c262z6flhaos37   com.docker.ucp.interlock.conf-2   54 seconds ago   54 seconds ago
    
  5. Optional. If you provide an invalid configuration, the ucp-interlock service is configured to roll back to a previous stable configuration, by default. Configure the service to pause instead of rolling back:

    docker service update \
    --update-failure-action pause \
    ucp-interlock
    
  6. Update the ucp-interlock service to begin using the new configuration:

    docker service update \
    --config-rm $CURRENT_CONFIG_NAME \
    --config-add source=$NEW_CONFIG_NAME,target=/config.toml \
    ucp-interlock
    

Enable Interlock proxy NGINX debugging mode

As Interlock proxy NGINX debugging mode generates copious log files and can produce core dumps, you can only set it manually to run.

Caution

Mirantis strongly recommends that you use debugging mode only for as long as is necessary, and that you do not use it in production environments.

  1. Obtain the current configuration for the ucp-interlock service and save it as a TOML file named config.toml:

    CURRENT_CONFIG_NAME=$(docker service inspect --format \
    '{{ (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' \
    ucp-interlock) docker config inspect --format \
    '{{ printf "%s" .Spec.Data }}' $CURRENT_CONFIG_NAME > config.toml
    
  2. Add the ProxyArgs attribute to the config.toml file, if it is not already present, and assign to it the following value:

    ProxyArgs = ["/entrypoint.sh","nginx-debug","-g","daemon off;"]
    
  3. Set the value of ProxyArgs to ["/entrypoint.sh","nginx-debug","-g","daemon off;"].

  4. Create a new Docker configuration object from the config.toml file:

    NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$\
    (( $(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
    docker config create $NEW_CONFIG_NAME config.toml
    
  5. Update the ucp-interlock service to begin using the new configuration:

    docker service update \
    --config-rm $CURRENT_CONFIG_NAME \
    --config-add source=$NEW_CONFIG_NAME,target=/config.toml \
    ucp-interlock
    
Configuration file options for layer 7 routing

This topic describes the configuration options for the primary Interlock services.

For configuration instructions, see Configure layer 7 routing service.

Core configuration

The following core configuration options are available for the ucp-interlock service:

Option

Type

Description

ListenAddr

string

Address to serve the Interlock GRPC API. The default is 8080.

DockerURL

string

Path to the socket or TCP address to the Docker API. The default is unix:// /var/run/docker.sock.

TLSCACert

string

Path to the CA certificate for connecting securely to the Docker API.

TLSCert

string

Path to the certificate for connecting securely to the Docker API.

TLSKey

string

Path to the key for connecting securely to the Docker API.

AllowInsecure

bool

A value of true skips TLS verification when connecting to the Docker API via TLS.

PollInterval

string

Interval to poll the Docker API for changes. The default is 3s.

EndpointOverride

string

Override the default GRPC API endpoint for extensions. Swarm detects the default.

Extensions

[]extension

Refer to Extension configuration for the array of extensions.

Extension configuration

The following options are available to configure the extensions. Interlock must contain at least one extension to service traffic.

Option

Type

Description

Image

string

Name of the Docker image to use for the extension.

Args

[]string

Arguments to pass to the extension service.

Labels

map[string]string

Labels to add to the extension service.

Networks

[]string

Allows the administrator to cherry pick a list of networks that Interlock can connect to. If this option is not specified, the proxy service can connect to all networks.

ContainerLabels

map[string]string

Labels for the extension service tasks.

Constraints

[]string

One or more constraints to use when scheduling the extension service.

PlacementPreferences

[]string

One of more placement preferences.

ServiceName

string

Name of the extension service.

ProxyImage

string

Name of the Docker image to use for the proxy service.

ProxyArgs

[]string

Arguments to pass to the proxy service.

ProxyLabels

map[string]string

Labels to add to the proxy service.

ProxyContainerLabels

map[string]string

Labels to add to the proxy service tasks.

ProxyServiceName

string

Name of the proxy service.

ProxyConfigPath

string

Path in the service for the generated proxy configuration.

ProxyReplicas

unit

Number or proxy service replicas.

ProxyStopSignal

string

Stop signal for the proxy service. For example, SIGQUIT.

ProxyStopGracePeriod

string

Stop grace period for the proxy service in seconds. For example, 5s.

ProxyConstraints

[]string

One or more constraints to use when scheduling the proxy service. Set the variable to false, as it is currently set to true by default.

ProxyPlacementPreferences

[]string

One or more placement preferences to use when scheduling the proxy service.

ProxyUpdateDelay

string

Delay between rolling proxy container updates.

ServiceCluster

string

Name of the cluster that this extension serves.

PublishMode

string (ingress or host)

Publish mode that the proxy service uses.

PublishedPort

int

Port on which the proxy service serves non-SSL traffic.

PublishedSSLPort

int

Port on which the proxy service serves SSL traffic.

Template

int

Docker configuration object that is used as the extension template.

Config

config

Proxy configuration used by the extensions as described in this section.

HitlessServiceUpdate

bool

When set to true, services can be updated without restarting the proxy container.

ConfigImage

config

Name for the config service used by hitless service updates. For example, mirantis/ucp-interlock-config:3.2.1.

ConfigServiceName

config

Name of the config service. This name is equivalent to ProxyServiceName. For example, ucp-interlock-config.

Proxy configuration

Options are available to the extensions, and the extensions use the options needed for proxy service configuration. This provides overrides to the extension configuration.

Because Interlock passes the extension configuration directly to the extension, each extension has different configuration options available.

The default proxy service used by MKE to provide layer 7 routing is NGINX. If users try to access a route that has not been configured, they will see the default NGINX 404 page.

You can customize this by labeling a service with com.docker.lb.default_backend=true. If users try to access a route that is not configured, they will be redirected to the custom service.

For details, see Create a proxy service.

See also

NGINX

Create a proxy service

If you want to customize the default NGINX proxy service used by MKE to provide layer 7 routing, follow the steps below to create an example proxy service where users will be redirected if they try to access a route that is not configured.

To create an example proxy service:

  1. Create a docker-compose.yml file:

    version: "3.2"
    
    services:
      demo:
        image: httpd
        deploy:
          replicas: 1
          labels:
            com.docker.lb.default_backend: "true"
            com.docker.lb.port: 80
        networks:
          - demo-network
    
    networks:
      demo-network:
        driver: overlay
    
  2. Download and configure the client bundle and deploy the service:

    docker stack deploy --compose-file docker-compose.yml demo
    

    If users try to access a route that is not configured, they are directed to this demo service.

  3. Optional. To minimize forwarding interruption to the updating service while updating a single replicated service, add the following line to the labels section of the docker-compose.yml file:

    com.docker.lb.backend_mode: "vip"
    

    And then update the existing service:

    docker stack deploy --compose-file docker-compose.yml demo
    

Refer to Use service labels for information on how to set Interlock labels on services.

Configure host mode networking

Layer 7 routing components communicate with one another by default using overlay networks, but Interlock also supports host mode networking in a variety of ways, including proxy only, Interlock only, application only, and hybrid.

When using host mode networking, you cannot use DNS service discovery, since that functionality requires overlay networking. For services to communicate, each service needs to know the IP address of the node where the other service is running.

Note

Use an alternative to DNS service discovery such as Registrator if you require this functionality.

The following is a high-level overview of how to use host mode instead of overlay networking:

  1. Update the ucp-interlock configuration.

  2. Deploy your Swarm services.

  3. Configure proxy services.

If you have not already done so, configure the layer 7 routing solution for production with the ucp-interlock-proxy service replicas running on their own dedicated nodes.

Update the ucp-interlock configuration
  1. Update the PublishMode key in the ucp-interlock service configuration so that it uses host mode networking:

    PublishMode = "host"
    
  2. Update the ucp-interlock service to use the new Docker configuration so that it starts publishing its port on the host:

    docker service update \
    --config-rm $CURRENT_CONFIG_NAME \
    --config-add source=$NEW_CONFIG_NAME,target=/config.toml \
    --publish-add mode=host,target=8080 \
    ucp-interlock
    

    The ucp-interlock and ucp-interlock-extension services are now communicating using host mode networking.

Deploy Swarm services

This section describes how to deploy an example Swarm service on an eight-node cluster using host mode networking to route traffic without using overlay networks. The cluster has three manager nodes and five worker nodes, with two workers configured as dedicated ingress cluster load balancer nodes that will receive all application traffic.

This example does not cover the actual infrastructure deployment, and assumes you have a typical Swarm cluster using docker init and docker swarm join from the nodes.

  1. Download and configure the client bundle.

  2. Deploy an example Swarm demo service that uses host mode networking:

    docker service create \
    --name demo \
    --detach=false \
    --label com.docker.lb.hosts=app.example.org \
    --label com.docker.lb.port=8080 \
    --publish mode=host,target=8080 \
    --env METADATA="demo" \
    mirantiseng/docker-demo
    

    This example allocates a high random port on the host where the service can be reached.

  3. Test that the service works:

    curl --header "Host: app.example.org" \
    http://<proxy-address>:<routing-http-port>/ping
    
    • <proxy-address> is the domain name or IP address of a node where the proxy service is running.

    • <routing-http-port> is the port used to route HTTP traffic.

    A properly-working service will produce a result similar to the following:

    {"instance":"63b855978452", "version":"0.1", "request_id":"d641430be9496937f2669ce6963b67d6"}
    
  4. Log in to one of the manager nodes and configure the load balancer worker nodes with node labels in order to pin the Interlock Proxy service:

    docker node update --label-add nodetype=loadbalancer lb-00
    lb-00
    docker node update --label-add nodetype=loadbalancer lb-01
    lb-01
    
  5. Verify that the labels were successfully added to each node:

    docker node inspect -f '{{ .Spec.Labels  }}' lb-00
    map[nodetype:loadbalancer]
    docker node inspect -f '{{ .Spec.Labels  }}' lb-01
    map[nodetype:loadbalancer]
    
  6. Create a configuration object for Interlock that specifies host mode networking:

    cat << EOF | docker config create service.interlock.conf -
    ListenAddr = ":8080"
    DockerURL = "unix:///var/run/docker.sock"
    PollInterval = "3s"
    
    [Extensions]
      [Extensions.default]
        Image = "mirantis/ucp-interlock-extension:3.7.16"
        Args = []
        ServiceName = "interlock-ext"
        ProxyImage = "mirantis/ucp-interlock-proxy:3.7.16"
        ProxyArgs = []
        ProxyServiceName = "interlock-proxy"
        ProxyConfigPath = "/etc/nginx/nginx.conf"
        ProxyReplicas = 1
        PublishMode = "host"
        PublishedPort = 80
        TargetPort = 80
        PublishedSSLPort = 443
        TargetSSLPort = 443
        [Extensions.default.Config]
          User = "nginx"
          PidPath = "/var/run/proxy.pid"
          WorkerProcesses = 1
          RlimitNoFile = 65535
          MaxConnections = 2048
    EOF
    oqkvv1asncf6p2axhx41vylgt
    
  7. Create the Interlock service using host mode networking:

    docker service create \
    --name interlock \
    --mount src=/var/run/docker.sock,dst=/var/run/docker.sock,type=bind \
    --constraint node.role==manager \
    --publish mode=host,target=8080 \
    --config src=service.interlock.conf,target=/config.toml \
    mirantis/ucp-interlock:3.7.16 -D run -c /config.toml
    sjpgq7h621exno6svdnsvpv9z
    
Configure proxy services

You can use node labels to reconfigure the Interlock Proxy services to be constrained to the workers.

  1. From a manager node, pin the proxy services to the load balancer worker nodes:

    docker service update \
    --constraint-add node.labels.nodetype==loadbalancer \
    interlock-proxy
    
  2. Deploy the application:

    docker service create \
    --name demo \
    --detach=false \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.port=8080 \
    --publish mode=host,target=8080 \
    --env METADATA="demo" \
    mirantiseng/docker-demo
    

    This runs the service using host mode networking. Each task for the service has a high port, such as 32768, and uses the node IP address to connect.

  3. Inspect the headers from the request to verify that each task uses the node IP address to connect:

    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    

    Example of system response:

    *   Trying 127.0.0.1...
    * TCP_NODELAY set
    * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
    > GET /ping HTTP/1.1
    > Host: demo.local
    > User-Agent: curl/7.54.0
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    < Server: nginx/1.13.6
    < Date: Fri, 10 Nov 2017 15:38:40 GMT
    < Content-Type: text/plain; charset=utf-8
    < Content-Length: 110
    < Connection: keep-alive
    < Set-Cookie: session=1510328320174129112; Path=/; Expires=Sat, 11 Nov 2017 15:38:40 GMT; Max-Age=86400
    < x-request-id: e4180a8fc6ee15f8d46f11df67c24a7d
    < x-proxy-id: d07b29c99f18
    < x-server-info: interlock/2.0.0-preview (17476782) linux/amd64
    < x-upstream-addr: 172.20.0.4:32768
    < x-upstream-response-time: 1510328320.172
    <
    {"instance":"897d3c7b9e9c","version":"0.1","metadata":"demo","request_id":"e4180a8fc6ee15f8d46f11df67c24a7d"}
    
Configure NGINX

By default, NGINX is used as a proxy. The following configuration options are available for the NGINX extension.

Note

The ServerNamesHashBucketSize option, which allowed the user to manually set the bucket size for the server names hash table, was removed in MKE 3.4.2 because MKE now adaptively calculates the setting and overrides any manual input.

Option

Type

Description

Defaults

User

string

User name for the proxy

nginx

PidPath

string

Path to the PID file for the proxy service

/var/run/proxy.pid

MaxConnections

int

Maximum number of connections for the proxy service

1024

ConnectTimeout

int

Timeout in seconds for clients to connect

600

SendTimeout

int

Timeout in seconds for the service to read a response from the proxied upstream

600

ReadTimeout

int

Timeout in seconds for the service to read a response from the proxied upstream

600

SSLOpts

int

Options to be passed when configuring SSL

N/A

SSLDefaultDHParam

int

Size of DH parameters

1024

SSLDefaultDHParamPath

string

Path to DH parameters file

N/A

SSLVerify

string

SSL client verification

required

WorkerProcesses

string

Number of worker processes for the proxy service

1

RLimitNoFile

int

Maximum number of open files for the proxy service

65535

SSLCiphers

string

SSL ciphers to use for the proxy service

HIGH:!aNULL:!MD5

SSLProtocols

string

Enable the specified TLS protocols

TLSv1.2

HideInfoHeaders

bool

Hide proxy-related response headers

N/A

KeepaliveTimeout

string

Connection keep-alive timeout

75s

ClientMaxBodySize

string

Maximum allowed client request body size

1 m

ClientBodyBufferSize

string

Buffer size for reading client request body

8k

ClientHeaderBufferSize

string

Maximum number and size of buffers used for reading large client request header

1k

LargeClientHeaderBuffers

string

Maximum number and size of buffers used for reading large client request header

4 8k

ClientBodyTimeout

string

Timeout for reading client request body

60s

UnderscoresInHeaders

bool

Enables or disables the use of underscores in client request header fields

false

UpstreamZoneSize

int

Size of the shared memory zone (in KB)

64

GlobalOptions

[]string

List of options that are included in the global configuration

N/A

HTTPOptions

[]string

List of options that are included in the HTTP configuration

N/A

TCPOptions

[]string

List of options that are included in the stream (TCP) configuration

N/A

AccessLogPath

string

Path to use for access logs

/dev/stdout

ErrorLogPath

string

Path to use for error logs

/dev/stdout

MainLogFormat

string

Format to use for main logger

N/A

TraceLogFormat

string

Format to use for trace logger

N/A

See also

NGINX

Tune the proxy service

This topic describes how to tune various components of the proxy service.

  • Constrain the proxy service to multiple dedicated worker nodes:

    <need-sme-instructions>
    
  • Adjust the stop signal and grace period, for example, to SIGTERM for the stop signal and ten seconds for the grace period:

    docker service update --stop-signal=SIGTERM \
    --stop-grace-period=10s interlock-proxy
    
  • Change the action that Swarm takes when an update fails using update-failure-action (the default is pause), for example, to rollback to the previous configuration:

    docker service update --update-failure-action=rollback \
    interlock-proxy
    
  • Change the amount of time between proxy updates using update-delay (the default is to use rolling updates), for example, setting the delay to thirty seconds:

    docker service update --update-delay=30s interlock-proxy
    
Update Interlock services

This topic describes how to update Interlock services by first updating the Interlock configuration to specify the new extension or proxy image versions and then updating the Interlock services to use the new configuration and image.

To update Interlock services:

  1. Create the new Interlock configuration:

    docker config create service.interlock.conf.v2 <path-to-new-config>
    
  2. Remove the old configuration and specify the new configuration:

    docker service update --config-rm \
    service.interlock.conf ucp-interlock
    docker service update --config-add \
    source=service.interlock.conf.v2,target=/config.toml \
    ucp-interlock
    
  3. Update the Interlock service to use the new image, for example, to pull the latest version of MKE:

    docker pull v/ucp:latest
    

    Example output:

    latest: Pulling from mirantis/ucp
    cd784148e348: Already exists
    3871e7d70c20: Already exists
    cad04e4a4815: Pull complete
    Digest: sha256:63ca6d3a6c7e94aca60e604b98fccd1295bffd1f69f3d6210031b72fc2467444
    Status: Downloaded newer image for mirantis/ucp:latest
    docker.io/mirantis/ucp:latest
    
  4. List all of the latest MKE images:

    docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp images --list
    

    Example output

    mirantis/ucp-agent:3.7.16
    mirantis/ucp-auth-store:3.7.16
    mirantis/ucp-auth:3.7.16
    mirantis/ucp-azure-ip-allocator:3.7.16
    mirantis/ucp-calico-cni:3.7.16
    mirantis/ucp-calico-kube-controllers:3.7.16
    mirantis/ucp-calico-node:3.7.16
    mirantis/ucp-cfssl:3.7.16
    mirantis/ucp-compose:3.7.16
    mirantis/ucp-controller:3.7.16
    mirantis/ucp-dsinfo:3.7.16
    mirantis/ucp-etcd:3.7.16
    mirantis/ucp-hyperkube:3.7.16
    mirantis/ucp-interlock-extension:3.7.16
    mirantis/ucp-interlock-proxy:3.7.16
    mirantis/ucp-interlock:3.7.16
    mirantis/ucp-kube-compose-api:3.7.16
    mirantis/ucp-kube-compose:3.7.16
    mirantis/ucp-kube-dns-dnsmasq-nanny:3.7.16
    mirantis/ucp-kube-dns-sidecar:3.7.16
    mirantis/ucp-kube-dns:3.7.16
    mirantis/ucp-metrics:3.7.16
    mirantis/ucp-pause:3.7.16
    mirantis/ucp-swarm:3.7.16
    mirantis/ucp:3.7.16
    
  5. Start Interlock to verify the configuration object, which has the new extension version, and deploy a rolling update on all extensions:

    docker service update \
    --image mirantis/ucp-interlock:3.7.16 \
    ucp-interlock
    
Routing traffic to services
Route traffic to a Swarm service

After Interlock is deployed, you can launch and publish services and applications. This topic describes how to configure services to publish themselves to the load balancer by using service labels.

Caution

The following procedures assume a DNS entry exists for each of the applications (or local hosts entry for local testing).


To publish a demo service with four replicas to the host (demo.local):

  1. Create a Docker Service using the following two labels:

    • com.docker.lb.hosts for Interlock to determine where the service is available.

    • com.docker.lb.port for the proxy service to determine which port to use to access the upstreams.

  2. Create an overlay network so that service traffic is isolated and secure:

    docker network create -d overlay demo
    1se1glh749q1i4pw0kf26mfx5
    
  3. Deploy the application:

    docker service create \
    --name demo \
    --network demo \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.port=8080 \
    mirantiseng/docker-demo
    6r0wiglf5f3bdpcy6zesh1pzx
    

    Interlock detects when the service is available and publishes it.

  4. After tasks are running and the proxy service is updated, the application is available through http://demo.local:

    curl -s -H "Host: demo.local" http://127.0.0.1/ping
    {"instance":"c2f1afe673d4","version":"0.1",request_id":"7bcec438af14f8875ffc3deab9215bc5"}
    
  5. To increase service capacity, use the docker service scale command:

    docker service scale demo=4
    demo scaled to 4
    

The load balancer balances traffic across all four service replicas configured in this example.


To publish a service with a web interface

This procedure deploys a simple service that includes the following:

  • A JSON endpoint that returns the ID of the task serving the request.

  • A web interface available at http://app.example.org that shows how many tasks the service is running.


  1. Create a docker-compose.yml file that includes the following:

    version: "3.2"
    
    services:
      demo:
        image: mirantiseng/docker-demo
        deploy:
          replicas: 1
          labels:
            com.docker.lb.hosts: app.example.org
            com.docker.lb.network: demo_demo-network
            com.docker.lb.port: 8080
        networks:
          - demo-network
    
    networks:
      demo-network:
        driver: overlay
    

    Label

    Description

    com.docker.lb.hosts

    Defines the hostname for the service. When the layer 7 routing solution gets a request containing app.example.org in the host header, that request is forwarded to the demo service.

    com.docker.lb.network

    Defines which network the ucp-interlock-proxy should attach to in order to communicate with the demo service. To use layer 7 routing, you must attach your services to at least one network. If your service is attached to a single network, you do not need to add a label to specify which network to use for routing. When using a common stack file for multiple deployments leveraging MKE Interlock and layer 7 routing, prefix com.docker.lb.network with the stack name to ensure traffic is directed to the correct overlay network. In combination with com.docker.lb.ssl_passthrough, the label in mandatory even if your service is only attached to a single network.

    com.docker.lb.port

    Specifies which port the ucp-interlock-proxy service should use to communicate with this demo service. Your service does not need to expose a port in the Swarm routing mesh. All communications are done using the network that you have specified.

    The ucp-interlock service detects that your service is using these labels and automatically reconfigures the ucp-interlock-proxy service.

  2. Download and configure the client bundle and deploy the service:

    docker stack deploy --compose-file docker-compose.yml demo
    

To test your services using the CLI:

Verify that requests are routed to the demo service:

curl --header "Host: app.example.org" \
http://<mke-address>:<routing-http-port>/ping
  • <mke-address> is the domain name or IP address of an MKE node.

  • <routing-http-port> is the port used to route HTTP traffic.

Example of a successful response:

{"instance":"63b855978452", "version":"0.1", "request_id":"d641430be9496937f2669ce6963b67d6"}

To test your services using a browser:

Because the demo service exposes an HTTP endpoint, you can also use your browser to validate that it works.

  1. Verify that the /etc/hosts file in your system has an entry mapping app.example.org to the IP address of an MKE node.

  2. Navigate to http://app.example.org in your browser.

Publish a service as a canary instance

This topic describes how to publish an initial or an updated service as a canary instance.


To publish a service as a canary instance:

  1. Create an overlay network to isolate and secure service traffic:

    docker network create -d overlay demo
    

    Example output:

    1se1glh749q1i4pw0kf26mfx5
    
  2. Create the initial service:

    docker service create \
    --name demo-v1 \
    --network demo \
    --detach=false \
    --replicas=4 \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.port=8080 \
    --env METADATA="demo-version-1" \
    mirantiseng/docker-demo
    

    Interlock detects when the service is available and publishes it.

  3. After tasks are running and the proxy service is updated, the application is available at http://demo.local:

    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    

    Example output:

    *   Trying 127.0.0.1...
    * TCP_NODELAY set
    * Connected to demo.local (127.0.0.1) port 80 (#0)
    > GET /ping HTTP/1.1
    > Host: demo.local
    > User-Agent: curl/7.54.0
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    < Server: nginx/1.13.6
    < Date: Wed, 08 Nov 2017 20:28:26 GMT
    < Content-Type: text/plain; charset=utf-8
    < Content-Length: 120
    < Connection: keep-alive
    < Set-Cookie: session=1510172906715624280; Path=/; Expires=Thu, 09 Nov 2017 20:28:26 GMT; Max-Age=86400
    < x-request-id: f884cf37e8331612b8e7630ad0ee4e0d
    < x-proxy-id: 5ad7c31f9f00
    < x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
    < x-upstream-addr: 10.0.2.4:8080
    < x-upstream-response-time: 1510172906.714
    <
    {"instance":"df20f55fc943","version":"0.1","metadata":"demo-version-1","request_id":"f884cf37e8331612b8e7630ad0ee4e0d"}
    

    The value of metadata is demo-version-1.


To deploy an updated service as a canary instance:

  1. Deploy an updated service as a canary instance:

    docker service create \
    --name demo-v2 \
    --network demo \
    --detach=false \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.port=8080 \
    --env METADATA="demo-version-2" \
    --env VERSION="0.2" \
    mirantiseng/docker-demo
    

    Because this has one replica and the initial version has four replicas, 20% of application traffic is sent to demo-version-2:

    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    {"instance":"23d9a5ec47ef","version":"0.1","metadata":"demo-version-1","request_id":"060c609a3ab4b7d9462233488826791c"}
    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    {"instance":"f42f7f0a30f9","version":"0.1","metadata":"demo-version-1","request_id":"c848e978e10d4785ac8584347952b963"}
    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    {"instance":"c2a686ae5694","version":"0.1","metadata":"demo-version-1","request_id":"724c21d0fb9d7e265821b3c95ed08b61"}
    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    {"instance":"1b0d55ed3d2f","version":"0.2","metadata":"demo-version-2","request_id":"b86ff1476842e801bf20a1b5f96cf94e"}
    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    {"instance":"c2a686ae5694","version":"0.1","metadata":"demo-version-1","request_id":"724c21d0fb9d7e265821b3c95ed08b61"}
    
  2. Optional. Increase traffic to the new version by adding more replicas. For example:

    docker service scale demo-v2=4
    

    Example output:

    demo-v2
    
  3. Complete the upgrade by scaling the demo-v1 service to zero replicas:

    docker service scale demo-v1=0
    

    Example output:

    demo-v1
    

    This routes all application traffic to the new version. If you need to roll back your service, scale the v1 service back up and the v2 service back down.

Use context or path-based routing

This topic describes how to publish a service using context or path-based routing.


  1. Create an overlay network to isolate and secure service traffic:

    docker network create -d overlay demo
    

    Example output:

    1se1glh749q1i4pw0kf26mfx5
    
  2. Create the initial service:

    docker service create \
    --name demo \
    --network demo \
    --detach=false \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.port=8080 \
    --label com.docker.lb.context_root=/app \
    --label com.docker.lb.context_root_rewrite=true \
    --env METADATA="demo-context-root" \
    mirantiseng/docker-demo
    

    Interlock detects when the service is available and publishes it.

    Note

    Interlock only supports one path per host for each service cluster. When a specific com.docker.lb.hosts label is applied, it cannot be applied again in the same service cluster.

  3. After the tasks are running and the proxy service is updated, the application is available at http://demo.local:

    curl -vs -H "Host: demo.local" http://127.0.0.1/app/
    

    Example output:

    *   Trying 127.0.0.1...
    * TCP_NODELAY set
    * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
    > GET /app/ HTTP/1.1
    > Host: demo.local
    > User-Agent: curl/7.54.0
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    < Server: nginx/1.13.6
    < Date: Fri, 17 Nov 2017 14:25:17 GMT
    < Content-Type: text/html; charset=utf-8
    < Transfer-Encoding: chunked
    < Connection: keep-alive
    < x-request-id: 077d18b67831519defca158e6f009f82
    < x-proxy-id: 77c0c37d2c46
    < x-server-info: interlock/2.0.0-dev (732c77e7) linux/amd64
    < x-upstream-addr: 10.0.1.3:8080
    < x-upstream-response-time: 1510928717.306
    
Configure a routing mode

This topic describes how to publish services using the task and VIP backend routing modes.

Routing modes

The following table describes the two backend routing modes:

Routing modes

Task mode

VIP mode

Default

yes

no

Traffic routing

Interlock uses backend task IPs to route traffic from the proxy to each container. Traffic to the front-end route is layer 7 load balanced directly to service tasks. This allows for routing functionality such as sticky sessions for each container. Task routing mode applies layer 7 routing and then sends packets directly to a container.

Interlock uses the Swarm service VIP as the backend IP instead of using container IPs. Traffic to the front-end route is layer 7 load balanced to the Swarm service VIP, which Layer 4 load balances to backend tasks. VIP mode is useful for reducing the amount of churn in Interlock proxy service configurations, which can be an advantage in highly dynamic environments.

VIP mode optimizes for fewer proxy updates with the tradeoff of a reduced feature set. Most application updates do not require configuring backends in VIP mode. In VIP routing mode, Interlock uses the service VIP, which is a persistent endpoint that exists from service creation to service deletion, as the proxy backend. VIP routing mode applies Layer 7 routing and then sends packets to the Swarm Layer 4 load balancer, which routes traffic to service containers.

Canary deployments

In task mode, a canary service with one task next to an existing service with four tasks represents one out of five total tasks, so the canary will receive 20% of incoming requests.

Because VIP mode routes by service IP rather than by task IP, it affects the behavior of canary deployments. In VIP mode, a canary service with one task next to an existing service with four tasks will receive 50% of incoming requests, as it represents one out of two total services.

Specify a routing mode

You can set each service to use either the task or the VIP backend routing mode. Task mode is the default and is used if a label is not specified or if it is set to task.

Set the routing mode to VIP
  1. Apply the following label to set the routing mode to VIP:

    com.docker.lb.backend_mode=vip
    
  2. Perform a proxy reconfiguration for the following two updates, as they create or remove a service VIP:

    • Adding or removing a network on a service

    • Deploying or deleting a service

    Note

    The following is a non-exhaustive list of application events that do not require proxy reconfiguration in VIP mode:

    • Increasing or decreasing a service replica

    • Deploying a new image

    • Updating a configuration or secret

    • Adding or removing a label

    • Adding or removing an environment variable

    • Rescheduling a failed application task

Publish a default host service

The following example publishes a service to be a default host. The service responds whenever a request is made to an unconfigured host.

  1. Create an overlay network to isolate and secure the service traffic:

    docker network create -d overlay demo
    

    Example output:

    1se1glh749q1i4pw0kf26mfx5
    
  2. Create the initial service:

    docker service create \
    --name demo-default \
    --network demo \
    --detach=false \
    --replicas=1 \
    --label com.docker.lb.default_backend=true \
    --label com.docker.lb.port=8080 \
    ehazlett/interlock-default-app
    

    Interlock detects when the service is available and publishes it. After tasks are running and the proxy service is updated, the application is available at any URL that is not configured.

Publish a service using the VIP backend mode
  1. Create an overlay network to isolate and secure the service traffic:

    docker network create -d overlay demo
    

    Example output:

    1se1glh749q1i4pw0kf26mfx5
    
  2. Create the initial service:

    docker service create \
    --name demo \
    --network demo \
    --detach=false \
    --replicas=4 \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.port=8080 \
    --label com.docker.lb.backend_mode=vip \
    --env METADATA="demo-vip-1" \
    mirantiseng/docker-demo
    

    Interlock detects when the service is available and publishes it.

  3. After tasks are running and the proxy service is updated, the application is available at http://demo.local:

    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    

    Example output:

    *   Trying 127.0.0.1...
    * TCP_NODELAY set
    * Connected to demo.local (127.0.0.1) port 80 (#0)
    > GET /ping HTTP/1.1
    > Host: demo.local
    > User-Agent: curl/7.54.0
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    < Server: nginx/1.13.6
    < Date: Wed, 08 Nov 2017 20:28:26 GMT
    < Content-Type: text/plain; charset=utf-8
    < Content-Length: 120
    < Connection: keep-alive
    < Set-Cookie: session=1510172906715624280; Path=/; Expires=Thu, 09 Nov 2017 20:28:26 GMT; Max-Age=86400
    < x-request-id: f884cf37e8331612b8e7630ad0ee4e0d
    < x-proxy-id: 5ad7c31f9f00
    < x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
    < x-upstream-addr: 10.0.2.9:8080
    < x-upstream-response-time: 1510172906.714
    <
    {"instance":"df20f55fc943","version":"0.1","metadata":"demo","request_id":"f884cf37e8331612b8e7630ad0ee4e0d"}
    

    Using VIP mode causes Interlock to use the virtual IPs of the service for load balancing rather than using each task IP.

  4. Inspect the service to see the VIPs, as in the following example:

    "Endpoint": {
        "Spec": {
                    "Mode": "vip"
    
        },
        "VirtualIPs": [
            {
                    "NetworkID": "jed11c1x685a1r8acirk2ylol",
                    "Addr": "10.0.2.9/24"
            }
        ]
    }
    

    In this example, Interlock configures a single upstream for the host using IP 10.0.2.9. Interlock skips further proxy updates as long as there is at least one replica for the service, as the only upstream is the VIP.

Use service labels

Interlock uses service labels to configure how applications are published, to define the host names that are routed to the service, to define the applicable ports, and to define other routing configurations.

The following occurs when you deploy or update a Swarm service with service labels:

  1. The ucp-interlock service monitors the Docker API for events and publishes the events to the ucp-interlock-extension service.

  2. The ucp-interlock-extension service generates a new configuration for the proxy service based on the labels you have added to your services.

  3. The ucp-interlock service takes the new configuration and reconfigures ucp-interlock-proxy to start using the new configuration.

This process occurs in milliseconds and does not interrupt services.


The following table lists the service labels that Interlock uses:

Label

Description

Example

com.docker.lb.hosts

Comma-separated list of the hosts for the service to serve.

example.com, test.com

com.docker.lb.port

Port to use for internal upstream communication.

8080

com.docker.lb.network

Name of the network for the proxy service to attach to for upstream connectivity.

app-network-a

com.docker.lb.context_root

Context or path to use for the application.

/app

com.docker.lb.context_root_rewrite

Changes the path from the value of label com.docker.lb.context_root to / when set to true.

true

com.docker.lb.ssl_cert

Docker secret to use for the SSL certificate.

example.com.cert

com.docker.lb.ssl_key

Docker secret to use for the SSL key.

example.com.key

com.docker.lb.websocket_endpoints

Comma-separated list of endpoints to be upgraded for websockets.

/ws,/foo

com.docker.lb.service_cluster

Name of the service cluster to use for the application.

us-east

com.docker.lb.sticky_session_cookie

Cookie to use for sticky sessions.

app_session

com.docker.lb.redirects

Semicolon-separated list of redirects to add in the format of <source>, <target>.

http://old.example.com, http://new.example.com

com.docker.lb.ssl_passthrough

Enables SSL passthrough when set to true.

false

com.docker.lb.backend_mode

Selects the backend mode that the proxy should use to access the upstreams. The default is task.

vip

Configure redirects

This topic describes how to publish a service with a redirect from old.local to new.local.

Note

Redirects do not work if a service is configured for TLS passthrough in the Interlock proxy.


  1. Create an overlay network to isolate and secure service traffic:

    docker network create -d overlay demo
    

    Example output:

    1se1glh749q1i4pw0kf26mfx5
    
  2. Create the service with the redirect:

    docker service create \
    --name demo \
    --network demo \
    --detach=false \
    --label com.docker.lb.hosts=old.local,new.local \
    --label com.docker.lb.port=8080 \
    --label com.docker.lb.redirects=http://old.local,http://new.local \
    --env METADATA="demo-new" \
    mirantiseng/docker-demo
    

    Interlock detects when the service is available and publishes it.

  3. After tasks are running and the proxy service is updated, the application is available through http://new.local with a redirect configured that sends http://old.local to http://new.local:

    curl -vs -H "Host: old.local" http://127.0.0.1
    

    Example output:

    * Rebuilt URL to: http://127.0.0.1/
    *   Trying 127.0.0.1...
    * TCP_NODELAY set
    * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
    > GET / HTTP/1.1
    > Host: old.local
    > User-Agent: curl/7.54.0
    > Accept: */*
    >
    < HTTP/1.1 302 Moved Temporarily
    < Server: nginx/1.13.6
    < Date: Wed, 08 Nov 2017 19:06:27 GMT
    < Content-Type: text/html
    < Content-Length: 161
    < Connection: keep-alive
    < Location: http://new.local/
    < x-request-id: c4128318413b589cafb6d9ff8b2aef17
    < x-proxy-id: 48854cd435a4
    < x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
    <
    <html>
    <head><title>302 Found</title></head>
    <body bgcolor="white">
    <center><h1>302 Found</h1></center>
    <hr><center>nginx/1.13.6</center>
    </body>
    </html>
    
Service clusters

Reconfiguring the single proxy service that Interlock manages by default can take one to two seconds for each overlay network that the proxy manages. You can scale up to a larger number of Interlock-routed networks and services by implementing a service cluster. Service clusters use Interlock to manage multiple proxy services, each responsible for routing to a separate set of services and their corresponding networks, thereby minimizing proxy reconfiguration time.

Configure service clusters

Note

The provided instruction is based on the presumption that certain prerequisites have been met:

  • You have an operational MKE cluster with at least two worker nodes (mke-node-0 and mke-node-1), to use as dedicated proxy servers for two independent Interlock service clusters.

  • You have enabled Interlock with 80 as an HTTP port and 8443 as an HTTPS port.


  1. From a manager node, apply node labels to the MKE workers that you have chosen to use as your proxy servers:

    docker node update --label-add nodetype=loadbalancer --label-add region=east mke-node-0
    docker node update --label-add nodetype=loadbalancer --label-add region=west mke-node-1
    

    In this example, mke-node-0 serves as the proxy for the east region and mke-node-1 serves as the proxy for the west region.

  2. Create a dedicated overlay network for each region proxy to manage traffic:

    docker network create --driver overlay eastnet
    docker network create --driver overlay westnet
    
  3. Modify the Interlock configuration to create two service clusters:

    CURRENT_CONFIG_NAME=$(docker service inspect --format '{{ \
    (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' \
    ucp-interlock)
    docker config inspect --format '{{ printf "%s" .Spec.Data }}' \
    $CURRENT_CONFIG_NAME > old_config.toml
    
  4. Create the following config.toml file that declares two service clusters, east and west:

    ListenAddr = ":8080"
    DockerURL = "unix:///var/run/docker.sock"
    AllowInsecure = false
    PollInterval = "3s"
    
    [Extensions]
      [Extensions.east]
        Image = "mirantis/ucp-interlock-extension:3.2.3"
        ServiceName = "ucp-interlock-extension-east"
        Args = []
        Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
        ConfigImage = "mirantis/ucp-interlock-config:3.2.3"
        ConfigServiceName = "ucp-interlock-config-east"
        ProxyImage = "mirantis/ucp-interlock-proxy:3.2.3"
        ProxyServiceName = "ucp-interlock-proxy-east"
        ServiceCluster="east"
        Networks=["eastnet"]
        ProxyConfigPath = "/etc/nginx/nginx.conf"
        ProxyReplicas = 1
        ProxyStopSignal = "SIGQUIT"
        ProxyStopGracePeriod = "5s"
        ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.region==east"]
        PublishMode = "host"
        PublishedPort = 80
        TargetPort = 80
        PublishedSSLPort = 8443
        TargetSSLPort = 443
        [Extensions.east.Labels]
          "ext_region" = "east"
          "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
        [Extensions.east.ContainerLabels]
          "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
        [Extensions.east.ProxyLabels]
          "proxy_region" = "east"
          "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
        [Extensions.east.ProxyContainerLabels]
          "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
        [Extensions.east.Config]
          Version = ""
          HTTPVersion = "1.1"
          User = "nginx"
          PidPath = "/var/run/proxy.pid"
          MaxConnections = 1024
          ConnectTimeout = 5
          SendTimeout = 600
          ReadTimeout = 600
          IPHash = false
          AdminUser = ""
          AdminPass = ""
          SSLOpts = ""
          SSLDefaultDHParam = 1024
          SSLDefaultDHParamPath = ""
          SSLVerify = "required"
          WorkerProcesses = 1
          RLimitNoFile = 65535
          SSLCiphers = "HIGH:!aNULL:!MD5"
          SSLProtocols = "TLSv1.2"
          AccessLogPath = "/dev/stdout"
          ErrorLogPath = "/dev/stdout"
          MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t    '$status $body_bytes_sent \"$http_referer\" '\n\t\t    '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
          TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t    '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t    '\"$http_x_forwarded_for\" $reqid $msec $request_time '\n\t\t    '$upstream_connect_time $upstream_header_time $upstream_response_time';"
          KeepaliveTimeout = "75s"
          ClientMaxBodySize = "32m"
          ClientBodyBufferSize = "8k"
          ClientHeaderBufferSize = "1k"
          LargeClientHeaderBuffers = "4 8k"
          ClientBodyTimeout = "60s"
          UnderscoresInHeaders = false
          UpstreamZoneSize = 64
          ServerNamesHashBucketSize = 128
          GlobalOptions = []
          HTTPOptions = []
          TCPOptions = []
          HideInfoHeaders = false
    
      [Extensions.west]
        Image = "mirantis/ucp-interlock-extension:3.2.3"
        ServiceName = "ucp-interlock-extension-west"
        Args = []
        Constraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux"]
        ConfigImage = "mirantis/ucp-interlock-config:3.2.3"
        ConfigServiceName = "ucp-interlock-config-west"
        ProxyImage = "mirantis/ucp-interlock-proxy:3.2.3"
        ProxyServiceName = "ucp-interlock-proxy-west"
        ServiceCluster="west"
        Networks=["westnet"]
        ProxyConfigPath = "/etc/nginx/nginx.conf"
        ProxyReplicas = 1
        ProxyStopSignal = "SIGQUIT"
        ProxyStopGracePeriod = "5s"
        ProxyConstraints = ["node.labels.com.docker.ucp.orchestrator.swarm==true", "node.platform.os==linux", "node.labels.region==west"]
        PublishMode = "host"
        PublishedPort = 80
        TargetPort = 80
        PublishedSSLPort = 8443
        TargetSSLPort = 443
        [Extensions.west.Labels]
          "ext_region" = "west"
          "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
        [Extensions.west.ContainerLabels]
          "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
        [Extensions.west.ProxyLabels]
          "proxy_region" = "west"
          "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
        [Extensions.west.ProxyContainerLabels]
          "com.docker.ucp.InstanceID" = "vl5umu06ryluu66uzjcv5h1bo"
        [Extensions.west.Config]
          Version = ""
          HTTPVersion = "1.1"
          User = "nginx"
          PidPath = "/var/run/proxy.pid"
          MaxConnections = 1024
          ConnectTimeout = 5
          SendTimeout = 600
          ReadTimeout = 600
          IPHash = false
          AdminUser = ""
          AdminPass = ""
          SSLOpts = ""
          SSLDefaultDHParam = 1024
          SSLDefaultDHParamPath = ""
          SSLVerify = "required"
          WorkerProcesses = 1
          RLimitNoFile = 65535
          SSLCiphers = "HIGH:!aNULL:!MD5"
          SSLProtocols = "TLSv1.2"
          AccessLogPath = "/dev/stdout"
          ErrorLogPath = "/dev/stdout"
          MainLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" '\n\t\t    '$status $body_bytes_sent \"$http_referer\" '\n\t\t    '\"$http_user_agent\" \"$http_x_forwarded_for\"';"
          TraceLogFormat = "'$remote_addr - $remote_user [$time_local] \"$request\" $status '\n\t\t    '$body_bytes_sent \"$http_referer\" \"$http_user_agent\" '\n\t\t    '\"$http_x_forwarded_for\" $reqid $msec $request_time '\n\t\t    '$upstream_connect_time $upstream_header_time $upstream_response_time';"
          KeepaliveTimeout = "75s"
          ClientMaxBodySize = "32m"
          ClientBodyBufferSize = "8k"
          ClientHeaderBufferSize = "1k"
          LargeClientHeaderBuffers = "4 8k"
          ClientBodyTimeout = "60s"
          UnderscoresInHeaders = false
          UpstreamZoneSize = 64
          ServerNamesHashBucketSize = 128
          GlobalOptions = []
          HTTPOptions = []
          TCPOptions = []
          HideInfoHeaders = false
    

    Note

    Change all instances of the MKE version and *.ucp.InstanceID in the above to match your deployment.

  5. Optional. Modify the configuration file that Interlock creates by default:

    1. Replace [Extensions.default] with [Extensions.east].

    2. Change ServiceName to "ucp-interlock-extension-east".

    3. Change ConfigServiceName to "ucp-interlock-config-east".

    4. Change ProxyServiceName to "ucp-interlock-proxy-east".

    5. Add the "node.labels.region==east" constraint to the ProxyConstraints list.

    6. Add the ServiceCluster="east" key immediately below and inline with ProxyServiceName.

    7. Add the Networks=["eastnet"] key immediately below and inline with ServiceCluster. This list can contain as many overlay networks as you require. Interlock only connects to the specified networks and connects to them all at startup.

    8. Change PublishMode="ingress" to PublishMode="host".

    9. Change the [Extensions.default.Labels] section title to [Extensions.east.Labels].

    10. Add the "ext_region" = "east" key under the [Extensions.east.Labels] section.

    11. Change the [Extensions.default.ContainerLabels] section title to [Extensions.east.ContainerLabels].

    12. Change the [Extensions.default.ProxyLabels] section title to [Extensions.east.ProxyLabels].

    13. Add the "proxy_region" = "east" key under the [Extensions.east.ProxyLabels] section.

    14. Change the [Extensions.default.ProxyContainerLabels] section title to [Extensions.east.ProxyContainerLabels].

    15. Change the [Extensions.default.Config] section title to [Extensions.east.Config].

    16. Optional. Change ProxyReplicas=2 to ProxyReplicas=1. This is only necessary if there is a single node labeled as a proxy for each service cluster.

    17. Configure your west service cluster by duplicating the entire [Extensions.east] block and changing all instances of east to west.

  6. Create a new docker config object from the config.toml file:

    NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$(( \
    $(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
    docker config create $NEW_CONFIG_NAME config.toml
    
  7. Update the ucp-interlock service to start using the new configuration:

    docker service update \
    --config-rm $CURRENT_CONFIG_NAME \
    --config-add source=$NEW_CONFIG_NAME,target=/config.toml \
    ucp-interlock
    
  8. View your service clusters:

    docker service ls
    

    The following two proxy services will display: ucp-interlock-proxy-east and ucp-interlock-proxy-west.

    Note

    If only one proxy service displays, delete it using docker service rm and rerun docker service ls to display the two new proxy services.

Deploy services in separate service clusters

Note

The provided instruction is based on the presumption that certain prerequisites have been met:

  • You have an operational MKE cluster with at least two worker nodes (mke-node-0 and mke-node-1), to use as dedicated proxy servers for two independent Interlock service clusters.

  • You have enabled Interlock with 80 as an HTTP port and 8443 as an HTTPS port.

With your service clusters configured, you can now deploy services, routing to them with your new proxy services using the service_cluster label.

  1. Create two example services:

    docker service create --name demoeast \
    --network eastnet \
    --label com.docker.lb.hosts=demo.A \
    --label com.docker.lb.port=8000 \
    --label com.docker.lb.service_cluster=east \
    training/whoami:latest
    
    docker service create --name demowest \
    --network westnet \
    --label com.docker.lb.hosts=demo.B \
    --label com.docker.lb.port=8000 \
    --label com.docker.lb.service_cluster=west \
    training/whoami:latest
    
  2. Ping your whoami service on the mke-node-0 proxy server:

    curl -H "Host: demo.A" http://<mke-node-0 public IP>
    

    The response contains the container ID of the whoami container declared by the demoeast service.

    The same curl command on mke-node-1 fails because that Interlock proxy only routes traffic to services with the service_cluster=west label, which are connected to the westnet Docker network that you listed in the configuration for that service cluster.

  3. Ping your whoami service on the mke-node-1 proxy server:

    curl -H "Host: demo.B" http://<mke-node-1 public IP>
    

    The service routed by Host: demo.B is only reachable through the Interlock proxy mapped to port 80 on mke-node-1.

Remove a service cluster

In removing a service cluster, Interlock removes all of the services that are used internally to manage the service cluster, while leaving all of the user services intact. For continued function, however, you may need to update, modify, or remove the user services that remain. For instance:

  • Any remaining user service that depends on functionality provided by the removed service cluster will need to be provisioned and managed by different means.

  • All load balancing that is managed by the service cluster will no longer be available following its removal, and thus must be reconfigured.

Following the removal of the service cluster, all ports that were previously managed by the service cluster will once again be available. Also, any manually created networks will remain in place.


To remove a service cluster:

  1. Obtain the current Interlock configuration file:

    CURRENT_CONFIG_NAME=$(docker service inspect --format '{{ \
    (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' \
    ucp-interlock)
    docker config inspect --format '{{ printf "%s" .Spec.Data }}' \
    $CURRENT_CONFIG_NAME > old_config.toml
    
  2. Open the old_config.toml file.

  3. Remove the subsection from [Extensions] that corresponds with the service cluster that you want to remove, but leave the [Extensions] section header itself in place. For example, remove the entire [Extensions.east] subsection from the config.toml file generated in Configure service clusters.

  4. Create a new docker config object from the old_config.toml file:

    NEW_CONFIG_NAME="com.docker.ucp.interlock.conf-$(( \
    $(cut -d '-' -f 2 <<< "$CURRENT_CONFIG_NAME") + 1 ))"
    docker config create $NEW_CONFIG_NAME config.toml
    
  5. Update the ucp-interlock service to use the new configuration:

     docker service update \
     --config-rm $CURRENT_CONFIG_NAME \
     --config-add source=$NEW_CONFIG_NAME,target=/config.toml \
    ucp-interlock
    
  6. Wait for two minutes, and then verify that Interlock has removed the services that were previously associated with the service cluster:

    docker service ls
    
Use persistent sessions

This topic describes how to publish a service with a proxy that is configured for persistent sessions using either cookies or IP hashing. Persistent sessions are also known as sticky sessions.

Configure persistent sessions using cookies
  1. Create an overlay network to isolate and secure service traffic:

    docker network create -d overlay demo
    

    Example output:

    1se1glh749q1i4pw0kf26mfx5
    
  2. Create a service with the persistent session cookie:

    docker service create \
    --name demo \
    --network demo \
    --detach=false \
    --replicas=5 \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.sticky_session_cookie=session \
    --label com.docker.lb.port=8080 \
    --env METADATA="demo-sticky" \
    mirantiseng/docker-demo
    

    Interlock detects when the service is available and publishes it.

  3. After tasks are running and the proxy service is updated, the application is configured to use persistent sessions and is available at http://demo.local:

    curl -vs -c cookie.txt -b cookie.txt -H "Host: demo.local" http://127.0.0.1/ping
    

    Example output:

    *   Trying 127.0.0.1...
    * TCP_NODELAY set
    * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
    > GET /ping HTTP/1.1
    > Host: demo.local
    > User-Agent: curl/7.54.0
    > Accept: */*
    > Cookie: session=1510171444496686286
    >
    < HTTP/1.1 200 OK
    < Server: nginx/1.13.6
    < Date: Wed, 08 Nov 2017 20:04:36 GMT
    < Content-Type: text/plain; charset=utf-8
    < Content-Length: 117
    < Connection: keep-alive
    * Replaced cookie session="1510171444496686286" for domain demo.local, path /, expire 0
    < Set-Cookie: session=1510171444496686286
    < x-request-id: 3014728b429320f786728401a83246b8
    < x-proxy-id: eae36bf0a3dc
    < x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
    < x-upstream-addr: 10.0.2.5:8080
    < x-upstream-response-time: 1510171476.948
    <
    {"instance":"9c67a943ffce","version":"0.1","metadata":"demo-sticky","request_id":"3014728b429320f786728401a83246b8"}
    

    The curl command stores Set-Cookie from the application and sends it with subsequent requests, which are pinned to the same instance. If you make multiple requests, the same x-upstream-addr is present in each.

Configure persistent sessions using IP hashing

Using client IP hashing to configure persistent sessions is not as flexible or consistent as using cookies but it enables workarounds for applications that cannot use the other method. To use IP hashing, you must reconfigure Interlock proxy to use host mode networking, because the default ingress networking mode uses SNAT, which obscures client IP addresses.

  1. Create an overlay network to isolate and secure service traffic:

    docker network create -d overlay demo
    

    Example output:

    1se1glh749q1i4pw0kf26mfx5
    
  2. Create a service using IP hashing:

    docker service create \
    --name demo \
    --network demo \
    --detach=false \
    --replicas=5 \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.port=8080 \
    --label com.docker.lb.ip_hash=true \
    --env METADATA="demo-sticky" \
    mirantiseng/docker-demo
    

    Interlock detects when the service is available and publishes it.

  3. After tasks are running and the proxy service is updated, the application is configured to use persistent sessions and is available at http://demo.local:

    curl -vs -H "Host: demo.local" http://127.0.0.1/ping
    

    Example output:

    *   Trying 127.0.0.1...
    * TCP_NODELAY set
    * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
    > GET /ping HTTP/1.1
    > Host: demo.local
    > User-Agent: curl/7.54.0
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    < Server: nginx/1.13.6
    < Date: Wed, 08 Nov 2017 20:04:36 GMT
    < Content-Type: text/plain; charset=utf-8
    < Content-Length: 117
    < Connection: keep-alive
    < x-request-id: 3014728b429320f786728401a83246b8
    < x-proxy-id: eae36bf0a3dc
    < x-server-info: interlock/2.0.0-development (147ff2b1) linux/amd64
    < x-upstream-addr: 10.0.2.5:8080
    < x-upstream-response-time: 1510171476.948
    <
    {"instance":"9c67a943ffce","version":"0.1","metadata":"demo-sticky","request_id":"3014728b429320f786728401a83246b8"}
    
  4. Optional. Add additional replicas:

    docker service scale demo=10
    

    Note

    IP hashing for extensions creates a new upstream address when scaling replicas because the proxy uses the new set of replicas to determine where to pin the requests. When the upstreams are determined, a new “sticky” backend is selected as the dedicated upstream.

Secure services with TLS

MKE offers you two different methods for securing your services with Transport Layer Security (TLS): proxy-managed TLS and service-managed TLS.

Method

Description

Proxy-managed TLS

All traffic between users and the proxy is encrypted, but the traffic between the proxy and your Swarm service is not secure.

Service-managed TLS

The end-to-end traffic is encrypted and the proxy service allows TLS traffic to pass through unchanged.

Proxy-managed TLS

This topic describes how to deploy a Swarm service wherein the proxy manages the TLS connection. Using proxy-managed TLS entails that the traffic between the proxy and the Swarm service is not secure, so you should only use this option if you trust that no one can monitor traffic inside the services that run in your datacenter.

To deploy a Swarm service with proxy-managed TLS:

  1. Obtain a private key and certificate for the TLS connection. The Common Name (CN) in the certificate must match the name where your service will be available. Generate a self-signed certificate for app.example.org:

    openssl req \
    -new \
    -newkey rsa:4096 \
    -days 3650 \
    -nodes \
    -x509 \
    -subj "/C=US/ST=CA/L=SF/O=Docker-demo/CN=app.example.org" \
    -keyout app.example.org.key \
    -out app.example.org.cert
    
  2. Create the following docker-compose.yml file:

    version: "3.2"
    
    services:
      demo:
        image: mirantiseng/docker-demo
        deploy:
          replicas: 1
          labels:
            com.docker.lb.hosts: app.example.org
            com.docker.lb.network: demo-network
            com.docker.lb.port: 8080
            com.docker.lb.ssl_cert: demo_app.example.org.cert
            com.docker.lb.ssl_key: demo_app.example.org.key
        environment:
          METADATA: proxy-handles-tls
        networks:
          - demo-network
    
    networks:
      demo-network:
        driver: overlay
    secrets:
      app.example.org.cert:
        file: ./app.example.org.cert
      app.example.org.key:
        file: ./app.example.org.key
    

    The demo service has labels specifying that the proxy service routes app.example.org traffic to this service. All traffic between the service and proxy occurs using the demo-network network. The service has labels that specify the Docker secrets used on the proxy service for terminating the TLS connection.

    The private key and certificate are stored as Docker secrets, and thus you can readily scale the number of replicas used for running the proxy service, with MKE distributing the secrets to the replicas.

  3. Download and configure the client bundle and deploy the service:

    docker stack deploy --compose-file docker-compose.yml demo
    
  4. Test that everything works correctly by updating your /etc/hosts file to map app.example.org to the IP address of an MKE node.

  5. Optional. In a production deployment, create a DNS entry so that users can access the service using the domain name of your choice. After creating the DNS entry, access your service at https://<hostname>:<https-port>.

    • hostname is the name you specified with the com.docker.lb.hosts. label.

    • https-port is the port you configured in the MKE settings.

    Because this example uses self-signed certificates, client tools such as browsers display a warning that the connection is insecure.

  6. Optional. Test that everything works using the CLI:

    curl --insecure \
    --resolve <hostname>:<https-port>:<mke-ip-address> \
    https://<hostname>:<https-port>/ping
    

    Example output:

    {"instance":"f537436efb04","version":"0.1","request_id":"5a6a0488b20a73801aa89940b6f8c5d2"}
    

    The proxy uses SNI to determine where to route traffic, and thus you must verify that you are using a version of curl that includes the SNI header with insecure requests. Otherwise, curl displays the following error:

    Server aborted the SSL handshake
    

Note

There is no way to update expired certificates using the proxy-managed TLS method. You must create a new secret and then update the corresponding service.

Service-managed TLS

This topic describes how to deploy a Swarm service wherein the service manages the TLS connection by encrypting traffic from users to your Swarm service.

Deploy your Swarm service using the following example docker-compose.yml file:

version: "3.2"

services:
  demo:
    image: mirantiseng/docker-demo
    command: --tls-cert=/run/secrets/cert.pem --tls-key=/run/secrets/key.pem
    deploy:
      replicas: 1
      labels:
        com.docker.lb.hosts: app.example.org
        com.docker.lb.network: demo-network
        com.docker.lb.port: 8080
        com.docker.lb.ssl_passthrough: "true"
    environment:
      METADATA: end-to-end-TLS
    networks:
      - demo-network
    secrets:
      - source: app.example.org.cert
        target: /run/secrets/cert.pem
      - source: app.example.org.key
        target: /run/secrets/key.pem

networks:
  demo-network:
    driver: overlay
secrets:
  app.example.org.cert:
    file: ./app.example.org.cert
  app.example.org.key:
    file: ./app.example.org.key

This updates the service to start using the secrets with the private key and certificate and it labels the service with com.docker.lb.ssl_passthrough: true, thus configuring the proxy service such that TLS traffic for app.example.org is passed to the service.

Since the connection is fully encrypted from end-to-end, the proxy service cannot add metadata such as version information or the request ID to the response headers.

Deploy services with mTLS enabled

Mutual Transport Layer Security (mTLS) is a process of mutual authentication in which both parties verify the identity of the other party, using a signed certificate.

You must have the following items to deploy services with mTLS:

  • One or more CA certificates for signing the server and client certificates and keys.

  • A signed certificate and key for the server

  • A signed certificate and key for the client


To deploy a backend service with proxy-managed mTLS enabled:

  1. Create a secret for the CA certificate that the client uses to authenticate the server.

  2. Modify the docker-compose.yml file produced in Proxy-managed TLS:

    1. Add the following label to the docker-compose.yml file:

      com.docker.lb.client_ca_cert: demo_app.example.org.client-ca-cert
      
    2. Add the CA certificate to the secrets: in the docker-compose.yml file:

      app.example.org.client-ca.cert:
        file: ./app.example.org.client-ca.cert
      

    The docker-compose-yml file presents as follows:

    version: "3.2"
    
    services:
      demo:
        image: mirantiseng/docker-demo
        deploy:
          replicas: 1
          labels:
            com.docker.lb.hosts: app.example.org
            com.docker.lb.network: demo-network
            com.docker.lb.port: 8080
            com.docker.lb.ssl_cert: demo_app.example.org.cert
            com.docker.lb.ssl_key: demo_app.example.org.key
            com.docker.lb.client_ca_cert: demo_app.example.org.client-ca.cert
        environment:
          METADATA: proxy-handles-tls
        networks:
          - demo-network
    
    networks:
      demo-network:
        driver: overlay
    secrets:
      app.example.org.cert:
        file: ./app.example.org.cert
      app.example.org.key:
        file: ./app.example.org.key
      app.example.org.client-ca.cert:
        file: ./app.example.org.client-ca.cert
    
  3. Deploy the service:

    docker stack deploy --compose-file docker-compose.yml demo
    
  4. Test the mTLS-enabled service:

    curl --insecure \
    --resolve app.example.org:<mke-https-port>:<mke-ip-address> \
    --cacert client_ca_cert.pem \
    --cert client_cert.pem \
    --key client_key.pem \
    https://app.example.org:<mke-https-port>/ping
    

    A successful deployment returns a JSON payload in plain text.

    Note

    Omitting --cacert, --cert, or --key from the cURL command returns an error message, as all three parameters are required.

Use websockets

This topic describes how to use websockets with Interlock.

  1. Create an overlay network to isolate and secure service traffic:

    docker network create -d overlay demo
    

    Example output:

    1se1glh749q1i4pw0kf26mfx5
    
  2. Create the service with websocket endpoints:

    docker service create \
    --name demo \
    --network demo \
    --detach=false \
    --label com.docker.lb.hosts=demo.local \
    --label com.docker.lb.port=8080 \
    --label com.docker.lb.websocket_endpoints=/ws \
    ehazlett/websocket-chat
    

    Interlock detects when the service is available and publishes it.

    Note

    You must have an entry for demo.local in your /etc/hosts file or use a routable domain.

  3. Once tasks are running and the proxy service is updated, the application will be available at http://demo.local. Navigate to this URL in two different browser windows and notice that the text you enter in one window displays automatically in the other.

Deploy applications with Kubernetes

Use Kubernetes on Windows Server nodes

Observe the following prerequisites prior to using Kubernetes on Windows Server nodes.

  1. Install MKE.

  2. Create a single-node, linux-only cluster.

Note

Running Kubernetes on Windows Server nodes is only supported on MKE 3.3.0 and later. If you want to run Kubernetes on Windows Server nodes on a cluster that is currently running an earlier version of MKE than 3.3.0, you must perform a fresh install of MKE 3.3.0 or later.

Add Windows Server nodes
  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes and click Add Node.

  3. Under NODE TYPE, select Windows. Windows Server nodes can only be workers.

  4. Optional. Specify custom listen and advertise addresses by using the relevant slider.

  5. Copy the command generated at the bottom of the Add Node page, which includes the join-token.

    Example command:

    docker swarm join \
    --token SWMTKN-1-2is7c14ff43tq1g61ubc5egvisgilh6m8qxm6dndjzgov9qjme-4388n8bpyqivzudz4fidqm7ey \
    172.31.2.154:2377
    
  6. Add your Windows Server node to the MKE cluster by running the docker swarm join command copied in the previous step.

Validate your cluster

To validate your cluster using the MKE web UI:

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Nodes. A green circle indicates a healthy node state. All nodes should be green.

  3. Change each node orchestrator to Kubernetes:

    1. Click on the node.

    2. In the upper-right corner, click the slider icon.

    3. In the Role section of the Details tab, select Kubernetes under ORCHESTRATOR TYPE.

    4. Click Save.

    5. Repeat the above steps for each node.


To validate your cluster using the command line:

  1. View the status of all the nodes in your cluster:

    kubectl get nodes
    

    Your nodes should all have a status value of Ready, as in the following example:

    NAME                   STATUS   ROLES    AGE     VERSION
    user-135716-win-0      Ready    <none>   2m16s   v1.17.2
    user-7d985f-ubuntu-0   Ready    master   4m55s   v1.17.2-docker-d-2
    user-135716-win-1      Ready    <none>   1m12s   v1.17.2
    
  2. Change each node orchestrator to Kubernetes:

    docker node update <node name> --label-add com.docker.ucp.orchestrator.kubernetes=true
    
  3. Repeat the last step for each node.

  4. Deploy a workload on your cluster to verify that everything works as expected.

Troubleshoot

If you cannot join your Windows Server node to the cluster, confirm that the correct processes are running on the node.

  1. Verify that the calico-node process is operational:

    PS C:\> Get-Process calico-node
    

    Example output:

    Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
    -------  ------    -----      -----     ------     --  -- -----------
        276      17    33284      40948      39.89   8132   0 calico-node
    
  2. Verify that the kubelet process is operational:

    PS C:\> Get-Process kubelet
    

    Example output:

    Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
    -------  ------    -----      -----     ------     --  -- -----------
        524      23    47332      73380     828.50   6520   0 kubelet
    
  3. Verify that the kube-proxy process is operational:

    PS C:\> Get-Process kube-proxy
    

    Example output:

    Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
    -------  ------    -----      -----     ------     --  -- -----------
        322      19    25464      33488      21.00   7852   0 kube-proxy
    
  4. If any of the process verifications indicate a problem, review the container logs that bootstrap the Kubernetes components on the Windows node:

    docker container logs (docker container ls --filter name=ucp-kubelet-win -q)
    docker container logs (docker container ls --filter name=ucp-kube-proxy -q)
    docker container logs (docker container ls --filter name=ucp-tigera-node-win -q)
    docker container logs (docker container ls --filter name=ucp-tigera-felix-win -q)
    
Deploy a workload on Windows Server

The following procedure deploys a complete web application on IIS servers as Kubernetes Services. The example workload includes an MSSQL database and a load balancer. The procedure includes the following tasks:

  • Namespace creation

  • Pod and deployment scheduling

  • Kubernetes Service provisioning

  • Application workload deployment

  • Pod, Node, and Service configuration

  1. Download and configure the client bundle.

  2. Create the following namespace file:

    demo-namespace.yaml
    apiVersion: v1
    kind: Namespace
    metadata:
      name: demo
    
  3. Create a namespace:

    kubectl create -f demo-namespace.yaml
    
  4. Create the following Windows web server file:

    win-webserver.yaml
    apiVersion: v1
    kind: Service
    metadata:
      name: win-webserver
      labels:
        app: win-webserver
      namespace: demo
    spec:
      ports:
      - port: 80
        targetPort: 80
      selector:
        app: win-webserver
      type: NodePort
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: win-webserver
      labels:
        app: win-webserver
      namespace: demo
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: win-webserver
      template:
        metadata:
          labels:
            app: win-webserver
        spec:
          affinity:
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - win-webserver
                topologyKey: "kubernetes.io/hostname"
          containers:
          - name: windowswebserver
            image: mcr.microsoft.com/windows/servercore:ltsc2019
            command:
            - powershell.exe
            - -command
            - "<#code used from https://gist.github.com/wagnerandrade/5424431#> ; $$listener = New-Object System.Net.HttpListener ; $$listener.Prefixes.Add('http://*:80/') ; $$listener.Start() ; $$callerCounts = @{} ; Write-Host('Listening at http://*:80/') ; while ($$listener.IsListening) { ;$$context = $$listener.GetContext() ;$$requestUrl = $$context.Request.Url ;$$clientIP = $$context.Request.RemoteEndPoint.Address ;$$response = $$context.Response ;Write-Host '' ;Write-Host('> {0}' -f $$requestUrl) ;  ;$$count = 1 ;$$k=$$callerCounts.Get_Item($$clientIP) ;if ($$k -ne $$null) { $$count += $$k } ;$$callerCounts.Set_Item($$clientIP, $$count) ;$$ip=(Get-NetAdapter | Get-NetIpAddress); $$header='<html><body><H1>Windows Container Web Server</H1>' ;$$callerCountsString='' ;$$callerCounts.Keys | % { $$callerCountsString+='<p>IP {0} callerCount {1} ' -f $$ip[1].IPAddress,$$callerCounts.Item($$_) } ;$$footer='</body></html>' ;$$content='{0}{1}{2}' -f $$header,$$callerCountsString,$$footer ;Write-Output $$content ;$$buffer = [System.Text.Encoding]::UTF8.GetBytes($$content) ;$$response.ContentLength64 = $$buffer.Length ;$$response.OutputStream.Write($$buffer, 0, $$buffer.Length) ;$$response.Close() ;$$responseStatus = $$response.StatusCode ;Write-Host('< {0}' -f $$responseStatus)  } ; "
          nodeSelector:
            kubernetes.io/os: windows
    

    Note

    If the Windows nodes in your MKE cluster are Windows Server 2022, edit the image tag in the win-webserver.yaml file from ltsc2019 to ltsc2022.

  5. Create the web service:

    kubectl create -f win-webserver.yaml
    

    Expected output:

    service/win-webserver created
    deployment.apps/win-webserver created
    
  6. Verify creation of the Kubernetes Service:

    kubectl get service --namespace demo
    

    Expected output:

    NAME            TYPE       CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
    win-webserver   NodePort   10.96.29.12   <none>        80:35048/TCP   12m
    
  7. Review the pods deployed on your Windows Server worker nodes with inter-pod affinity and anti-affinity.

    Note

    After creating the web service, it may take several minutes for the pods to enter a ready state.

    kubectl get pod --namespace demo
    

    Expected output:

    NAME                            READY   STATUS    RESTARTS   AGE
    win-webserver-8c5678c68-qggzh   1/1     Running   0          6m21s
    win-webserver-8c5678c68-v8p84   1/1     Running   0          6m21s
    
  8. Review the detailed status of pods deployed:

    kubectl describe pod win-webserver-8c5678c68-qggzh --namespace demo
    
  9. From a kubectl client, access the web service using node-to-pod communication across the network:

    kubectl get pods --namespace demo -o wide
    

    Example output:

    NAME                            READY   STATUS    RESTARTS   AGE   IP              NODE              NOMINATED NODE   READINESS GATES
    win-webserver-8c5678c68-qggzh   1/1     Running   0          16m   192.168.77.68   user-135716-win-1 <none>           <none>
    win-webserver-8c5678c68-v8p84   1/1     Running   0          16m   192.168.4.206   user-135716-win-0 <none>           <none>
    
  10. SSH into the master node:

    ssh -o ServerAliveInterval=15 root@<master-node>
    
  11. Use curl to access the web service by way of the CLUSTER-IP listed for the win-webserver service.

    curl 10.96.29.12
    

    Example output:

    <html><body><H1>Windows Container Web Server</H1><p>IP 192.168.77.68 callerCount 1 </body></html>
    
  12. Run the curl command a second time. You can see the second request load-balanced to a different pod:

    curl 10.96.29.12
    

    Example output:

    <html><body><H1>Windows Container Web Server</H1><p>IP 192.168.4.206 callerCount 1 </body></html>
    
  13. From a kubectl client, access the web service using pod-to-pod communication across the network:

    kubectl get service --namespace demo
    

    Expample output:

    NAME            TYPE       CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
    win-webserver   NodePort   10.96.29.12   <none>        80:35048/TCP   12m
    
  14. Review the pod status:

    kubectl get pods --namespace demo -o wide
    

    Example output:

    NAME                            READY   STATUS    RESTARTS   AGE   IP              NODE              NOMINATED NODE   READINESS GATES
    win-webserver-8c5678c68-qggzh   1/1     Running   0          16m   192.168.77.68   user-135716-win-1 <none>           <none>
    win-webserver-8c5678c68-v8p84   1/1     Running   0          16m   192.168.4.206   user-135716-win-0 <none>           <none>
    
  15. Exec into the web service:

    kubectl exec -it win-webserver-8c5678c68-qggzh --namespace demo cmd
    

    Example output:

    Microsoft Windows [Version 10.0.17763.1098]
    (c) 2018 Microsoft Corporation. All rights reserved.
    
  16. Use curl to access the web service:

    C:\>curl 10.96.29.12
    

    Example output:

    <html><body><H1>Windows Container Web Server</H1><p>IP 192.168.77.68
    callerCount 1 <p>IP 192.168.77.68 callerCount 1 </body></html>
    

Access Kubernetes resources

Using the MKE web UI left-side navigation panel, under Kubernetes, you can access the following Kubernetes resources:

Kubernetes menu item

Kubernetes resources

Namespaces

Namespaces

Service Accounts

Service accounts

Controllers

Deployments

ReplicaSet

DaemonSet

StatefulSet

Job

Cronjobs

Load Balancers

Pods

Configurations

Secrets

ResourceQuota

NetworkSecurityPolicy

ConfigMap

LimitRange

Storage

PersistentVolumes

PersistentVolumeClaims

StorageClasses

Deploy a workload to a Kubernetes cluster

MKE supports using both the web UI and the CLI to deploy your Kubernetes YAML files.

Deploy a workload using the MKE web UI

This example defines a Kubernetes deployment object for an NGINX server.

Deploy an NGINX server
  1. Log in to the MKE web UI.

  2. In the left-side navigation menu, navigate to Kubernetes and click Create.

  3. In the Namespace drop-down, select default.

  4. Paste the following configuration details in the Object YAML editor:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment
    spec:
      selector:
        matchLabels:
          app: nginx
      replicas: 2
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1.7.9
            ports:
            - containerPort: 80
    

    This YAML file specifies an earlier version of NGINX, which you will update in a later section of this topic.

  5. Click Create.

  6. Navigate to Kubernetes > Namespaces, hover over the default namespace, and select Set Context.

Inspect the deployment

You can review the status of your deployment in the Kubernetes section of the left-side navigation panel.

  1. In the left-side navigation panel, navigate to Kubernetes > Controllers to review the resource controllers created for the NGINX server.

  2. Click the nginx-deployment controller.

  3. To review the values used to create the deployment, click the slider icon in the upper right corner.

  4. In the left-side navigation panel, navigate to Kubernetes > Pods to review the Pods that are provisioned for the NGINX server.

  5. Click one of the Pods.

  6. In the Overview tab, review the Pod phase, IP address, and other properties.

Expose the server

The NGINX server is operational, but it is not accessible from outside of the cluster. Create a YAML file to add a NodePort service, which exposes the server on a specified port.

  1. In the left-side navigation menu, navigate to Kubernetes and click Create.

  2. In the Namespace drop-down, select default.

  3. Paste the following configuration details in the Object YAML editor:

    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      type: NodePort
      ports:
        - port: 80
          nodePort: 32768
      selector:
        app: nginx
    

    The service connects internal port 80 of the cluster to the external port 32768.

  4. Click Create, and the Services page opens.

  5. Select the nginx service and in the Overview tab, scroll to the Ports section.

  6. To review the default NGINX page, navigate to <node-ip>:<nodeport> in your browser.

    Note

    To display the NGINX page, you may need to add a rule in your cloud provider firewall settings to allow inbound traffic on the port specified in the YAML file.

The YAML definition connects the service to the NGINX server using the app label nginx and a corresponding label selector.

Update the deployment

MKE supports updating an existing deployment by applying an updated YAML file. In this example, you will scale the server up to four replicas and update NGINX to a later version.

  1. In the left-side navigation panel, navigate to Kubernetes > Controllers and select nginx-deployment.

  2. To edit the deployment, click the gear icon in the upper right corner.

  3. Update the number of replicas from 2 to 4.

  4. Update the value of image from nginx:1.7.9 to nginx:1.8.

  5. Click Save to update the deployment with the new configuration settings.

  6. To review the newly-created replicas, in the left-side navigation panel, navigate to Kubernetes > Pods.

The content of the updated YAML file is as follows:

...
spec:
  progressDeadlineSeconds: 600
  replicas: 4
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: nginx
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.8
...

See also

Deploy a workload using the CLI

MKE supports deploying your Kubernetes objects on the command line using kubectl.

Deploy an NGINX server
  1. Download and configure the client bundle.

  2. Create a file called deployment.yaml that contains the following content:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment
    spec:
      selector:
        matchLabels:
          app: nginx
      replicas: 2
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1.7.9
            ports:
            - containerPort: 80
          nodeSelector:
            kubernetes.io/os: linux
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      type: NodePort
      ports:
        - port: 80
          nodePort: 32768
      selector:
        app: nginx
    
  3. Deploy the NGINX server:

    kubectl apply -f deployment.yaml
    
  4. Use the describe deployment option to review the deployment:

    kubectl describe deployment nginx-deployment
    
Update the deployment

Update an existing deployment by applying an updated YAML file.

  1. Increase the number of replicas to 4:

    kubectl scale --replicas=4 deployment/nginx-deployment
    
  2. Update the NGINX version to 1.8:

    kubectl set image deployment/nginx-deployment nginx=nginx:1.8
    
  3. Deploy the updated NGINX server:

    kubectl apply -f update.yaml
    
  4. Verify that the deployment was scaled up successfully by listing the deployments in the cluster:

    kubectl get deployments
    

    Expected output:

    NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
    nginx-deployment       4         4         4            4           2d
    
  5. Verify that the pods are running the updated image:

    kubectl describe deployment nginx-deployment | grep -i image
    

    Expected output:

    Image:        nginx:1.8
    

See also

Deploy OPA Gatekeeper for policy enforcement

Mirantis currently supports the use of OPA Gatekeeper for purposes of policy enforcement.

Open Policy Agent (OPA) is an open source policy engine that facilitates policy-based control for cloud native environments. OPA introduces a high-level declarative language called Rego that decouples policy decisions from enforcement.

The OPA Constraint Framework introduces two primary resources: constraint templates and constraints.

Constraint templates

OPA policy definitions, written in Rego

Constraints

The application of a constraint template to a given set of objects

Gatekeeper uses the Kubernetes API to integrate OPA into Kubernetes. Policies are defined in the form of Kubernetes CustomResourceDefinitions (CRDs) and are enforced with custom admission controller webhooks. These CRDs define constraint templates and constraints on the API server. Any time a request to create, delete, or update a resource is sent to the Kubernetes cluster API server, Gatekeeper validates that resource against the predefined policies. Gatekeeper also audits preexisting resource constraint violations against newly defined policies.

Using OPA Gatekeeper, you can enforce a wide range of policies against your Kubernetes cluster. Policy examples include:

  • Container images can only be pulled from a set of whitelisted repositories.

  • New resources must be appropriately labeled.

  • Deployments must specify a minimum number of replicas.

Note

By design, when the OPA Gatekeeper is disabled using the configuration file, the policies are not cleaned up. Thus, when the OPA Gatekeeper is re-enabled, the cluster can immediately adopt the existing policies.

The retention of the policies poses no risk, as they are merely data on the API server and have no value outside of an OPA Gatekeeper deployment.

The following topics offer installation instructions and an example use case.

Install OPA Gatekeeper

The installation of OPA Gatekeeper is achieved simply by updating the MKE configuration file.

  1. Obtain the current MKE configuration file for your cluster.

  2. Set the cluster_config.policy_enforcement.gatekeeper.enabled configuration parameter to "true". For more information on Gatekeeper configuration options, refer to cluster_config.policy_enforcement.gatekeeper.

  3. Optional. Exclude resources that are contained in a specified set of namespaces by assigning a comma-separated list of namespaces to the cluster_config.policy_enforcement.gatekeeper.excluded_namespaces configuration parameter.

    Caution

    Avoid adding namespaces to the excluded_namespaces list that do not yet exist in the cluster.

  4. Upload the newly modified MKE configuration file. Be aware that the upload requires a wait time of approximately five minutes.

  5. Verify the successful installation of Gatekeeper by running the following commands in sequence:

    1. Verify that the gatekeeper-system namespace was created:

      kubectl get ns gatekeeper-system
      

      Expected output:

      NAME                STATUS   AGE
      gatekeeper-system   Active   1m
      
    2. Verify the contents of the gatekeeper-system deployment:

      kubectl get deployment -n gatekeeper-system
      

      Expected output:

      NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
      gatekeeper-audit                1/1     1            1           1m
      gatekeeper-controller-manager   3/3     3            3           1m
      
    3. Verify that gatekeeper-webhook-service was created:

      kubectl get service -n gatekeeper-system
      

      Expected output:

      NAME                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
      gatekeeper-webhook-service   ClusterIP   10.96.143.125   <none>        443/TCP   1m
      
    4. Verify that the correct CustomResourceDefinitions were created:

      kubectl get crd
      

      Expected output:

      NAME                                                 CREATED AT
      assign.mutations.gatekeeper.sh                       2022-08-01T06:25:12Z
      assignmetadata.mutations.gatekeeper.sh               2022-08-01T06:25:12Z
      configs.config.gatekeeper.sh                         2022-08-01T06:25:12Z
      constraintpodstatuses.status.gatekeeper.sh           2022-08-01T06:25:12Z
      constrainttemplatepodstatuses.status.gatekeeper.sh   2022-08-01T06:25:12Z
      constrainttemplates.templates.gatekeeper.sh          2022-08-01T06:25:12Z
      modifyset.mutations.gatekeeper.sh                    2022-08-01T06:25:12Z
      mutatorpodstatuses.status.gatekeeper.sh              2022-08-01T06:25:12Z
      providers.externaldata.gatekeeper.sh                 2022-08-01T06:25:12Z
      
    5. Verify exempted namespaces, if applicable:

      kubectl describe ns kube-system gatekeeper-system
      

      Expected output:

      Name:         kube-system
      Labels:       admission.gatekeeper.sh/ignore=exempted-by-mke
           kubernetes.io/metadata.name=kube-system
      Annotations:  <none>
      Status:       Active
      
      No resource quota.
      
      No LimitRange resource.
      
      
      Name:         gatekeeper-system
      Labels:       admission.gatekeeper.sh/ignore=no-self-managing
                    control-plane=controller-manager
                    gatekeeper.sh/system=yes
                    kubernetes.io/metadata.name=gatekeeper-system
      Annotations:  <none>
      Status:       Active
      
      Resource Quotas
        Name:     gatekeeper-critical-pods
        Resource  Used  Hard
        --------  ---   ---
        pods      4     100
      
      No LimitRange resource.
      
Use OPA Gatekeeper

To guide you in the creation of OPA Gatekeeper policies, as an example this topic illustrates how to generate a policy for restricting escalation to root privileges.

Note

Gatekeeper provides a library of commonly used policies, including replacements for familiar PodSecurityPolicies.

Important

For users who are new to Gatekeeper, Mirantis recommends performing a dry run on potential policies prior to production deployment. Such an approach, by only auditing violations, will prevent potential cluster disruption. To perform a dry run, set spec.enforcementAction to dryrun in the constraint.yaml detailed herein.

  1. Create a YAML file called template.yaml and place the following code in that file:

    apiVersion: templates.gatekeeper.sh/v1
    kind: ConstraintTemplate
    metadata:
      name: k8spspallowprivilegeescalationcontainer
      annotations:
        description: >-
          Controls restricting escalation to root privileges. Corresponds to the
          `allowPrivilegeEscalation` field in a PodSecurityPolicy. For more
          information, see
          https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privilege-escalation
    spec:
      crd:
        spec:
          names:
            kind: K8sPSPAllowPrivilegeEscalationContainer
          validation:
            openAPIV3Schema:
              type: object
              description: >-
                Controls restricting escalation to root privileges. Corresponds to the
                `allowPrivilegeEscalation` field in a PodSecurityPolicy. For more
                information, see
                https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privilege-escalation
              properties:
                exemptImages:
                  description: >-
                    Any container that uses an image that matches an entry in this list will be excluded
                    from enforcement. Prefix-matching can be signified with `*`. For example: `my-image-*`.
    
                    It is recommended that users use the fully-qualified Docker image name (e.g. start with a domain name)
                    in order to avoid unexpectedly exempting images from an untrusted repository.
                  type: array
                  items:
                    type: string
      targets:
        - target: admission.k8s.gatekeeper.sh
          rego: |
            package k8spspallowprivilegeescalationcontainer
    
            import data.lib.exempt_container.is_exempt
    
            violation[{"msg": msg, "details": {}}] {
                c := input_containers[_]
                not is_exempt(c)
                input_allow_privilege_escalation(c)
                msg := sprintf("Privilege escalation container is not allowed: %v", [c.name])
            }
    
            input_allow_privilege_escalation(c) {
                not has_field(c, "securityContext")
            }
            input_allow_privilege_escalation(c) {
                not c.securityContext.allowPrivilegeEscalation == false
            }
            input_containers[c] {
                c := input.review.object.spec.containers[_]
            }
            input_containers[c] {
                c := input.review.object.spec.initContainers[_]
            }
            input_containers[c] {
                c := input.review.object.spec.ephemeralContainers[_]
            }
            # has_field returns whether an object has a field
            has_field(object, field) = true {
                object[field]
            }
          libs:
            - |
              package lib.exempt_container
    
              is_exempt(container) {
                  exempt_images := object.get(object.get(input, "parameters", {}), "exemptImages", [])
                  img := container.image
                  exemption := exempt_images[_]
                  _matches_exemption(img, exemption)
              }
    
              _matches_exemption(img, exemption) {
                  not endswith(exemption, "*")
                  exemption == img
              }
    
              _matches_exemption(img, exemption) {
                  endswith(exemption, "*")
                  prefix := trim_suffix(exemption, "*")
                  startswith(img, prefix)
              }
    
  2. Create the constraint template:

    kubectl create -f template.yaml
    

    Expected output:

    constrainttemplate.templates.gatekeeper.sh/k8spspallowprivilegeescalationcontainer created
    
  3. Create a YAML file called constraint.yaml and place the following code in that file:

    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: K8sPSPAllowPrivilegeEscalationContainer
    metadata:
      name: psp-allow-privilege-escalation-container
    spec:
      match:
        kinds:
          - apiGroups: [""]
            kinds: ["Pod"]
    
  4. Create the constraint:

    kubectl create -f constraint.yaml
    

    Expected output:

    k8spspallowprivilegeescalationcontainer.constraints.gatekeeper.sh/psp-allow-privilege-escalation-container created
    
  5. Create a YAML file called disallowed-pod.yaml and place the following code in that file:

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx-privilege-escalation-disallowed
      labels:
        app: nginx-privilege-escalation
    spec:
      containers:
      - name: nginx
        image: nginx
        securityContext:
          allowPrivilegeEscalation: true
    
  6. Create the Pod:

    kubectl create -f disallowed-pod.yaml
    

    Expected output:

    Error from server (Forbidden): error when creating "disallowed.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [psp-allow-privilege-escalation-container] Privilege escalation container is not allowed: nginx
    
  7. Create a YAML file called allowed-pod.yaml and place the following code in that file:

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx-privilege-escalation-allowed
      labels:
        app: nginx-privilege-escalation
    spec:
      containers:
      - name: nginx
        image: nginx
        securityContext:
          allowPrivilegeEscalation: false
    
  8. Create the Pod:

    kubectl create -f allowed-pod.yaml
    

    Expected output:

    pod/nginx-privilege-escalation-allowed created
    

Use admission controllers for access

MKE supports using a selective grant to allow a set of user and service accounts to use privileged attributes on Kubernetes Pods. This enables administrators to create scenarios that would ordinarily require administrators or cluster-admins to execute. Such selective grants can be used to temporarily bypass restrictions on non-administrator accounts, as the changes can be reverted at any time.

The privileged attributes associated with user and service accounts are specified separately. It is only possible to specify one list of privileged attributes for user accounts and one list for service accounts.

The user accounts specified for access must be non-administrator users and the service accounts specified for access must not be bound to the cluster-admin role.

The following privileged attributes can be assigned using a selective grant:

Attribute

Description

hostIPC

Allows the Pod containers to share the host IPC namespace

hostNetwork

Allows the Pod to use the network namespace and network resources of the host node

hostPID

Allows the Pod containers to share the host process ID namespace

hostBindMounts

Allows the Pod containers to use directories and volumes mounted on the container host

privileged

Allows one or more Pod containers to run privileged, escalate privileges, or both

kernelCapabilities

Allows you to specify the addition of kernel capabilities on one or more of the kernel capabilities

The following Pod manifest demonstrates the use of several of the privileged attributes in a Pod:

Example Pod manifest
apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
  - image: ubuntu
    command:
      - sleep
      - "36000"
    imagePullPolicy: IfNotPresent
    name: busybox
    securityContext:
      capabilities:
        add:
          - NET_ADMIN
        drop:
          - CHOWN
      privileged: false
      allowPrivilegeEscalation: true

  restartPolicy: Always

To configure privileged attributes for user and service account access:

  1. Obtain the current MKE configuration file for your cluster.

  2. In the [cluster_config] section on the MKE configuration file, specify the required privileged attributes for user accounts using the priv_attributes_allowed_for_user_accounts parameter.

  3. Specify the associated user accounts with the priv_attributes_user_accounts parameter.

  4. Specify the required privileged attributes for service accounts using the priv_attributes_allowed_for_service_accounts parameter.

  5. Specify the associated service accounts with the priv_attributes_service_accounts parameter.

  6. Upload the new MKE configuration file.

Example privileged attribute specification in the MKE configuration file:

priv_attributes_allowed_for_user_accounts = ["privileged"]
priv_attributes_user_accounts = ["Abby"]
priv_attributes_allowed_for_service_accounts = ["hostBindMounts", "hostIPC"]
priv_attributes_service_accounts = ["default:sa1"]

Create a service account for a Kubernetes app

Kubernetes uses service accounts to enable workload access control. A service account is an identity for processes that run in a Pod. When a process is authenticated through a service account, it can contact the API server and access cluster resources. The default service account is default.

You provide a service account with access to cluster resources by creating a role binding, just as you do for users and teams.

This example illustrates how to create a service account and role binding used with an NGINX server.


To create a Kubernetes namespace:

It is necessary to create a namespace for use with your service account, as unlike user accounts, service accounts are scoped to a particular namespace.

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Kubernetes > Namespaces and click Create.

  3. Leave the Namespace drop-down blank.

  4. Paste the following in the Object YAML editor:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: nginx
    
  5. Click Create.

  6. Navigate to the nginx namespace.

  7. Click the vertical ellipsis in the upper-right corner and click Set Context.


To create a service account:

  1. In the left-side navigation panel, navigate to Kubernetes > Service Accounts and click Create.

  2. In the Namespace drop-down, select nginx.

  3. Paste the following in the Object YAML editor:

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: nginx-service-account
    
  4. Click Create.

There are now two service accounts associated with the nginx namespace: default and nginx-service-account.


To create a role binding:

To give the service account access to cluster resources, create a role binding with view permissions.

  1. From the left-side navigation panel, navigate to Access Control > Grants.

    Note

    If Hide Swarm Navigation is selected on the <username> > Admin Settings > Tuning page, Grants will display as Role Bindings under the Access Control menu item.

  2. In the Grants pane, select the Kubernetes tab and click Create Role Binding.

  3. In the Subject pane, under SELECT SUBJECT TYPE, select Service Account.

  4. In the Namespace drop-down, select nginx.

  5. In the Service Account drop-down, select nginx-service-account and then click Next.

  6. In the Resource Set pane, select the nginx namespace.

  7. In the Role pane, under ROLE TYPE, select Cluster Role and then select view.

  8. Click Create.

The NGINX service account can now access all cluster resources in the nginx namespace.

Install an unmanaged CNI plugin

Calico affords MKE secure networking functionality for container-to-container communication within Kubernetes. MKE manages the Calico lifecycle, packaging it at both the time of installation and upgrade, and fully supports its use with MKE

MKE also supports the use of alternative, unmanaged CNI plugins available on Docker Hub. Mirantis can provide limited instruction on basic configuration, but for detailed guidance on third-party CNI components, you must refer to the external product documentation or support.

Consider the following limitations before implementing an unmanaged CNI plugin:

  • MKE only supports implementation of an unmanaged CNI plugin at install time.

  • MKE does not manage the version or configuration of alternative CNI plugins.

  • MKE does not upgrade or reconfigure alternative CNI plugins. To switch from the managed CNI to an unmanaged CNI plugin, or vice versa, you must uninstall and then reinstall MKE.

Install an unmanaged CNI plugin on MKE
  1. Verify that your system meets all MKE requirements and third-party CNI plugin requirements.

  2. Install MKE with the --unmanaged-cni flag:

    docker container run --rm -it --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      mirantis/ucp:3.7.16 install \
      --host-address <node-ip-address> \
      --unmanaged-cni \
      --interactive
    

    MKE components that require Kubernetes networking will remain in the Container Creating state in Kubernetes until a CNI is installed. Once the installation is complete, you can access MKE from a web browser. Note that the manager node will be unhealthy as the kubelet will report NetworkPluginNotReady. Additionally, the metrics in the MKE dashboard will also be unavailable, as this runs in a Kubernetes pod.

  3. Download and configure the client bundle.

  4. Review the status of the MKE components that run on Kubernetes:

    kubectl get nodes
    

    Example output:

    NAME         STATUS     ROLES     AGE       VERSION
    manager-01   NotReady   master    10m       v1.11.9-docker-1
    
    kubectl get pods -n kube-system -o wide
    

    Example output:

    NAME                           READY     STATUS              RESTARTS   AGE       IP        NODE         NOMINATED NODE
    compose-565f7cf9ff-gq2gv       0/1       Pending             0          10m       <none>    <none>       <none>
    compose-api-574d64f46f-r4c5g   0/1       Pending             0          10m       <none>    <none>       <none>
    kube-dns-6d96c4d9c6-8jzv7      0/3       Pending             0          10m       <none>    <none>       <none>
    ucp-metrics-nwt2z              0/3       ContainerCreating   0          10m       <none>    manager-01   <none>
    
  5. Install the unmanaged CNI plugin. Follow the CNI plugin documentation for specific installation instructions. The unmanaged CNI plugin install steps typically include:

    1. Download the relevant upstream CNI binaries.

    2. Place the CNI binaries in /opt/cni/bin.

    3. Download the relevant CNI plugin Kubernetes Manifest YAML file.

    4. Run kubectl apply -f <your-custom-cni-plugin>.yaml.

    Caution

    You must install the unmanaged CNI immediately after installing MKE and before joining any manager or worker nodes to the cluster.

    Note

    While troubleshooting a custom CNI plugin, you may want to access logs within the kubelet. Connect to an MKE manager node and run docker logs ucp-kubelet.

Verify the MKE installation

Upon successful installation of the CNI plugin, the relevant MKE components will have a Running status once the pods have become available.

To review the status of the Kubernetes components:

kubectl get pods -n kube-system -o wide

Example output:

NAME                           READY     STATUS    RESTARTS   AGE       IP            NODE         NOMINATED NODE
compose-565f7cf9ff-gq2gv       1/1       Running   0          21m       10.32.0.2     manager-01   <none>
compose-api-574d64f46f-r4c5g   1/1       Running   0          21m       10.32.0.3     manager-01   <none>
kube-dns-6d96c4d9c6-8jzv7      3/3       Running   0          22m       10.32.0.5     manager-01   <none>
ucp-metrics-nwt2z              3/3       Running   0          22m       10.32.0.4     manager-01   <none>
weave-net-wgvcd                2/2       Running   0          8m        172.31.6.95   manager-01   <none>

Note

Weave Net serves as the CNI plugin for the above example. If you are using an alternative CNI plugin, verify its status in the output.

Enable an unmanaged CNI for Windows Server nodes

When MKE is installed with --unmanaged-cni, the ucp-kube-proxy-win container on Windows nodes will not fully start, but will instead log the following suggestion in a loop:

example : [System.Environment]::SetEnvironmentVariable("CNINetworkName", "ElangoNet", [System.EnvironmentVariableTarget]::Machine)
example : [System.Environment]::SetEnvironmentVariable("CNISourceVip", "192.32.31.1", [System.EnvironmentVariableTarget]::Machine)

This occurs because kube-proxy requires more information to program routes for Kubernetes services.


To enable an unmanaged CNI for Windows Server nodes:

There are two options for supplying kube-proxy with the required information.

  • Deploy your own kube-proxy along with the CNI, as implemented by the kube-proxy manifest and documented in the Kubernetes 1.21 Windows Install Guide.

  • If using a VXLAN-based CNI, define the following variables:

    • CNINetworkName must match the name of the Windows Kubernetes HNS network, which you can find either in the installation documentation for the third party CNI or by using hnsdiag list networks.

    • CNISourceVip must use the value of the source VIP for this node, which should be available in the installation documentation for the third party CNI. Because the source VIP will be different for each node and can change across host reboots, Mirantis recommends setting this variable using a utility script.

    The following is an example of how to define these variables using PowerShell:

    [System.Environment]::SetEnvironmentVariable("CNINetworkName", "vxlan0", [System.EnvironmentVariableTarget]::Machine)
    
    [System.Environment]::SetEnvironmentVariable("CNISourceVip", "192.32.31.1", [System.EnvironmentVariableTarget]::Machine)
    

Kubernetes network encryption

MKE provides data-plane level IPSec network encryption to securely encrypt application traffic in a Kubernetes cluster. This secures application traffic within a cluster when running in untrusted infrastructure or environments. It is an optional feature of MKE that is enabled by deploying the SecureOverlay components on Kubernetes when using the default Calico driver for networking with the default IPIP tunneling configuration.

Kubernetes network encryption is enabled by two components in MKE:

  • SecureOverlay Agent

  • SecureOverlay Master

The SecureOverlay Agent is deployed as a per-node service that manages the encryption state of the data plane. The Agent controls the IPSec encryption on Calico IPIP tunnel traffic between different nodes in the Kubernetes cluster. The Master is deployed on an MKE manager node and acts as the key management process that configures and periodically rotates the encryption keys.

Kubernetes network encryption uses AES Galois Counter Mode (AES-GCM) with 128-bit keys by default.

You must deploy the SecureOverlay Agent and Master on MKE to enable encryption, as it is not enabled by default. You can enable or disable encryption at any time during the cluster lifecycle. However, be aware that enabling or disabling encryption can cause temporary traffic outages between Pods, lasting up to a few minutes. When enabled, Kubernetes Pod traffic between hosts is encrypted at the IPIP tunnel interface in the MKE host.

Kubernetes network encryption is supported on the following platforms:

Platform

Encryption support

MKE 3.1 and later

Yes

Kubernetes 1.11 and later

Yes

On-premises

Yes

AWS

Yes

GCE

Yes

All MKE-supported Linux OSes

Yes

Azure

No

Unmanaged CNI plugins

No

Configure maximum transmission units

Maximum transmission units (MTUs) are the largest packet length that a container will allow. Before deploying the SecureOverlay components, verify that Calico is configured so that the IPIP tunnel MTU leaves sufficient room for the encryption overhead. Encryption adds 26 bytes of overhead, but every IPSec packet size must be a multiple of 4 bytes. IPIP tunnels require 20 bytes of encapsulation overhead. The IPIP tunnel interface MTU must be no more than EXTMTU - 46 - ((EXTMTU - 46) modulo 4), where EXTMTU is the minimum MTU of the external interfaces. An IPIP MTU of 1452 should generally be safe for most deployments.

In the MKE configuration file, update the ipip_mtu parameter with the new MTU:

[cluster_config]
 ...
 ipip_mtu = "1452"
 ...
Configure SecureOverlay

Once the cluster node MTUs are properly configured, deploy the SecureOverlay components to MKE using either the MKE configuration file or the SecureOverlay YAML file.


To configure SecureOverlay using the MKE configuration file:

Set the value of secure_overlay in the MKE configuration file cluster_config table to true.


To configure SecureOverlay using the SecureOverlay YAML file:

Run the following procedure at the time of cluster installation, prior to starting any workloads.

  1. Copy the contents of the SecureOverlay YAML file into a YAML file called ucp-secureoverlay.yaml.

  2. Download and configure the client bundle.

  3. Enable network encryption:

    kubectl apply -f ucp-secureoverlay.yml
    

Note

To remove network encryption from the system, issue the following command:

kubectl delete -f ucp-secureoverlay.yml

Persistent Kubernetes Storage

Use NFS Storage

You can provide persistent storage for MKE workloads by using NFS storage. When mounted into the running container, NFS shares provide state to the application, managing data external to the container lifecycle.

Note

The following subjects are out of the scope of this topic:

  • Provisioning an NFS server

  • Exporting an NFS share

  • Using external Kubernetes plugins to dynamically provision NFS shares

There are two different ways to mount existing NFS shares within Kubernetes Pods:

  • Define NFS shares within the Pod definitions. NFS shares are defined manually by each tenant when creating a workload.

  • Define NFS shares as a cluster object through PersistentVolumes, with the cluster object lifecycle handled separately from the workload. This is common for operators who want to define a range of NFS shares for tenants to request and consume.

Define NFS shares in the Pod definition

While defining workloads in Kubernetes manifest files, users can reference the NFS shares that they want to mount within the Pod specification for each Pod. This can be a standalone Pod or it can be wrapped in a higher-level object like a Deployment, DaemonSet, or StatefulSet.

The following example includes a running MKE cluster and a downloaded client bundle with permission to schedule Pods in a namespace.

  1. Create nfs-in-a-pod.yaml with the following content:

    kind: Pod
    apiVersion: v1
    metadata:
      name: nfs-in-a-pod
    spec:
      containers:
        - name: app
          image: alpine
          volumeMounts:
            - name: nfs-volume
              mountPath: /var/nfs
          command: ["/bin/sh"]
          args: ["-c", "sleep 500000"]
      volumes:
        - name: nfs-volume
          nfs:
            server: nfs.example.com
            path: /share1
    
    • Change the value of mountPath to the location where you want the share to be mounted.

    • Change the value of server to your NFS server.

    • Change the value of path to the relevant share.

  2. Create the Pod specification:

    kubectl create -f nfs-in-a-pod.yaml
    
  3. Verify that the Pod is created successfully:

    kubectl get pods
    

    Example output:

    NAME                     READY     STATUS      RESTARTS   AGE
    nfs-in-a-pod             1/1       Running     0          6m
    
  4. Verify everything was mounted correctly by accessing a shell prompt within the container and searching for your mount:

  5. Access a shell prompt within the container:

    kubectl exec -it nfs-in-a-pod sh
    
  6. Verify that everything is correctly mounted by searching for your mount:

    mount | grep nfs.example.com
    

Note

MKE and Kubernetes are unaware of the NFS share because it is defined as part of the Pod specification. As such, when you delete the Pod, the NFS share detaches from the cluster, though the data remains in the NFS share.

Expose NFS shares as a cluster object

This method uses the Kubernetes PersistentVolume (PV) and PersistentVolumeClaim (PVC) objects to manage NFS share lifecycle and access.

You can define multiple shares for a tenant to use within the cluster. The PV is a cluster-wide object, so it can be pre-provisioned. A PVC is a claim by a tenant for using a PV within the tenant namespace.

To create PV objects at the cluster level, you will need a ClusterRoleBinding grant.

Note

The “NFS share lifecycle” refers to granting and removing the end user ability to consume NFS storage, rather than the lifecycle of the NFS server.


To define the PersistentVolume at the cluster level:

  1. Create pvwithnfs.yaml with the following content:

    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: my-nfs-share
    spec:
      capacity:
        storage: 5Gi
      accessModes:
        - ReadWriteOnce
      persistentVolumeReclaimPolicy: Recycle
      nfs:
        server: nfs.example.com
        path: /share1
    
    • The 5Gi storage size is used to match the volume to the tenant claim.

    • The valid accessModes values for an NFS PV are:

      • ReadOnlyMany: the volume can be mounted as read-only by many nodes.

      • ReadWriteOnce: the volume can be mounted as read-write by a single node.

      • ReadWriteMany: the volume can be mounted as read-write by many nodes.

      The access mode in the PV definition is used to match a PV to a Claim. When a PV is defined and created inside of Kubernetes, a volume is not mounted. Refer to Access Modes for more information, including any changes to the valid accessModes.

    • The valid persistentVolumeReclaimPolicy values are:

      • Reclaim

      • Recycle

      • Delete

      MKE uses the reclaim policy to define what the cluster does after a PV is released from a claim. Refer to Reclaiming in the official Kubernetes documentation for more information, including any changes to the valid persistentVolumeReclaimPolicy values.

    • Change the value of server to your NFS server.

    • Change the value of path to the relevant share.

  2. Create the volume:

    kubectl create -f pvwithnfs.yaml
    
  3. Verify that the volume is created successfully:

    kubectl get pv
    

    Example output:

    NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                       STORAGECLASS   REASON    AGE
    
    my-nfs-share   5Gi        RWO            Recycle          Available                               slow                     7s
    

To define a PersistentVolumeClaim:

A tenant can now “claim” a PV for use within their workloads by using a Kubernetes PVC. A PVC exists within a namespace and it attempts to match available PVs to the tenant request.

Create myapp-cliam.yaml with the following content:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: myapp-nfs
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

To deploy this PVC, the tenant must have a RoleBinding that permits the creation of PVCs. If there is a PV that meets the tenant criteria, Kubernetes binds the PV to the claim. This does not, however, mount the share.

  1. Create the PVC:

    kubectl create -f myapp-claim.yaml
    

    Expected output:

    persistentvolumeclaim "myapp-nfs" created
    
  2. Verify that the claim is created successfully:

    kubectl get pvc
    

    Example output:

    NAME        STATUS    VOLUME         CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    myapp-nfs   Bound     my-nfs-share   5Gi        RWO            slow           2s
    
  3. Verify that the claim is associated with the PV:

    kubectl get pv
    

    Example output:

    NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM              STORAGECLASS   REASON    AGE
    my-nfs-share   5Gi        RWO            Recycle          Bound     default/myapp-nfs  slow                     4m
    

To define a workload:

The final task is to deploy a workload to consume the PVC. The PVC is defined within the Pod specification, which can be a standalone Pod or wrapped in a higher-level object such as a Deployment, DaemonSet, or StatefulSet.

Create myapp-pod.yaml with the following content:

kind: Pod
apiVersion: v1
metadata:
  name: pod-using-nfs
spec:
  containers:
    - name: app
      image: alpine
      volumeMounts:
      - name: data
          mountPath: /var/nfs
      command: ["/bin/sh"]
      args: ["-c", "sleep 500000"]
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: myapp-nfs

Change the value of mountPath to the location where you want the share mounted.

  1. Deploy the Pod:

    kubectl create -f myapp-pod.yaml
    
  2. Verify that the Pod is created successfully:

    kubectl get pod
    

    Example output:

    NAME                     READY     STATUS      RESTARTS   AGE
    pod-using-nfs            1/1       Running     0          1m
    
  3. Access a shell prompt within the container:

    kubectl exec -it pod-using-nfs sh
    
  4. Verify that everything is correctly mounted by searching for your mount:

    mount | grep nfs.example.com
    

See also

Use Azure Disk Storage

You can provide persistent storage for MKE workloads on Microsoft Azure by using Azure Disk Storage. You can either pre-provision Azure Disk Storage to be consumed by Kubernetes Pods, or you can use the Azure Kubernetes integration to dynamically provision Azure Disks as needed.

This guide assumes that you have already provisioned an MKE environment on Microsoft Azure and that you have provisioned a cluster after meeting all of the prerequisites listed in Install MKE on Azure.

To complete the steps in this topic, you must download and configure the client bundle.

Manually provision Azure Disks

You can use existing Azure Disks or manually provision new ones to provide persistent storage for Kubernetes Pods. You can manually provision Azure Disks in the Azure Portal, using ARM Templates, or using the Azure CLI. The following example uses the Azure CLI to manually provision an Azure Disk.

  1. Create an environment variable for myresourcegroup:

    RG=myresourcegroup
    
  2. Provision an Azure Disk:

    az disk create \
    --resource-group $RG \
    --name k8s_volume_1  \
    --size-gb 20 \
    --query id \
    --output tsv
    

    This command returns the Azure ID of the Azure Disk Object.

    Example output:

    /subscriptions/<subscriptionID>/resourceGroups/<resourcegroup>/providers/Microsoft.Compute/disks/<diskname>
    
  3. Make note of the Azure ID of the Azure Disk Object returned by the previous step.

You can now create Kubernetes Objects that refer to this Azure Disk. The following example uses a Kubernetes Pod, though the same Azure Disk syntax can be used for DaemonSets, Deployments, and StatefulSets. In the example, the Azure diskName and diskURI refer to the manually created Azure Disk:

$ cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: mypod-azuredisk
spec:
  containers:
  - image: nginx
    name: mypod
    volumeMounts:
      - name: mystorage
        mountPath: /data
  volumes:
      - name: mystorage
        azureDisk:
          kind: Managed
          diskName: k8s_volume_1
          diskURI: /subscriptions/<subscriptionID>/resourceGroups/<resourcegroup>/providers/Microsoft.Compute/disks/<diskname>
EOF
Dynamically provision Azure Disks

Kubernetes can dynamically provision Azure Disks using the Azure Kubernetes integration, configured at the time of your MKE installation. For Kubernetes to determine which APIs to use when provisioning storage, you must create Kubernetes StorageClass objects specific to each storage backend.

There are two different Azure Disk types that can be consumed by Kubernetes: Azure Disk Standard Volumes and Azure Disk Premium Volumes.

Depending on your use case, you can deploy one or both of the Azure Disk storage classes.


To define the Azure Disk storage class:

  1. Create the storage class:

    cat <<EOF | kubectl create -f -
    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
      name: standard
    provisioner: kubernetes.io/azure-disk
    parameters:
      storageaccounttype: <disk-type>
      kind: Managed
    EOF
    

    For storageaccounttype, enter Standard_LRS for the standard storage class Premium_LRS for the premium storage class.

  2. Verify which storage classes have been provisioned:

    kubectl get storageclasses
    

    Example output:

    NAME       PROVISIONER                AGE
    premium    kubernetes.io/azure-disk   1m
    standard   kubernetes.io/azure-disk   1m
    

To create an Azure Disk with a PersistentVolumeClaim:

After you create a storage class, you can use Kubernetes Objects to dynamically provision Azure Disks. This is done using Kubernetes PersistentVolumesClaims.

The following example uses the standard storage class and creates a 5 GiB Azure Disk. Alter these values to fit your use case.

  1. Create a PersistentVolumeClaim:

    cat <<EOF | kubectl create -f -
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      name: azure-disk-pvc
    spec:
      storageClassName: standard
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
    EOF
    
  2. Verify the creation of the PersistentVolumeClaim:

    kubectl get persistentvolumeclaim
    

    Example output:

    NAME              STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    azure-disk-pvc    Bound     pvc-587deeb6-6ad6-11e9-9509-0242ac11000b   5Gi        RWO            standard       1m
    
  3. Verify the creation of the PersistentVolume:

    kubectl get persistentvolume
    

    Expected output:

    NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                     STORAGECLASS   REASON    AGE
    pvc-587deeb6-6ad6-11e9-9509-0242ac11000b   5Gi        RWO            Delete           Bound     default/azure-disk-pvc    standard                 3m
    
  4. Verify the creation of a new Azure Disk in the Azure Portal.


To attach the new Azure Disk to a Kubernetes Pod:

You can now mount the Kubernetes PersistentVolume into a Kubernetes Pod. The disk can be consumed by any Kubernetes object type, including a Deployment, DaemonSet, or StatefulSet. However, the following example simply mounts the PersistentVolume into a standalone Pod.

Attach the new Azure Disk to a Kubernetes pod:

cat <<EOF | kubectl create -f -
kind: Pod
apiVersion: v1
metadata:
  name: mypod-dynamic-azuredisk
spec:
  containers:
    - name: mypod
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: storage
  volumes:
    - name: storage
      persistentVolumeClaim:
        claimName: azure-disk-pvc
EOF
Data disk capacity of an Azure Virtual Machine

Azure limits the number of data disks that can be attached to each Virtual Machine. Refer to Azure Virtual Machine Sizes for this information. Kubernetes prevents Pods from deploying on Nodes that have reached their maximum Azure Disk Capacity. In such cases, Pods remain stuck in the ContainerCreating status, as demonstrated in the following example:

  1. Review Pods:

    kubectl get pods
    

    Example output:

    NAME                  READY     STATUS              RESTARTS   AGE
    mypod-azure-disk      0/1       ContainerCreating   0          4m
    
  2. Describe the Pod to display troubleshooting logs, which indicate the node has reached its capacity:

    kubectl describe pods mypod-azure-disk
    

    Example output:

    Warning  FailedAttachVolume  7s (x11 over 6m)  attachdetach-controller  \
    AttachVolume.Attach failed for volume "pvc-6b09dae3-6ad6-11e9-9509-0242ac11000b" : \
    Attach volume "kubernetes-dynamic-pvc-6b09dae3-6ad6-11e9-9509-0242ac11000b" to instance \
    "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Compute/virtualMachines/worker-03" \
    failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: \
    StatusCode=409 -- Original Error: failed request: autorest/azure: \
    Service returned an error. Status=<nil> Code="OperationNotAllowed" \
    Message="The maximum number of data disks allowed to be attached to a VM of this size is 4." \
    Target="dataDisks"
    

See also

Use Azure Files Storage

You can provide persistent storage for MKE workloads on Microsoft Azure by using Azure Files. You can either pre-provision Azure Files shares to be consumed by Kubernetes Pods, or you can use the Azure Kubernetes integration to dynamically provision Azure Files shares as needed.

This guide assumes that you have already provisioned an MKE environment on Microsoft Azure and that you have provisioned a cluster after meeting all of the prerequisites listed in Install MKE on Azure.

To complete the steps in this topic, you must download and configure the client bundle.

Manually provision Azure Files shares

You can use existing Azure Files shares or manually provision new ones to provide persistent storage for Kubernetes Pods. You can manually provision Azure Files shares in the Azure Portal, using ARM Templates, or using the Azure CLI. The following example uses the Azure CLI to manually provision an Azure Files share.


To manually provision an Azure Files share:

Note

The Azure Kubernetes driver does not support Azure Storage accounts created using Azure Premium Storage.

  1. Create an Azure Storage account:

    1. Create the following environment variables, replacing <region> with the required region:

      REGION=<region>
      SA=mystorageaccount
      RG=myresourcegroup
      
    2. Create the Azure Storage account:

      az storage account create \
      --name $SA \
      --resource-group $RG \
      --location $REGION \
      --sku Standard_LRS
      
  2. Provision an Azure Files share:

    1. Create the following environment variables, adjusting the size of this share to satisfy the user requirements.

      FS=myfileshare
      SIZE=5
      
    2. Obtain the Azure collection string, which you can also obtain from the Azure Portal:

      export AZURE_STORAGE_CONNECTION_STRING=`az storage account show-connection-string --name $SA --resource-group $RG -o tsv`
      
    3. Provision the Azure Files share:

      az storage share create \
      --name $FS \
      --quota $SIZE \
      --connection-string $AZURE_STORAGE_CONNECTION_STRING
      

To configure a Kubernetes Secret:

After creating an Azure Files share, you must load the Azure Storage account access key into MKE as a Kubernetes Secret. This provides access to the file share when Kubernetes attempts to mount the share into a Pod. You can find this Secret either in the Azure Portal or by using the Azure CLI, as in the following example.

  1. Create the following environment variables, if you have not done so already:

    SA=mystorageaccount
    RG=myresourcegroup
    FS=myfileshare
    
  2. Obtain the Azure Storage account access key, which you can also obtain from the Azure Portal:

    STORAGE_KEY=$(az storage account keys list --resource-group $RG --account-name $SA --query "[0].value" -o tsv)
    
  3. Load the Azure Storage account access key into MKE as a Kubernetes Secret:

    kubectl create secret generic azure-secret \
    --from-literal=azurestorageaccountname=$SA \
    --from-literal=azurestorageaccountkey=$STORAGE_KEY
    

To mount the Azure Files share into a Kubernetes Pod:

The following example creates a standalone Kubernetes Pod, though you can use the same syntax to create DaemonSets, Deployments, and StatefulSets.

  1. Create the following environment variable:

    FS=myfileshare
    
  2. Mount the Azure Files share into a Kubernetes Pod:

    cat <<EOF | kubectl create -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: mypod-azurefile
    spec:
      containers:
      - image: nginx
        name: mypod
        volumeMounts:
          - name: mystorage
            mountPath: /data
      volumes:
      - name: mystorage
        azureFile:
          secretName: azure-secret
          shareName: $FS
          readOnly: false
    EOF
    
Dynamically provision Azure Files shares

Kubernetes can dynamically provision Azure Files shares using the Azure Kubernetes integration, configured at the time of your MKE installation. For Kubernetes to determine which APIs to use when provisioning storage, you must create Kubernetes StorageClass objects specific to each storage backend.

Note

The Azure Kubernetes plugin only supports using the Standard StorageClass. File shares that use the Premium StorageClass will fail to mount.

To define the Azure Files StorageClass:

  1. Create the storage class:

    cat <<EOF | kubectl create -f -
    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
      name: standard
    provisioner: kubernetes.io/azure-file
    mountOptions:
      - dir_mode=0777
      - file_mode=0777
      - uid=1000
      - gid=1000
    parameters:
      skuName: Standard_LRS
      storageAccount: <existingstorageaccount> # Optional
      location: <existingstorageaccountlocation> # Optional
    EOF
    
  2. Verify which storage classes have been provisioned:

    kubectl get storageclasses
    

    Example output:

    NAME       PROVISIONER                AGE
    azurefile  kubernetes.io/azure-file   1m
    

To create an Azure Files share using a PersistentVolumeClaim:

After you create a storage class, you can use Kubernetes Objects to dynamically provision Azure Files shares. This is done using Kubernetes PersistentVolumesClaims.

Kubernetes uses an existing Azure Storage account, if one exists inside of the Azure Resource Group. If an Azure Storage account does not exist, Kubernetes creates one.

The following example uses the standard storage class and creates a 5 Gi Azure File share. Alter these values to fit your use case.

  1. Create a PersistentVolumeClaim:

    cat <<EOF | kubectl create -f -
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: azure-file-pvc
    spec:
      accessModes:
        - ReadWriteMany
      storageClassName: standard
      resources:
        requests:
          storage: 5Gi
    EOF
    
  2. Verify the creation of the PersistentVolumeClaim:

    kubectl get pvc
    

    Example output:

    NAME             STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    azure-file-pvc   Bound     pvc-f7ccebf0-70e0-11e9-8d0a-0242ac110007   5Gi        RWX            standard       22s
    
  3. Verify the creation of the PerstentVolume:

    kubectl get pv
    

    Example output:

    NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                    STORAGECLASS   REASON    AGE
    pvc-f7ccebf0-70e0-11e9-8d0a-0242ac110007   5Gi        RWX            Delete           Bound     default/azure-file-pvc   standard                 2m
    

To attach the new Azure Files share to a Kubernetes Pod:

You can now mount the Kubernetes PersistentVolume into a Kubernetes Pod. The file share can be consumed by any Kubernetes object type, including a Deployment, DaemonSet, or StatefulSet. However, the following example simply mounts the PersistentVolume into a standalone Pod.

Attach the new Azure Files share to a Kubernetes Pod:

cat <<EOF | kubectl create -f -
kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: storage
  volumes:
    - name: storage
      persistentVolumeClaim:
       claimName: azure-file-pvc
EOF
Troubleshoot Azure Files shares

When creating a PersistentVolumeClaim, the volume can get stuck in a Pending state if the persistent-volume-binder service account does not have the relevant Kubernetes RBAC permissions.


To resolve this issue:

  1. Review the status of the PVC:

    kubectl get pvc
    

    Example output:

    NAME             STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    azure-file-pvc   Pending                                      standard       32s
    
  2. Describe the PVC:

    kubectl describe pvc azure-file-pvc
    

    The storage account creates a Kubernetes Secret to store the Azure Files storage account key. If the persistent-volume-binder service account does not have the correct permissions, a warning such as the following will display:

    Warning    ProvisioningFailed  7s (x3 over 37s)  persistentvolume-controller
    Failed to provision volume with StorageClass "standard": Couldn't create secret
    secrets is forbidden: User "system:serviceaccount:kube-system:persistent-volume-binder"
    cannot create resource "secrets" in API group "" in the namespace "default": access denied
    
  3. Grant the persistent-volume-binder service account the relevant RBAC permissions by creating the following RBAC ClusterRole:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      labels:
        subjectName: kube-system-persistent-volume-binder
      name: kube-system-persistent-volume-binder:cluster-admin
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: cluster-admin
    subjects:
    - kind: ServiceAccount
      name: persistent-volume-binder
      namespace: kube-system
    

See also

Configure iSCSI

Internet Small Computer System Interface (iSCSI) is an IP-based standard that provides block-level access to storage devices. iSCSI receives requests from clients and fulfills them on remote SCSI devices. iSCSI support in MKE enables Kubernetes workloads to consume persistent storage from iSCSI targets.

Note

MKE does not support using iSCSI with Windows clusters.

Note

Challenge-Handshake Authentication Protocol (CHAP) secrets are supported for both iSCSI discovery and session management.

iSCSI components

The iSCSI initiator is any client that consumes storage and sends iSCSI commands. In an MKE cluster, the iSCSI initiator must be installed and running on any node where Pods can be scheduled. Configuration, target discovery, logging in, and logging out of a target are performed primarily by two software components: iscsid (service) and iscsiadm (CLI tool).

These two components are typically packaged as part of open-iscsi on Debian systems and iscsi-initiator-utils on RHEL, CentOS, and Fedora systems.

  • iscsid is the iSCSI initiator daemon and implements the control path of the iSCSI protocol. It communicates with iscsiadm and kernel modules.

  • iscsiadm is a CLI tool that allows discovery, login to iSCSI targets, session management, and access and management of the open-iscsi database.

The iSCSI target is any server that shares storage and receives iSCSI commands from an initiator.

Note

iSCSI kernel modules implement the data path. The most common modules used across Linux distributions are scsi_transport_iscsi.ko, libiscsi.ko, and iscsi_tcp.ko. These modules need to be loaded on the host for proper functioning of the iSCSI initiator.

Prerequisites
  • Complete hardware and software configuration of the iSCSI storage provider. There is no significant demand for RAM and disk when running external provisioners in MKE clusters. For setup information specific to a storage vendor, refer to the vendor documentation.

  • Configure kubectl on your clients.

  • Make sure that the iSCSI server is accessible to MKE worker nodes.

Configure an iSCSI target

An iSCSI target can run on dedicated, stand-alone hardware, or can be configured in a hyper-converged manner to run alongside container workloads on MKE nodes. To provide access to the storage device, configure each target with one or more logical unit numbers (LUNs).

iSCSI targets are specific to the storage vendor. Refer to the vendor documentation for setup instructions, including applicable RAM and disk space requirements, and expose them to the MKE cluster.


To expose iSCSI targets to the MKE cluster:

  1. If necessary for access control, configure the target with client iSCSI qualified names (IQNs).

  2. CHAP secrets for authentication.

  3. Make sure that each iSCSI LUN is accessible by all nodes in the cluster. Configure the iSCSI service to expose storage as an iSCSI LUN to all nodes in the cluster. You can do this by allowing all MKE nodes, and along with them the IQNs, to join the target ACL list.

Configure a generic iSCSI initiator

Every Linux distribution packages the iSCSI initiator software in a particular way. Follow the instructions specific to the storage provider, using the following steps as a guideline.

  1. Prepare all MKE nodes by installing OS-specific iSCSI packages and loading the necessary iSCSI kernel modules. In the following example, scsi_transport_iscsi.ko and libiscsi.ko are pre-loaded by the Linux distribution. The iscsi_tcp kernel module must be loaded with a separate command.

    • For CentOS or Red Hat:

      sudo yum install -y iscsi-initiator-utils sudo modprobe iscsi_tcp
      
    • For Ubuntu:

      sudo apt install open-iscsi sudo modprobe iscsi_tcp
      
  2. Set up MKE nodes as iSCSI initiators. Configure initiator names for each node, using the format InitiatorName=iqn.<YYYY-MM.reverse.domain.name:OptionalIdentifier>:

    sudo sh -c 'echo "InitiatorName=iqn.<YYYY-MM.reverse.domain.name:OptionalIdentifier>" >
    /etc/iscsi/ <initiatorname>.iscsi sudo systemctl restart iscsid
    
Configure MKE

Update the MKE configuration file with the following options:

  1. Configure --storage-iscsi=true to enable iSCSI-based PersistentVolumes (PVs) in Kubernetes.

  2. Configure --iscsiadm-path=<path> to specify the absolute path of the iscsiadm binary on the host. The default value is /usr/sbin/iscsiad.

  3. Configure --iscsidb-path=<path> to specify the path of the iSCSI database on the host. The default value is /etc/iscsi.

Configure in-tree iSCSI volumes

The Kubernetes in-tree iSCSI plugin only supports static provisioning, for which you must:

  • Verify that the desired iSCSI LUNs are pre-provisioned in the iSCSI targets.

  • Create iSCSI PV objects, which correspond to the pre-provisioned LUNs with the appropriate iSCSI configuration. As PersistentVolumeClaims (PVCs) are created to consume storage, the iSCSI PVs bind to the PVCs and satisfy the request for persistent storage.


To configure in-tree iSCSI volumes:

  1. Create a YAML file for the PersistentVolume object based on the following example:

    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: iscsi-pv
    spec:
      capacity:
        storage: 12Gi
      accessModes:
        - ReadWriteOnce
      iscsi:
         targetPortal: 192.0.2.100:3260
         iqn: iqn.2017-10.local.example.server:disk1
         lun: 0
         fsType: 'ext4'
         readOnly: false
    
  2. Make the following changes using information appropriate for your environment:

    • Replace 12Gi with the size of the storage available.

    • Replace 192.0.2.100:3260 with the IP address and port number of the iSCSI target in your environment. Refer to the storage provider documentation for port information.

    • Replace iqn.2017-10.local.example.server:disk1 with a unique name for the identifier. More than one iqn can be specified, but it must use the format iqn.YYYY-MM.reverse.domain.name:OptionalIdentifier. iqn.2017-10.local.example.server:disk1 is the IQN of the iSCSI initiator, which in this case is the MKE worker node. Each MKE worker must have a unique IQN.

  3. Create the PersistentVolume:

    kubectl create -f pv-iscsi.yml
    

    Expected output:

    persistentvolume/iscsi-pv created
    
External provisioner and Kubernetes objects

An external provisioner is a piece of software running out of process from Kubernetes that is responsible for creating and deleting PVs. External provisioners monitor the Kubernetes API server for PV claims and create PVs accordingly.

When using an external provisioner, you must perform the following additional steps:

  1. Configure external provisioning based on your storage provider. Refer to your storage provider documentation for deployment information.

  2. Define storage classes. Refer to your storage provider dynamic provisioning documentation for configuration information.

  3. Define a PVC and a Pod. When you define a PVC to use the storage class, a PV is created and bound.

  4. Start a Pod using the PVC that you defined.

Note

In some cases, on-premises storage providers use external provisioners to connect PV provisioning to the backend storage.

Troubleshooting

The following issues occur frequently in iSCSI integrations:

  • The host might not have iSCSI kernel modules loaded. To avoid this, always prepare your MKE worker nodes by installing the iSCSI packages and the iSCSI kernel modules prior to installing MKE. If worker nodes are not prepared correctly prior to an MKE installation:

    1. Prepare the nodes.

    2. Restart the ucp-kubelet container for changes to take effect.

  • Some hosts have depmod confusion. On some Linux distributions, the kernel modules cannot be loaded until the kernel sources are installed and depmod is run. If you experience problems with loading kernel modules, verify that you are running depmod after performing the kernel module installation.

Example
  1. Download and configure the client bundle.

  2. Create a YAML file with the following StorageClass object:

    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
      name: iscsi-targetd-vg-targetd
    provisioner: iscsi-targetd
    parameters:
      targetPortal: 172.31.8.88
      iqn: iqn.2019-01.org.iscsi.docker:targetd
      iscsiInterface: default
      volumeGroup: vg-targetd
      initiators: iqn.2019-01.com.example:node1, iqn.2019-01.com.example:node2
      chapAuthDiscovery: "false"
      chapAuthSession: "false"
    
  3. Apply the StorageClass YAML file:

    kubectl apply -f iscsi-storageclass.yaml
    

    Expected output:

    storageclass "iscsi-targetd-vg-targetd" created
    
  4. Verify the successful creation of the StorageClass object:

    kubectl get sc
    

    Example output:

    NAME                       PROVISIONER     AGE
    iscsi-targetd-vg-targetd   iscsi-targetd   30s
    
  5. Create a YAML file with the following PersistentVolumeClaim object:

    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      name: iscsi-claim
    spec:
      storageClassName: "iscsi-targetd-vg-targetd"
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 100Mi
    
    • The valid accessModes values for iSCSI are ReadWriteOnce and ReadOnlyMany.

    • Change the value of storage as required.

    Note

    The scheduler automatically ensures that Pods with the same PVC run on the same worker node.

  6. Apply the PersistentVolumeClaim YAML file:

    kubectl apply -f pvc-iscsi.yml
    

    Expected output:

    persistentvolumeclaim "iscsi-claim" created
    
  7. Verify the successful creation of the PersistentVolume and PersistentVolumeClaim and that the PersistentVolumeClaim is bound to the correct volume:

    kubectl get pv,pvc
    

    Example output:

    NAME STATUS    VOLUME  CAPACITY   ACCESS MODES   STORAGECLASS  AGE
    iscsi-claim   Bound     pvc-b9560992-24df-11e9-9f09-0242ac11000e   100Mi      RWO              iscsi-targetd-vg-targetd   1m
    
    NAME CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS CLAIM STORAGECLASS                REASON    AGE
    pvc-b9560992-24df-11e9-9f09-0242ac11000e   100Mi      RWO Delete Bound     default/iscsi- claim   iscsi-targetd-vg-targetd  36s
    
  8. Configure Pods to use the PersistentVolumeClaim when binding to the PersistentVolume.

  9. Create a YAML file with the following ReplicationController object. The ReplicationController is used to set up two replica Pods running web servers that use the PersistentVolumeClaim to mount the PersistentVolume onto a mountpath containing shared resources.

    apiVersion: v1
    kind: ReplicationController
    metadata:
      name: rc-iscsi-test
    spec:
      replicas: 2
      selector:
        app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx
            ports:
            - name: nginx
              containerPort: 80
            volumeMounts:
            - name: iscsi
              mountPath: "/usr/share/nginx/html"
          volumes:
          - name: iscsi
            persistentVolumeClaim:
              claimName: iscsi-claim
    
  10. Create the ReplicationController object:

    kubectl create -f rc-iscsi.yml
    

    Expected output:

    replicationcontroller "rc-iscsi-test" created
    
  11. Verify successful creation of the Pods:

    kubectl get pods
    

    Example output:

    NAME                  READY     STATUS    RESTARTS   AGE
    rc-iscsi-test-05kdr   1/1       Running   0          9m
    rc-iscsi-test-wv4p5   1/1       Running   0          9m
    

See also

Refer to iSCSI-targetd provisioner for detailed information on an external provisioner implementation using a target-based external provisioner.

Use CSI drivers

The Container Storage Interface (CSI) is a specification for container orchestrators to manage block- and file-based volumes for storing data. Storage vendors can each create a single CSI driver that works with multiple container orchestrators. The Kubernetes community maintains sidecar containers that a containerized CSI driver can use to interface with Kubernetes controllers in charge of the following:

  • Managing persistent volumes

  • Attaching volumes to nodes, if applicable

  • Mounting volumes to Pods

  • Taking snapshots

These sidecar containers include a driver registrar, external attacher, external provisioner, and external snapshotter.

Mirantis supports version 1.0 and later of the CSI specification, and thus MKE can manage storage backends that ship with an associated CSI driver.

Note

Enterprise storage vendors provide CSI drivers, whereas Mirantis does not. Kubernetes does not enforce a specific procedure for how storage providers (SPs) should bundle and distribute CSI drivers.

Review the Kubernetes CSI Developer Documentation for CSI architecture, security, and deployment information.

Prerequisites
  1. Select a CSI driver to use with Kubernetes from the following MKE-certified CSI drivers:

    Partner name

    Kubernetes on MKE

    Dell EMC

    Certified (CSI)

    HPE

    Certified (CSI)

    NetApp

    Certified (Trident - CSI)

  2. Optional. Set the --storage-expt-enabled flag in the MKE install configuration to enable experimental Kubernetes storage features.

  3. Install the CSI plugin from your storage provider.

  4. Apply RBAC for sidecars and the CSI driver.

  5. Perform static or dynamic provisioning of PersistentVolumes (PVs) using the CSI plugin as the provisioner.

CSI driver deployment

The simplest way to deploy CSI drivers is for storage vendors to package them in containers. In the context of Kubernetes clusters, containerized CSI drivers typically deploy as StatefulSets for managing the cluster-wide logic and DaemonSets for managing node-specific logic.

Note the following considerations:

  • You can deploy multiple CSI drivers for different storage backends in the same cluster.

  • To avoid credential leak to user processes, Kubernetes recommends running CSI Controllers on master nodes and the CSI node plugin on worker nodes.

  • MKE allows running privileged Pods, which is required to run CSI drivers.

  • The Docker daemon on the hosts must be configured with shared mount propagation for CSI. This allows the sharing of volumes mounted by one container into other containers in the same Pod or to other Pods on the same node. By default, MKE enables bidirectional mount propagation in the Docker daemon.

Refer to Kubernetes CSI documentation for more information.

Role-based access control (RBAC)

Pods that contain CSI plugins must have the appropriate permissions to access and manipulate Kubernetes objects.

Using YAML files that the storage vendor provides, you can configure the cluster roles and bindings for service accounts associated with CSI driver Pods. MKE administrators must apply those YAML files to properly configure RBAC for the service accounts associated with CSI Pods.

Usage

The dynamic provisioning of persistent storage depends on the capabilities of the CSI driver and of the underlying storage backend. Review the CSI driver provider documentation for the available parameters. Refer to CSI HostPath Driver for a generic CSI plugin example.

You can access the following CSI deployment information in the MKE web UI:

Persistent storage objects

In the MKE web UI left-side navigation panel, navigate to Kubernetes > Storage for information on persistent storage objects such as StorageClass, PersistentVolumeClaim, and PersistentVolume.

Volumes

In the MKE web UI left-side navigation panel, navigate to Kubernetes > Pods, select a Pod, and scroll to Volumes to view the Pod volume information.

GPU support for Kubernetes workloads

MKE provides graphics processing unit (GPU) support for Kubernetes workloads that run on Linux worker nodes. This topic describes how to configure your system to use and deploy NVIDIA GPUs.

Install the GPU drivers

GPU support requires that you install GPU drivers, which you can do either prior to or after installing MKE. Installing the GPU drivers installs the NVIDIA driver using a runfile on your Linux host.

Note

This procedure describes how to manually install the GPU drivers. However, Mirantis recommends that you use a pre-existing automation system to automate the installation and patching of the drivers, along with the kernel and other host software.

  1. Enable the NVIDIA GPU device plugin by setting nvidia_device_plugin to true in the MKE configuration file.

  2. Verify that your system supports NVIDIA GPU:

    lspci | grep -i nvidia
    
  3. Verify that your GPU is a supported NVIDIA GPU Product.

  4. Install all the dependencies listed in the NVIDIA Minimum Requirements.

  5. Verify that your system is up to date and that you are running the latest kernel version.

  6. Install the following packages:

    • Ubuntu:

      sudo apt-get install -y gcc make curl linux-headers-$(uname -r)
      
    • RHEL:

      sudo yum install -y kernel-devel-$(uname -r) \
      kernel-headers-$(uname -r) gcc make curl elfutils-libelf-devel
      
  7. Verify that the i2c_core and ipmi_msghandler kernel modules are loaded:

    sudo modprobe -a i2c_core ipmi_msghandler
    
  8. Persist the change across reboots:

    echo -e "i2c_core\nipmi_msghandler" | sudo tee /etc/modules-load.d/nvidia.conf
    
  9. Review the NVIDIA libraries, which are located under the following directory on the host:

    NVIDIA_OPENGL_PREFIX=/opt/kubernetes/nvidia
    sudo mkdir -p $NVIDIA_OPENGL_PREFIX/lib
    echo "${NVIDIA_OPENGL_PREFIX}/lib" | sudo tee /etc/ld.so.conf.d/nvidia.conf
    sudo ldconfig
    
  10. Install the NVIDIA GPU driver:

    NVIDIA_DRIVER_VERSION=<version-number>
    curl -LSf https://us.download.nvidia.com/XFree86/Linux-x86_64/${NVIDIA_DRIVER_VERSION}/NVIDIA-Linux-x86_64-${NVIDIA_DRIVER_VERSION}.run -o nvidia.run
    sudo sh nvidia.run --opengl-prefix="${NVIDIA_OPENGL_PREFIX}"
    

    Set <version-number> to the NVIDIA driver version of your choice.

  11. Load the NVIDIA Unified Memory kernel module and create device files for the module on startup:

    sudo tee /etc/systemd/system/nvidia-modprobe.service << END
    [Unit]
    Description=NVIDIA modprobe
    
    [Service]
    Type=oneshot
    RemainAfterExit=yes
    ExecStart=/usr/bin/nvidia-modprobe -c0 -u
    
    [Install]
    WantedBy=multi-user.target
    END
    
    sudo systemctl enable nvidia-modprobe
    sudo systemctl start nvidia-modprobe
    
  12. Enable the NVIDIA persistence daemon to initialize GPUs and keep them initialized:

    sudo tee /etc/systemd/system/nvidia-persistenced.service << END
    [Unit]
    Description=NVIDIA Persistence Daemon
    Wants=syslog.target
    
    [Service]
    Type=forking
    PIDFile=/var/run/nvidia-persistenced/nvidia-persistenced.pid
    Restart=always
    ExecStart=/usr/bin/nvidia-persistenced --verbose
    ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced
    
    [Install]
    WantedBy=multi-user.target
    END
    
    sudo systemctl enable nvidia-persistenced
    sudo systemctl start nvidia-persistenced
    
  13. Test the device plugin and review its description:

    kubectl describe node <node-name>
    

    Example output:

    Capacity:
    cpu:                8
    ephemeral-storage:  40593612Ki
    hugepages-1Gi:      0
    hugepages-2Mi:      0
    memory:             62872884Ki
    nvidia.com/gpu:     1
    pods:               110
    Allocatable:
    cpu:                7750m
    ephemeral-storage:  36399308Ki
    hugepages-1Gi:      0
    hugepages-2Mi:      0
    memory:             60775732Ki
    nvidia.com/gpu:     1
    pods:               110
    ...
    Allocated resources:
    (Total limits may be over 100 percent, i.e., overcommitted.)
    Resource        Requests    Limits
    --------        --------    ------
    cpu             500m (6%)   200m (2%)
    memory          150Mi (0%)  440Mi (0%)
    nvidia.com/gpu  0           0
    
Schedule GPU workloads

The following example describes how to deploy a simple workload that reports detected NVIDIA CUDA devices.

  1. Create a practice Deployment that requests nvidia.com/gpu in the limits section. The Pod will be scheduled on any available GPUs in your system.

    kubectl apply -f- <<EOF
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      creationTimestamp: null
      labels:
        run: gpu-test
      name: gpu-test
    spec:
      replicas: 1
      selector:
        matchLabels:
          run: gpu-test
      template:
        metadata:
          labels:
            run: gpu-test
        spec:
          containers:
          - command:
            - sh
            - -c
            - "deviceQuery && sleep infinity"
            image: kshatrix/gpu-example:cuda-10.2
            name: gpu-test
            resources:
              limits:
                nvidia.com/gpu: "1"
    EOF
    
  2. Verify that it is in the Running state:

kubectl get pods | grep "gpu-test"
NAME                        READY   STATUS    RESTARTS   AGE
gpu-test-747d746885-hpv74   1/1     Running   0          14m
  1. Review the logs. The presence of Result = PASS indicates a successful deployment:

    kubectl logs <name of the pod>
    

    Example output:

    deviceQuery Starting...
    
    CUDA Device Query (Runtime API) version (CUDART static linking)
    
    Detected 1 CUDA Capable device(s)
    
    Device 0: "Tesla V100-SXM2-16GB"
    ...
    
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
    Result = PASS
    
  2. Determine the overall GPU capacity of your cluster by inspecting its nodes:

    echo $(kubectl get nodes -l com.docker.ucp.gpu.nvidia="true" \
    -o jsonpath="0{range .items[*]}+{.status.allocatable['nvidia\.com/gpu']}{end}") | bc
    
  3. Set the proper replica number to acquire all available GPUs:

    kubectl scale deployment/gpu-test --replicas N
    
  4. Verify that all of the replicas are scheduled:

    kubectl get pods | grep "gpu-test"
    

    Example output:

    NAME                        READY   STATUS    RESTARTS   AGE
    gpu-test-747d746885-hpv74   1/1     Running   0          12m
    gpu-test-747d746885-swrrx   1/1     Running   0          11m
    
  5. Remove the Deployment and corresponding Pods:

    kubectl delete deployment gpu-test
    
Troubleshooting

If you attempt to add an additional replica to the previous example Deployment, it will result in a FailedScheduling error with the Insufficient nvidia.com/gpu message.

  1. Add an additional replica:

    kubectl scale deployment/gpu-test --replicas N+1
    kubectl get pods | grep "gpu-test"
    

    Example output:

    NAME                        READY   STATUS    RESTARTS   AGE
    gpu-test-747d746885-hpv74   1/1     Running   0          14m
    gpu-test-747d746885-swrrx   1/1     Running   0          13m
    gpu-test-747d746885-zgwfh   0/1     Pending   0          3m26s
    
  2. Review the status of the failed Deployment:

    kubectl describe po gpu-test-747d746885-zgwfh
    

    Example output:

    Events:
    Type     Reason            Age        From               Message
    ----     ------            ----       ----               -------
    Warning  FailedScheduling  <unknown>  default-scheduler  0/2 nodes are available: 2 Insufficient nvidia.com/gpu.
    

NGINX Ingress Controller

NGINX Ingress Controller for Kubernetes manages traffic that originates outside your cluster (ingress traffic) using the Kubernetes Ingress rules. You can use either the host name, path, or both the host name and path to route incoming requests to the appropriate service.

Only administrators can enable and disable NGINX Ingress Controller. Both administrators and regular users with the appropriate roles and permissions can create Ingress resources.

Configuration of the NGINX Ingress Controller is managed by way of cluster_config.ingress_controller option parameters in the MKE configuration file.

Configure NGINX Ingress Controller

Use the MKE web UI to enable and configure the NGINX Ingress Controller.

  1. Log in to the MKE web UI as an administrator.

  2. Using the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. In the Kubernetes tab, toggle the HTTP Ingress Controller for Kubernetes slider to the right.

  4. Under Configure proxy, specify the NGINX Ingress Controller service node ports through which external traffic can enter the cluster.

  5. Verify that the specified node ports are open.

    Note

    On production applications, it is typical to expose services using the load balancer that your cloud provider offers.

  6. Optional. Create a layer 7 load balancer in front of multiple nodes by toggling the External IP slider to the right and adding a list of external IP addresses to the NGINX Ingress Controller service.

  7. Specify how to scale load balancing by setting the number of replicas.

  8. Specify placement rules and load balancer configurations.

  9. Specify any additional NGINX configuration options you require. Refer to the NGINX documentation for the complete list of configuration options.

  10. Click Save.

Note

The NGINX Ingress Controller implements all Kubernetes Ingress resources with the IngressClassName of nginx-default, regardless of which namespace they are created in.

Note

The Ingress Controller implements any new Kubernetes Ingress resource that is created without IngressClassName.

Create a Kubernetes Ingress

A Kubernetes Ingress specifies a set of rules that route requests that match a particular <domain>/{path} to a given application. Ingresses are scoped to a single namespace and thus can route requests only to the applications inside that namespace.

  1. Log in to the MKE web UI.

  2. Navigate to Kubernetes > Ingresses and click Create.

  3. In the Create Ingress Object page, enter an ingress name and the following rule details:

    • Host (optional)

    • Path

    • Path type

    • Service name

    • Port number

    • Port name

  4. Generate the configuration file by clicking Generate YAML.

  5. Select a namespace using the Namespace dropdown.

  6. Click Create.

Example Kubernetes Ingress configuration file
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
  name: echo
spec:
  ingressClassName: nginx-default
  rules:
    - host: example.com
      http:
        paths:
          - path: /echo
            pathType: Exact
            backend:
              service:
                name: echo-service
                port:
                  number: 80

See also

Configure a canary deployment

Canary deployments release applications incrementally to a subset of users, which allows for the gradual deployment of new application versions without any downtime.

NGINX Ingress Controller supports traffic-splitting policies based on header, cookie, and weight. Whereas header- and cookie-based policies serve to provide a new service version to a subset of users, weight-based policies serve to divert a percentage of traffic to a new service version.

NGINX Ingress Controller uses the following annotations to enable canary deployments:

  • nginx.ingress.kubernetes.io/canary-by-header

  • nginx.ingress.kubernetes.io/canary-by-header-value

  • nginx.ingress.kubernetes.io/canary-by-header-pattern

  • nginx.ingress.kubernetes.io/canary-by-cookie

  • nginx.ingress.kubernetes.io/canary-weight

Canary rules are evaluated in the following order:

  1. canary-by-header

  2. canary-by-cookie

  3. canary-weight

Canary deployments require that you create two ingresses: one for regular traffic and one for alternative traffic. Be aware that you can apply only one canary ingress.

You enable a particular traffic-splitting policy by setting the associated canary annotation to true in the Kubernetes Ingress resource, as in the following example:

nginx.ingress.kubernetes.io/canary-by-header: "true"

Refer to Ingress Canary Annotations in the NGINX Ingress Controller documentation for more information.

Example canary setup
  1. Deploy two services, echo-v1 and echo-v2, using either the MKE web UI or kubectl.

    To deploy echo-v1:
    apiVersion: v1
    kind: Service
    metadata:
      name: echo-v1
    spec:
      type: ClusterIP
      ports:
        - port: 80
          protocol: TCP
          name: http
      selector:
        app: echo
        version: v1
    
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: echo-v1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: echo
          version: v1
      template:
        metadata:
          labels:
            app: echo
            version: v1
        spec:
          containers:
            - name: echo
              image: "docker.io/hashicorp/http-echo"
              args:
                - -listen=:80
                - --text="echo-v1"
              ports:
                - name: http
                  protocol: TCP
                  containerPort: 80
    
    To deploy echo-v2:
    apiVersion: v1
    kind: Service
    metadata:
      name: echo-v2
    spec:
      type: ClusterIP
      ports:
        - port: 80
          protocol: TCP
          name: http
      selector:
        app: echo
        version: v2
    
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: echo-v2
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: echo
          version: v2
      template:
        metadata:
          labels:
            app: echo
            version: v2
        spec:
          containers:
            - name: echo
              image: "docker.io/hashicorp/http-echo"
              args:
                - -listen=:80
                - --text="echo-v2"
              ports:
                - name: http
                  protocol: TCP
                  containerPort: 80
    
  2. Create an Ingress to route the traffic for the regular service:

    Example Ingress
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:
        ingress.kubernetes.io/rewrite-target: /
      name: ingress-echo
    spec:
      ingressClassName: nginx-default
      rules:
        - host: canary.example.com
          http:
            paths:
              - path: /echo
                pathType: Exact
                backend:
                  service:
                    name: echo-v1
                    port:
                      number: 80
    
  3. Verify that traffic is successfully routed:

    curl -H "Host: canary.example.com" http://<IP_ADDRESS>:<NODE_PORT>/echo
    

    Expected output:

    echo-v1
    
Canary deployment use cases

To provide a subset of users with a new service version using a request header:

  1. Create a canary ingress that routes traffic to the echo-v2 service using the request header x-region: us-east:

    Header-based policy
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:
        ingress.kubernetes.io/rewrite-target: /
        nginx.ingress.kubernetes.io/canary: "true"
        nginx.ingress.kubernetes.io/canary-by-header: "x-region"
        nginx.ingress.kubernetes.io/canary-by-header-value: "us-east"
      name: ingress-echo-canary
    spec:
      ingressClassName: nginx-default
      rules:
        - host: canary.example.com
          http:
            paths:
              - path: /echo
                pathType: Exact
                backend:
                  service:
                    name: nginx-v2
                    port:
                      number: 80
    
  2. Verify that traffic is properly routed:

    curl -H "Host: canary.example.com" -H "x-region: us-east" \
    http://<IP_ADDRESS>:<NODE_PORT>/echo
    curl -H "Host: canary.example.com" -H "x-region: us-west" \
    http://<IP_ADDRESS>:<NODE_PORT>/echo
    curl -H "Host: canary.example.com" \
    http://<IP_ADDRESS>:<NODE_PORT>/echo
    

    Expected output:

    echo-v2
    echo-v1
    echo-v1
    

To provide a subset of users with a new service version using a cookie:

  1. Create a canary ingress that routes traffic to the echo-v2 service using a cookie:

  2. Verify that traffic is properly routed:

    curl -s -H "Host: canary.example.com" --cookie "my_cookie=always" \
    http://<IP_ADDRESS>:<NODE_PORT>/echo
    curl -s -H "Host: canary.example.com" --cookie "other_cookie=always" \
    http://<IP_ADDRESS>:<NODE_PORT>/echo
    curl -s -H "Host: canary.example.com" \
    http://<IP_ADDRESS>:<NODE_PORT>/echo
    

    Expected output:

    echo-v2
    echo-v1
    echo-v1
    

To route a segment of traffic to a new service version:

  1. Create a canary ingress that routes 20% of traffic to the echo-v2 service using the nginx.ingress.kubernetes.io/canary-weight annotation:

    Weight-based policy
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:
        ingress.kubernetes.io/rewrite-target: /
        nginx.ingress.kubernetes.io/canary: "true"
        nginx.ingress.kubernetes.io/canary-weight: "20"
      name: ingress-echo-canary
    spec:
      ingressClassName: nginx-default
      rules:
        - host: canary.example.com
          http:
            paths:
              - path: /echo
                pathType: Exact
                backend:
                  service:
                    name: echo-v2
                    port:
                      number: 80
    
  2. Verify that traffic is properly routed:

    for i in {1..10}; do curl -H "Host: canary.example.com" \
    http://<IP_ADDRESS>:<NODE_PORT>/echo
    

    Example output:

    "echo-v1"
    "echo-v2"
    "echo-v2"
    "echo-v1"
    "echo-v1"
    "echo-v1"
    "echo-v1"
    "echo-v1"
    "echo-v1"
    "echo-v1"
    
Configure a sticky session

Sticky sessions enable users who participate in split testing to consistently see a particular feature. Adding sticky sessions to the initial request forces NGINX Ingress Controller to route follow-up requests to the same Pod.

  1. Enable the sticky session in the Kubernetes Ingress resource:

    nginx.ingress.kubernetes.io/affinity: "cookie"
    
  2. Specify the name of the required cookie (default: INGRESSCOOKIE).

    nginx.ingress.kubernetes.io/session-cookie-name: "<cookie-name>"
    
  3. Specify the time before the cookie expires (in seconds):

    nginx.ingress.kubernetes.io/session-cookie-max-age: "<cookie-duration>"
    

The following is an example of a Kubernetes Ingress configuration file with a sticky session enabled:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: sticky-session-test
  annotations:
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/session-cookie-name: "route"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"

spec:
  rules:
  - host: stickyingress.example.com
    http:
      paths:
      - backend:
          serviceName: http-svc
          servicePort: 80
        path: /

Note

NGINX Ingress Controller only supports cookie-based sticky sessions.

See also

Sticky sessions in the NGINX Ingress Controller documentation.

TLS termination

By default, NGINX Ingress Controller generates default TLS certificates for TLS termination. You can, though, generate and configure your own TLS certificates for TLS termination purposes.

Note

Prior to setting up TLS termination, verify that you have correctly configured kubectl.

  1. Generate a self-signed certificate and a private key for the TLS connection.

    Note

    The Common Name (CN) in the certificate must match the host name of the server.

    mkdir -p example_certs
    
    openssl req \
    -new \
    -newkey rsa:2048 \
    -x509 \
    -sha256 \
    -days 365 \
    -nodes \
    -subj "/O=echo/CN=echo.example.com" \
    -keyout example_certs/echo.example.com.key \
    -out example_certs/echo.example.com.crt
    
  2. Create a Kubernetes Secret that contains the generated certificate:

    kubectl create secret tls echo-tls-secret --key example_certs/echo.example.com.key --cert example_certs/echo.example.com.crt
    
  3. Deploy a sample application:

    cat <<EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: echo-svc
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: echo-svc
      template:
        metadata:
          labels:
            app: echo-svc
        spec:
          containers:
          - name: echo-svc
            image: registry.k8s.io/e2e-test-images/echoserver:2.3
            ports:
            - containerPort: 8080
            env:
              - name: NODE_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: spec.nodeName
              - name: POD_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.name
              - name: POD_NAMESPACE
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.namespace
              - name: POD_IP
                valueFrom:
                  fieldRef:
                    fieldPath: status.podIP
    
    ---
    
    apiVersion: v1
    kind: Service
    metadata:
      name: echo-svc
      labels:
        app: echo-svc
    spec:
      ports:
      - port: 80
        targetPort: 8080
        protocol: TCP
        name: http
      selector:
        app: echo-svc
    EOF
    
  4. Deploy the Ingress.

    Create an Ingress for the sample application, inserting the Kubernets Secret you created in the tls section as the host for which the TLS connection will terminate:

    cat <<EOF | kubectl apply -f -
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: echo-test
    spec:
      tls:
        - hosts:
          - echo.example.com
          secretName: echo-tls-secret
      ingressClassName: nginx-default
      rules:
        - host: echo.example.com
          http:
            paths:
              - path: /
                pathType: Prefix
                backend:
                  service:
                    name: echo-svc
                    port:
                      number: 80
    EOF
    
  5. Test the TLS termination by connecting to the application using HTTPS.

    export MKE_HOST=<mke host>
    export NODE_PORT=33001
    
    curl -k -v --resolve "echo.example.com:$NODE_PORT:$MKE_HOST"
    "https://echo.example.com:$NODE_PORT"
    
    Example output:
    
    * Added echo.example.com:33001:54.218.145.62 to DNS cache
    * Hostname echo.example.com was found in DNS cache
    *   Trying 54.218.145.62:33001...
    * Connected to echo.example.com (54.218.145.62) port 33001 (#0)
    * ALPN, offering h2
    * ALPN, offering http/1.1
    * successfully set certificate verify locations:
    *  CAfile: /etc/ssl/cert.pem
    *  CApath: none
    * (304) (OUT), TLS handshake, Client hello (1):
    * (304) (IN), TLS handshake, Server hello (2):
    * (304) (IN), TLS handshake, Unknown (8):
    * (304) (IN), TLS handshake, Certificate (11):
    * (304) (IN), TLS handshake, CERT verify (15):
    * (304) (IN), TLS handshake, Finished (20):
    * (304) (OUT), TLS handshake, Finished (20):
    * SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
    * ALPN, server accepted to use h2
    * Server certificate:
    *  subject: O=echo; CN=echo.example.com
    *  start date: Nov 29 19:58:22 2022 GMT
    *  expire date: Nov 29 19:58:22 2023 GMT
    *  issuer: O=echo; CN=echo.example.com
    *  SSL certificate verify result: self signed certificate (18), continuing anyway.
    
    ...
    

    The output shows that the TLS connection is being negotiated using the provided certficate.

  6. Clean up Kubernetes resources that are no longer needed:

    kubectl delete ingress echo-test
    kubectl delete service echo-svc
    kubectl delete deployment echo-svc
    kubectl delete secret echo-tls-secret
    
TLS passthrough

Available since MKE 3.7.0

TLS passthrough is the action of passing data through a load balancer to a server without decrypting it. Usually, the decryption or TLS termination happens at the load balancer and data is passed along to a web server as plain HTTP. TLS passthrough, however, keeps the data encrypted as it travels through the load balancer, with the web server performing the decryption upon receipt. With TLS passthrough enabled in NGINX Ingress Controller, the request will be forwarded to the backend service without being decrypted.

Note

Prior to setting up TLS termination, verify that you have correctly configured kubectl.

  1. Enable TLS passthrough using either the MKE web UI or the MKE configuration file.

    Note

    You must have MKE admin access to enable TLS passthrough.

    • To enable TLS passthrough with the MKE web UI, navigate to <username> > Admin Settings > Ingress, scroll down to Advanced Settings and toggle the Enable TLS-Passthrough control on.

    • To enable TLS passthrough using the MKE configuration file, set the ingress_extra_args.enable_ssl_passthrough file parameter under the cluster_config.ingress_controller option to true.

      [cluster_config.ingress_controller.ingress_extra_args]
           http_port = 80
           https_port = 443
           enable_ssl_passthrough = true
           default_ssl_certificate = ""
      
  2. Generate a self-signed certificate and a private key for the TLS connection.

    Note

    The Common Name (CN) in the certificate must match the host name of the server.

    mkdir -p example_certs
    
    openssl req \
      -new \
      -newkey rsa:2048 \
      -x509 \
      -sha256 \
      -days 365 \
      -nodes \
      -subj "/O=Example Inc./CN=nginx.example.com" \
      -keyout example_certs/nginx.example.com.key \
      -out example_certs/nginx.example.com.crt
    
  3. Create Kubernetes Secrets for the generated certificate:

    kubectl create secret tls nginx-server-certs \
      --key example_certs/nginx.example.com.key \
      --cert example_certs/nginx.example.com.crt
    
  4. Deploy a web server.

    1. Create a Kubernetes ConfigMap to retain the configuration of the NGINX server:

      cat <<EOF | kubectl apply -f -
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: nginx-conf
      data:
        nginx.conf: |
          events {
          }
          http {
            server {
              listen 443 ssl;
      
              root /usr/share/nginx/html;
              index index.html;
      
              server_name nginx.example.com;
              ssl_certificate /etc/nginx-server-certs/tls.crt;
              ssl_certificate_key /etc/nginx-server-certs/tls.key;
            }
          }
      EOF
      
    2. Deploy the NGINX server:

      cat <<EOF | kubectl apply -f -
      apiVersion: v1
      kind: Service
      metadata:
        name: my-nginx
        labels:
          app: my-nginx
      spec:
        ports:
        - port: 443
          protocol: TCP
        selector:
          app: my-nginx
      ---
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: my-nginx
      spec:
        selector:
          matchLabels:
            app: my-nginx
        replicas: 1
        template:
          metadata:
            labels:
              app: my-nginx
          spec:
            containers:
            - name: my-nginx
              image: nginx
              ports:
              - containerPort: 443
              volumeMounts:
              - name: nginx-config
                mountPath: /etc/nginx
                readOnly: true
              - name: nginx-server-certs
                mountPath: /etc/nginx-server-certs
                readOnly: true
            volumes:
            - name: nginx-config
              configMap:
                name: nginx-configmap > nginx-conf
            - name: nginx-server-certs
              secret:
                secretName: nginx-server-certs
      EOF
      
  5. Configure Ingress.

    Create a Kubernetes Ingress with the annotation: nginx.ingress.kubernetes.io/ssl-passthrough: "true":

    cat <<EOF | kubectl apply -f -
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:
        nginx.ingress.kubernetes.io/ssl-passthrough: "true"
      name: nginx-test
    spec:
      ingressClassName: nginx-default
      rules:
        - host: nginx.example.com
          http:
            paths:
              - path: /
                pathType: Prefix
                backend:
                  service:
                    name: my-nginx
                    port:
                      number: 443
    EOF
    
  6. Test the TLS passthrough by connecting to the application using HTTPS.

    In the example below, the TLS connection is being negotiated using the certficate provided for host nginx.example.com. Thus, theTLS connection was passed to the deployed NGINX server.

    export MKE_HOST=<mke host>
    export NODE_PORT=33001
    
    curl -k -v \
    --resolve "nginx.example.com:$NODE_PORT:$MKE_HOST" \
    "https://nginx.example.com:$NODE_PORT"
    

    Example output:

    * Added nginx.example.com:33001:54.218.145.62 to DNS cache
    * Hostname nginx.example.com was found in DNS cache
    *   Trying 54.218.145.62:33001...
    * Connected to nginx.example.com (54.218.145.62) port 33001 (#0)
    * ALPN, offering h2
    * ALPN, offering http/1.1
    * successfully set certificate verify locations:
    *  CAfile: /etc/ssl/cert.pem
    *  CApath: none
    * (304) (OUT), TLS handshake, Client hello (1):
    * (304) (IN), TLS handshake, Server hello (2):
    * TLSv1.2 (IN), TLS handshake, Certificate (11):
    * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
    * TLSv1.2 (IN), TLS handshake, Server finished (14):
    * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
    * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
    * TLSv1.2 (OUT), TLS handshake, Finished (20):
    * TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
    * TLSv1.2 (IN), TLS handshake, Finished (20):
    * SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
    * ALPN, server accepted to use http/1.1
    * Server certificate:
    *  subject: O=Example Inc.; CN=nginx.example.com
    *  start date: Nov 29 18:52:39 2022 GMT
    *  expire date: Nov 26 18:52:39 2032 GMT
    *  issuer: O=Example Inc.; CN=nginx.example.com
    *  SSL certificate verify result: self signed certificate (18), continuing anyway.
    
    ...
    

    The output shows that the TLS connection is being negotiated with the certificate provided for host nginx.example.com, thus confirming that the TLS connection has passed to the deployed NGINX server.

  7. Clean up Kubernetes resources that are no longer needed:

    kubectl delete deploy my-nginx
    kubectl delete service my-nginx
    kubectl delete ingress nginx-test
    kubectl delete configmap nginx-conf
    kubectl delete secrets nginx-server-certs
    
Expose TCP and UDP services

Available since MKE 3.7.0

Note

  • You must have MKE admin access to configure TCP and UDP services.

  • Prior to setting up TCP and UDP services, verify that you have correctly configured kubectl.

Kubernetes Ingress only supports services over HTTP and HTTPS. Using NGINX Ingress Controller, though, you can circumvent this limitation to enable situations in which it may be necessary to expose TCP and UDP services.

The following example procedure exposes a TCP service on port 9000 and a UDP service on port 5005:

  1. Deploy a sample TCP service listening on port 9000, to echo back any text it receives with the prefix hello.

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Service
    metadata:
      name: tcp-echo
      labels:
        app: tcp-echo
        service: tcp-echo
    spec:
      selector:
        app: tcp-echo
      ports:
      - name: tcp
        port: 9000
    
    ---
    
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: tcp-echo
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: tcp-echo
      template:
        metadata:
          labels:
            app: tcp-echo
        spec:
          containers:
          - name: tcp-echo
            image: docker.io/istio/tcp-echo-server:1.2
            imagePullPolicy: IfNotPresent
            args: [ "9000", "hello" ]
            ports:
            - containerPort: 9000
    EOF
    
  2. Deploy a sample UDP service listening on port 5005.

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: udp-listener
    
    ---
    
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: udp-listener
      labels:
        app: udp-listener
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: udp-listener
      template:
        metadata:
          labels:
            app: udp-listener
        spec:
          containers:
          - name: udp-listener
            image: mendhak/udp-listener
            ports:
            - containerPort: 5005
              protocol: UDP
              name: udp
    
    ---
    
    apiVersion: v1
    kind: Service
    metadata:
      name: udp-listener
    spec:
      ports:
      - port: 5005
        targetPort: 5005
        protocol: UDP
        name: udp
      selector:
        app: udp-listener
    EOF
    
  3. Verify that the two services are running correctly.

    1. Run kubectl get deploy tcp-echo udp-listener to verify that the deployment was created.

      Example output:

      NAME           READY   UP-TO-DATE   AVAILABLE   AGE
      tcp-echo       1/1     1            1           39s
      udp-listener   1/1     1            1           31s
      
    2. Run kubectl get service tcp-echo udp-listener to list the services created.

      Example output:

      NAME           TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
      tcp-echo       ClusterIP   10.96.172.90   <none>        9000/TCP   46s
      udp-listener   ClusterIP   10.96.19.229   <none>        5005/UDP   37s
      
  4. Configure Ingress Controller to expose the TCP and UDP services.

    1. Verify that the enabled parameter for the cluster_config.ingress_controller option in the MKE configuration file is set to true.

    2. Modify the MKE configuration file to expose the newly created TCP and UDP services, as shown below:

      [cluster_config.ingress_controller]
          enabled = true
      
      ...
      
      [[cluster_config.ingress_controller.ingress_exposed_ports]]
          name = "proxied-tcp-9000"
          port = 9000
          target_port = 9000
          node_port = 33011
          protocol = "TCP"
      
      [[cluster_config.ingress_controller.ingress_exposed_ports]]
          name = "proxied-udp-5005"
          port = 5005
          target_port = 5005
          node_port = 33012
          protocol = "UDP"
      
      ...
      
      [cluster_config.ingress_controller.ingress_tcp_services]
        9000 = "default/tcp-echo:9000"
      [cluster_config.ingress_controller.ingress_udp_services]
        5005 = "default/udp-listener:5005"
      
      ...
      
    3. Upload the modified MKE configuration file to complete the operation. For more information, refer to Use an MKE configuration file.

    Note

    To configure NGINX Ingress Controller using the MKE web UI, in the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  5. Test the TCP service.

    1. Send the text world. The service should respond with hello world:

      export MKE_HOST=<mke host>
      
      echo "world" | netcat $MKE_HOST 33011
      hello world
      
    2. Check the tcp-echo logs:

      kubectl get pods --selector=app=tcp-echo
      

      Example output:

      NAME                        READY   STATUS    RESTARTS   AGE
      tcp-echo-5cf4d68d76-p2c9b   1/1     Running   0          6m6s
      
      kubectl logs tcp-echo-5cf4d68d76-p2c9b
      

      Example output:

      listening on [::]:9000, prefix: hello
      request: world
      response: hello world
      
  6. Test the UDP service.

    1. Send the text UDP Datagram Message:

      echo "UDP Datagram Message" | netcat -v -u $MKE_HOST 33012
      
    2. Check the udp-listneer logs to verify that receipt:

      kubectl get pods --selector=app=udp-listener
      

      Example output:

      NAME                            READY   STATUS    RESTARTS   AGE
      udp-listener-59cdc755d7-qr2bl   1/1     Running   0          93m
      
      kubectl logs udp-listener-59cdc755d7-qr2bl
      

      Example output:

      Listening on UDP port 5005
      UDP Datagram Message
      
      UDP Datagram Message
      
  7. Remove the Kubernetes resources, as they are no longer needed.

    kubectl delete service tcp-echo
    kubectl delete deployment tcp-echo
    kubectl delete service udp-listener
    kubectl delete deployment udp-listener
    

Important

If the services are not reachable:

  • Verify that ports 33011 and 33012 are open on both the node and on the firewall of the cloud platform.

  • Review the logs of the sample applications for error messages:

    kubectl logs -l app=tcp-echo and kubectl logs -l app=ucp-listener
    
  • Review the NGINX Ingress Controller logs for error messages:

    kubectl logs -l app.kubernetes.io/name=ingress-nginx -n ingress-nginx
    

Deploy MetalLB

Available since MKE 3.7.0

MetalLB is a load balancer implementation for bare metal Kubernetes clusters. It monitors for the services with the type LoadBalancer and assigns them an IP address from IP address pools that are configured in the MKE system.

Note

For information on how to install and uninstall MetalLB, refer to MetalLB load-balancer for Kubernetes.

MetalLB uses two features to provide service: Address allocation and External announcement.

Address allocation

When a service of type LoadBalancer is created in a bare-metal cluster, the external IP of the service remains in <pending> state. This is because the Kubernetes implementation for network load balancers is only supported on known IaaS platforms. To fill in this gap you can use MetalLB, which can assign the IP to the service from a custom address pool resource, which is comprised of a list of IP addresses. And administrators can add multiple pools to the cluster, thus enabling you to control which IP addresses MetalLB can assign to load-balancer services.

External announcement

After assigning an IP address to a service, MetalLB must make the network aware so that the external entities of the cluster can learn how to reach the IP. To achieve this, MetalLB uses standard protocols, depending on whether the ARP or or BGP mode is in use.

Note

When you install and configure MetalLB in MKE, support is restricted to Layer 2 (ARP) mode.

Create LoadBalancer services

Once MetalLB is installed you can create services of type LoadBalancer, with MetalLB able to assign them an IP address from the IP address pools configured in the system.

In the example that follows, we will create an NGINX deployment and a ``LoadBalancer``service:

  1. Create a YAML file called nginx.yaml:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx
    spec:
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1
            ports:
            - name: http
              containerPort: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
    spec:
      ports:
      - name: http
        port: 80
        protocol: TCP
        targetPort: 80
      selector:
        app: nginx
      type: LoadBalancer
    
  2. Create the deployment and service:

    kubectl apply -f nginx.yaml
    

    Example output:

    deployment.apps/nginx created
    service/nginx created
    
  3. Verify the IP assigned to the NGINX service:

    kubectl get svc
    

    Example output:

    NAME         TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)        AGE
    kubernetes   ClusterIP      10.96.0.1      <none>         443/TCP        74m
    nginx        LoadBalancer   10.96.42.228   192.168.10.0   80:34072/TCP
    20s
    
Request from a specific IP pool

When multiple IP addresses are configured to the system, you can request assignment from a specific IP address pool.

Caution

Do not use the private IP of the master node to configure IP address pools, as admin function of the MKE web UI will become inaccessible as a result.

  1. Add the metallb.universe.tf/address-pool annotation to the service with the name of the IP address pool as the annotation value:

    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
      annotations:
        metallb.universe.tf/address-pool: <IP-address-pool-name>
    spec:
      ports:
      - name: http
        port: 80
        protocol: TCP
        targetPort: 80
      selector:
        app: nginx
    type: LoadBalancer
    

    On creation, the service is assigned an external IP address from the indicated IP address pool.

  2. Verify the IP address assignment:

    kubectl get svc
    

    Example output:

    NAME         TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
    kubernetes   ClusterIP      10.96.0.1      <none>        443/TCP        76m
    nginx        LoadBalancer   10.96.183.55   52.205.10.0   80:35328/TCP   9s
    

See also

MetalLB usage

Add IP address pools

MKE allows you to add IP address pools once MetalLB has been deployed in the MKE cluster. To do this, you use an MKE configuration file.

  1. Verify that the enabled parameter setting for the cluster_config.metallb_config configuration option is set to true.

  2. Update the cluster_config.metallb_config configuration option with the details for the new IP address pools.

    [cluster_config.metallb_config]
      enabled = true
    
    [[cluster_config.metallb_config.metallb_ip_addr_pool]]
          name = "<IP-address-pool-name-1>”
          external_ip = ["192.168.10.0/24", "192.168.1.0/24"]
    
    [[cluster_config.metallb_config.metallb_ip_addr_pool]]
          name = "<IP-address-pool-name-2>”
          external_ip = ["52.205.10.1/24"]
    
    [[cluster_config.metallb_config.metallb_ip_addr_pool]]
          name = "<IP-address-pool-name>-3”
          external_ip = ["54.205.10.0/24"]
    

    Caution

    • Make sure to provide correct IP addresses in CIDR format.

    • MetalLB pool name values must adhere to the RFC 1123 international format:

      A lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, - or ., and must start and end with an alphanumeric character. For example, example.com, regex used for validation is ‘[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*’)

    Example IP address pool settings:

    [cluster_config.metallb_config]
        enabled = true
        [[cluster_config.metallb_config.metallb_ip_addr_pool]]
            name = "example1"
            external_ip = ["192.168.10.0/24", "192.168.1.0/24"]
    
        [[cluster_config.metallb_config.metallb_ip_addr_pool]]
            name = "example2"
            external_ip = ["52.205.10.1/24"]
    

    When multiple address pools are configured, MKE advertises all of the pools by default. To request assignment from a specific pool, users can add metallb.universe.tf/address-pool annotation to the service, with the name of the address pool as the annotation value. In the event that no such annotation is added, MetalLB will assign an IP from one of the configured pools.

    You can configure both public and private IPs, based on your environment. MKE allows you to define unlimited address pools and is type-agnostic.

  3. Upload the modified MKE configuration file and allow at least 5 minutes for MKE to propagate the configuration changes throughout the cluster.

Note

Following the addition of your IP address pools, Mirantis recommends that you verify your MetalLB deployment.

Modify IP address pools

You must use the kubectl Kubernetes command line tool to modify existing IP address pools in MKE.

  1. Obtain all of the IP address pools that are configured to MKE:

    kubectl get IPAddressPools -n metallb-system
    

    Example output:

    NAME                       AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
    <IP-address-pool-name-1>   true          false             ["192.168.10.0/24","192.168.1.0/24"]
    <IP-address-pool-name-2>   true          false             ["52.205.10.1/24"]
    <IP-address-pool-name-3>   true          false             ["54.205.10.1/24"]
    
  2. Modify the target IP address pool entries in the cluster_config.metallb_config configuration option of the configuration-options.

  3. Run the following kubectl command to modify the target IP address pools:

    kubectl delete IPAddressPool <IP-address-pool-name-3> -n metallb-system
    
  4. Restart MetalLB Controller to ensure that any persisting services set to use modified IP address pools receive new IPs:

    kubectl rollout restart deployment controller -n metallb-system
    

    Example output:

    deployment.apps/controller restarted
    
Delete IP address pools

You must use the kubectl Kubernetes command line tool to delete existing IP address pools from MKE.

  1. Obtain all of the IP address pools that are configured to MKE.

    kubectl get IPAddressPools -n metallb-system
    

    Example output:

    NAME                       AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
    <IP-address-pool-name-1>   true          false             ["192.168.10.0/24","192.168.1.0/24"]
    <IP-address-pool-name-2>   true          false             ["52.205.10.1/24"]
    <IP-address-pool-name-3>   true          false             ["54.205.10.1/24"]
    
  2. Remove the target IP address pool entries from the cluster_config.metallb_config configuration option of the configuration-options.

  3. Run the following kubectl command to delete the target IP address pools:

    kubectl delete IPAddressPool <IP-address-pool-name-3> -n metallb-system
    
  4. Restart MetalLB Controller to ensure that any persisting services set to use the deleted IP address pools receive new IPs:

    kubectl rollout restart deployment controller -n metallb-system
    

    Example output:

    deployment.apps/controller restarted
    

    Note

    Any service that uses the metallb.universe.tf/address-pool annotation with a value equal to the name of a deleted pool will remain in <pending> state.

See also

MetalLB usage

Use Multus CNI to create multi-homed Pods

Available since MKE 3.7.0

In Kubernetes, by default, a Pod is only connected to a single network interface, which is the default network. Using Multus CNI, however, you can create a multi-home Pod that has multiple network interfaces.

The following example procedure attaches two network interfaces to a Pod, net1 and net2.

  1. Configure kubectl for your MKE cluster.

  2. Enable Multus CNI in the MKE cluster when you install MKE, using the --multus-cni flag with the MKE install CLI command.

  3. Install a different CNI plugin.

    Run the following command on all nodes in the cluster:

    CNI_PLUGIN_VERSION=v1.3.0
    CNI_ARCH=amd64
    curl -sL
    https://github.com/containernetworking/plugins/releases/download/${CNI_PLUGIN_VERSION}/cni-plugins-linux-${CNI_ARCH}-${CNI_PLUGIN_VERSION}.tgz
    | sudo tar xvz -C /opt/cni/bin/
    
  4. Determine the primary network interface for the node. You will need this information to create the NetworkAttachmentDefinitions file.

    Note

    The name of the primary interface can vary with the underlying network adapter.

    Run the following command on the nodes locate the default network interface. The Iface column in the line with destination default indicates which interface to use.

    route
    

    Note

    eth0 is the primary network interface in most Linux distributions.

    Example output:

    Kernel IP routing table
    Destination     Gateway         Genmask         Flags Metric Ref  Use Iface
    default         ip-172-31-32-1. 0.0.0.0         UG    100    0    0   ens3
    172.17.0.0      0.0.0.0         255.255.0.0     U     0      0    0   docker0
    172.18.0.0      0.0.0.0         255.255.0.0     U     0      0    0   docker_gwbridge
    172.19.0.0      0.0.0.0         255.255.0.0     U     0      0    0   br-05e6200d101b
    172.31.32.0     0.0.0.0         255.255.240.0   U     0      0    0   ens3
    ip-172-31-32-1. 0.0.0.0         255.255.255.255 UH    100    0    0   ens3
    192.168.219.0   0.0.0.0         255.255.255.192 U     0      0    0   *
    

    Important

    Run all ensuing commands on the machine upon which kubectl is configured.

  5. Create NetworkAttachmentDefinitions to specify other networks.

    cat <<EOF | kubectl create -f -
    apiVersion: "k8s.cni.cncf.io/v1"
    kind: NetworkAttachmentDefinition
    metadata:
    name: macvlan-conf
    spec:
    config: '{
          "cniVersion": "0.3.0",
          "type": "macvlan",
          "master": "eth0",
          "mode": "bridge",
          "ipam": {
          "type": "host-local",
          "subnet": "192.168.1.0/24",
          "rangeStart": "192.168.1.200",
          "rangeEnd": "192.168.1.216",
          "routes": [
             { "dst": "0.0.0.0/0" }
          ],
          "gateway": "192.168.1.1"
          }
       }'
    EOF
    
  6. Create a multi-homed Pod:

    cat <<EOF | kubectl create -f -
    apiVersion: v1
    kind: Pod
    metadata:
    name: samplepod
    annotations:
       k8s.v1.cni.cncf.io/networks: macvlan-conf
    spec:
    containers:
       - name: samplepod
       command: ["/bin/ash", "-c", "trap : TERM INT; sleep infinity & wait"]
       image: alpine
    EOF
    
  7. Check the network interfaces of the Pod:

    kubectl exec -it samplepod -- ip a
    

    Example output:

    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
       link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
       inet 127.0.0.1/8 scope host lo
          valid_lft forever preferred_lft forever
    3: eth0@if490: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP
       link/ether 9e:02:e2:cf:4f:7e brd ff:ff:ff:ff:ff:ff
       inet 192.168.58.70/32 scope global eth0
          valid_lft forever preferred_lft forever
    4: net1@if492: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
       link/ether 0e:b0:45:13:94:f7 brd ff:ff:ff:ff:ff:ff
       inet 192.168.1.200/24 brd 192.168.1.255 scope global net1
          valid_lft forever preferred_lft forever
    

    Interface

    Description

    lo

    A loopback interface.

    eth0

    Default network interface of the Pod.

    net1

    New interface created with the macvlan configuration.

  8. Add multiple additional interfaces:

    cat <<EOF | kubectl create -f -
    apiVersion: v1
    kind: Pod
    metadata:
    name: samplepod-multi
    annotations:
       k8s.v1.cni.cncf.io/networks: macvlan-conf,macvlan-conf
    spec:
    containers:
       - name: samplepod
       command: ["/bin/ash", "-c", "trap : TERM INT; sleep infinity & wait"]
       image: alpine
    EOF
    

    The operation results in the addition of the net1 and net2 network interfaces to the Pod.

Monitor an MKE cluster

You can monitor the health of your MKE cluster using the MKE web UI, the CLI, and the _ping endpoint. This topic describes how to monitor your cluster health, vulnerability counts, and disk usage.

For those running MSR in addition to MKE, MKE displays image vulnerability scanning count data obtained from MSR for containers, Swarm services, Pods, and images. This feature requires that you run MSR 2.6.x or later and enable MKE single sign-on.

The MKE web UI only displays the disk usage metrics, including space availability, for the /var/lib/docker part of the filesystem. Monitoring the total space available on each filesystem of an MKE worker or manager node requires that you deploy a third-party operating system-monitoring solution.

Monitor with the MKE web UI

  1. Log in to the MKE web UI.

  2. From the left-side navigation panel, navigate to the Dashboard page.

    Cluster health-related warnings that require your immediate attention display on the cluster dashboard. A greater number of such warnings are likely to present for MKE administrators than for regular users.

  3. Navigate to Shared Resources > Nodes to inspect the health of the nodes that MKE manages. To read the node health status, hover over the colored indicator.

  4. Click a particular node to learn more about its health.

  5. Click on the vertical ellipsis in the top right corner and select Tasks.

  6. From the left-side navigation panel, click Agent Logs to examine log entries.

Monitor with the CLI

  1. Download and configure the client bundle.

  2. Examine the health of the nodes in your cluster:

    docker node ls
    

    Status messages that begin with [Pending] indicate a transient state that is expected to resolve itself and return to a healthy state.

Automate the monitoring process

Automate the MKE cluster monitoring process by using the https://<mke-manager-url>/_ping endpoint to evaluate the health of a single manager node. The MKE manager evaluates whether its internal components are functioning properly, and returns one of the following HTTP codes:

  • 200 - all components are healthy

  • 500 - one or more components are not healthy

Using an administrator client certificate as a TLS client certificate for the _ping endpoint returns a detailed error message if any component is unhealthy.

Do not access the _ping endpoint with a load balancer, as this method does not allow you to determine which manager node is not healthy. Instead, connect directly to the URL of a manager node. Use GET to ping the endpoint instead of HEAD, as HEAD returns a 404 error code.

Troubleshoot an MKE cluster

Troubleshooting is a necessary part of cluster maintenance. This section provides you with the tools you need to diagnose and resolve the problems you are likely to encounter in the course of operating your cluster.

Troubleshoot MKE node states

Nodes enter a variety of states in the course of their lifecycle, including transitional states such as when a node joins a cluster and when a node is promoted or demoted. MKE reports the steps of the transition process as they occur in both the ucp-controller logs and in the MKE web UI.


To view transitional node states in the MKE web UI:

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Shared Resources > Nodes. The transitional node state displays in the DETAILS column for each node.

  3. Optional. Click the required node. The transitional node state displays in the Overview tab under Cluster Message.

The following table includes all the node states as they are reported by MKE, along with their description and expected duration:

Message

Description

Expected duration

Completing node registration

The node is undergoing the registration process and does not yet appear in the KV node inventory. This is expected to occur when a node first joins the MKE swarm.

5 - 30 seconds

heartbeat failure

The node has not contacted any swarm managers in the last 10 seconds. Verify the swarm state using docker info on the node.

  • inactive indicates that the node has been removed from the swarm with docker swarm leave.

  • pending indicates dockerd has been attempting to contact a manager since dockerd started on the node. Confirm that the network security policy allows TCP port 2377 from the node to the managers.

  • error indicates an error prevented Swarm from starting on the node. Verify the docker daemon logs on the node.

Until resolved

Node is being reconfigured

The ucp-reconcile container is converging the current state of the node to the desired state. Depending on which state the node is currently in, this process can involve issuing certificates, pulling missing images, or starting containers.

1 - 60 seconds

Reconfiguration pending

The node is expected to be a manager but the ucp-reconcile container has not yet been started.

1 - 10 seconds

The ucp-agent task is state

The ucp-agent task on the node is not yet in a running state. This message is expected when the configuration has been updated or when a node first joins the MKE cluster. This step may take longer than expected if the MKE images need to be pulled from Docker Hub on the affected node.

1 - 10 seconds

Unable to determine node state

The ucp-reconcile container on the target node has just begun running and its state is not yet evident.

1 - 10 seconds

Unhealthy MKE Controller: node is unreachable

Other manager nodes in the cluster have not received a heartbeat message from the affected node within a predetermined timeout period. This usually indicates that there is either a temporary or permanent interruption in the network link to that manager node. Ensure that the underlying networking infrastructure is operational, and contact support if the symptom persists.

Until resolved

Unhealthy MKE Controller: unable to reach controller

The controller that the node is currently communicating with is not reachable within a predetermined timeout. Refresh the node listing to determine whether the symptom persists. The symptom appearing intermittently can indicate latency spikes between manager nodes, which can lead to temporary loss in the availability of MKE. Ensure the underlying networking infrastructure is operational and contact support if the symptom persists.

Until resolved

Unhealthy MKE Controller: Docker Swarm Cluster: Local node <ip> has status Pending

The MCR Engine ID is not unique in the swarm. When a node first joins the cluster, it is added to the node inventory and discovered as Pending by Swarm. MCR is considered validated if a ucp-swarm-manager container can connect to MCR through TLS and its Engine ID is unique in the swarm. If you see this issue repeatedly, make sure that MCR does not have duplicate IDs. Use docker info to view the Engine ID. To refresh the ID, remove the /etc/docker/key.json file and restart the daemon.

Until resolved

Troubleshoot using logs

You can troubleshoot your MKE cluster by using the MKE web UI, the ClI, and the support bundle to review the logs of the individual MKE components. You must have administrator privileges to view information about MKE system containers.

Review logs using the MKE web UI
  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to Shared Resources > Containers. By default, the system containers are hidden.

  3. Click the slider icon and select Show system resources.

  4. Click the required container to view details, which include configurations and logs.

Review logs using the CLI
  1. Download and configure the client bundle.

    Using the Docker CLI requires that you authenticate using client certificates. Client certificate bundles generated for users without administrator privileges do not permit viewing MKE system container logs.

  2. Review the logs of MKE system containers. Use the -a flag to display system containers, as they are not displayed by default.

    docker ps -a
    

    Example output:

    CONTAINER ID        IMAGE                                     COMMAND                  CREATED             STATUS                     PORTS                                                                             NAMES
    8b77cfa87889        mirantis/ucp-agent:latest             "/bin/ucp-agent re..."   3 hours ago         Exited (0) 3 hours ago                                                                                       ucp-reconcile
    b844cf76a7a5        mirantis/ucp-agent:latest             "/bin/ucp-agent agent"   3 hours ago         Up 3 hours                 2376/tcp                                                                          ucp-agent.tahzo3m4xjwhtsn6l3n8oc2bf.xx2hf6dg4zrphgvy2eohtpns9
    de5b45871acb        mirantis/ucp-controller:latest        "/bin/controller s..."   3 hours ago         Up 3 hours (unhealthy)     0.0.0.0:443->8080/tcp                                                             ucp-controller
    ...
    
  3. Optional. Review the log of a particular MKE container by using the docker logs <mke container ID> command. For example, the following command produces the log for the ucp-controller container listed in the previous step:

    docker logs de5b45871acb
    

    Example output:

    {"level":"info","license_key":"PUagrRqOXhMH02UgxWYiKtg0kErLY8oLZf1GO4Pw8M6B","msg":"/v1.22/containers/ucp/ucp-controller/json",
    "remote_addr":"192.168.10.1:59546","tags":["api","v1.22","get"],"time":"2016-04-25T23:49:27Z","type":"api","username":"dave.lauper"}
    {"level":"info","license_key":"PUagrRqOXhMH02UgxWYiKtg0kErLY8oLZf1GO4Pw8M6B","msg":"/v1.22/containers/ucp/ucp-controller/logs",
    "remote_addr":"192.168.10.1:59546","tags":["api","v1.22","get"],"time":"2016-04-25T23:49:27Z","type":"api","username":"dave.lauper"}
    
Review logs using a support bundle

With the logs contained in a support bundle you can troubleshoot problems that existed before you changed your MKE configuration. Do not alter your MKE configuration until after you have performed the following steps.

  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to <username> > Admin Settings > Log & Audit Logs

  3. Select DEBUG and click Save.

    Increasing the MKE log level to DEBUG produces more descriptive logs, making it easier to understand the status of the MKE cluster.

    Note

    Changing the MKE log level restarts all MKE system components and introduces a small amount of downtime to MKE. Your applications will not be affected by this downtime.

  4. support-dump.

Each of the following container types reports a different variety of problems in its logs:

  • Review the ucp-reconcile container logs for problems that occur after a node was added or removed.

    Note

    It is normal for the ucp-reconcile container to be stopped. This container starts only when the ucp-agent detects that a node needs to transition to a different state. The ucp-reconcile container is responsible for creating and removing containers, issuing certificates, and pulling missing images.

  • Review the ucp-controller container logs for problems that occur in the normal state of the system.

  • Review the ucp-auth-api and ucp-auth-store container logs for problems that occur when you are able to visit the MKE web UI but unable to log in.

Review logs using the API
  1. Store the IP address for use in the shell:

    IP=<ip-address>
    
  2. Obtain a temporary access token:

    curl -k -X POST -H 'Content-Type: application/json' https://$IP/auth/login --data-binary '
    {
      "username": "<username>",
      "password": "<password>"
    }
    '
    

    Example output:

    {"auth_token":"88d790ab-5cc0-4284-b3c6-986272af50b6"}
    
  3. Store the temporary access token for use in the shell:

    AUTHTOKEN="88d790ab-5cc0-4284-b3c6-986272af50b6"
    
  4. Determine which containers are present:

    curl -k -X GET "https://$IP/containers/json?all=true&size=false" -H  "accept: application/json" -H  "Authorization: Bearer ${AUTHTOKEN}"
    

    Truncated first line of example output:

    {"Id":"2cebeb898636ce519ec68fadbad4abe499f2fdebb057eb534bb64ad5bbf7925f", ...}
    
  5. Store the container ID for use in the shell:

    ID=2cebeb898636ce519ec68fadbad4abe499f2fdebb057eb534bb64ad5bbf7925f
    
  6. Obtain log files associated with the container ID:

    curl -k -X GET "https://$IP/containers/$ID/logs?follow=false&stdout=true&stderr=true&since=0&until=0&timestamps=false&tail=all" \
    -H "accept: application/json" -H  "Authorization: Bearer ${AUTHTOKEN}" --output output.txt
    
  7. View log content:

    cat output.txt
    

Troubleshoot cluster configurations

MKE regularly monitors its internal components, attempting to resolve issues as it discovers them.

In most cases where a single MKE component remains in a persistently failed state, removing and rejoining the unhealthy node restores the cluster to a healthy state.

MKE persists configuration data on an etcd key-value store and RethinkDB database that are replicated on all MKE manager nodes. These data stores are for internal use only and should not be used by other applications.

Troubleshoot the etcd key-value store with the HTTP API

This example uses curl to make requests to the key-value store REST API and jq to process the responses.

  1. Install curl and jq on a Ubuntu distribution:

    sudo apt-get update && sudo apt-get install curl jq
    
  2. Use a client bundle to authenticate your requests. Download and configure the client bundle if you have not done so already.

  3. Use the REST API to access the cluster configurations. The $DOCKER_HOST and $DOCKER_CERT_PATH environment variables are set when using the client bundle.

    export KV_URL="https://$(echo $DOCKER_HOST | cut -f3 -d/ | cut -f1 -d:):12379"
    
    curl -s \
         --cert ${DOCKER_CERT_PATH}/cert.pem \
         --key ${DOCKER_CERT_PATH}/key.pem \
         --cacert ${DOCKER_CERT_PATH}/ca.pem \
         ${KV_URL}/v2/keys | jq "."
    
Troubleshoot the etcd key-value store with the CLI

Execution of the MKE etcd key-value store takes place in containers with the name ucp-kv. To check the health of etcd clusters, execute commands inside these containers using docker exec` with etcdctl.

  1. Log in to a manager node using SSH.

  2. Troubleshoot an etcd key-value store:

    docker exec -it ucp-kv sh -c \
    'etcdctl --cluster=true endpoint health -w table 2>/dev/null'
    

    If the command fails, an error code is the only output that displays.

Troubleshoot your cluster configuration using the RethinkDB database

User and organization data for MKE is stored in a RethinkDB database, which is replicated across all manager nodes in the MKE cluster.

The database replication and failover is typically handled automatically by the MKE configuration management processes. However, you can use the CLI to review the status of the database and manually reconfigure database replication.

  1. Log in to a manager node using SSH.

  2. Produce a detailed status of all servers and database tables in the RethinkDB cluster:

    NODE_ADDRESS=$(docker info --format '{{.Swarm.NodeAddr}}')
    VERSION=$(docker image ls --format '{{.Tag}}' mirantis/ucp-auth | head -n 1)
    docker container run --rm -v ucp-auth-store-certs:/tls mirantis/ucp-auth:${VERSION} --db-addr=${NODE_ADDRESS}:12383 db-status
    
    • NODE_ADDRESS is the IP address of this Docker Swarm manager node.

    • VERSION is the most recent version of the mirantis/ucp-auth image.

    Expected output:

    Server Status: [
      {
        "ID": "ffa9cd5a-3370-4ccd-a21f-d7437c90e900",
        "Name": "ucp_auth_store_192_168_1_25",
        "Network": {
          "CanonicalAddresses": [
            {
              "Host": "192.168.1.25",
              "Port": 12384
            }
          ],
          "TimeConnected": "2017-07-14T17:21:44.198Z"
        }
      }
    ]
    ...
    
  3. Repair the RethinkDB cluster so that the number of replicas it has is equal to the number of manager nodes in the cluster.

    NODE_ADDRESS=$(docker info --format '{{.Swarm.NodeAddr}}')
    NUM_MANAGERS=$(docker node ls --filter role=manager -q | wc -l)
    VERSION=$(docker image ls --format '{{.Tag}}' mirantis/ucp-auth | head -n 1)
    docker container run --rm -v ucp-auth-store-certs:/tls mirantis/ucp-auth:${VERSION} --db-addr=${NODE_ADDRESS}:12383 --debug reconfigure-db --num-replicas ${NUM_MANAGERS}
    
    • NODE_ADDRESS is the IP address of this Docker Swarm manager node.

    • NUM_MANAGERS is the current number of manager nodes in the cluster.

    • VERSION is the most recent version of the mirantis/ucp-auth image.

    Example output:

    time="2017-07-14T20:46:09Z" level=debug msg="Connecting to db ..."
    time="2017-07-14T20:46:09Z" level=debug msg="connecting to DB Addrs: [192.168.1.25:12383]"
    time="2017-07-14T20:46:09Z" level=debug msg="Reconfiguring number of replicas to 1"
    time="2017-07-14T20:46:09Z" level=debug msg="(00/16) Reconfiguring Table Replication..."
    time="2017-07-14T20:46:09Z" level=debug msg="(01/16) Reconfigured Replication of Table \"grant_objects\""
    ...
    

Note

If the quorum in any of the RethinkDB tables is lost, run the reconfigure-db command with the --emergency-repair flag.

See also

Troubleshoot root certificate authorities

If one of the nodes goes offline during MKE cluster CA rotation, it can prevent other nodes from finishing the rotation. In this event, to unblock other nodes, remove the offline node from the cluster by running the docker node rm --force <node_id> command from any manager node. Thereafter, once the rotation is done, the node can rejoin the cluster.

If the CA rotation was only partially successful, having left some nodes in an unhealthy state, you can attempt to remove and rejoin the problematic nodes.

For more detail, refer to Join Nodes.

Note

Be aware that if the troubleshooting procedures detailed herein do not work, it may be necessary to restore the cluster using the backup.

Troubleshoot NodeLocalDNS

Running NodeLocalDNS presents issues for certain Linux distributions, such as RHEL, centOS, and Rocky Linux.

Pods stuck in crash loopbackFailure

After enabling NodelocalDNS in MKE, NodelocalDNS pods may become stuck in the crash loopback.

kubectl get po -o wide -n kube-system -l k8s-app=node-local-dns

NAME                   READY   STATUS             RESTARTS      AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
node-local-dns-cg49w   0/1     CrashLoopBackOff   5 (79s ago)   3m15s   172.31.32.61   centos7-centos-0   <none>           <none>
node-local-dns-ldjk4   0/1     CrashLoopBackOff   5 (7s ago)    3m15s   172.31.45.15   centos7-centos-1   <none>           <none>
kubectl logs -f -n kube-system -l k8s-app=node-local-dns

2024/05/14 17:34:05 [ERROR] Failed to add non-existent interface nodelocaldns: operation not supported
2024/05/14 17:34:05 [INFO] Added interface - nodelocaldns
2024/05/14 17:34:05 [ERROR] Error checking dummy device nodelocaldns - operation not supported
listen tcp 169.254.0.10:8080: bind: cannot assign requested address

The reason this happens is that the NodeLocalDNS DaemonSet creates a dummy interface during network setup, and this dummy kernel module is not loaded in RHEL or CENTOS by default. To fix the issue, load the dummy kernel module and run the following command on every node in the cluster:

sudo modprobe dummy
NodeLocalDNS containers are unable to add iptables rules

Although the NodeLocalDNS Pods switch to running state after the dummy kernel module is loaded, the Pods still fail to add iptables rules.

The error presents in the NodeLocalDNS Pods logs.

kubectl get po -o wide -n kube-system -l k8s-app=node-local-dns

NAME                   READY   STATUS    RESTARTS        AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
node-local-dns-khfh7   1/1     Running   0               4m11s   172.31.32.61   centos7-centos-0   <none>           <none>
node-local-dns-shtqn   1/1     Running   3 (3m48s ago)   4m11s   172.31.45.15
centos7-centos-1   <none>           <none>
kubectl logs -f -n kube-system -l k8s-app=node-local-dns

Notice: The NOTRACK target is converted into CT target in rule listing and saving.
Fatal: can't open lock file /run/xtables.lock: Permission denied
[ERROR] Error checking/adding iptables rule {raw OUTPUT [-p tcp -d 10.96.0.10 --dport 8080 -j NOTRACK -m comment --comment NodeLocal DNS Cache: skip conntrack]}, error - error checking rule: exit status 4: Ignoring deprecated --wait-interval option.
Warning: Extension CT revision 0 not supported, missing kernel module?
Notice: The NOTRACK target is converted into CT target in rule listing and saving.
Fatal: can't open lock file /run/xtables.lock: Permission denied

You can fix this problem in two different ways:

  • Use audit2allow to generate SELinux policy rules for the denied operations:

     module localdnsthird 1.0;
    
    require {
         type kernel_t;
         type spc_t;
         type rpm_script_t;
         type firewalld_t;
         type container_t;
         type iptables_var_run_t;
         class process transition;
         class capability { sys_admin sys_resource };
         class system module_request;
         class file { lock open read };
    }
    
    #============= container_t ==============
    allow container_t iptables_var_run_t:file lock;
    
    #!!!! This avc is allowed in the current policy
    allow container_t iptables_var_run_t:file { open read };
    
    #!!!! This avc is allowed in the current policy
    allow container_t kernel_t:system module_request;
    
    #============= firewalld_t ==============
    
    #!!!! This avc is allowed in the current policy
    allow firewalld_t self:capability { sys_admin sys_resource };
    
    #============= spc_t ==============
    
    #!!!! This avc is allowed in the current policy
    allow spc_t rpm_script_t:process transition;
    
  • Change the SELinux mode to permissive:

    sudo setenforce 0
    

MKE virtualization

Available since MKE 3.7.15

Virtualization functionality is available for MKE through KubeVirt, a Kubernetes extension with which you can natively run Virtual Machine (VM) workloads alongside container workloads in Kubernetes clusters.

Prepare Kubevirt deployment

To deploy KubeVirt, the KVM kernel module must be present on all Kubernetes nodes on which virtual machines will run, and nested virtualization must be enabled in all virtual environments.

Run the following local platform validation on your Kubernetes nodes to determine whether they can be used for KubeVirt:

lsmod | grep -i kvm
cat /sys/module/kvm_intel/parameters/nested

Example output:

lsmod | grep -i kvm
kvm_intel             483328  0
kvm                  1396736  1 kvm_intel

If the validation routine does not return results that resemble those presented in the example output, the node cannot be used for KubeVirt.

Deploy Kubevirt

You can deploy Kubevirt using manifest files that are available from the Mirantis Azure CDN:

Run the following command to check KubeVirt readiness:

kubectl wait -n kubevirt-hyperconverged kv kubevirt-kubevirt-hyperconverged
--for condition=Available --timeout 5m

Example output:

kubevirt.kubevirt.io/kubevirt-kubevirt-hyperconverged condition met

Install virtctl CLI

Although it is not required to run KubeVirt, the virtctl CLI provides an interface that can significantly enhance the convenience of your virtual machine interactions.

  1. Obtain the virtctl version for your particular architecture from https://binary.mirantis.com/?prefix=kubevirt/bin/artifacts.

  2. Run the following command, inserting the correct values for your architecture and platform. For <ARCH> the valid values are linux or darwin, and for <PLATFORM> the valid values are amd64 or arm64.

    wget https://binary-mirantis-com.s3.amazonaws.com/kubevirt/bin/artifacts/virtctl-1.3.1-20240911005512-<ARCH>-<PLATFORM>  -O virtctl
    

    Example command for Linux with amd64 architecture:

    wget https://binary-mirantis-com.s3.amazonaws.com/kubevirt/bin/artifacts/virtctl-1.3.1-20240911005512-linux-amd64 -O virtctl
    

    Example command for MacOS with arm64 architecture:

    wget https://binary-mirantis-com.s3.amazonaws.com/kubevirt/bin/artifacts/virtctl-1.3.1-20240911005512-darwin-arm64  -O virtctl
    
  3. Move virtctl to one of the PATH directories.

    Note

    PATH is a system environment variable that contains a list of directories, within each of which the system is able to search for a binary. To reveal the list, issue the following command:

    echo $PATH
    

    Example output:

    /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
    

    The PATH directories are each separated by a :.

MKE virtualization deployment scenario

The example scenario illustrated herein pertains to the deployment of a CirrOS virtual machine. The scenario consists of three primary steps:

  1. Launch a basic virtual machine

  2. Attach a disk to a virtual machine

  3. Attach a network interface to a virtual machine

Launch a basic virtual machine
  1. Create a cirros-vm.yaml file.

    kubectl apply -f
    https://binary-mirantis-com.s3.amazonaws.com/kubevirt/manifests/examples/cirros-vm.yaml
    

    Alternatively, you can manually create the cirros-vm.yaml file, using the following content:

    ---
    apiVersion: kubevirt.io/v1
    kind: VirtualMachine
    metadata:
      labels:
        kubevirt.io/vm: vm-cirros
      name: vm-cirros
    spec:
      running: false
      template:
        metadata:
          labels:
            kubevirt.io/vm: vm-cirros
        spec:
          domain:
            devices:
              disks:
              - disk:
                  bus: virtio
                name: containerdisk
              - disk:
                  bus: virtio
                name: cloudinitdisk
            resources:
              requests:
                memory: 128Mi
          terminationGracePeriodSeconds: 0
          volumes:
          - containerDisk:
              image: mirantis.azurecr.io/kubevirt/cirros-container-disk-demo:1.3.1-20240911005512
            name: containerdisk
          - cloudInitNoCloud:
              userData: |
                #!/bin/sh
    
                echo 'printed from cloud-init userdata'
            name: cloudinitdisk
    
  2. Apply the cirros-vm.yaml file.

    kubectl apply -f cirros-vm.yaml
    
  3. Verify the creation of the virtual machine:

    kubectl get vm
    

    Example output:

    NAME        AGE    STATUS    READY
    vm-cirros   1m8s   Stopped   False
    
  4. Start the CirrOS VM:

    virtctl start vm-cirros
    
  5. Access the CirrOS console:

    virtctl console vm-cirros
    
Attach a disk to a virtual machine

Note

The following example scenario uses the HostPathProvisioner component, which is deployed by default.

  1. Manually create the example-pvc.yaml file, using the following content:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: pvc-claim-1
    spec:
      storageClassName: hpp-local
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 3Gi
    
  2. Run the following command to create the PersistentVolumeClaim resource that you defined in the example-pvc.yaml file:

    kubectl apply -f example-pvc.yaml
    
  3. Verify the creation of the volume:

    kubectl get pvc
    

    Example output:

    NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
    pvc-claim-1   Bound    pvc-b7d68902-f340-4b7e-8a36-495d170b7193   10Gi       RWO            hpp-local       10s
    
  4. Run the following command to attach the disk to your VM:

    kubectl patch vm vm-cirros --type='json' -p
    '[{"op":"add","path":"/spec/template/spec/volumes/2","value":{"persistentVolumeClaim":
    {"claimName": "pvc-claim-1"},"name":
    "example-pvc"}},{"op":"add","path":"/spec/template/spec/domain/devices/disks/2","value":{"disk":
    {"bus": "virtio"},"name": "example-pvc"}}]'
    
  5. Restart the VM and access the console:

    virtctl restart vm-cirros
    
    virtctl console vm-cirros
    
  6. Format and mount the disk:

    sudo mkfs.ext3 /dev/vdc
    
    sudo mkdir /example-disk
    
    sudo mount /dev/vdc /example-disk
    
  7. Create a helloworld.txt file to verify that the disk works:

    sudo touch /example-disk/helloworld.txt
    
    ls /example-disk/
    
Attach a network interface to a virtual machine

Note

The following example scenario requires the presence of CNAO.

  1. Manually create the bridge-test.yaml file, using the following content:

    apiVersion: "k8s.cni.cncf.io/v1"
    kind: NetworkAttachmentDefinition
    metadata:
      name: bridge-test
    spec:
      config: '{
          "cniVersion": "0.3.1",
          "name": "bridge-test",
          "type": "bridge",
          "bridge": "br1",
          "ipam": {
            "type": "host-local",
            "subnet": "10.250.250.0/24"
          }
        }'
    
  2. Apply the bridge-test.yaml file:

    kubectl apply -f bridge-test.yaml
    
  3. Run the following command to attach the network interface to your virtual machine:

    kubectl patch vm vm-cirros --type='json' -p
    '[{"op":"add","path":"/spec/template/spec/domain/devices/interfaces","value":[{"name":"default","masquerade":{}},{"name":"bridge-net","bridge":{}}]},{"op":"add","path":"/spec/template/spec/networks","value":[{"name":default,"pod":{}},{"name":"bridge-net","multus":{"networkName":"bridge-test"}}]}]'
    
  4. Restart the VM and access the console:

    virtctl restart vm-cirros
    
    virtctl console vm-cirros
    
  5. Verify the VM interfaces:

    ip a
    

    Example output:

    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast qlen 1000
        link/ether 0a:c5:3b:63:71:51 brd ff:ff:ff:ff:ff:ff
        inet 10.0.2.2/24 brd 10.0.2.255 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 fe80::8c5:3bff:fe63:7151/64 scope link
           valid_lft forever preferred_lft forever
    3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
        link/ether 6a:bb:20:5b:6a:5e brd ff:ff:ff:ff:ff:ff
    

Disaster recovery

Perform disaster recovery procedures first for Swarm and then for MKE, with any required MSR disaster recovery procedures performed last.

Swarm disaster recovery

This section describes how to recover after losing quorum and how to force your swarm to rebalance.

Note

Perform the procedures in this section prior to those described in MKE disaster recovery.

Recover from losing the quorum

Swarms are resilient to failures and can recover from temporary node failures, such as machine reboots and restart crashes, and other transient errors. However, if a swarm loses quorum, it cannot automatically recover. In such cases, tasks on existing worker nodes continue to run, but it is not possible to perform administrative tasks, such as scaling or updating services and joining or removing nodes from the swarm. The best way to recover after losing quorum is to bring the missing manager nodes back online. If that is not possible, follow the instructions below.

In a swarm of N managers, a majority (quorum) of manager nodes must always be available. For example, in a swarm with 5 managers, a minimum of 3 managers must be operational and in communication with each other. In other words, the swarm can tolerate up to (N-1)/2 permanent failures, and beyond that, requests involving swarm management cannot be processed. Such permanent failures include data corruption and hardware failure.

If you lose a quorum of managers, you cannot administer the swarm. If you have lost the quorum and you attempt to perform any management operation on the swarm, MKE issues the following error:

Error response from daemon: rpc error: code = 4 desc = context deadline exceeded

To recover from losing quorum:

If you cannot recover from losing quorum by bringing the failed nodes back online, you must run the docker swarm init command with the --force-new-cluster flag from a manager node. Using this flag removes all managers except the manager from which the command was run.

  1. Run --force-new-cluster from the manager node you want to recover:

    docker swarm init --force-new-cluster --advertise-addr node01:2377
    
  2. Promote nodes to become managers until you have the required number of manager nodes.

The Mirantis Container Runtime where you run the command becomes the manager node of a single-node swarm, which is capable of managing and running services. The manager has all the previous information about services and tasks, worker nodes continue to be part of the swarm, and services continue running. You need to add or re-add manager nodes to achieve your previous task distribution and ensure that you have enough managers to maintain high availability and prevent losing the quorum.

Force the swarm to rebalance

You do not usually need to force your swarm to rebalance its tasks. However, when you add a new node to a swarm or a node reconnects to the swarm after a period of unavailability, the swarm does not automatically give a workload to the idle node. This is a design decision; if the swarm periodically shifts tasks to different nodes for the sake of balance, the clients using those tasks would be disrupted. The goal is to avoid disrupting running services for the sake of balance across the swarm. When new tasks start, or when a node with running tasks becomes unavailable, those tasks are given to less busy nodes.


To force the swarm to rebalance its tasks:

Use the docker service update command with the --force or -f flag to force the service to redistribute its tasks across the available worker nodes. This causes the service tasks to restart. Client applications may be disrupted. If configured, your service will use a rolling update.

MKE disaster recovery

If you cannot recover half or more manager nodes to a healthy state, you have lost quorum and must restore your system using the following procedure.

Note

Perform Swarm disaster recovery procedures prior to those described here.

Recover an MKE cluster from an existing backup
  1. If MKE is still installed on the swarm, uninstall MKE:

    Note

    Skip this step when restoring MKE on new machines.

    docker container run -it --rm -v /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:<mke-version> uninstall-ucp -i
    

    Substitute <mke-version> with the MKE version of your backup.

  2. Confirm that you want to uninstall MKE.

    Example output:

    INFO[0000] Detected UCP instance tgokpm55qcx4s2dsu1ssdga92
    INFO[0000] We're about to uninstall UCP from this Swarm cluster
    Do you want to proceed with the uninstall? (y/n):
    
  3. Restore MKE from the existing backup as described in Restore MKE.

    If the swarm exists, restore MKE on a manager node. Otherwise, restore MKE on any node, and the swarm will be created automatically during the restore procedure.

Recreate Kubernetes and Swarm objects

For Kubernetes, MKE backs up the declarative state of Kubernetes objects in etcd.

For Swarm, it is not possible to take the state and export it to a declarative format, as the objects that are embedded within the Swarm raft logs are not easily transferable to other nodes or clusters.

To recreate swarm-related workloads, you must refer to the original scripts used for deployment. Alternatively, you can recreate the workloads by manually recreating output using the docker inspect commands.

Back up Swarm

MKE manager nodes store the swarm state and manager logs in the /var/lib/docker/swarm/ directory. Swarm raft logs contain crucial information for recreating Swarm-specific resources, including services, secrets, configurations, and node cryptographic identity. This data includes the keys used to encrypt the raft logs. You must have these keys to restore the swarm.

Because logs contain node IP address information and are not transferable to other nodes, you must perform a manual backup on each manager node. If you do not back up the raft logs, you cannot verify workloads or Swarm resource provisioning after restoring the cluster.

Note

You can avoid performing a Swarm backup by storing stacks, services definitions, secrets, and networks definitions in a source code management or config management tool.

Swarm backup contents

Data

Backed up

Description

Raft keys

Yes

Keys used to encrypt communication between Swarm nodes and to encrypt and decrypt raft logs

Membership

Yes

List of the nodes in the cluster

Services

Yes

Stacks and services stored in Swarm mode

Overlay networks

Yes

Overlay networks created on the cluster

Configs

Yes

Configs created in the cluster

Secrets

Yes

Secrets saved in the cluster

Swarm unlock key

No

Secret key needed to unlock a manager after its Docker daemon restarts

To back up Swarm:

Note

All commands that follow must be prefixed with sudo or executed from a superuser shell by first running sudo sh.

  1. If auto-lock is enabled, retrieve your Swarm unlock key. Refer to Rotate the unlock key in the Docker documentation for more information.

  2. Optional. Mirantis recommends that you run at least three manager nodes, in order to achieve high availability, as you must stop the engine of the manager node before performing the backup. A majority of managers must be online for a cluster to be operational. If you have less than 3 managers, the cluster will be unavailable during the backup.

    Note

    While a manager is shut down, your swarm is more likely to lose quorum if further nodes are lost. A loss of quorum renders the swarm unavailable until quorum is recovered. Quorum is only recovered when more than 50% of the nodes become available. If you regularly take down managers when performing backups, consider running a 5-manager swarm, as this will enable you to lose an additional manager while the backup is running, without disrupting services.

  3. Select a manager node other than the leader to avoid a new election inside the cluster:

    docker node ls -f "role=manager" | tail -n+2 | grep -vi leader
    
  4. Optional. Store the Mirantis Container Runtime (MCR) version in a variable to easily add it to your backup name.

    ENGINE=$(docker version -f '{{.Server.Version}}')
    
  5. Stop MCR on the manager node before backing up the data, so that no data is changed during the backup:

    systemctl stop docker
    
  6. Back up the /var/lib/docker/swarm directory:

    tar cvzf "/tmp/swarm-${ENGINE}-$(hostname -s)-$(date +%s%z).tgz" /var/lib/docker/swarm/
    

    You can decode the Unix epoch in the file name by typing date -d @timestamp:

    date -d @1531166143
    Mon Jul  9 19:55:43 UTC 2018
    
  7. If auto-lock is enabled, unlock the swarm:

    docker swarm unlock
    
  8. Restart MCR on the manager node:

    systemctl start docker
    
  9. Repeat the above steps for each manager node.

Back up MKE

All manager nodes store the same data, thus it is only necessary to back up a single one.

Backing up MKE does not require that you pause the reconciler and delete MKE containers, nor does it affect manager node activities and user resources, such as services, containers, and stacks.

Backup considerations

Observe the following considerations prior to performing an MKE backup.

Limitations
  • MKE does not support using a backup that runs an earlier version of MKE to restore a cluster that runs a later version of MKE.

  • MKE does not support performing two backups at the same time. If a backup is attempted while another backup is in progress, or if two backups are scheduled at the same time, a message will display indicating that the second backup failed because another backup is in progress.

  • MKE may not be able to back up a cluster that has crashed. Mirantis recommends that you perform regular backups to avoid encountering this scenario.

  • MKE backups do not include Swarm workloads.

MKE backup contents

The following backup contents are stored in a .tar file. Backups contain MKE configuration metadata for recreating configurations such as LDAP, SAML, and RBAC.

Data

Backed up

Description

Configurations

Yes

MKE configurations, including Mirantis Container Runtime license, Swarm, and client CAs.

Access control

Yes

Swarm resource permissions for teams, including collections, grants, and roles.

Certificates and keys

Yes

Certificates, public and private keys used for authentication and mutual TLS communication.

Metrics data

Yes

Monitoring data gathered by MKE.

Organizations

Yes

Users, teams, and organizations.

Volumes

Yes

All MKE-named volumes including all MKE component certificates and data.

Overlay networks

No

Swarm mode overlay network definitions, including port information.

Configs, secrets

No

MKE configurations and secrets. Create a Swarm backup to back up these data.

Services

No

MKE stacks and services are stored in Swarm mode or SCM/config management.

ucp-metrics-data

No

Metrics server data.

ucp-node-certs

No

Certs used to lock down MKE system components.

Routing mesh settings

No

Interlock layer 7 ingress configuration information. A manual backup and restore process is possible and should be performed.

Note

Because Kubernetes stores the state of resources on etcd, a backup of etcd is sufficient for stateless backups.

Kubernetes settings, data, and state

MKE backups include all Kubernetes declarative objects, including secrets, and are stored in the ucp-kv etcd database.

Note

You cannot back up Kubernetes volumes and node labels. When you restore MKE, Kubernetes objects and containers are recreated and IP addresses are resolved.

For more information, refer to Backing up an etcd cluster.

Backup procedure

You can create an MKE backup using either the CLI, the MKE web UI, or the MKE API.

The backup process runs on one manager node.

Create an MKE backup using the CLI

The following example demonstrates how to:

  • Create an MKE manager node backup.

  • Encrypt the backup by using a passphrase.

  • Decrypt the backup.

  • Verify the backup contents.

  • Store the backup locally on the node at /tmp/mybackup.tar.


To create an MKE backup:

  1. Run the mirantis/ucp:3.7.16 backup command on a single MKE manager node, including the --file and --include-logs options. This creates a .tar archive with the contents of all volumes used by MKE and streams it to stdout. Replace 3.7.16 with the version you are currently running.

    docker container run \
      --rm \
      --log-driver none \
      --name ucp \
      --volume /var/run/docker.sock:/var/run/docker.sock \
      --volume /tmp:/backup \
      mirantis/ucp:3.7.16 backup \
      --file mybackup.tar \
      --passphrase "secret12chars" \
      --include-logs=false
    

    If you are running MKE with Security-Enhanced Linux (SELinux) enabled, which is typical for RHEL hosts, include --security-opt label=disable in the docker command, replacing 3.7.16 with the version you are currently running:

    docker container run \
      --rm \
      --log-driver none \
      --security-opt label=disable \
      --name ucp \
      --volume /var/run/docker.sock:/var/run/docker.sock \
      mirantis/ucp:3.7.16 backup \
      --passphrase "secret12chars" > /tmp/mybackup.tar
    

    Note

    To determine whether SELinux is enabled in MCR, view the host /etc/docker/daemon.json file, and search for the string "selinux-enabled":"true".

  2. You can access backup progress and error reporting in the stderr streams of the running backup container during the backup process. MKE updates progress after each backup step, for example, after volumes are backed up. The progress tracking is not preserved after the backup has completed.

  3. A valid backup file contains at least 27 files, including ./ucp-controller-server-certs/key.pem. Verify that the backup is a valid .tar file by listing its contents, as in the following example:

    gpg --decrypt /tmp/mybackup.tar | tar --list
    
  4. A log file is also created, in the same directory as the backup file. The passphrase for the backup and log files are the same. Review the contents of the log file by using the following command:

    gpg --decrypt '/tmp/mybackup.log'
    
Create a backup using the MKE web UI
  1. Log in to the MKE web UI.

  2. In the left-side navigation panel, navigate to Admin Settings.

  3. Click Backup.

  4. Initiate an immediate backup by clicking Backup Now.

The MKE web UI also provides the following options:

  • Display the status of a running backup

  • Display backup history

  • Display backup outcome

Create, list, and retrieve backups using the MKE API

The MKE API provides three endpoints for managing MKE backups:

  • /api/ucp/backup

  • /api/ucp/backups

  • /api/ucp/backup/{backup_id}

You must be an MKE administrator to access these API endpoints.


To create a backup using the MKE API:

You can create a backup with the POST: /api/ucp/backup endpoint. This JSON endpoint accepts the following arguments:

Field name

JSON data type

Description

passphrase

String

Encryption passphrase

noPassphrase

Boolean

Sets whether a passphrase is used

fileName

String

Backup file name

includeLogs

Boolean

Sets whether to include a log file

hostPath

String

File system location

The request returns one of the following HTTP status codes, and if successful, a backup ID.

  • 200: Success

  • 500: Internal server error

  • 400: Malformed request (payload fails validation)

Example API call:

curl -sk -H "Authorization: Bearer $AUTHTOKEN"  https://$UCP_HOSTNAME/api/ucp/backup \
  -X POST \
  -H "Content-Type: application/json" \
  --data  '{"passphrase": "secret12chars", "includeLogs": true, "fileName": "backup1.tar", "logFileName": "backup1.log", "hostPath": "/tmp"}'
  • $AUTHTOKEN is your authentication bearer token if using auth token identification.

  • $UCP_HOSTNAME is your MKE hostname.

Example output:

200 OK

To list all backups using the MKE API:

You can view all existing backups with the GET: /api/ucp/backups endpoint. This request does not expect a payload and returns a list of backups, each as a JSON object following the schema detailed in Backup schema.

The request returns one of the following HTTP status codes, and if successful, a list of existing backups:

  • 200: Success

  • 500: Internal server error

Example API call:

curl -sk -H "Authorization: Bearer $AUTHTOKEN" https://$UCP_HOSTNAME/api/ucp/backups

Example output:

[
  {
    "id": "0d0525dd-948a-41b4-9f25-c6b4cd6d9fe4",
    "encrypted": true,
    "fileName": "backup2.tar",
    "logFileName": "backup2.log",
    "backupPath": "/secure-location",
    "backupState": "SUCCESS",
    "nodeLocation": "ucp-node-ubuntu-0",
    "shortError": "",
    "created_at": "2019-04-10T21:55:53.775Z",
    "completed_at": "2019-04-10T21:56:01.184Z"
  },
  {
    "id": "2cf210df-d641-44ca-bc21-bda757c08d18",
    "encrypted": true,
    "fileName": "backup1.tar",
    "logFileName": "backup1.log",
    "backupPath": "/secure-location",
    "backupState": "IN_PROGRESS",
    "nodeLocation": "ucp-node-ubuntu-0",
    "shortError": "",
    "created_at": "2019-04-10T01:23:59.404Z",
    "completed_at": "0001-01-01T00:00:00Z"
  }
]

To retrieve backup details using the MKE API:

You can retrieve details for a specific backup using the GET: /api/ucp/backup/{backup_id} endpoint, where {backup_id} is the ID of an existing backup. This request returns the backup, if it exists, as a JSON object following the schema detailed in Backup schema.

The request returns one of the following HTTP status codes, and if successful, the backup for the specified ID:

  • 200: Success

  • 404: Backup not found for the given {backup_id}

  • 500: Internal server error

Specify a backup file

To avoid directly managing backup files, you can specify a file name and host directory on a secure and configured storage backend, such as NFS or another networked file system. The file system location is the backup folder on the manager node file system. This location must be writable by the nobody user, which is specified by changing the directory ownership to nobody. This operation requires administrator permissions to the manager node, and must only be run once for a given file system location.

To change the file system directory ownership to nobody:

sudo chown nobody:nogroup /path/to/folder

Caution

  • Specify a different name for each backup file. Otherwise, the existing backup file with the same name is overwritten.

  • Also specify a location that is mounted on a fault-tolerant file system, such as NFS, rather than the node local disk. Otherwise, it is important to regularly move backups from the manager node local disk to ensure adequate space for ongoing backups.

Backup schema

The following table describes the backup schema returned by the GET: /api/ucp/backups and GET: /api/ucp/backup/{backup_id} endpoints:

Field name

JSON data type

Description

id

String

Unique ID

encrypted

Boolean

Sets whether to encrypt with a passphrase

filename

String

Backup file name if backing up to a file, empty otherwise

logFilename

String

Backup log file name if saving backup logs, empty otherwise

backupPath

String

Host path where backup is located

backupState

String

Current state of the backup (IN_PROGRESS, SUCCESS, FAILED)

nodeLocation

String

Node on which the backup was taken

shortError

String

Empty unless backupState is set to FAILED

created_at

String

Time of backup creation

completed_at

String

Time of backup completion

Restore Swarm

Prior to restoring Swarm, verify that you meet the following prerequisites:

  • The node you select for the restore must use the same IP address as the node from which you made the backup, as the command to force the new cluster does not reset the IP address in the swarm data.

  • The node you select for the restore must run the same version of Mirantis Container Runtime (MCR) as the node from which you made the backup.

  • You must have access to the list of manager node IP addresses located in state.json inside the zip file.

  • If auto-lock was enabled on the backed-up swarm, you must have access to the unlock key.


To perform the Swarm restore:

Caution

You must perform the Swarm restore on only the one manager node in your cluster and the manager node must be the same manager from which you made the backup.

  1. Shut down MCR on the manager node that you have selected for your restore:

    systemctl stop docker
    
  2. On the new swarm, remove the contents of the /var/lib/docker/swarm directory. Create this directory if it does not exist.

  3. Restore the /var/lib/docker/swarm directory with the contents of the backup:

    tar -xvf <PATH_TO_TARBALL> -C /
    

    Set <PATH_TO_TARBAL> to the location path where you saved the tarball during backup. If you are following the procedure in backup-swarm, the tarball will be in a /tmp/ folder with a unique name based on the engine version and timestamp: swarm-${ENGINE}-$(hostname -s)-$(date +%s%z).tgz.

    Note

    The new node uses the same encryption key for on-disk storage as the old one. It is not possible to change the on-disk storage encryption keys. For a swarm that has auto-lock enabled, the unlock key is the same as on the old swarm and is required to restore the swarm.

  4. Unlock the swarm, if necessary:

    docker swarm unlock
    
  5. Start Docker on the new node:

    systemctl start docker
    
  6. Verify that the state of the swarm is as expected, including application-specific tests or checking the output of docker service ls to verify that all expected services are present.

  7. If you use auto-lock, rotate the unlock key:

    docker swarm unlock-key --rotate
    
  8. Add the required manager and worker nodes to the new swarm.

  9. Reinstate your previous backup process on the new swarm.

Restore MKE

MKE supports the following three different approaches to performing a restore:

  • Run the restore on the machines from which the backup originated or on new machines. You can use the same swarm from which the backup originated or a new swarm.

  • Run the restore on a manager node of an existing swarm that does not have MKE installed. In this case, the MKE restore uses the existing swarm and runs in place of an MKE install.

  • Run the restore on an instance of MCR that is not included in a swarm. The restore performs docker swarm init just as the install operation would do. This creates a new swarm and restores MKE thereon.

Note

During the MKE restore operation, Kubernetes declarative objects and containers are recreated and IP addresses are resolved.

For more information, refer to Restoring an etcd cluster.

Prerequisites

Consider the following requirements prior to restoring MKE:

  • To restore an existing MKE installation from a backup, you must uninstall MKE from the swarm by using the uninstall-ucp command.

  • Restore operations must run using the same major and minor MKE version and mirantis/ucp image version as the backed-up cluster.

  • If you restore MKE using a different swarm than the one where the backed-up MKE was deployed, MKE will use new TLS certificates. In this case, you must download new client bundles, as the existing ones will no longer be operational.

Restore MKE

Note

At the start of the restore operation, the script identifies the MKE version defined in the backup and performs one of the following actions:

  • The MKE restore fails if it runs using an image that does not match the MKE version from the backup. To override this in, for example, a testing scenario, use the --force flag.

  • MKE provides instructions on how to run the restore process for the MKE version in use.

Note

If SELinux is enabled, you must temporarily disable it prior to running the restore command. You can then reenable SELinux once the command has completed.

Volumes are placed onto the host where you run the MKE restore command.

  1. Restore MKE from an existing backup file. The following example illustrates how to restore MKE from an existing backup file located in /tmp/backup.tar:

    docker container run \
    --rm \
    --interactive \
    --name ucp \
    --volume /var/run/docker.sock:/var/run/docker.sock  \
    mirantis/ucp:3.7.16 restore \
    --san=${APISERVER_LB} < /tmp/backup.tar
    
    • Replace mirantis/ucp:3.7.16 with the MKE version in your backup file.

    • For the --san flag, assign the cluster API server IP address without the port number to the APISERVER_LB variable. For example, for https://172.16.243.2:443 use 172.16.243.2. For more information on the --san flag, refer to MKE CLI restore options.

    If the backup file is encrypted with a passphrase, include the --passphrase flag in the restore command:

    docker container run \
    --rm \
    --interactive \
    --name ucp \
    --volume /var/run/docker.sock:/var/run/docker.sock  \
    mirantis/ucp:3.7.16 restore \
    --san=${APISERVER_LB} \
    --passphrase "secret" < /tmp/backup.tar
    

    Alternatively, you can invoke the restore command in interactive mode by mounting the backup file to the container rather than streaming it through stdin:

    docker container run \
    --rm \
    --interactive \
    --name ucp \
    --volume /var/run/docker.sock:/var/run/docker.sock \
    -v /tmp/backup.tar:/config/backup.tar \
    mirantis/ucp:3.7.16 restore -i
    
  2. Regenerate certs. The current certs volume containing cluster-specific information, such as SANs, is invalid on new clusters with different IPs. For volumes that are not backed up, such as ucp-node-certs, the restore regenerates certs. For certs that are backed up, ucp-controller-server-certs, the restore does not perform a regeneration and you must correct those certs when the restore completes.

  3. After you successfully restore MKE, add new managers and workers just as you would after a fresh installation.

  4. For restore operations, review the output of the restore command.

Verify the MKE restore
  1. Run the following command:

    curl -s -k https://localhost/_ping
    
  2. Log in to the MKE web UI.

  3. In the left-side navigation panel, navigate to Shared Resources > Nodes.

  4. Verify that all swarm manager nodes are healthy:

    • Monitor all swarm managers for at least 15 minutes to ensure no degradation.

    • Verify that no containers on swarm manager nodes are in an unhealthy state.

    • Verify that no swarm nodes are running containers with the old version, except for Kubernetes Pods that use the ucp-pause image.

Customer feedback

You can submit feedback on MKE to Mirantis either by rating your experience or through a Jira ticket.

To rate your MKE experience:

  1. Log in to the MKE web UI.

  2. Click Give feedback at the bottom of the screen.

  3. Rate your MKE experience from one to five stars, and add any additional comments in the provided field.

  4. Click Send feedback.

To offer more detailed feedback:

  1. Log in to the MKE web UI.

  2. Click Give feedback at the bottom of the screen.

  3. Click create a ticket in the 5-star review dialog to open a Jira feedback collector.

  4. Fill in the Jira feedback collector fields and add attachments as necessary.

  5. Click Submit.

Launchpad

Mirantis’s Launchpad CLI Tool (Launchpad) is a command-line deployment and lifecycle-management tool that runs on virtually any Linux, Mac, or Windows machine. It simplifies and automates MKE, MSR, and MCR installation and deployments on public clouds, private clouds, virtualization platforms, and bare metal.

In addition, Launchpad provides full cluster lifecycle management. Using Launchpad, multi-manager, high availability clusters (defined as having sufficient node capacity to move active workloads around while updating) can be upgraded with no downtime.

Note

Launchpad is distributed as a binary executable. The main integration point with cluster management is the launchpad apply command and the input launchpad.yaml configuration for the cluster. As the configuration is in YAML format, you can integrate other tooling with Launchpad.

System requirements

Mirantis Launchpad is a static binary that works on the following operating systems:

  • Linux (x64)

  • MacOS (x64)

  • Windows (x64)

Important

The setup must meet MKE system requirements, in addition to the requirements for running Launchpad.

The following operating systems support MKE:

  • MKEx (Rocky&OSTree)

  • CentOS 7

  • Oracle Linux 7

  • Oracle Linux 8

  • Oracle Linux 9

  • Redhat Enterprise Linux 7

  • Redhat Enterprise Linux 8

  • Redhat Enterprise Linux 9

  • Rocky Linux 8

  • Rocky Linux 9

  • SUSE Linux Enterprise Server 12

  • SUSE Linux Enterprise Server 15

  • Ubuntu 18.04

  • Ubuntu 20.04

  • Ubuntu 22.04

  • Windows Server 2022, 2019

Be aware that Launchpad does not support all OS platform patch levels. Refer to the Compatibility Matrix for your version of MCR for full OS platform support information.

Hardware requirements

Manager nodes

Worker nodes

Minimum hardware requirements

  • 16 GB of RAM

  • 2 vCPUs

  • 25 GB of free disk space for the /var partition

4 GB of RAM

Recommended hardware requirements

  • 24 - 32 GB of RAM

  • 4 vCPUs

  • 25 - 100 GB of free disk space

Note

Windows container images are typically larger than Linux container images, and thus it is necessary to provision more local storage for Windows nodes.

Permissions and privilege levels

Launchpad remote management must have high privilege on your system, both to prepare the system for installation and to perform the installation. This level of access is necessary for package managent, and also to allow remote users to execute MCR docker commands.

Note

For security reasons, Launchpad should not be executed with root/admin user authentication on any machine.

Package Management

Launchpad uses sudo commands to manage several packages through a system package manager, as detailed below:

  • Install the key components needed for installing Mirantis products:

    curl

    Used to retrieve the MCR installation script

    iptables/iputils

    MCR dependencies

    socat

    Enables Prometheus management in certain scenarios

    RHEL rh-amazon-rhui-client

    Used by AWS for various management tasks

  • Add remote users to the MCR group docker to allow docker commands.

  • Run the MCR installation script:

    • Add package repositories for the MCR packages.

    • Remove conflicting Docker-EE packages from the system.

    • Install MCR, through the system package manager.

  • Optional. Uninstall MCR, by removing installed packages.

  • Optional. Prune MCR installations during unintall, by deleting system folders created by MCR.

Remote management

Launchpad connects through the use of a cryptographic network protocol (SSH on Linux systems, SSH or WinRM on Windows systems), and as such these must be set up on all host instances.

Note

Only passwordless sudo capable SSH Key-Based authentication is currently supported. On Windows the user must have administrator privileges.

OpenSSH

OpenSSH is the open-source version of the Secure Shell (SSH) tools used by administrators of Linux and other non-Windows operating systems for cross-platform management of remote systems. It is included in Windows Server 2019.

  1. To enable SSH on Windows, you can run the following PowerShell snippets, modified for your specific configuration, on each Windows host.

    # Install OpenSSH
    Add-WindowsCapability -Online -Name OpenSSH.Client~~~~0.0.1.0
    Add-WindowsCapability -Online -Name OpenSSH.Server~~~~0.0.1.0
    Start-Service sshd
    Set-Service -Name sshd -StartupType 'Automatic'
    
    # Configure ssh key authentication
    mkdir c:\Users\Administrator\.ssh\
    $sshdConf = 'c:\ProgramData\ssh\sshd_config'
    (Get-Content $sshdConf).replace('#PubkeyAuthentication yes', 'PubkeyAuthentication yes') | Set-Content $sshdConf
    (Get-Content $sshdConf).replace('Match Group administrators', '#Match Group administrators') | Set-Content $sshdConf
    (Get-Content $sshdConf).replace('       AuthorizedKeysFile __PROGRAMDATA__/ssh/administrators_authorized_keys', '#       AuthorizedKeysFile __PROGRAMDATA__/ssh/administrators_authorized_keys') | Set-Content $sshdConf
    restart-service sshd
    
  2. Transfer your SSH public key from your local machine to the host, using the following example but with your own values.

    # Transfer SSH Key to Server
    scp ~/.ssh/id_rsa.pub Administrator@1.2.1.2:C:\Users\Administrator\.ssh\authorized_keys
    ssh --% Administrator@1.2.1.2 powershell -c $ConfirmPreference = 'None'; Repair-AuthorizedKeyPermission C:\Users\Administrator\.ssh\authorized_keys
    
WinRM

As an alternative to SSH, WinRM can be used on Windows hosts.

Ports Used

When installing an MKE cluster, a series of ports must be opened to incoming traffic.

Get started with Launchpad

Launchpad is a command-line deployment and lifecycle-management tool that enables users on any Linux, Mac, or Windows machine to easily install, deploy, modify, and update MKE, MSR, and MCR.

Set up a deployment environment

To fully evaluate and use MKE, MSR, and MCR, Mirantis recommends installing Launchpad on a real machine (Linux, Mac, or Windows) or a virtual machine (VM) that is capable of running:

  • A graphic desktop and browser, for accessing or installing:

    • The MKE web UI

    • Lens, an open source, stand-alone GUI application from Mirantis (available for Linux, Mac, and Windows) for multi-cluster management and operations

    • Metrics, observability, visualization, and other tools

  • kubectl (the Kubernetes command-line client)

  • curl, Postman and/or client libraries, for accessing the Kubernetes REST API

  • Docker and related tools for using the Docker Swarm CLI, and for containerizing workloads and accessing local and remote registries.

The machine can reside in different contexts from the hosts and connect with those hosts in several different ways, depending on the infrastructure and services in use. It must be able to communicate with the hosts via their IP addresses on several ports. Depending on the infrastructure and security requirements, this can be relatively simple to achieve for evaluation clusters (refer to Networking Considerations for more information).

Configure hosts

A cluster is comprised of at least one manager node and one or more worker nodes. At the start, Mirantis recommends deploying a small evaluation cluster, with one manager and at least one worker node. Such a setup will allow you to become familiar with Launchpad, with the procedures for provisioning nodes, and with the features of MKE, MSR, and MCR. In addition, if the deployment is on a public cloud, the setup will minimize costs.

Ultimately, Launchpad can deploy manager and worker nodes in any combination, creating many different cluster configurations, such as:

  • Small evaluation clusters, with one manager and one or more worker nodes.

  • Diverse clusters, with Linux and Windows workers.

  • High-availability clusters, with two, three, or more manager node.

  • Clusters that Launchpad can auto-update, non-disruptively, with multiple managers (allowing one-by-one update of MKE without loss of cluster cohesion) and sufficient worker nodes of each type to allow workloads be drained to new homes as each node is updated.

The hosts must be able to communicate with one another (and potentially, with users in the outside world) by way of their IP addresses, using many ports. Depending on infrastructure and security requirements, this can be relatively simple to achieve for evaluation clusters (refer to Networking Considerations).

Install Launchpad

Note

Launchpad has built-in telemetry for tracking tool use. The telemetry data is used to improve the product and overall user experience. No sensitive data about the clusters is included in the telemetry payload.

  1. Download Launchpad.

  2. Rename the downloaded binary to launchpad, move it to a directory in the PATH variable, and give it permission to run (execute permission).

    Tip

    If macOS is in use it may be necessary to give Launchpad permissions in the Security & Privacy section in System Preferences.

  3. Verify the installation by checking the installed tool version with the launchpad version command.

    $ launchpad version
    # console output:
    
    version: 1.0.0
    
  4. Complete the registration. Please be aware that the registration information will be used to assign evaluation licenses and to provide Launchpad use help.

    $ launchpad register
    
    name: Anthony Stark
    company: Stark Industries
    email: astark@example.com
    I agree to Mirantis Launchpad Software Evaluation License Agreement https://github.com/Mirantis/launchpad/blob/master/LICENSE [Y/n]: Yes
    INFO[0022] Registration completed!
    

Create a Launchpad configuration file

The cluster is configured using a yaml file.

In the example provided, a simple two-node MKE cluster is set up using Kubernetes: one node for MKE and one for a worker node.

  1. In your editor, create a new file and copy-paste the following text as-is:

    apiVersion: launchpad.mirantis.com/mke/v1.4
    kind: mke
    metadata:
      name: mke-kube
    spec:
      mke:
        adminUsername: admin
        adminPassword: passw0rd!
        installFlags:
        - --default-node-orchestrator=kubernetes
      hosts:
      - role: manager
        ssh:
          address: 172.16.33.100
          keyPath: ~/.ssh/my_key
      - role: worker
        ssh:
          address: 172.16.33.101
          keyPath: ~/.ssh/my_key
    
  2. Save the file as launchpad.yaml.

  3. Adjust the text to meet your infrastructure requirements. The model should work to deploy hosts on most public clouds.

    If you’re deploying on VirtualBox or some other desktop virtualization solution and are using bridged networking, it will be necessary to make a few minor adjustments to the launchpad.yaml.

    • Deliberately set a –pod-cidr to ensure that pod IP addresses don’t overlap with node IP addresses (the latter are in the 192.168.x.x private IP network range on such a setup)

    • Supply appropriate labels for the target nodes’ private IP network cards using the privateInterface parameter (this typically defaults to enp0s3 on Ubuntu 18.04 (other Linux distributions use similar nomenclature).

    In addition, it may be necessary to set the username for logging in to the host.

    apiVersion: launchpad.mirantis.com/mke/v1.4
    kind: mke
    metadata:
      name: my-mke
    spec:
      mke:
        adminUsername: admin
        adminPassword: passw0rd!
        installFlags:
          - --default-node-orchestrator=kubernetes
          - --pod-cidr 10.0.0.0/16
      hosts:
      - role: manager
        ssh:
          address: 192.168.110.100
          keyPath: ~/.ssh/id_rsa
          user: theuser
        privateInterface: enp0s3
      - role: worker
        ssh:
          192.168.110.101
          keyPath: ~/.ssh/id_rsa
          user: theuser
        privateInterface: enp0s3
    

For more complex setups, Launchpad offers a full set of configuration options.

Note

Users who are familiar with Terraform can automate the infrastructure creation using Mirantis Terraform examples as a baseline.

Bootstrap your cluster

You can start the cluster once the cluster configuration file is fully set up. In the same directory where you created the launchpad.yaml file, run:

$ launchpad apply

The launchpad tool uses a cryptographic network protocol (SSH on Linux systems, SSH or WinRM on Windows systems) to connect to the infrastructure specified in the launchpad.yaml and configures on the hosts everything that is required. Within a few minutes the cluster should be up and running.

Connect to the cluster

Launchpad will present the information needed to connect to the cluster at the end of the installation procedure. For example:

INFO[0021] ==> Running phase: MKE cluster info
INFO[0021] Cluster is now configured.  You can access your admin UIs at:
INFO[0021] MKE cluster admin UI: https://test-mke-cluster-master-lb-895b79a08e57c67b.elb.eu-north-1.example.com
INFO[0021] You can also download the admin client bundle with the following command: launchpad client-config

By default, the administrator username is admin. If the password is not supplied in launchpad.yaml installFlags option like --admin-password=supersecret, the generated admin password will display in the install flow.

INFO[0083] 127.0.0.1:  time="2020-05-26T05:25:12Z" level=info msg= "Generated random admin password: wJm-TzIzQrRNx7d1fWMdcscu_1pN5Xs0"

Important

The addition or removal of nodes in subsequent Launchpad runs will fail if the password is not provided in the launchpad.yaml file.

See also

Kubernetes

Networking considerations

Users will likely install Launchpad on a laptop or a VM with the intent of deploying MKE, MSR, or MCR onto VMs running on a public or private cloud that supports security groups for IP access control. Such an approach makes it fairly simple to configure networking in a way that provides adequate security and convenient access to the cluster for evaluation and experimentation.

The simplest way to configure the networking for a small, temporary cluster for evaluation:

  1. Create a new virtual subnet (or VPC and subnet) for hosts.

  2. Create a new security group called de_hosts (or another name of your choice) that permits inbound IPv4 traffic on all ports, either from the security group de_hosts, or from the new virtual subnet only.

  3. Create another new security group (for example, admit_me) that permits inbound IPv4 traffic from your deployer machine’s public IP address only (for instance, the website whatismyip.com) to determine your public IP.

  4. When launching hosts, attach them to the newly-created subnet and apply both new security groups.

  5. (Optional) Once you know the IPv4 addresses (public, or VPN-accessible private) of your nodes, unless you are using local DNS it makes sense to assign names to your hosts (for example, manager, worker1, worker2… and so on). Then, insert IP addresses and names in your hostfile, thus letting you (and Launchpad) refer to hosts by hostname instead of IP address.

Once the hosts are booted, SSH into them from your deployer machine with your private key. For example:

ssh -i /my/private/keyfile username@mynode

After that, determine whether they can access the internet. One method for doing this is by pinging a Google nameserver:

$ ping 8.8.8.8

Now, proceed with installing Launchpad and configuring an MKE, MSR, or MCR deployment. Once completed, use your deployer machine to access the MKE web UI, run kubectl (after authenticating to your cluster) and other utilities (for example, Postman, curl, and so on).

Use a VPN

A more secure way to manage networking is to connect your deployer machine to your VPC/subnet using a VPN, and to then modify the de_hosts security group to accept traffic on all ports from this source.

More deliberate network security

If you intend to deploy a cluster for longer-term evaluation, it makes sense to secure it more deliberately. In this case, a certain range of ports will need to be opened on hosts. Refer to the MKE documentation for details.

Use DNS

Launchpad can deploy certificate bundles obtained from a certificate provider to authenticate your cluster. These can be used in combination with DNS to allow you to reach your cluster securely on a fully-qualified domain name (FQDN). Refer to the MKE documentation for details.

Upgrade components with Launchpad

Launchpad allows users to upgrade their clusters with the launchpad apply reconciliation command. The tool discovers the current state of the cluster and its components, and upgrades what is needed.

Upgrade Mirantis Container Runtime

  1. Change the MCR version in the launchpad.yaml file.

    apiVersion: launchpad.mirantis.com/mke/v1.4
    kind: mke
    metadata:
      name: <metadata-name>
    spec:
      hosts:
      - role: manager
        ssh:
          address: 10.0.0.1
      mcr:
        version: 20.10.0
    
  2. Run launchpad apply. Launchpad will upgrade MCR on all hosts in the following sequence:

    1. Upgrade the container runtime on each manager node one-by-one, and thus if there is more than one manager node, all other manager nodes are available during the time that the first node is being updated.

    2. Once the first manager node is updated and is running again, the second is updated, and so on, until all of the manager nodes are running the new version of MCR.

    3. 10% of worker nodes are updated at a time, until all of the worker nodes are running the new version of MCR.

Upgrade MKE, MSR, AND MCR (separately or collectively)

Upgrading to newer versions of MKE, MSR, and MCR is as easy as changing the version tags in the launchpad.yaml and running the launchpad apply command.

Note

Launchpad upgrades MKE on all nodes.

  1. Open the launchpad.yaml file.

  2. Update the version tags to the new version of the component(s).

  3. Save launchpad.yaml.

  4. Run the launchpad apply command.

    Launchpad connects to the nodes to get the current version of each component, after which it upgrades each node as described in Upgrading Mirantis Container Runtime. This may take several minutes.

Note

MKE and MSR upgrade paths require consecutive minor versions (for example, to upgrade from MKE 3.1.0 to MKE 3.3.0 it is necessary to upgrade from MKE 3.1.0 to MKE 3.2.0 first, and then upgrade from MKE 3.2.0 to MKE 3.3.0).

Manage nodes

The process of adding and removing nodes differs, depending on whether the affected nodes are Manager nodes, Worker nodes, or MSR nodes.

Manager Nodes

Swarm manager nodes use the Raft Consensus Algorithm to manage the swarm state. As such, it is advisable to have an understanding of some general Raft concepts in order to manage a swarm.

  • There is no limit on the number of manager nodes that can be deployed. The decision on how many manager nodes to implement comes down to a trade-off between performance and fault-tolerance. Adding manager nodes to a swarm makes the swarm more fault-tolerant, however additional manager nodes reduce write performance as more nodes must acknowledge proposals to update the swarm state (which means more network round-trip traffic).

  • Raft requires a majority of managers, also referred to as the quorum, to agree on proposed updates to the swarm, such as node additions or removals. Membership operations are subject to the same constraints as state replication.

  • In addition, Manager nodes host the control plane etcd cluster, and thus making changes to the cluster requires a working etcd cluster with the majority of peers present and working.

  • It is highly advisable to run an odd number of peers in quorum-based systems. MKE only works when a majority can be formed, so once more than one node has been added it is not possible to (automatically) go back to having only one node.

Add Manager Nodes

Adding manager nodes is as simple as adding them to the launchpad.yaml file. Re-running launchpad apply will configure MKE on the new node and also makes necessary changes in the swarm and etcd cluster.

Remove Manager Nodes
  1. Remove the manager host from the launchpad.yaml file.

  2. Enable pruning by changing the prune setting to true in spec.cluster.prune.

    spec:
      cluster:
        prune: true
    
  3. Run the launchpad apply command.

  4. Remove the node in the infrastructure.

Worker Nodes

Add Worker Nodes

To add worker nodes, simply include them in the launchpad.yaml file. Re-running launchpad apply will configure everything on the new node and join it to the cluster.

Remove Worker Nodes
  1. Remove the host from the launchpad.yaml file.

  2. Enable pruning by changing the prune setting to true in spec.cluster.prune.

    spec:
      cluster:
        prune: true
    
  3. Run the launchpad apply command.

  4. Remove the node in the infrastructure.

MSR Nodes

MSR nodes are identical to worker nodes. They participate in the MKE swarm, but should not be used as traditional worker nodes for both MSR and cluster workloads.

Note

By default, MKE will prevent scheduling of containers on MSR nodes.

MSR forms its own cluster and quorum in addition to the swarm formed by MKE. There is no limit on the number of MSR nodes that can be configured, however the best practice is to limit the amount to five. As with manager nodes, the decision on how many nodes to implement should be made with an understanding of the trade-off between performance and fault-tolerance (a larger amount of nodes added can incur severe performance penalties).

The quorum formed by MSR utilizes RethinkDB which, as with swarm, uses the Raft Consensus Algorithm.

Add MSR Nodes

To add MSR nodes, simply include them in the launchpad.yaml file with a host role of msr. When adding an MSR node, specify both the adminUsername and adminPassword in the spec.mke section of the launchpad.yaml file so that MSR knows which admin credentials to use.

spec:
  mke:
    adminUsername: admin
    adminPassword: passw0rd!

Next, re-run launchpad apply which will configure everything on the new node and join it into the cluster.

Remove MSR nodes
  1. Remove the host from the launchpad.yaml file.

  2. Enable pruning by changing the prune setting to true in spec.cluster.prune.

    spec:
      cluster:
        prune: true
    
  3. Run the launchpad apply command.

  4. Remove the node in the infrastructure.

Launchpad CLI reference

Global options

A number of optional arguments can be used with any Launchpad command.

Option

Description

--disable-telemetry

Disable sending analytics and telemetry data

--accept-license

Accept the end user license agreement

--disable-upgrade-check

Skip check for Launchpad upgrade

--debug

Increase output verbosity

--help

Display command help

Commands

All Launchpad commands begin wth launchpad or lp.

launchpad <command>

Command

Description

init

Initialize Launchpad.

Intializes the cluster config file (usually called launchpad.yaml).

Supported options:

n/a

apply

Initialize or upgrade Launchpad.

After initializing the cluster config file, applies the settings and initializes or upgrades a cluster.

Supported options:

--config

Path to a cluster config file, including the filename (default: launchpad.yaml, to read from standard input use: -).

--force

Continue installation when prerequisite validation fails (default: false)

client-config

Download client configuration.

The MKE client bundle contains a private and public key pair that authorizes Launchpad to interact with the MKE CLI.

Supported options:

--config

Path to a cluster config file, including the filename (default: launchpad.yaml, to read from standard input use: -).

Note that the configuration MUST include the MKE credentials (example follows):

apiVersion: launchpad.mirantis.com/mke/v1.4
kind: mke
spec:
  mke:
    adminUsername: admin
    adminPassword: password

reset

Reset or uninstall a cluster.

Resets or uninstalls an MKE cluster.

Supported options:

--config

Path to a cluster config file, including the filename (default: launchpad.yaml, to read from standard input use: -).

--force

Required when running non-interactively (default: false)

exec

Execute a command or run a remote terminal on a host.

Use Launchpad to run commands or an interactive terminal on the hosts in the configuration.

Supported options:

--config

Path to a cluster config file, including the filename (default: launchpad.yaml, to read from standard input use: -).

--target value

Target host (example: address[:port])

--interactive

Run interactive (default: false)

--first

Use the first target found in configuration (default: false)

--role value

Use the first target that has this role in configuration

-[command]

The command to run. When blank, will run the default shell.

describe

Presents basic information that correlates to the command target.

When the launchpad describe hosts command is run, the information delivered includes the IP address, the internal IP, the host name, the set role, the operating system, and the MCR version of each host. When the launchpad describe MKE or launchpad describe MSR is run, the command returns the product version number for the product targeted, as well as the URL of the administation user interface.

Supported options:

--config

Path to a cluster config file, including the filename (default: launchpad.yaml, to read from standard input use: -).

-[report name]

currently supported reports: config, mke, msr

register

Registers a user.

Supported options:

--name

User’s name.

--email

User’s email address.

--company

Name of user’s company.

--accept-license

Accept the end user license agreement.

completion

Generate shell auto-completions.

Completes a specified shell.

Supported options:

--shell

Generates completions for the shell specified following the option.

Installing the completion scripts:

Bash:

$ launchpad completion -s bash > \
/etc/bash_completion.d/launchpad
$ source /etc/bash_completion.d/launchpad

Zsh:

$ launchpad completion -s zsh > \
/usr/local/share/zsh/site-functions/_launchpad
$ source /usr/local/share/zsh/site-functions/_launchpad

Fish:

$ launchpad completion -s fish > \
~/.config/fish/completions/launchpad.fish
$ source ~/.config/fish/completions/launchpad.fish

Launchpad Configuration File

Mirantis Launchpad cluster configuration is presented in YAML format. launchpad.yaml is the file’s default name, though you can edit this name as necessary using any common text editor.

Sample Launchpad Configuration File

The following launchpad.yaml example uses every possible configuration option.

apiVersion: launchpad.mirantis.com/mke/v1.4
kind: mke+msr
metadata:
  name: mycluster
spec:
  hosts:
  - role: manager
    hooks:
      apply:
        before:
          - ls -al > test.txt
        after:
          - cat test.txt
    ssh:
      address: 10.0.0.1
      user: myuser
      port: 22
      keyPath: ~/.ssh/id_rsa
    privateInterface: eth0
    environment:
      http_proxy: http://example.com
      NO_PROXY: 10.0.0.*
    mcrConfig:
      debug: true
      log-opts:
        max-size: 10m
        max-file: "3"
  - role: worker
    winRM:
      address: 10.0.0.2
      user: myuser
      password: abcd1234
      port: 5986
      useHTTPS: true
      insecure: false
      useNTLM: false
      caCertPath: ~/.certs/cacert.pem
      certPath: ~/.certs/cert.pem
      keyPath: ~/.certs/key.pem
  - role: msr
    imageDir: ./msr-images
    ssh:
      address: 10.0.0.3
      user: myuser
      port: 22
      keyPath: ~/.ssh/id_rsa
  - role: worker
    localhost:
      enabled: true
  mke:
    version: "3.7.16"
    imageRepo: "docker.io/mirantis"
    adminUsername: admin
    adminPassword: "$MKE_ADMIN_PASSWORD"
    installFlags:
    - "--default-node-orchestrator=kubernetes"
    licenseFilePath: ./docker-enterprise.lic
    configFile: ./mke-config.toml
    configData: |-
      [scheduling_configuration]
        default_node_orchestrator = "kubernetes"
  msr:
    version: "2.9.17"
    imageRepo: "docker.io/mirantis"
    installFlags:
    - --dtr-external-url dtr.example.com
    - --ucp-insecure-tls
    replicaIDs: sequential
  mcr:
    version: "23.0.12"
    channel: stable
    repoURL: https://repos.mirantis.com
    installURLLinux: https://get.mirantis.com/
    installURLWindows: https://get.mirantis.com/install.ps1
  cluster:
    prune: true

Note

Launchpad follows Kubernetes-style versioning and grouping in its configuration.

Environment variable substitution

In reading the configuration file, Launchpad will replace any strings that begin with a dollar sign with values from the local host’s environment variables. For example:

apiVersion: launchpad.mirantis.com/mke/v1.4
kind: mke
spec:
  mke:
    installFlags:
    - --admin-password="$MKE_ADMIN_PASSWORD"

Simple bash-like expressions are supported.

Expression

Meaning

${var}

Value of var (same as $var)

${var-$DEFAULT}

If var not set, evaluate expression as $DEFAULT

${var:-$DEFAULT}

If var not set or is empty, evaluate expression as $DEFAULT

${var=$DEFAULT}

If var not set, evaluate expression as $DEFAULT

${var:=$DEFAULT}

If var not set or is empty, evaluate expression as $DEFAULT

${var+$OTHER}

If var set, evaluate expression as $OTHER, otherwise as empty string

${var:+$OTHER}

If var set, evaluate expression as $OTHER, otherwise as empty string

$$var

Escape expressions. Result will be $var.

Key detail

Comprehensive information follows for each of the top-level Launchpad configuration file (launchpad.yaml) keys: apiVersion, kind, metadata, spec, cluster

apiVersion

The latest API version is launchpad.mirantis.com/mke/v1.4, though earlier configuration file versions are also likely to work without changes (without any features added by more recent versions).

kind

mke and mke+msr are currently supported.

metadata
name

Name of the cluster to be created. Currently affects only Launchpad internal storage paths (for example, for client bundles and log files).

spec

The specification for the cluster (hosts_, mke_, msr_, engine_).

hosts

The machines that clusters run on are hosts.

Host name

Role of the host

privateInterface

Private network address for the configured network interface (default: eth0)

role

Role of the machine in the cluster. Possible values are:

  • manager

  • worker

  • msr

environment

Key-value pairs in YAML mapping syntax. Values are updated to host environment (optional)

mcrConfig

Mirantis Container Runtime configuration in YAML mapping syntax, will be converted to daemon.json (optional)

hooks

Hooks configuration for running commands before or after stages (optional)

imageDir

Path to a directory containing .tar/.tar.gz files produced by docker save. The images from that directory will be uploaded and docker load is used to load them.

sudodocker

Flag indicating whether Docker should be run with sudo. When set to true on Linux hosts, Docker commands will be run with sudo, and the user will not be added to the machine docker group.

Host connection options

Option type

Options

ssh (Secure Shell)

  • address: SSH connection address

  • user: User to log in as (default: root)

  • port: Host’s ssh port (default: 22)

  • keyPath: A local file path to an ssh private key file (default: ~/.ssh/id_rsa)

winRM (Windows Remote Management)

  • address: WinRM connection address

  • user: Windows account username (default: Administrator)

  • password: User account password

  • port: Host’s winRM listening port (default: 5986)

  • useHTTPS: Set true to use HTTPS protocol. When false, plain HTTP is used. (default: false)

  • insecure: Set to true to ignore SSL certificate validation errors (default: false)

  • useNTLM: Set true to use NTLM (default: false)

  • caCertPath: Path to CA Certificate file (optional)

  • certPath: Path to Certificate file (optional)

  • keyPath: Path to Key file (optional)

localhost

  • enabled: Set to true to enable.

Hooks configuration options

Option type

Options

apply

  • before: List of commands to run on the host before the “Preparing host” phase (optional)

  • after: List of commands to run on the host before the “Disconnect” phase when the apply was succesful (optional)

reset

  • before: List of commands to run on the host before the “Uninstall” phase (optional)

  • after: List of commands to run on the host before the “Disconnect” phase when the reset was successful (optional)

mke

Specify options for the MKE cluster.

Options

Description

version

Version of MKE to install or upgrade to (default: 3.3.7)

imageRepo

The image repository to use for MKE installation (default: docker.io/ mirantis)

adminUsername

MKE administrator username (default: admin)

adminPassword

MKE administrator password (default: auto-generate)

installFlags

Custom installation flags for MKE installation.

upgradeFlags

Optional. Custom upgrade flags for MKE upgrade. Obtain a list of supported installation options for a specific MKE version by running the installer container with docker run -t -i --rm mirantis/ucp:3.7.16 upgrade --help.

licenseFilePath

Optional. A path to the MKE license file.

configFile

Optional. The initial full cluster configuration file.

configData

Optional. The initial full cluster configuration file in embedded “heredocs” syntax. Heredocs allows you to define a mulitiline string while maintaining the original formatting and indenting

cloud

Optional. Cloud provider configuration.

Note

The cloud option is valid only for MKE versions prior to 3.7.x.

  • provider: Provider name (currently Azure and OpenStack (MKE 3.3.3+) are supported)

  • configFile: Path to cloud provider configuration file on local machine

  • configData: Inlined cloud provider configuration

swarmInstallFlags

Optional. Custom flags for Swarm initialization

swarmUpdateCommands

Optional. Custom commands to run after the Swarm initialization

caCertPath
certPath
keyPath
each followed by
<path to file>
or
caCertData
certData
keyData
each followed by
<PEM encoded string>

Required components for configuring the MKE UI to use custom SSL certificates on its Ingress. You must specify all components:

  • CA Certificate

  • SSL Certificate

  • Private Key

Launchpad accepts either inline PEM-encoded data or a file path, depending on the provided argument.

Note

If MKE already uses custom certificates, Launchpad can rotate the certificates during upgrade.

Important

Unless a password is provided, the MKE installer automatically generates an administrator password. This password will display in clear text in the output and persist in the logs. Subsequent runs will fail if this automatically generated password is not configured in the launchpad.yaml file.

msr

Specify options for the MSR cluster.

Options

Description

version

Version of MSR to install or upgrade to (default: 2.8.5)

imageRepo

The image repository to use for MSR installation (default: docker.io/ mirantis)

installFlags

Optional. Custom installation flags for MSR installation. Obtain a list of supported installation options for a specific MSR version by running the installer container with docker run -t -i --rm mirantis/dtr:3.1.5 install --help.

Note

Launchpad inherits the MKE flags that MSR needs to perform an installation, and to join or remove nodes. Thus, there is no need to include the following install flags in the installFlags section of msr:

  • --ucp-username (inherited from MKE’s --admin-username flag or spec.mke.adminUsername)

  • --ucp-password (inherited from MKE’s --admin-password flag or spec.mke.adminPassword)

  • --ucp-url (inherited from MKE’s --san flag or intelligently selected based on other configuration variables)

upgradeFlags

Optional. Custom upgrade flags for MSR upgrade. Obtain a list of supported installation options for a specific MSR version by running the installer container with docker run -t -i --rm mirantis/dtr:3.1.5 upgrade --help.

replicaIDs

Set to sequential to generate sequential replica id’s for cluster members, e.g., 000000000001, 000000000002, etc. (default: random)

mcr

Specify options for MCR installation.

Note

Customers take a risk in opting to use and manage their own install scripts for MCR instead of the install script that Mirantis hosts at get.mirantis.com. Mirantis manages this script as necessary to support MCR installations on demand, and can change it as needed to resolve issues and to support new features. As such, customers who opt to use their own script will need to monitor the Mirantis script to ensure compatibility.

Options

Description

version

Version of MCR to install or upgrade to. (default 20.10.0)

channel

Installation channel to use. One of test or prod (optional).

repoURL

Repository URL to use for MCR installation. (optional)

installURLLinux

Location from which to download the initial installer script for Linux hosts (local paths can also be used). (default: https://get.mirantis.com/)

installURLWindows

Location from which to download the initial installer script for Windows hosts (local paths can be used). (default: https://get.mirantis.com/install.ps1)

Note

In most scenarios, it is not necessary to specify repoUrl and installURLLinux/Windows, which usually are only used when installing from a non-standard location (that is, a disconnected datacenter).

prune

Removes certain system paths that are created by MCR during uninstallation (for example, /var/lib/docker).

cluster

Specify options that do not pertain to any of the individual components.

Options

Description

prune

Set to true to remove nodes that are known by the cluster but not listed in the launchpad.yaml file.

See also

Kubernetes

MKEx

MKEx is an integrated container orchestration platform that is powered by an immutable Rocky Linux operating system, offering next-level security and reliability.

Note

An immutable Linux operating system is designed to be unchangeable following installation, with system files that are read-only, and limited only to those packages that are required to run the applications. Such an OS is more resistant to tampering and malwares, and is well protected from accidental or malicious modification. Also, as updates or changes can only be made to an immutable OS by creating a new instance, such an OS is easier to maintain and troubleshoot.

Mirantis, in conjunction with our partner CIQ, worked to preassemble ostree-based Rocky Linux with Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine (MKE), to provide users with an immutable, atomic upgrade/rollback, versioning stack that offers a high degree of predictability and resiliency.

rpm-ostree is a hybrid image/package system for managing and deploying Linux-based operating systems. It combines the concepts of Git and traditional package management to provide a version-controlled approach to system updates and rollbacks. As with Git, rpm-ostree treats the operating system as an immutable tree of files, which enables you to atomically update or roll back the entire system.

MKEx installation

You can install MKEx either from a bootable ISO image or by way of Kickstart.

To install MKEx from a bootable ISO image:

Note

The bootable ISO image is quite similar to that of a Rocky Linux installation.

  1. Bootup the ISO. The welcome screen will display.

  2. Select the language for the MKEx installation and click Continue. A warning message will display, stating that MKEx is pre-released software that is intended for development and testing purposes only.

  3. Click I want to proceed. The INSTALLATION SUMMARY screen will display, offering entry points to the various aspects of the MKEx installation.

    Installation phase

    Aspects

    LOCALIZATION

    • Keyboard

    • Language Support

    • Time & Date

    SYSTEM

    • Installation Destination

    • KDUMP

    • Network & Host Name

    • Security Policy

    USER SETTINGS

    • Root Password

    • User Creation

  4. Click Installation Destination to call the INSTALLATION DESTINATION screen. Review the setup. The default installation destination should suffice for testing purposes.

    Once you are certain the setting is correct, click DONE to return to the The INSTALLATION SUMMARY screen.

  5. Next, click User Creation to call the CREATE USER screen.

  6. Configure a user for your MKEx installation, making sure to tick the Make this user administrator checkbox, and click DONE to return to the The INSTALLATION SUMMARY screen.

  7. Next, click Network & Host Name to call the NETWORK & HOST NAME screen.

  8. Set the toggle to ON to enable the network connection, update the Host Name if necessary, and click DONE to return to the The INSTALLATION SUMMARY screen.

  9. Click Begin Installation. Tthe INSTALLATION PROGRESS screen will display.

    Note

    The output may be differ from any previous experience you have had with installing Rocky Linux. This is due to the use of an immutable operating system base rather than a traditional RPM-based OS.

  10. Once installation is complete, click Reboot System to boot the new image. Be aware that the initial boot will require time to load the MKE images by way of the network. Once the initial boot is complete, a login prompt will display.

  11. Log in to the console.

    Note

    SSH is disabled by default, however you can enable it for easier access.

  12. Verify the presence of the MKE image:

    Note

    Presence verification requires the use of the sudo command due to the locking down of the /var/run/docker.sock socket. The immutable operating system does not allow the use of the usermod command.

    • Users with external network access:

      sudo docker image ls
      
    • Users on an isolated system, without external network access:

      ls -l /usr/share/mke
      
  13. If the MKE image is present, skip ahead to the following step. If the MKE image is not present, run the following command to load it:

    sudo docker load -i /usr/share/mke/ucp_images_3.6.8.tar.gz
    
  14. Install MKE.

  15. Log in to MKE, upload your license, and set up your worker nodes.

  16. Optional. Install additional software to your MKEx operating system to benefit from additional features. Be aware that unlike other RPM-based systems, the immutable MKEx-based Rocky image uses rpm-ostree to manage software.

To install MKEx using Kickstart:

  1. Obtain a copy of the ostre-repo.tar.gz and host it in a normal http server.

  2. Copy the following kickstart to a file and host it on the http server.

    ostreesetup --nogpg --osname=mkex --remote=mkex
    --url=<url-of-hosted-repo> --ref=mkex/8/x86_64/mke/3.7.0-devel
    bootloader --append='rw'
    
    %post --erroronfail
    rm -f /etc/ostree/remotes.d/rockylinux.conf
    ostree remote add --no-gpg-verify mkex <url-of-hosted-repo>
    ostree config set 'sysroot.readonly' 'true'
    %end
    
  3. Boot an Anaconda installer ISO of your choosing.

  4. During machine bootup, inject a cmdline parameter to instruct Anaconda to use the hosted kickstart. This can be done by editing the cmdline in the grub menu. When the grub screen displays, press the Tab key and append inst.ks=<url-of-hosted-kickstart>.

    The machine should boot into the Anaconda installer and automatically install as per the Kickstart instructions.

    Note

    The Kickstart provided here is not complete, as it only contains what is required for rpm-ostree. Be aware that it may be necessary to add commands for networking, partitioning, adding users, setting root passwd, and so forth.

MKEx reference architecture

MKEx is an integrated stack, with MKE container orchestration, or MCR container engines, in a productized configuration that is delivered on a minimal version of RHEL-compatible, ostree-based Rocky Linux.

You can deploy MKEx configurations on either bare metal or virtual machines, from an ISO image that is assembled and validated by Mirantis. The image is available online, as well as in file form for air-gapped installation.

Mirantis Kubernetes Engine (MKE)

The MKE product documentation provides full detail on the MKE Reference Architecture.

ostree-based Rocky Linux

Mirantis, in conjunction with our partner CIQ, built an ostree-based Rocky Linux operating system with Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine (MKE), to provide users with an immutable, atomic upgrade/rollback, versioning stack that offers a high degree of predictability and resiliency.

The sshd is disabled by default. System administrators can enable it to access the node, though, and disable it prior to installing the OS. With sshd disabled, users will be unable to access the nodes, and will thus have to use Mirantis-provided debug Pods to troubleshoot MKE clusters.

Mirantis has configured rotating logs (100M) by default cat /etc/docker/daemon.json, and system administrators can change the value as necessary.

To keep the footprint small and secure, only the required Linux packages are installed. System administrators can add custom packages or set specific kernel parameters through Ansible, or any other IaC software. Note, though, that the ansible-pull command is installed by default, to enable the use of Ansible outside of sshd.

Note

To ensure that the image is consistent, users should contact Mirantis support and request the inclusion of specific packages in the ISO image.

rpm-ostree operation

rpm-ostree provides a version-controlled approach to system updates and rollbacks. As with Git, rpm-ostree treats the operating system as an immutable tree of files, which enables you to atomically update or roll back the entire system.

Basic use

The basic core capabilities of rpm-ostree bring such concepts as version control and atomic updating to the management and maintenance of the operating system.

View status and version of deployment

The rpm-ostree status command provides the current system state as well as deployed operating system images. It displays the currently active deployment, its commit ID, and any pending upgrades.

Note

As with git status you can use the rpm-ostree status to better understand the status of your system and to learn of any pending changes.

To view basic status:

rpm-ostree status

Example system response:

State: idle
Deployments:
- mkex:mkex/8/x86_64/mcr/23.0-devel
    Version: MKEx/8/MCR/23.0-devel 2023.05.0 (2023-05-01T21:01:22Z)
    Commit: 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91

To view detailed status:

rpm-ostree status -v

Example system response:

State: idle
AutomaticUpdates: disabled
Deployments:
- mkex:mkex/8/x86_64/mcr/23.0-devel (index: 0)
    Version: MKEx/8/MCR/23.0-devel 2023.05.0 (2023-05-01T21:01:22Z)
    Commit: 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91
      ├─ appstream (2023-04-26T15:44:31Z)
      ├─ baseos (2023-04-26T15:44:36Z)
      └─ docker-ee-stable-23.0 (2023-04-04T16:02:39Z)
     Staged: no
     StateRoot: mkex

To display the status and version of the booted deployment:

rpm-ostree status --booted --verbose
Perform system upgrades

Use the rpm-ostree upgrade command to update the system to the latest available operating system image.

Note

As with git pull in a Git repository, rpm-ostree upgrade pulls in the latest packages, dependencies, and configuration changes to update your system.

**To perform system upgrade:

rpm-ostree upgrade

Example system response:

Checking out tree abcdefg...done
Enabled rpm-md repositories: fedora-34
Updating metadata for 'fedora-34'...done
Delta RPMs reduced 50% of update size
Upgraded:
  package-1.2.3-4.fc34.x86_64
  Package-foo-2.0.1-1.fc34.x86_64
Revert to previously booted tree

You can revert the system to a previous known state with the rpm-ostree rollback command, specifying a specific deployment. This command undoes any system changes made following the specified deployment, effectively rolling back the entire system to a previous commit.

Note

The rpm-ostree rollback operation is similar to using git checkout in a Git repository to revert to a previous commit.

To revert to previously booted tree:

Note

For example purposes, the target deployment shown is the original deployment without the additional tmux package installed.

  1. Rollback the current deployment to the target deployment.

    rpm-ostree rollback
    

    Example system response:

    Moving '0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91.1' to be
    first deployment
    
    Transaction complete; bootconfig swap: no; bootversion: boot.1.0, deployment count change: 0
    Removed:
      libevent-2.1.8-5.el8.x86_64
      tmux-2.7-1.el8.x86_64
    Changes queued for next boot. Run "systemctl reboot" to start a reboot
    
  2. Reboot the system to use the target deployment:

    systemctl reboot
    
  3. Switch the deployment and initiate a system reboot once the rollback operation is complete:

    rpm-ostree rollback -r
    

    Example system response:

    Moving '8b6ffd83adcf6c6a37941cade3a8ffea0a80d3d311e5ee8b7f0a074be6143ca9.0' to be
    first deployment
    Transaction complete; bootconfig swap: no; bootversion: boot.1.1,
    deployment count change: 0
    
Overlay additional packages

You can use the rpm-ostree install command to install additional packages on top of the base operating system image. The command adds the packages you specify to the current deployment, thus enabling you to extend your system functionality.

To overlay additional packages:

  1. Install the tmux package:

    rpm-ostree install tmux
    

    Example system response:

    Checking out tree 0572d28... done
    Enabled rpm-md repositories: appstream baseos extras
    Updating metadata for 'appstream'... done
    Updating metadata for 'baseos'... done
    ...<OUTPUT TRUNCATED>...
    Staging deployment... done
    Added:
      libevent-2.1.8-5.el8.x86_64
      tmux-2.7-1.el8.x86_64
    Changes queued for next boot. Run "systemctl reboot" to start a reboot
    
  2. Once the package is installed, verify the status of the current deployment:

    rpm-ostree status
    

    Example system response:

    State: idle
    Deployments:
    - mkex:mkex/8/x86_64/mcr/23.0-devel
        Version: MKEx/8/MCR/23.0-devel 2023.05.0 (2023-05-01T21:01:22Z)
        BaseCommit: 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91
          Diff: 2 added
        LayeredPackages: tmux
    - mkex:mkex/8/x86_64/mcr/23.0-devel
        Version: MKEx/8/MCR/23.0-devel 2023.05.0 (2023-05-01T21:01:22Z)
        Commit: 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91
    ```
    

    The system response shows that the current deployment has been modified and there are now two deployments in total.

  3. Reboot the system into the deployment that has the newly installed tmux package.

  4. Verify the status of the running deployed image following the system reboot:

    rpm-ostree status
    

    Example system response:

    State: idle
    Deployments:
    - mkex:mkex/8/x86_64/mcr/23.0-devel
        Version: MKEx/8/MCR/23.0-devel 2023.05.0 (2023-05-01T21:01:22Z)
        BaseCommit: 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91
        LayeredPackages: tmux
    - mkex:mkex/8/x86_64/mcr/23.0-devel
        Version: MKEx/8/MCR/23.0-devel 2023.05.0 (2023-05-01T21:01:22Z)
        Commit: 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91
    
Query the RPM database

Use the rpm-ostree db command to query and manipulate the package information stored in the database of a rpm-ostree system.

To show the package changes between two commits:

rpm-ostree db diff

Example system response:

ostree diff commit from: rollback deployment
(0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91)

ostree diff commit to: booted deployment
(8b6ffd83adcf6c6a37941cade3a8ffea0a80d3d311e5ee8b7f0a074be6143ca9)
Added:
  libevent-2.1.8-5.el8.x86_64
  tmux-2.7-1.el8.x86_64

To list packages within commits:

rpm-ostree db list
Clear unused deployments

Use the rpm-ostree cleanup command to remove old or unused deployments for the purpose of freeing up disk space. Following invoication, a configurable number of recent deployments is retained and the rest are deleted.

Note

The rpm-ostree cleanup operatåion is similar to using git gc in a Git repository, in that it serves to optimize disk usage and keeps the system clean.

To clear temporary files and leave deployments unchanged:

rpm-ostree cleanup --base

To remove a pending deployment:

rpm-ostree cleanup --pending

Example system response:

Transaction complete; bootconfig swap: yes; bootversion: boot.0.1, deployment count change:
-1
Query or modify kernel arguments

Use the rpm-ostree kargs command to manage kernel arguments (kargs) for the system. With this command, you can modify the kernel command-line parameters for the next reboot, as well as customize the kernel parameters for specific deployments.

To modify kernel arguments:

  1. View the kernel arguments for the currently booted deployment:

    rpm-ostree kargs
    

    Example system response:

    rw crashkernel=auto resume=/dev/mapper/rlo-swap rd.lvm.lv=rlo/root rd.lvm.lv=rlo/swap root=/dev/mapper/rlo-root \
    ostree=/ostree/boot.1/mkex/6f043f707b4bdd83304c977d39ead06da68e85e7786e42e7da673e53f3a7a2ca/0
    
  2. Add a new custom argument to the deployment, for example foo=bar:

    rpm-ostree kargs --append="foo=bar"
    

    Example system response:

    Checking out tree 0572d28... done
    Resolving dependencies... done
    Checking out packages... done
    Running pre scripts... done
    Running post scripts... done
    Running posttrans scripts... done
    Writing rpmdb... done
    Writing OSTree commit... done
    Staging deployment... done
    Freed: 67.3 MB (pkgcache branches: 0)
    Changes queued for next boot. Run "systemctl reboot" to start a reboot
    
  3. Review the pending change prior to rebooting the system:

    rpm-ostree kargs
    

    Example system response:

    rw crashkernel=auto resume=/dev/mapper/rlo-swap rd.lvm.lv=rlo/root rd.lvm.lv=rlo/swap \
    root=/dev/mapper/rlo-root ostree=/ostree/boot.1/mkex/6f043f707b4bdd83304c977d39ead06da68e85e7786e42e7da673e53f3a7a2ca/0 \
    foo=bar
    
  4. Reboot the system to enable it to begin using the deployment with the custom kernel argument:

    systemctl reboot
    
  5. Verify that the changes are in effect on the booted deployment:

    cat /proc/cmdline
    

    Example system response:

    BOOT_IMAGE=(hd0,msdos1)/ostree/mkex-6f043f707b4bdd83304c977d39ead06da68e85e7786e42e7da673e53f3a7a2ca/vmlinuz-4.18.0-425.19.2.el8_7.x86_64 \
    rw crashkernel=auto resume=/dev/mapper/rlo-swap rd.lvm.lv=rlo/root rd.lvm.lv=rlo/swap root=/dev/mapper/rlo-root \
    ostree=/ostree/boot.0/mkex/6f043f707b4bdd83304c977d39ead06da68e85e7786e42e7da673e53f3a7a2ca/0 \
    foo=bar
    

To delete the custom kernel argument that was appended to the booted deployment and force the system to automatically reboot after the command completes:

rpm-ostree kargs --delete="foo=bar" --reboot

Example system response:

Checking out tree 0572d28... done
Resolving dependencies... done
...<OUTPUT TRUNCATED>...
Writing OSTree commit... done
Staging deployment... done
Freed: 67.3 MB (pkgcache branches: 0)
Switch to a different tree

Use the rpm-ostree rebase command to update the base operating system image of the system.

Note

The rpm-ostree rebase operation is similar to using git rebase in a Git repository, in that it pulls in a newer version of the base operating system image and updates the system accordingly.

To update the base operating system image:

Note

For example purposes, mkex:mkex/8/x86_64/mcr/20.10-devel is the different base operating system image maintained in the repository.

  1. Rebase the current deployment on a different base operating system image maintained in this repository:

    rpm-ostree rebase mkex:mkex/8/x86_64/mcr/20.10-devel
    

    Example system response:

    Writing objects: 1
    Writing objects: 1... done
    Checking out tree 2c92ce7... done
    ...<OUTPUT TRUNCATED>...
    Downgraded:
      docker-ee 3:23.0.3-3.el8 -> 3:20.10.16-3.el8
      docker-ee-cli 1:23.0.3-3.el8 -> 1:20.10.16-3.el8
      docker-ee-rootless-extras 23.0.3-3.el8 -> 20.10.16-3.el8
    Changes queued for next boot. Run "systemctl reboot" to start a reboot
    

    Note

    You can use the ostree remote refs mkex command to list the remote refs that are available to your system.

  2. View the status of deployments on the system:

    rpm-ostree status
    

    Example system response:

    State: idle
    Deployments:
    - mkex:mkex/8/x86_64/mcr/20.10-devel
      Version: MKEx/8/MCR/20.10-devel 2023.05.0 (2023-05-01T20:59:26Z)
      BaseCommit: 2c92ce7c0c06af56dfba57f24529ffa94b59cb5103126024126fa5fb72966b87
        Diff: 3 downgraded
      LayeredPackages: tmux
    - mkex:mkex/8/x86_64/mcr/23.0-devel
      Version: MKEx/8/MCR/23.0-devel 2023.05.0 (2023-05-01T21:01:22Z)
      BaseCommit: 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91
      LayeredPackages: tmux
    ...<OUTPUT TRUNCATED>...
    
  3. Reboot the system to begin using the new base image:

    systemctl reboot
    
  4. Verify the rebase operation by checking the status of the newly booted deployment:

    rpm-ostree status --booted
    

    Example system response:

    State: idle
    BootedDeployment:
    - mkex:mkex/8/x86_64/mcr/20.10-devel
      Version: MKEx/8/MCR/20.10-devel 2023.05.0 (2023-05-01T20:59:26Z)
      BaseCommit: 2c92ce7c0c06af56dfba57f24529ffa94b59cb5103126024126fa5fb72966b87
      LayeredPackages: tmux
    
Using experimental commands

Use the rpm-ostree ex command to execute experimental commands in the context of a specific deployment. Invocation of this command executes a command that uses the files and environment of the specified deployment.

Example experimental command: To view the rpm-ostree history of the system:

rpm-ostree ex history

Example system response:

NOTICE: Experimental commands are subject to change.
BootTimestamp: 2023-06-01T03:40:34Z (19min ago)
└─ BootCount: 2; first booted on 2023-06-01T03:30:12Z (29min ago)
CreateTimestamp: 2023-05-31T22:56:15Z (5h 3min ago)
CreateCommand: install tmux
  mkex:mkex/8/x86_64/mcr/23.0-devel
    Version: MKEx/8/MCR/23.0-devel 2023.05.0 (2023-05-01T21:01:22Z)
    BaseCommit: 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91
    LayeredPackages: tmux
...<OUTPUT TRUNCATED>...
Deploy a specific commit

Use the rpm-ostree deploy` command to trigger the deployment of a specific operating image on the system.

Note

Similar to checking out to a specific commit in Git, the rpm-ostree deploy` command enables you to switch to a different version of the operating system by specifying the desired deployment through its commit ID.

To deploy the image with the commit ID:

Note

For example purposes, 0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91 serves as the commit ID.

rpm-ostree deploy \
0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91

Example system response:

Validating checksum
'0572d2897c74afb1d123461728e17e7204cb1f0a55fb7f4c13c1fda87de50d91'
1 metadata, 0 content objects fetched; 153 B transferred in 0 seconds; 0 bytes content written
2 metadata, 0 content objects fetched; 306 B transferred in 0 seconds; 0 bytes content written
Checking out tree 0572d28... done
...<OUTPUT TRUNCATED>...
Writing rpmdb... done
Writing OSTree commit... done
Staging deployment... done
Run "systemctl reboot" to start a reboot
Remove overlayed packages

Use the rpm-ostree uninstall command to remove installed packages from the system. This command removes the specified packages from the current deployment, similar to uninstalling packages from a Git repository.

Uninstall the tmux package previously added to the current deployment:

rpm-ostree uninstall tmux

Example system response:

Staging deployment... done
Removed:
  libevent-2.1.8-5.el8.x86_64
  tmux-2.7-1.el8.x86_64
Changes queued for next boot. Run "systemctl reboot" to start a reboot

Advanced use

Deploy package layering

Warning

Mirantis recommends that package layering be used only for debugging specific deployments, and not to manage system state at scale.

With package layering, you can create an overlay deployment with the rpm-ostree ex command and install packages into that overlay. Subsequently, this allows you to install additional packages on top of the base operating system image without modifying the codebase itself, thus permitting you to test new customizations and experimental changes against the booted deployment without risk.

Note

For example purposes, the procedure detailed herein will install the telnet package.

  1. Install the package.

    rpm-ostree install telnet
    

    Example system response:

    Checking out tree 0572d28... done
    ...<OUTPUT TRUNCATED>...
    Staging deployment... done
    Added:
      telnet-1:0.17-76.el8.x86_64
    Changes queued for next boot. Run "systemctl reboot" to start a reboot
    
  2. Boot the package.

    telnet
    

    Example system response:

    -bash: telnet: command not found
    
  3. Create a transient overlayfs filesystem for the booted /usr, and synchronize the changes from the source to the booted filesystem tree:

    rpm-ostree ex apply-live
    

    Example system response:

    NOTICE: Experimental commands are subject to change.
    Computing /etc diff to preserve... done
    Updating /usr... done
    Updating /etc... done
    Running systemd-tmpfiles for /run and /var... done
    Added:
      telnet-1:0.17-76.el8.x86_64
    Successfully updated running filesystem tree.
    
Manage system rollbacks

One of the key advantages of rpm-ostree is its ability to perform rollbacks to previous system states. Thus, whenever you encounter issues or unexpected behavior, you can invoke the rpm-ostree rollback command to revert the system to a known, stable deployment.

rpm-ostree rollback

To effectively manage rollbacks, Mirantis recommends that you maintain multiple deployments, each with a different version or configuration. Using this approach, you can switch between deployments and perform rollbacks as necessary.

OSTree components detail

OSTree is designed as a client-server system, where the server hosts the repositories containing the operating system images or deployments, and the clients interact with the server to fetch and manage these deployments.

OSTree server

The server side of OSTree is responsible for maintaining and serving the operating system images or deployments. Typically, the OSTree server is a repository hosting server.

The server-side tools used for maintaining the repositories and serving the deployments may vary depending on implementation. The list of commonly used tools include:

  • OSTree

    The core tool that manages the repository and handles the versioning and branching of the operating system deployments.

  • Repository hosting software

    Can be represented by the software such as ostree-repo or a dedicated repository management system such as Pulp or Artifactory.

  • Web server

    A web server such as Apache or NGINX, which handles the HTTP(S) communication, to access the repository server through it. Typically, the server locates on a centralized infrastructure or network accessible to the clients over the network. It can be hosted on-premises or in the cloud, depending on the deployment requirements.

OSTree client

The client side of OSTree is responsible for interacting with the server to fetch and manage the operating system deployments on individual systems.

System administrators and end-users use the client tools and utilities to perform package upgrade, rollback, installation, and other various operations on the deployments.

The primary client-side tool is typically the rpm-ostree command-line tool, which provides a set of commands for managing the deployments.

Alongside OSTree, depending on specific distribution or system requirements, additional tools or package managers can be used. For example, you can use DNF to manage traditional packages.

rpm-ostree troubleshooting

When working with rpm-ostree, it is crucial to be aware of common issues that can occur and to know how to troubleshoot them effectively.

Common problems during the tool usage include conflicts during upgrades, disk space limitations, and failed deployments. To troubleshoot these issues, consider basic techniques that include reviewing logs with journalctl, monitoring disk space, inspecting deployment health using rpm-ostree status, examining error messages for specific operations and so on.

Checking the system logs with journalctl

The journalctl command enables you to view the logs specifically related to the rpm-ostreed service. It provides valuable information about system events, errors, and warnings related to rpm-ostree.

When examining the logs, search for any error messages, warnings, or indications of failed operations. These can provide insights into the root cause of the issue and further troubleshooting steps.

To view and follow system logs for rpm-ostreed.service:

journalctl -f -u rpm-ostreed.service

Example system response:

-- Logs begin at Sun 2023-05-28 22:52:52 EDT. --
Jun 01 19:34:30 localhost.localdomain rpm-ostree[1323]: libostree HTTP error from remote MKEx for .....

To view system logs with a priority of ERR or higher for rpm-ostreed.service:

journalctl -p err -f -u rpm-ostreed.service

Examining error messages for specific operations

When performing operations such as upgrades, rollbacks, and installations with rpm-ostree, pay attention to any error messages that the system displays.

Error messages often provide specific details about the issue encountered, such as package conflicts, missing dependencies, or connectivity problems with repositories.

To examine error messages, run the desired rpm-ostree command and carefully read the output. Look for any error indications or specific error codes mentioned. These can help narrow down the issue and guide the troubleshooting process.

Example error messages during an upgrade:

Error: Can't upgrade to commit abcdefg: This package conflicts with package-xyz-1.0.0-1.x86_64.

The above error message indicates a package conflict preventing the upgrade. You can perform further investigation by checking the package versions, dependencies, and resolving the conflict accordingly.

Cancel an active transaction

The rpm-ostree tool provides a cancel command that you use to cancel an active transaction. This can come in handy in situations where you, for example, accidentally start an upgrade rebasing a large deployment and want to cancel the opration:

rpm-ostree cancel

Updating and managing repositories

To ensure a smooth and reliable experience with rpm-ostree, always keep the repositories up to date. This involves regular metadata updates and repository information refreshing.

To update the metadata for a repository:

rpm-ostree refresh-md

This fetches the latest information about available packages, dependencies, and updates. Similarly to updating remote branches in Git, refreshing of the metadata ensures that you have the latest information from the repositories.

Debug the cluster without SSH access

The MKEx ISO images include the mirantis/mkexdebug debug image, which you can use to debug your cluster.

  1. Following MKEx setup, upload the mirantis/mkexdebug image to a private, secure repository.

  2. Run the kubectl debug command to attach a dedicated troubleshooting container to the node you want to check:

    kubectl debug node/[node name] -it --image=mirantis/mkexdebug
    

    Once the troubleshooting container is in place, you can run commands against it to check the node.

Get support

Mirantis Kubernetes Engine (MKE) subscriptions provide access to prioritized support for designated contacts from your company, agency, team, or organization. MKE service levels are based on your subscription level and the cloud or cluster that you designate in your technical support case. Our support offerings are described on the Enterprise-Grade Cloud Native and Kubernetes Support page. You may inquire about Mirantis support subscriptions by using the contact us form.

The CloudCare Portal is the primary way that Mirantis interacts with customers who are experiencing technical issues. Access to the CloudCare Portal requires prior authorization by your company, agency, team, or organization, and a brief email verification step. After Mirantis sets up its backend systems at the start of the support subscription, a designated administrator at your company, agency, team, or organization, can designate additional contacts. If you have not already received and verified an invitation to our CloudCare Portal, contact your local designated administrator, who can add you to the list of designated contacts. Most companies, agencies, teams, and organizations have multiple designated administrators for the CloudCare Portal, and these are often the persons most closely involved with the software. If you do not know who your local designated administrator is, or you are having problems accessing the CloudCare Portal, you may also send an email to Mirantis support at support@mirantis.com.

Once you have verified your contact details and changed your password, you and all of your colleagues will have access to all of the cases and resources purchased. Mirantis recommends that you retain your Welcome to Mirantis email, because it contains information on how to access the CloudCare Portal, guidance on submitting new cases, managing your resources, and other related issues.

We encourage all customers with technical problems to use the knowledge base, which you can access on the Knowledge tab of the CloudCare Portal. We also encourage you to review the MKE product documentation and release notes prior to filing a technical case, as the problem may already be fixed in a later release or a workaround solution provided to a problem experienced by other customers.

One of the features of the CloudCare Portal is the ability to associate cases with a specific MKE cluster. These are referred to in the Portal as “Clouds”. Mirantis pre-populates your customer account with one or more Clouds based on your subscription(s). You may also create and manage your Clouds to better match how you use your subscription.

Mirantis also recommends and encourages customers to file new cases based on a specific Cloud in your account. This is because most Clouds also have associated support entitlements, licenses, contacts, and cluster configurations. These greatly enhance the ability of Mirantis to support you in a timely manner.

You can locate the existing Clouds associated with your account by using the Clouds tab at the top of the portal home page. Navigate to the appropriate Cloud and click on the Cloud name. Once you have verified that the Cloud represents the correct MKE cluster and support entitlement, you can create a new case via the New Case button near the top of the Cloud page.

One of the key items required for technical support of most MKE cases is the support bundle. This is a compressed archive in ZIP format of configuration data and log files from the cluster. There are several ways to gather a support bundle, each described in the paragraphs below. After you obtain a support bundle, you can upload the bundle to your new technical support case by following the instructions in the Mirantis knowledge base, using the Detail view of your case.

Obtain a full-cluster support bundle using the MKE web UI

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side nagivation panel, navigate to <user name> and click Support Bundle.

    It may take several minutes for the download to complete.

    Note

    The default name for the generated support bundle file is docker-support-<cluster-id>-YYYYmmdd-hh_mm_ss.zip. Mirantis suggests that you not alter the file name before submittal to the customer portal. However, if necessary, you can add a custom string between docker-support and <cluster-id>, as in: docker-support-MyProductionCluster-<cluster-id>-YYYYmmdd-hh_mm_ss.zip.

  3. Submit the support bundle to Mirantis Customer Support by clicking Share support bundle on the success prompt that displays when the support bundle finishes downloading.

  4. Fill in the Jira feedback dialog, and click Submit.

Obtain a full-cluster support bundle using the MKE API

  1. Create an environment variable with the user security token:

    export AUTHTOKEN=$(curl -sk -d \
    '{"username":"<username>","password":"<password>"}' \
    https://<mke-ip>/auth/login | jq -r .auth_token)
    
  2. Obtain a cluster-wide support bundle:

    curl -k -X POST -H "Authorization: Bearer $AUTHTOKEN" \
    -H "accept: application/zip" https://<mke-ip>/support \
    -o docker-support-$(date +%Y%m%d-%H_%M_%S).zip
    
  3. Add the --submit option to the support command to submit the support bundle to Mirantis Customer Support. The support bundle will be sent, along with the following information:

    • Cluster ID

    • MKE version

    • MCR version

    • OS/architecture

    • Cluster size

    For more information on the support command, refer to support.

Obtain a single-node support bundle using the CLI

  1. Use SSH to log into a node and run:

    MKE_VERSION=$((docker container inspect ucp-proxy \
    --format '{{index .Config.Labels "com.docker.ucp.version"}}' \
    2>/dev/null || echo -n 3.7.16)|tr -d [[:space:]])
    
    docker container run --rm \
      --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      --log-driver none \
      mirantis/ucp:${MKE_VERSION} \
      support > \
      docker-support-${HOSTNAME}-$(date +%Y%m%d-%H_%M_%S).tgz
    

    Important

    If SELinux is enabled, include the following additional flag: --security-opt label=disable.

    Note

    The CLI-derived support bundle only contains logs for the node on which you are running the command. If your MKE cluster is highly available, collect support bundles from all manager nodes.

  2. Add the --submit option to the support command to submit the support bundle to Mirantis Customer Support. The support bundle will be sent, along with the following information:

    • Cluster ID

    • MKE version

    • MCR version

    • OS/architecture

    • Cluster size

    For more information on the support command, refer to support.

Use PowerShell to obtain a support bundle

Run the following command on Windows worker nodes to collect the support information and automatically place it in a zip file:

$MKE_SUPPORT_DIR = Join-Path -Path (Get-Location) -ChildPath 'dsinfo'
$MKE_SUPPORT_ARCHIVE = Join-Path -Path (Get-Location) -ChildPath $('docker-support-' + (hostname) + '-' + (Get-Date -UFormat "%Y%m%d-%H_%M_%S") + '.zip')
$MKE_PROXY_CONTAINER = & docker container ls --filter "name=ucp-proxy" --format "{{.Image}}"
$MKE_REPO = if ($MKE_PROXY_CONTAINER) { ($MKE_PROXY_CONTAINER -split '/')[0] } else { 'mirantis' }
$MKE_VERSION = if ($MKE_PROXY_CONTAINER) { ($MKE_PROXY_CONTAINER -split ':')[1] } else { '3.6.0' }
docker container run --name windowssupport `
-e UTILITY_CONTAINER="$MKE_REPO/ucp-containerd-shim-process-win:$MKE_VERSION" `
-v \\.\pipe\docker_engine:\\.\pipe\docker_engine `
-v \\.\pipe\containerd-containerd:\\.\pipe\containerd-containerd `
-v 'C:\Windows\system32\winevt\logs:C:\eventlogs:ro' `
-v 'C:\Windows\Temp:C:\wintemp:ro' $MKE_REPO/ucp-dsinfo-win:$MKE_VERSION
docker cp windowssupport:'C:\dsinfo' .
docker rm -f windowssupport
Compress-Archive -Path $MKE_SUPPORT_DIR -DestinationPath $MKE_SUPPORT_ARCHIVE

API Reference

The Mirantis Kubernetes Engine (MKE) API is a REST API, available using HTTPS, that enables programmatic access to Swarm and Kubernetes resources managed by MKE. MKE exposes the full Mirantis Container Runtime API, so you can extend your existing code with MKE features. The API is secured with role-based access control (RBAC), and thus only authorized users can make changes and deploy applications to your cluster.

The MKE API is accessible through the same IP addresses and domain names that you use to access the MKE web UI. And as the API is the same one used by the MKE web UI, you can use it to programmatically do everything you can do from the MKE web UI.

The system manages Swarm resources through collections and Kubernetes resources through namespaces. For detailed information on these resource sets, refer to the RBAC core elements table in the Role-based access control documentation.

endpoint

Description

/roles

Allows you to enumerate and create custom permissions for accessing collections.

/accounts

Enables the management of users, teams, and organizations.

/configs

Provides access to the swarm configuration.

CLI Reference

The mirantis/ucp:3.x.y image includes commands that install and manage MKE on a Mirantis Container Runtime.

You can configure the commands using either flags or environment variables.

Environment variables can use either of the following types of syntax:

  • Pass the value from your shell using the docker container run -e VARIABLE_NAME syntax.

  • Specify the value explicitly from the command line using the docker container run -e VARIABLE_NAME=value syntax.


To use the MKE CLI:

MKE CLI use requires that you name the mirantis/ucp:3.x.y image ucp and bind-mount the Docker daemon socket:

docker container run -it --rm \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  <command> <command-options>

Additional information is available for each command by using the --help flag.

Note

To obtain the appropriate image, it may be necessary to use docker/ucp:3.x.y rather than mirantis/ucp:3.x.y, as older versions are associated with the docker organization. Review the images in the mirantis and docker organizations on Docker Hub to determine the correct organization.

backup

The backup command creates a backup of an MKE manager node. Specifically, the command creates a TAR file with the contents of the volumes used by the given MKE manager node and then prints it. You can then use the restore command to restore the data from an existing backup.

To create backups of a multi-node cluster, you only need to back up a single manager node. The restore operation will reconstitute a new MKE installation from the backup of any previous manager node.

Note

The backup contains private keys and other sensitive information. Use the --passphrase flag to encrypt the backup with PGP-compatible encryption or --no-passphrase to opt out of encrypting the backup. Mirantis does not recommend the latter option.


To use the backup command:

docker container run \
  --rm \
  --interactive \
  --name ucp \
  --log-driver none \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  backup <command-options> > backup.tar

Options

Option

Description

--debug, -D

Enables debug mode.

--file <filename>

Specifies the name of the file wherein the backup contents are written. This option requires that you bind-mount the file path to the container that is performing the backup. The file path must be relative to the container file tree. For example:

docker run <other options> --mount
type=bind,src=/home/user/backup:/backup mirantis/ucp --file
/backup/backup.tar

This option is ignored in interactive mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--include-logs

Stores an encrypted backup.log file in the mounted directory. Must be issued at the same time as the --file option. The default value is true.

--interactive, -i

Runs in interactive mode and prompts for configuration values.

--no-passphrase

Bypasses the option to encrypt the TAR file with a passphrase. Mirantis does not recommend this option.

--passphrase <value>

Encrypts the TAR file with a passphrase.

SELinux

Installing MKE on a manager node with SELinux enabled at the daemon and the operating system levels requires that you include --security-opt label=disable with your backup command. This flag disables SELinux policies on the MKE container. The MKE container mounts and configures the Docker socket as part of the MKE container. Therefore, the MKE backup process fails with the following error if you neglect to include this flag:

FATA[0000] unable to get valid Docker client: unable to ping Docker
daemon: Got permission denied while trying to connect to the Docker
daemon socket at unix:///var/run/docker.sock:
Get http://%2Fvar%2Frun%2Fdocker.sock/_ping:
dial unix /var/run/docker.sock: connect: permission denied -
If SELinux is enabled on the Docker daemon, make sure you run
MKE with "docker run --security-opt label=disable -v /var/run/docker.sock:/var/run/docker.sock ..."

To backup MKE with SELinux enabled at the daemon level:

docker container run \
  --rm \
  --interactive \
  --name ucp \
  --security-opt label=disable \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  backup <command-options> > backup.tar

ca

Important

You must have access to a recent backup of your MKE instance to run the ca command.

With the ca command you can make changes to the material of MKE Root CA servers. Specifically, you can set the server material to rotate automatically or you can replace it with your own certificate and private key.


The ca command must be run on a manager node:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  ca <command-options>

You can use the ca command with a provided Root CA certificate and key by bind-mounting these credentials to the CLI container at /ca/cert.pem and /ca/key.pem, respectively:

docker container run -it --rm \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /path/to/cert.pem:/ca/cert.pem \
  -v /path/to/key.pem:/ca/key.pem \
  mirantis/ucp:3.x.y \
  ca <command-options>

The requirements for doing this are:

  • The MKE Cluster Root CA certificate must have swarm-ca as its common name.

  • The MKE Client Root CA certificate must have UCP Client Root CA as its common name.

  • The certificate must be a self-signed root certificate, and intermediate certificates are not allowed.

  • The certificate and key must be in PEM format without a passphrase.

  • The MKE etcd Root CA certificate must have MKE etcd Root CA as its common name.

Finally, to apply the certificates, you must reboot the manager nodes one at a time, making sure to reboot the leader node last.

Note

If there are unhealthy nodes in the cluster, CA rotation cannot complete. If the rotation is hanging, you can run the following command to determine whether any nodes are down or are otherwise unable to rotate TLS certificates:

docker node ls --format "{{.ID}} {{.Hostname}} {{.Status}} {{.TLSStatus}}"

Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--cluster

Manipulates MKE Cluster Root CA.

--client

Manipulates MKE Client Root CA.

--rotate

Generates a new root CA certificate and key automatically.

Default: false

--force-recent-backup

Forces the CA change to occur even if the system does not have a recent backup.

Default: false

--etcd

Manipulates MKE etcd Root CA.

dump-certs

The dump-certs command prints the public certificates used by the MKE web server. Specifically, the command produces public certificates for the MKE web server running on the specified node. By default, it prints the contents of the ca.pem and cert.pem files.

Integrating MKE and MSR requires that you use this command with the --cluster --ca flags to configure MSR.


To use the dump-certs command:

docker container run --rm \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  dump-certs <command-options>

Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--ca

Prints only the contents of the ca.pem file.

--cluster

Prints the internal MKE swarm root CA and certificate instead of the public server certificate.

--etcd

Prints the etcd server certificate. By default, the option prints the contents of both the ca.pem and cert.pem files. You can, though, print only the contents of ca.pem by using the option in conjunction with the --ca option.

example-config

The example-config command displays an example configuration file for MKE.


To use the example-config command:

docker container run --rm -i \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  example-config

id

The id command prints the ID of the MKE components that run on your MKE cluster. This ID matches the ID in the output of the docker info command, when issued while using a client bundle.


To use the id command:

docker container run --rm \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  id

Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

images

The images command reviews the MKE images that are available on the specified node and pulls the ones that are missing.


To use the images command:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  images <command-options>

Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--list

Lists all the images used by MKE but does not pull them.

--pull <value>

Pulls the MKE images.

Valid values: always, missing, and never.

--registry-password <value>

Specifies the password to use when pulling images.

--registry-username <value>

Specifies the user name to use when pulling images.

--swarm-only

Returns only the images used in Swarm-only mode.

install

The install command installs MKE on the specified node. Specifically, the command initializes a new swarm, promotes the specified node into a manager node, and installs MKE.

The following customizations are possible when installing MKE:

  • Customize the MKE web server certificates:

    1. Create a volume named ucp-controller-server-certs.

    2. Copy the ca.pem, cert.pem, and key.pem files to the root directory.

    3. Run the install` command with the --external-server-cert flag.

  • Customize the license used by MKE using one of the following options:

    • Bind mount the file at /config/docker_subscription.lic in the tool. For example:

      -v /path/to/my/config/docker_subscription.lic:/config/docker_subscription.lic
      
    • Specify the --license $(cat license.lic) option.

If you plan to join more nodes to the swarm, open the following ports in your firewall:

  • 443 or the value of --controller-port

  • 2376 or the value of --swarm-port

  • 2377 or the Swarm gRPC port

  • 6443 or the value of --kube-apiserver-port

  • 179, 10250, 12376, 12379, 12380, 12381, 12382, 12383, 12384, 12385, 12386, 12387, 12388, 12390

  • 4789 (UDP) and 7946 (TCP/UDP) for overlay networking

For more information, refer to Open ports to incoming traffic.

Note

If you are installing MKE on a public cloud platform, see the cloud-specific MKE installation documentation for the following platforms:


To use the install command:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  install <command-options>

Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--interactive, -i

Runs in interactive mode, prompting for configuration values.

--admin-password <value>

Sets the MKE administrator password, $UCP_ADMIN_PASSWORD.

--admin-username <value>

Sets the MKE administrator user name, $UCP_ADMIN_USER.

--azure-ip-count <value>

Configures the number of IP addresses to be provisioned for each Azure Virtual Machine.

Default: 128.

binpack

Sets the Docker Swarm scheduler to binpack mode, for backward compatibility.

--cloud-provider <value>

Sets the cluster cloud provider.

Valid values: azure, gce.

--cni-installer-url <value>

Sets a URL that points to a Kubernetes YAML file that is used as an installer for the cluster CNI plugin. If specified, the default CNI plugin is not installed. If the URL uses the HTTPS scheme, no certificate verification is performed.

--controller-port <value>

Sets the port for the web UI and the API

Default: 443.

--data-path-addr <value>

Sets the address or interface to use for data path traffic, $UCP_DATA_PATH_ADDR.

Format: IP address or network interface name

--disable-tracking

Disables anonymous tracking and analytics.

--disable-usage

Disables anonymous usage reporting.

--dns-opt <value>

Sets the DNS options for the MKE containers, $DNS_OPT.

--dns-search <value>

Sets custom DNS search domains for the MKE containers, $DNS_SEARCH.

--dns <value>

Sets custom DNS servers for the MKE containers, $DNS.

--enable-profiling

Enables performance profiling.

--existing-config

Sets to use the latest existing MKE configuration during the installation. The installation will fail if a configuration is not found.

--external-server-cert

Customizes the certificates used by the MKE web server.

--external-service-lb <value>

Sets the IP address of the load balancer where you can expect to reach published services.

--force-insecure-tcp

Forces the installation to continue despite unauthenticated Mirantis Container Runtime ports.

--force-minimums

Forces the installation to occur even if the system does not meet the minimum requirements.

--host-address <value>

Sets the network address that advertises to other nodes, $UCP_HOST_ADDRESS.

Format: IP address or network interface name

--iscsiadm-pathvalue <value>

Sets the path to the host iscsiadm binary. This option is applicable only when --storage-iscsi is specified.

--kube-apiserver-port <value>

Sets the port for the Kubernetes API server.

Default: 6443.

--kv-snapshot-count <value>

Sets the number of changes between key-value store snapshots, $KV_SNAPSHOT_COUNT.

Default: 20000.

--kv-timeout <value>

Sets the timeout in milliseconds for the key-value store, $KV_TIMEOUT.

Default: 5000.

--license <value>

Adds a license, $UCP_LICENSE.

Format: “$(cat license.lic)”

--nodeport-range <value>

Sets the allowed port range for Kubernetes services of NodePort type.

Default: 32768-35535.

--pod-cidr <values>

Sets Kubernetes cluster IP pool for the Pods to be allocated from.

Default: 192.168.0.0/16.

--preserve-certs

Sets so that certificates are not generated if they already exist.

--pull <value>

Pulls MKE images.

Valid values: always, missing, and never

Default: missing.

--random

Sets the Docker Swarm scheduler to random mode, for backward compatibility.

--registry-password <value>

Sets the password to use when pulling images, $REGISTRY_PASSWORD.

--registry-username <value>

Sets the user name to use when pulling images, $REGISTRY_USERNAME.

--san <value>

Adds subject alternative names to certificates, $UCP_HOSTNAMES.

For example: --san www2.acme.com

--service-cluster-ip-range <value>

Sets the Kubernetes cluster IP Range for services.

Default: 10.96.0.0/16.

--skip-cloud-provider-check

Disables checks which rely on detecting which cloud provider, if any, the cluster is currently running on.

--storage-expt-enabled

Enables experimental features in Kubernetes storage.

--storage-iscsi

Enables ISCSI-based PersistentVolumes in Kubernetes.

--swarm-experimental

Enables Docker Swarm experimental features, for backward compatibility.

--swarm-grpc-port <value>

Sets the port for communication between nodes.

Default: 2377.

--swarm-port <value>

Sets the port for the Docker Swarm manager, for backward compatibility.

Default: 2376.

--unlock-key <value>

Sets the unlock key for this swarm-mode cluster, if one exists, $UNLOCK_KEY.

--unmanaged-cni

Indicates that Calico is the CNI provider, managed by MKE. Calico is the default CNI provider.

--kubelet-data-root

Configures the kubelet data root directory on Linux when performing new MKE installations.

--containerd-root

Configures the containerd root directory on Linux when performing new MKE installations. Any non-root directory containerd customizations must be made along with the root directory customizations prior to installation and with the --containerd-root flag omitted.

--ingress-controller

Configures the HTTP ingress controller for the management of traffic that originates outside the cluster.

--calico-ebpf-enabled

Sets whether Calico eBPF mode is enabled.

When specifying --calico-ebpf-enabled, do not use --kube-default-drop-masq-bits or --kube-proxy-mode.

--kube-default-drop-masq-bits

Sets whether MKE uses Kubernetes default values for iptables drop and masquerade bits.

--kube-proxy-mode

Sets the operational mode for kube-proxy.

Valid values: iptables, ipvs, disabled

Default: iptables.

--kube-protect-kernel-defaults

Protects kernel parameters from being overridden by kubelet.

Default: false.

Important

When enabled, kubelet can fail to start if the following kernel parameters are not properly set on the nodes before you install MKE or before adding a new node to an existing cluster:

vm.panic_on_oom=0
vm.overcommit_memory=1
kernel.panic=10
kernel.panic_on_oops=1
kernel.keys.root_maxkeys=1000000
kernel.keys.root_maxbytes=25000000

For more information, refer to Configure kernel parameters.

--swarm-only

Configures MKE in Swarm-only mode, which supports only Docker Swarm orchestration.

--windows-containerd-root <value>

Sets the root directory for containerd on Windows.

--secure-overlay

Enables IPSec network encryption using SecureOverlay in Kubernetes.

--calico-ip-auto-method <value>

Allows the user to set the method for autodetecting the IPv4 address for the host. When specified, IP autodetection method is set for calico-node.

--calico-vxlan

Sets the calico CNI dataplane to VXLAN.

Default: VXLAN.

vxlan-vni <value>

Sets the vxlan-vni ID. Note that dataplane must be set to VXLAN.

Valid values: 10000 - 20000.

Default: 10000.

--cni-mtu <value>

Sets the MTU for CNI interfaces. Calculate MTU size based on which overlay is in use. For user-specific configuration, subtract 20 bytes for IPIP or 50 bytes for VXLAN.

Default: 1480 for IPIP, 1450 for VXLAN.

--windows-kubelet-data-root <value>

Sets the data root directory for kubelet on Windows.

--default-node-orchestrator <value>

Sets the default node orchestrator for the cluster.

Valid values: swarm, kubernetes.

Default: swarm.

--iscsidb-path <value>

Sets the absolute path to host iscsi DB. Verify that --storage-iscsi is specified. Note that Symlinks are not allowed.

--kube-proxy-disabled

Disables kube-proxy. This option is activated by --calico-ebpf-enabled, and it cannot be used in combination with --kube-proxy-mode.

--cluster-label <value>

Sets the cluster label that is employed for usage reporting.

--multus-cni

Enables Multus CNI plugin in the MKE cluster. This meta plugin provides the ability to attach multiple network interfaces to Pods using other CNI plugins.

Note

Calico remains the primary CNI plugin.

SELinux

Installing MKE on a manager node with SELinux enabled at the daemon and the operating system levels requires that you include --security-opt label=disable with your install command. This flag disables SELinux policies on the installation container. The MKE installation container mounts and configures the Docker socket as part of the MKE installation container. Therefore, omitting this flag will result in the failure of your MKE installation with the following error:

FATA[0000] unable to get valid Docker client: unable to ping Docker
daemon: Got permission denied while trying to connect to the Docker
daemon socket at unix:///var/run/docker.sock:
Get http://%2Fvar%2Frun%2Fdocker.sock/_ping:
dial unix /var/run/docker.sock: connect: permission denied -
If SELinux is enabled on the Docker daemon, make sure you run
MKE with "docker run --security-opt label=disable -v /var/run/docker.sock:/var/run/docker.sock ..."

To install MKE with SELinux enabled at the daemon level:

docker container run -rm -it \
  --name ucp \
  --security-opt label=disable \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  install <command-options>

port-check-server

The port-check-server command verifies whether the specified port is available for use.


To use the port-check-server command:

docker run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  port-check-server <command-options>

Options

Option

Description

--listen-address, -l <value>

Sets the port on which to verify connectivity.

Default: :2376.

restore

The restore command restores an MKE cluster from a backup. Specifically, the command installs a new MKE cluster that is populated with the state of a previous MKE manager node using a TAR file originally generated using the backup command. All of the MKE settings, users, teams, and permissions are restored from the backup file.

The restore operation does not alter or recover the following cluster resources:

  • Containers

  • Networks

  • Volumes

  • Services

You can use the restore command on any manager node in an existing cluster. If the current node does not belong in a cluster, one is initialized using the value of the --host-address flag. When restoring on an existing Swarm-mode cluster, there must be no previous MKE components running on any node of the cluster. This cleanup operation is performed using the uninstall-ucp command.

If the restoration is performed on a different cluster than the one from which the backup file was created, the cluster root CA of the old MKE installation is not restored. This restoration invalidates any previously issued admin client bundles and, thus, all administrators are required to download new client bundles after the operation is complete. Any existing non-admin user client bundles remain fully operational.

By default, the backup TAR file is read from stdin. You can also bind-mount the backup file under /config/backup.tar and run the restore command with the --interactive flag.

Note

  • You must run uninstall-ucp before attempting the restore operation on an existing MKE cluster.

  • If your Swarm-mode cluster has lost quorum and the original set of managers are not recoverable, you can attempt to recover a single-manager cluster using the docker swarm init --force-new-cluster command.

  • You can restore MKE from a backup that was taken on a different manager node or a different cluster altogether.

To use the restore command:

docker run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --name ucp \
  mirantis/ucp:3.x.y \
  restore <command-options>

Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--interactive, i

Runs in interactive mode and prompts for configuration values.

--data-path-addr <value>

Sets the address or interface to use for data path traffic.

--force-minimums

Forces the install or upgrade, which will go through even if the system does not meet the minimum requirements.

--host-address <value>

Sets the network address to advertise to other nodes.

Format: IP address or network interface name

--passphrase <value>

Decrypts the backup TAR file with the provided passphrase.

--san <value>

Adds subject alternative names to certificates, for example, --san www1.acme.com

--swarm-grpc-port <value>

Sets the port for communication between nodes.

Default: 2377.

--unlock-key <value>

Sets the unlock key for a Swarm-mode cluster.

--swarm-only

Indicates that the backup cluster is configured in Swarm-only mode.

--timeout value

Sets the timeout duration.

Valid time units: ns, us, ms, s, m, and h.

Default: "30m".

support

Use the support command to create a support bundle for the specified MKE nodes. This command creates a support bundle file for the specified nodes, including the MKE cluster ID, and prints it to stdout.


To use the support command:

docker container run --rm \
  --name mke \
  --log-driver none \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  support <command-options> > docker-support.tgz

Options

Option

Description

--debug, -D

Enable debug mode.

--jsonlog

Produce JSON-formatted output for easier parsing.

--submit

Submit the support bundle to Mirantis Customer Support along with the following information:

  • Cluster ID

  • MKE version

  • MCR version

  • OS/architecture

  • Cluster size

--loglines

Limit the size of log files to the specified amount.

Default: 10000

--until

Retrieve logs until the specified date and time.

Format: YYYY-MM-DD HH:MM:SS

--since

Retrieve logs since the specified date and time.

Format: YYYY-MM-DD HH:MM:SS

--goroutine

Retrieve goroutine stack straces.

uninstall-ucp

The uninstall-ucp command uninstalls MKE from the specified swarm, preserving the swarm so that your applications can continue running.

After MKE is uninstalled, you can use the docker swarm leave and docker node rm commands to remove nodes from the swarm. You cannot join nodes to the swarm until MKE is installed again.


To use the uninstall-ucp command:

docker container run --rm -it \
       --name ucp \
       -v /var/run/docker.sock:/var/run/docker.sock \
       -v /var/log:/var/log \
       mirantis/ucp:3.x.y \
       uninstall-ucp <command-options>

Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--interactive, -i

Runs in interactive mode and prompts for configuration values.

--id <value>

Sets the ID of the MKE instance to uninstall.

--no-purge-secret

Configures the command to leave the MKE-related Swarm secrets in place.

--pull <value>

Pulls MKE images.

Valid values: always, missing, and never.

--purge-config

Removes the MKE configuration file when uninstalling MKE.

--registry-password <value>

Sets the password to use when pulling images.

--registry-username <value>

Sets the user name to use when pulling images.

--unmanaged-cni

Specifies that MKE was installed in unmanaged CNI mode. When this parameter is supplied to the uninstaller, no attempt is made to clean up /etc/cni, thus causing any user-supplied CNI configuration files to persist in their original state.

upgrade

The upgrade command upgrades an MKE cluster.

Prior to performing an upgrade, Mirantis recommends that you perform a backup of your MKE cluster using the backup command.

After upgrading MKE, log in to the MKE web UI and confirm that each node is healthy and that all nodes have been upgraded successfully.


To use the upgrade command:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.x.y \
  upgrade <command-options>

Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--interactive, -i

Runs in interactive mode and prompts for configuration values.

--admin-password <value>

Sets the MKE administrator password.

--admin-username <value>

Sets the MKE administrator user name.

--force-minimums

Forces the install or upgrade to occur even if the system does not meet the minimum requirements.

--host-address <value>

Overrides the previously configured host address with the specified IP address or network interface.

--id <value>

Sets the ID of the MKE instance to upgrade.

--manual-worker-upgrade

Sets whether to manually upgrade worker nodes.

Default: false.

--pull <value>

Pulls MKE images.

Valid values: always, missing, and never.

--registry-password <value>

Sets the password to use when pulling images.

--registry-username <value>

Sets the user name to use when pulling images.

--force-port-check

Forces the upgrade to continue even in the event of a port check failure.

Default: false.

--force-recent-backup

Forces the upgrade to occur even if the system does not have a recent backup.

Default: false.

--disable-rollback

Disables the automated rollback to the previous running version of MKE that occurs by default in the event of an upgrade failure.

Default: false.

checks (subcommand)

The checks subcommand runs the pre-upgrade review on your cluster.


To use the checks subcommand:

docker container run --rm -it \
  --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp \
  upgrade checks <command-options>
Options

Option

Description

--debug, -D

Enables debug mode.

--jsonlog

Produces JSON-formatted output for easier parsing.

--interactive, -i

Runs in interactive mode and prompts for configuration values.

--admin-password <value>

Sets the MKE administrator password.

--admin-username <value>

Sets the MKE administrator user name.

--id <value>

Sets the ID of the MKE instance to upgrade.

--pull <value>

Pulls MKE images.

Valid values: always, missing, and never.

--registry-password <value>

Sets the password to use when pulling images.

--registry-username <value>

Sets the user name to use when pulling images.

CIS Benchmarks

The Center for Internet Security (CIS) provides the CIS Kubernetes Benchmarks for each Kubernetes release. These benchmarks comprise a comprehensive set of recommendations that is targeted to enhancing Kubernetes security configuration. Designed to align with industry regulations, CIS Benchmarks ensure standards that meet diverse compliance requirements, and their universal applicability across Kubernetes distributions ensures the fortification of such environments and while fostering a robust security posture.

Note

  • The CIS Benchmark results detailed herein are verified against MKE 3.7.2.

  • Mirantis has based its handling of Kubernetes benchmarks on CIS Kubernetes Benchmark v1.7.0.

1 Control Plane Components

Section 1 is comprised of security recommendations for the direct configuration of Kubernetes control plane processes. It is broken out into four subsections:

2 etcd

Section 2 details recommendations for etcd configuration, under the assumption that you are running etcd in a Kubernetes Pod.

3 Control Plane Configuration

Section 3 details recommendations for cluster-wide areas, such as authentication and logging. It is broken out into two subsections:

4 Worker Nodes

Section 4 details security recommendations for the components that run on Kubernetes worker nodes.

Note

Note that the components for Kubernetes worker nodes may also run on Kubernetes master nodes. Thus, the recommendations in Section 4 should be applied to master nodes as well as worker nodes where the master nodes make use of these components.

Section 4 is broken out into two subsections:

5 Policies

Section 5 details recommendations for various Kubernetes policies which are important to the security of the environment. Section 5 is broken out into six subsections, with 5.6 not in use:

Release Notes

Considerations

  • Upgrading from one MKE minor version to another minor version can result in the downgrading of MKE middleware components. For more information, refer to the middleware versioning tables in the release notes of both the source and target MKE versions.

  • In MKE 3.7.0 - 3.7.1, performance issues may occur with both cri-dockerd and dockerd due to the manner in which cri-dockerd handles container and ImageFSInfo statistics.

MKE 3.7.16 current

The MKE 3.7.16 patch release focuses exclusively on CVE mitigation.

MKE 3.7.15

Patch release for MKE 3.7 introducing the following key features:

  • Ability to enable cAdvisor through API call

  • New flag for collecting metrics during support bundle generation

  • Hypervisor Looker dashboard information added to telemetry

MKE 3.7.14

The MKE 3.7.14 patch release focuses exclusively on CVE mitigation.

MKE 3.7.13

The MKE 3.7.13 patch release focuses exclusively on CVE mitigation.

MKE 3.7.12

Patch release for MKE 3.7 introducing the following key features:

  • Addition of external cloud provider support for AWS

  • GracefulNodeShutdown settings now configurable

MKE 3.7.11

The MKE 3.7.11 patch release focuses exclusively on CVE mitigation.

MKE 3.7.10

Patch release for MKE 3.7 introducing the following key features:

  • Support for NodeLocalDNS 1.23.1

  • Support for Kubelet node configurations

  • node-exporter port now configurable

MKE 3.7.9

The MKE 3.7.9 patch release focuses exclusively on CVE mitigation.

MKE 3.7.8

Patch release for MKE 3.7 introducing the following key features:

  • Addition of Kubernetes log retention configuration parameters

  • Customizability of audit log policies

  • Support for scheduling of etcd cluster cleanup and defragmentation

  • Inclusion of Docker events in MKE support bundle

MKE 3.7.7

The MKE 3.7.7 patch release focuses exclusively on CVE mitigation.

MKE 3.7.6

Patch release for MKE 3.7 introducing the following key features:

  • Kubernetes for GMSA now supported

  • Addition of ucp-cadvisor container level metrics component

MKE 3.7.5

Patch release for MKE 3.7 introducing the following key features:

  • etcd alarms are exposed through Prometheus metrics

  • Augmented validation for etcd storage quota

  • Improved handling of larger sized etcd instances

  • All errors now returned from pre upgrade checks

  • Minimum Docker storage requirement now part of pre upgrade checks

MKE 3.7.4 (discontinued)

MKE 3.7.4 was discontinued shortly after release due to issues encountered when upgrading to it from previous versions of the product.

MKE 3.7.3

The MKE 3.7.3 patch release focuses exclusively on CVE resolution.

MKE 3.7.2

Patch release for MKE 3.7 introducing the following key features:

  • Prometheus metrics scraped from Linux workers

  • Performance improvement to MKE image tagging API

MKE 3.7.1

Initial MKE 3.7.1 release introducing the following key features:

  • Support bundle metrics additions for new MKE 3.7 features

  • Added ability to filter organizations by name in MKE web UI

  • Increased Docker and Kubernetes CIS benchmark compliance

  • MetalLB supports MKE-specific loglevel

  • Improved Kubernetes role creation error handling in MKE web UI

  • Increased SAML proxy feedback detail

  • Upgrade verifies that cluster nodes have minimum required MCR

  • kube-proxy now binds only to localhost

  • Enablement of read-only rootfs for specific containers

  • Support for cgroup v2

  • Added MKE web UI capability to add OS constraints to swarm services

  • Added ability to set support bundle collection windows

  • Added ability to set line limit of log files in support bundles

  • Addition of search function to Grants > Swarm in MKE web UI

MKE 3.7.0

Initial MKE 3.7.0 release introducing the following key features:

  • ZeroOps: certificate management

  • ZeroOps: upgrade rollback

  • ZeroOps: metrics

  • Prometheus memory resources

  • etcd event cleanup

  • Ingress startup options: TLS, TCP/UDP, HTTP/HTTPS

  • Additional NGINX Ingress Controller options

  • Setting for NGINX Ingress Controller default ports

  • MetalLB

  • Lameduck configuration options

  • Multus CNI

  • SAML proxy

  • Addition of referral chasing LDAP parameter

  • Kubernetes update to version 1.27.4

  • Go update to version 1.20.5.

  • RethinkDB update to version 2.4.3

3.7.16

Release date

Name

Highlights

2024-OCT-21

MKE 3.7.16

Patch release for MKE 3.7 that focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Enhancements

The MKE 3.7.16 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Addressed issues

The MKE 3.7.16 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Known issues

MKE 3.7.16 known issues with available workaround solutions include:

[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible

In air-gapped swarm-only environments, upgrades fail to start if all of the MKE images are not preloaded on the selected manager node or if the node cannot automatically pull the required MKE images.

Workaround:

Ensure either that the manager nodes have the complete set of MKE images preloaded before performing an upgrade or that they can pull the images from a remote repository.

[MKE-11535] ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state

Due to the upstream dependency issue in gpu-feature-discovery software, customers may encounter nvidia-gpu-feature-discovery in CrashLoopBackOff state with the following errors:

I0726 08:42:14.338857       1 main.go:144] Start running
SIGSEGV: segmentation violation
PC=0x0 m=4 sigcode=1
signal arrived during cgo execution

goroutine 1 [syscall]:
runtime.cgocall(0x12f4f20, 0xc00025b720)
 /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc00025b6f8 sp=0xc00025b6c0 pc=0x409b2b

Workaround:

  1. Disable nvidia_device_plugin. Refer to Use an MKE configuration file.

  2. Install NVIDIA GPU components with the NVIDIA GPU Operator.

[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes

The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a Linux-only solution, however it attempts without success to also deploy to Windows nodes.

Workaround:

  1. Edit the node-local-dns daemonset:

    kubectl edit daemonset node-local-dns -n kube-system
    
  2. Add the following under spec.template.spec:

    nodeSelector:
      kubernetes.io/os: linux
    
  3. Save the daemonset.

[MKE-11525] Kubelet node profiles fail to supersede global setting

Flags specified in the global custom_kubelet_flags setting and then applied through kubelet node profiles end up being applied twice.

Workaround:

Do not define any global flags in the global custom_kubelet_flags setting that will be used in kubelet node profiles.

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.16 release.

Security information

The MKE 3.7.16 patch release focuses exclusively on CVE mitigation. To this end, the following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • [MKE-11881] [MKE-11851] [MKE-11979] [MKE-11978] [MKE-11972] [MKE-11973] Calico 3.28.2

  • [MKE-11431] NGINX Ingress Controller 1.11.3

  • [MKE-11362] cAdvisor 0.50.0

  • [FIELD-7282] Interlock 3.3.14

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2024-24790

Resolved

  • ucp-ucp-calico-* ucp-cadvisor

The various Is methods (IsPrivate, IsLoopback, etc) did not work as expected for IPv4-mapped IPv6 addresses, returning false for addresses which would return true in their traditional IPv4 forms.

CVE-2023-45288

Resolved

  • ucp-kube-ingress-controller

An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames. Maintaining HPACK state requires parsing and processing all HEADERS and CONTINUATION frames on a connection. When a request’s headers exceed MaxHeaderBytes, no memory is allocated to store the excess headers, but they are still parsed. This permits an attacker to cause an HTTP/2 endpoint to read arbitrary amounts of header data, all associated with a request which is going to be rejected. These headers can include Huffman-encoded data which is significantly more expensive for the receiver to decode than for an attacker to send. The fix sets a limit on the amount of excess header frames we will process before closing a connection.

CVE-2024-3154

Resolved

  • ucp-kube-ingress-controller

A flaw was found in cri-o, where an arbitrary systemd property can be injected via a Pod annotation. Any user who can create a pod with an arbitrary annotation may perform an arbitrary action on the host system.

CVE-2024-5535

Resolved

  • ucp-kube-ingress-controller

Issue summary: Calling the OpenSSL API function SSL_select_next_proto with an empty supported client protocols buffer may cause a crash or memory contents to be sent to the peer.

CVE-2024-4741

Resolved

  • ucp-kube-ingress-controller

CVE has been reserved by an organization or individual and is not currently available in the NVD.

CVE-2024-6197

Resolved

  • ucp-kube-ingress-controller

libcurl’s ASN1 parser has this utf8asn1str() function used for parsing an ASN.1 UTF-8 string. Itcan detect an invalid field and return error. Unfortunately, when doing so it also invokes free() on a 4 byte localstack buffer. Most modern malloc implementations detect this error and immediately abort. Some however accept the input pointer and add that memory to its list of available chunks. This leads to the overwriting of nearby stack memory. The content of the overwrite is decided by the free() implementation; likely to be memory pointers and a set of flags. The most likely outcome of exploting this flaw is a crash, although it cannot be ruled out that more serious results can be had in special circumstances.

CVE-2024-6874

Resolved

  • ucp-kube-ingress-controller

libcurl’s URL API function [curl_url_get()](https://curl.se/libcurl/c/curl_url_get.html) offers punycode conversions, to and from IDN. Asking to convert a name that is exactly 256 bytes, libcurl ends up reading outside of a stack based buffer when built to use the macidn IDN backend. The conversion function then fills up the provided buffer exactly - but does not null terminate the string. This flaw can lead to stack contents accidently getting returned as part of the converted string.

CVE-2024-2398

Resolved

  • ucp-kube-ingress-controller

When an application tells libcurl it wants to allow HTTP/2 server push, and the amount of received headers for the push surpasses the maximum allowed limit (1000), libcurl aborts the server push. When aborting, libcurl inadvertently does not free all the previously allocated headers and instead leaks the memory. Further, this error condition fails silently and is therefore not easily detected by an application.

CVE-2024-2466

Resolved

  • ucp-kube-ingress-controller

libcurl did not check the server certificate of TLS connections done to a host specified as an IP address, when built to use mbedTLS. libcurl would wrongly avoid using the set hostname function when the specified hostname was given as an IP address, therefore completely skipping the certificate check. This affects all uses of TLS protocols (HTTPS, FTPS, IMAPS, POPS3, SMTPS, etc).

CVE-2024-24788

Resolved

  • ucp-kube-ingress-controller

A malformed DNS message in response to a query can cause the Lookup functions to get stuck in an infinite loop.

CVE-2024-6119

Resolved

  • ucp-kube-ingress-controller

Applications performing certificate name checks (e.g., TLS clients checking server certificates) may attempt to read an invalid memory address resulting in abnormal termination of the application process. Impact summary: Abnormal termination of an application can a cause a denial of service. Applications performing certificate name checks (e.g., TLS clients checking server certificates) may attempt to read an invalid memory address when comparing the expected name with an otherName subject alternative name of an X.509 certificate. This may result in an exception that terminates the application program. Note that basic certificate chain validation (signatures, dates, …) is not affected, the denial of service can occur only when the application also specifies an expected DNS name, Email address or IP address. TLS servers rarely solicit client certificates, and even when they do, they generally don’t perform a name check against a reference identifier (expected identity), but rather extract the presented identity after checking the certificate chain. So TLS servers are generally not affected and the severity of the issue is Moderate. The FIPS modules in 3.3, 3.2, 3.1 and 3.0 are not affected by this issue.

CVE-2024-34155

Resolved

  • ucp-kube-ingress-controller

Calling any of the Parse functions on Go source code which contains deeply nested literals can cause a panic due to stack exhaustion.

CVE-2024-34156

Resolved

  • ucp-kube-ingress-controller

Calling Decoder.Decode on a message which contains deeply nested structures can cause a panic due to stack exhaustion. This is a follow-up to CVE-2022-30635.

CVE-2024-34158

Resolved

  • ucp-kube-ingress-controller

Calling Parse on a “// +build” build tag line with deeply nested expressions can cause a panic due to stack exhaustion.

CVE-2024-7347

Resolved

  • ucp-interlock-*

NGINX Open Source and NGINX Plus have a vulnerability in the ngx_http_mp4_module, which might allow an attacker to over-read NGINX worker memory resulting in its termination, using a specially crafted mp4 file. The issue only affects NGINX if it is built with the ngx_http_mp4_module and the mp4 directive is used in the configuration file. Additionally, the attack is possible only if an attacker can trigger the processing of a specially crafted mp4 file with the ngx_http_mp4_module. Note: Software versions which have reached End of Technical Support (EoTS) are not evaluated.

CVE-2024-35200

Resolved

  • ucp-interlock-*

When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC module, undisclosed HTTP/3 requests can cause NGINX worker processes to terminate.

CVE-2024-34161

Resolved

  • ucp-interlock-*

When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC module and the network infrastructure supports a Maximum Transmission Unit (MTU) of 4096 or greater without fragmentation, undisclosed QUIC packets can cause NGINX worker processes to leak previously freed memory.

CVE-2024-32760

Resolved

  • ucp-interlock-*

When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC module, undisclosed HTTP/3 encoder instructions can cause NGINX worker processes to terminate or cause or other potential impact.

CVE-2024-31079

Resolved

  • ucp-interlock-*

When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC module, undisclosed HTTP/3 requests can cause NGINX worker processes to terminate or cause other potential impact. This attack requires that a request be specifically timed during the connection draining process, which the attacker has no visibility and limited influence over.

CVE-2024-24990

Resolved

  • ucp-interlock-*

When NGINX Plus or NGINX OSS are configured to use the HTTP/3 QUIC module, undisclosed requests can cause NGINX worker processes to terminate. Note: The HTTP/3 QUIC module is not enabled by default and is considered experimental. For more information, refer to Support for QUIC and HTTP/3 https://nginx.org/en/docs/quic.html. Note: Software versions which have reached End of Technical Support (EoTS) are not evaluated.

3.7.15

Release date

Name

Highlights

2024-SEPT-30

MKE 3.7.15

Patch release for MKE 3.7 introducing the following enhancements:

  • Ability to enable cAdvisor through API call

  • New flag for collecting metrics during support bundle generation

  • Hypervisor Looker dashboard information added to telemetry

Enhancements

Detail on the new features and enhancements introduced in MKE 3.7.15 includes:

[MKE-11928] Ability to enable cAdvisor through API call

Addition of API endpoints through which admins can now enable and disable cAdvisor by POST to api/ucp/config/c-advisor/enable and api/ucp/config/c-advisor/disable, respectively.

[FIELD-7167] New flag for collecting metrics during support bundle generation

Added the metrics flag, which when used with the support CLI command triggers the collection of metrics during the generation of a support bundle. The default value is false.

[FIELD-5197] Hypervisor Looker dashboard information added to telemetry

Added Hypervisor Looker dashboard information to the MKE segment telemetry.

Addressed issues

Issues addressed in the MKE 3.7.15 release include:

  • [FIELD-7209] Fixed an issue wherein empty kubelet flags were added to the ucp-kubelet container following the restart to apply a kubelet profile.

  • [FIELD-7153] Fixed an issue wherein upgrades to MKE 3.7.12 failed on Swarm-only mode at the pre-upgrade check.

  • [FIELD-7044] Fixed an issue wherein in rare cases the k8s service fails to work on Windows nodes.

Known issues

MKE 3.7.15 known issues with available workaround solutions include:

[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible

In air-gapped swarm-only environments, upgrades fail to start if all of the MKE images are not preloaded on the selected manager node or if the node cannot automatically pull the required MKE images.

Workaround:

Ensure either that the manager nodes have the complete set of MKE images preloaded before performing an upgrade or that they can pull the images from a remote repository.

[MKE-11535] ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state

Due to the upstream dependency issue in gpu-feature-discovery software, customers may encounter nvidia-gpu-feature-discovery in CrashLoopBackOff state with the following errors:

I0726 08:42:14.338857       1 main.go:144] Start running
SIGSEGV: segmentation violation
PC=0x0 m=4 sigcode=1
signal arrived during cgo execution

goroutine 1 [syscall]:
runtime.cgocall(0x12f4f20, 0xc00025b720)
 /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc00025b6f8 sp=0xc00025b6c0 pc=0x409b2b

Workaround:

  1. Disable nvidia_device_plugin. Refer to Use an MKE configuration file.

  2. Install NVIDIA GPU components with the NVIDIA GPU Operator.

[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes

The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a Linux-only solution, however it attempts without success to also deploy to Windows nodes.

Workaround:

  1. Edit the node-local-dns daemonset:

    kubectl edit daemonset node-local-dns -n kube-system
    
  2. Add the following under spec.template.spec:

    nodeSelector:
      kubernetes.io/os: linux
    
  3. Save the daemonset.

[MKE-11525] Kubelet node profiles fail to supersede global setting

Flags specified in the global custom_kubelet_flags setting and then applied through kubelet node profiles end up being applied twice.

Workaround:

Do not define any global flags in the global custom_kubelet_flags setting that will be used in kubelet node profiles.

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.15 release.

Security information

Upgraded the following middleware component versions to resolve vulnerabilities in MKE:

  • Golang 1.22.7

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2024-34158

Resolved

  • ucp

  • ucp-agent

  • ucp-controller

  • ucp-auth-store

  • ucp-auth

  • ucp-cfssl

Calling Parse on a “// +build” build tag line with deeply nested expressions can cause a panic due to stack exhaustion.

CVE-2024-34156

Resolved

  • ucp

  • ucp-agent

  • ucp-controller

  • ucp-auth-store

  • ucp-auth

  • ucp-cfssl

Calling Decoder.Decode on a message which contains deeply nested structures can cause a panic due to stack exhaustion. This is a follow-up to CVE-2022-30635.

CVE-2024-34155

Resolved

  • ucp

  • ucp-agent

  • ucp-controller

  • ucp-auth-store

  • ucp-auth

  • ucp-cfssl

Calling any of the Parse functions on Go source code which contains deeply nested literals can cause a panic due to stack exhaustion.

3.7.14

Release date

Name

Highlights

2024-SEPT-11

MKE 3.7.14

Patch release for MKE 3.7 that focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Enhancements

The MKE 3.7.14 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Addressed issues

The MKE 3.7.14 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Known issues

MKE 3.7.14 known issues with available workaround solutions include:

[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible

In air-gapped swarm-only environments, upgrades fail to start if all of the MKE images are not preloaded on the selected manager node or if the node cannot automatically pull the required MKE images.

Workaround:

Ensure either that the manager nodes have the complete set of MKE images preloaded before performing an upgrade or that they can pull the images from a remote repository.

[MKE-11535] ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state

Due to the upstream dependency issue in gpu-feature-discovery software, customers may encounter nvidia-gpu-feature-discovery in CrashLoopBackOff state with the following errors:

I0726 08:42:14.338857       1 main.go:144] Start running
SIGSEGV: segmentation violation
PC=0x0 m=4 sigcode=1
signal arrived during cgo execution

goroutine 1 [syscall]:
runtime.cgocall(0x12f4f20, 0xc00025b720)
 /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc00025b6f8 sp=0xc00025b6c0 pc=0x409b2b

Workaround:

  1. Disable nvidia_device_plugin. Refer to Use an MKE configuration file.

  2. Install NVIDIA GPU components with the NVIDIA GPU Operator.

[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes

The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a Linux-only solution, however it attempts without success to also deploy to Windows nodes.

Workaround:

  1. Edit the node-local-dns daemonset:

    kubectl edit daemonset node-local-dns -n kube-system
    
  2. Add the following under spec.template.spec:

    nodeSelector:
      kubernetes.io/os: linux
    
  3. Save the daemonset.

[MKE-11525] Kubelet node profiles fail to supersede global setting

Flags specified in the global custom_kubelet_flags setting and then applied through kubelet node profiles end up being applied twice.

Workaround:

Do not define any global flags in the global custom_kubelet_flags setting that will be used in kubelet node profiles.

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.14 release.

Security information

The MKE 3.7.14 patch release focuses exclusively on CVE mitigation. To this end, the following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • [MKE-11916] Kubernetes 1.27.16

  • [MKE-11833] etcd 3.5.15

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2024-2466

Resolved

  • ucp-etcd

libcurl did not check the server certificate of TLS connections done to a host specified as an IP address, when built to use mbedTLS. libcurl would wrongly avoid using the set hostname function when the specified hostname was given as an IP address, therefore completely skipping the certificate check. This affects all uses of TLS protocols (HTTPS, FTPS, IMAPS, POPS3, SMTPS, etc).

CVE-2023-45288

Resolved

  • ucp-etcd

  • ucp-hypercube

An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames. Maintaining HPACK state requires parsing and processing all HEADERS and CONTINUATION frames on a connection. When a request’s headers exceed MaxHeaderBytes, no memory is allocated to store the excess headers, but they are still parsed. This permits an attacker to cause an HTTP/2 endpoint to read arbitrary amounts of header data, all associated with a request which is going to be rejected. These headers can include Huffman-encoded data which is significantly more expensive for the receiver to decode than for an attacker to send. The fix sets a limit on the amount of excess header frames we will process before closing a connection.

3.7.13

Release date

Name

Highlights

2024-AUG-19

MKE 3.7.13

Patch release for MKE 3.7 that focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Enhancements

The MKE 3.7.13 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Addressed issues

The MKE 3.7.13 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Known issues

MKE 3.7.13 known issues with available workaround solutions include:

[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible

In air-gapped swarm-only environments, upgrades fail to start if all of the MKE images are not preloaded on the selected manager node or if the node cannot automatically pull the required MKE images.

Workaround:

Ensure either that the manager nodes have the complete set of MKE images preloaded before performing an upgrade or that they can pull the images from a remote repository.

[MKE-11535] ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state

Due to the upstream dependency issue in gpu-feature-discovery software, customers may encounter nvidia-gpu-feature-discovery in CrashLoopBackOff state with the following errors:

I0726 08:42:14.338857       1 main.go:144] Start running
SIGSEGV: segmentation violation
PC=0x0 m=4 sigcode=1
signal arrived during cgo execution

goroutine 1 [syscall]:
runtime.cgocall(0x12f4f20, 0xc00025b720)
 /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc00025b6f8 sp=0xc00025b6c0 pc=0x409b2b

Workaround:

  1. Disable nvidia_device_plugin. Refer to Use an MKE configuration file.

  2. Install NVIDIA GPU components with the NVIDIA GPU Operator.

[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes

The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a Linux-only solution, however it attempts without success to also deploy to Windows nodes.

Workaround:

  1. Edit the node-local-dns daemonset:

    kubectl edit daemonset node-local-dns -n kube-system
    
  2. Add the following under spec.template.spec:

    nodeSelector:
      kubernetes.io/os: linux
    
  3. Save the daemonset.

[MKE-11525] Kubelet node profiles fail to supersede global setting

Flags specified in the global custom_kubelet_flags setting and then applied through kubelet node profiles end up being applied twice.

Workaround:

Do not define any global flags in the global custom_kubelet_flags setting that will be used in kubelet node profiles.

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.13 release.

Security information

The MKE 3.7.13 patch release focuses exclusively on CVE mitigation. To this end, the following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • [MKE-11602] [MKE-11595] Debian stable-20240722-slim

  • Golang 1.22.5

  • Alpine Linux 3.19

  • Calico 3.28.1

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2024-28835

Resolved

  • ucp-multus-cni

A flaw has been discovered in GnuTLS where an application crash can be induced when attempting to verify a specially crafted .pem bundle using the “certtool –verify-chain” command.

CVE-2023-50387

Resolved

  • ucp-multus-cni

Certain DNSSEC aspects of the DNS protocol (in RFC 4033, 4034, 4035, 6840, and related RFCs) allow remote attackers to cause a denial of service (CPU consumption) via one or more DNSSEC responses, aka the “KeyTrap” issue. One of the concerns is that, when there is a zone with many DNSKEY and RRSIG records, the protocol specification implies that an algorithm must evaluate all combinations of DNSKEY and RRSIG records.

CVE-2023-50868

Resolved

  • ucp-multus-cni

The Closest Encloser Proof aspect of the DNS protocol (in RFC 5155 when RFC 9276 guidance is skipped) allows remote attackers to cause a denial of service (CPU consumption for SHA-1 computations) via DNSSEC responses in a random subdomain attack, aka the “NSEC3” issue. The RFC 5155 specification implies that an algorithm must perform thousands of iterations of a hash function in certain situations.

CVE-2024-38095

Resolved

  • ucp-agent-win

  • ucp-kube-binaries-win

  • ucp-containerd-shim-process-win

  • ucp-hardware-info-win

  • ucp-pause-win

.NET and Visual Studio Denial of Service Vulnerability.

CVE-2024-4741

Resolved

  • ucp-agent and all other Linux images

CVE has been reserved by an organization or individual and is not currently available in the NVD.

CVE-2024-5535

Resolved

  • ucp-agent and all other Linux images

Issue summary: Calling the OpenSSL API function SSL_select_next_proto with an empty supported client protocols buffer may cause a crash or memory contents to be sent to the peer.

CVE-2024-2961

Resolved

  • ucp-calico-cni

  • ucp-calico-kube-controllers

  • ucp-calico-node

The iconv() function in the GNU C Library versions 2.39 and older may overflow the output buffer passed to it by up to 4 bytes when converting strings to the ISO-2022-CN-EXT character set, which may be used to crash an application or overwrite a neighbouring variable.

CVE-2024-33599

Resolved

  • ucp-calico-cni

  • ucp-calico-kube-controllers

  • ucp-calico-node

nscd: Stack-based buffer overflow in netgroup cache If the Name Service Cache Daemon’s (nscd) fixed size cache is exhausted by client requests then a subsequent client request for netgroup data may result in a stack-based buffer overflow. This flaw was introduced in glibc 2.15 when the cache was added to nscd. This vulnerability is only present in the nscd binary.

3.7.12

Release date

Name

Highlights

2024-JULY-29

MKE 3.7.12

Patch release for MKE 3.7 introducing the following enhancements:

  • Addition of external cloud provider support for AWS

  • GracefulNodeShutdown settings now configurable

Enhancements

Detail on the new features and enhancements introduced in MKE 3.7.12 includes:

[MKE-11534] Addition of external cloud provider support for AWS

MKE 3.7 now supports the use of external cloud providers for AWS. As a result, MKE 3.6 users who are using --cloud-provider=aws can now migrate to MKE 3.7, first by upgrading to MKE 3.6.17 and then to 3.7.12.

[FIELD-6967] GracefulNodeShutdown settings now configurable

With custom kubelet node profiles, you can now configure the following kubelet GracefulNodeShutdown flags, which control the node shutdown grace periods:

  • –shutdown-grace-period

  • –shutdown-grace-period-critical-pods

The GracefulNodeShutdown feature gate is enabled by default, with the shutdown grace parameters set to 0s.

For more information, refer to configure-gracefulnodeshutdown-settings.

Addressed issues

Issues addressed in the MKE 3.7.12 release include:

  • [FIELD-7110] Fixed an issue wherein nvidia-gpu-feature-discovery pods crashed with the "--mig-strategy=mixed": executable file not found in $PATH: unknown error whenever nvidia_device_plugin was enabled. If you observe related issues, apply the workaround described in [MKE-11535] ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state.

  • [FIELD-7106] At startup, the ucp-kubelet container now obtains the profile label directly from etcd, thus ensuring that the kubelet profile settings are in place at the moment the container is created.

  • [FIELD-7059] Fixed an issue wherein the ucp-cluster-agent container leaks connections to Windows nodes.

  • [FIELD-7053] Fixed an issue wherein, in rare cases, cri-dockerd failed to pull large images.

  • [FIELD-7037] Fixed an issue exclusive to MKE 3.7.8 through 3.7.10, wherein in an air-gapped environment, the addition of a second manager node could cause cluster deployment to fail.

  • [FIELD-7032] Fixed an issue wherein SAML metadata was not deleted from RethinkDB upon disablement of the SAML configuration.

  • [FIELD-7012] Addition of TLS configuration to node-exporter.

Known issues

MKE 3.7.12 known issues with available workaround solutions include:

[MKE-11535] ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state

Due to the upstream dependency issue in gpu-feature-discovery software, customers may encounter nvidia-gpu-feature-discovery in CrashLoopBackOff state with the following errors:

I0726 08:42:14.338857       1 main.go:144] Start running
SIGSEGV: segmentation violation
PC=0x0 m=4 sigcode=1
signal arrived during cgo execution

goroutine 1 [syscall]:
runtime.cgocall(0x12f4f20, 0xc00025b720)
 /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc00025b6f8 sp=0xc00025b6c0 pc=0x409b2b

Workaround:

  1. Disable nvidia_device_plugin. Refer to Use an MKE configuration file.

  2. Install NVIDIA GPU components with the NVIDIA GPU Operator.

[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes

The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a Linux-only solution, however it attempts without success to also deploy to Windows nodes.

Workaround:

  1. Edit the node-local-dns daemonset:

    kubectl edit daemonset node-local-dns -n kube-system
    
  2. Add the following under spec.template.spec:

    nodeSelector:
      kubernetes.io/os: linux
    
  3. Save the daemonset.

[MKE-11525] Kubelet node profiles fail to supersede global setting

Flags specified in the global custom_kubelet_flags setting and then applied through kubelet node profiles end up being applied twice.

Workaround:

Do not define any global flags in the global custom_kubelet_flags setting that will be used in kubelet node profiles.

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[FIELD-7023] Air-gapped swarm-only upgrades fail if images are inaccessible

In air-gapped swarm-only environments, upgrades fail to start if all of the MKE images are not preloaded on the selected manager node or if the node cannot automatically pull the required MKE images.

Workaround:

Ensure either that the manager nodes have the complete set of MKE images preloaded before performing an upgrade or that they can pull the images from a remote repository.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.12 release.

Security information

The following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • [MKE-11363][MKE-11313] Prometheus 2.53.1

  • [MKE-11556] cri-dockerd 0.3.15

The following middleware component has been reconfigured to resolve vulnerabilities in MKE:

  • [FIELD-7012] Addition of TLS configuration to node-exporter.

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2024-35195

Resolved

  • ucp-sf-notifier

Requests is a HTTP library. Prior to 2.32.0, when making requests through a Requests Session, if the first request is made with verify=False to disable cert verification, all subsequent requests to the same host will continue to ignore cert verification regardless of changes to the value of verify. This behavior will continue for the lifecycle of the connection in the connection pool. This vulnerability is fixed in 2.32.0.

CVE-2024-24557

Resolved

  • ucp-metrics

  • ucp-metrics-swarm-only

Moby is an open-source project created by Docker to enable software containerization. The classic builder cache system is prone to cache poisoning if the image is built FROM scratch. Also, changes to some instructions (most important being HEALTHCHECK and ONBUILD) would not cause a cache miss. An attacker with the knowledge of the Dockerfile someone is using could poison their cache by making them pull a specially crafted image that would be considered as a valid cache candidate for some build steps. 23.0+ users are only affected if they explicitly opted out of Buildkit (DOCKER_BUILDKIT=0 environment variable) or are using the /build API endpoint. All users on versions older than 23.0 could be impacted. Image build API endpoint (/build) and ImageBuild function from github.com/docker/docker/client is also affected as it the uses classic builder by default. Patches are included in 24.0.9 and 25.0.2 releases.

CVE-2023-45288

Resolved

  • ucp-metrics

  • ucp-metrics-swarm-only

An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames. Maintaining HPACK state requires parsing and processing all HEADERS and CONTINUATION frames on a connection. When a request’s headers exceed MaxHeaderBytes, no memory is allocated to store the excess headers, but they are still parsed. This permits an attacker to cause an HTTP/2 endpoint to read arbitrary amounts of header data, all associated with a request which is going to be rejected. These headers can include Huffman-encoded data which is significantly more expensive for the receiver to decode than for an attacker to send. The fix sets a limit on the amount of excess header frames we will process before closing a connection.

3.7.11

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.11, the upgrade will fail, and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2024-JULY-8

MKE 3.7.11

Patch release for MKE 3.7 that focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Enhancements

The MKE 3.7.11 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Addressed issues

The MKE 3.7.11 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Known issues

MKE 3.7.11 known issues with available workaround solutions include:

[MKE-11535][FIELD-7110] ucp-nvidia-gpu-feature-discovery pods may enter CrashLoopBackOff state

Due to the upstream dependency issue in gpu-feature-discovery software, customers may encounter nvidia-gpu-feature-discovery in CrashLoopBackOff state with the following errors:

"--mig-strategy=mixed": executable file not found in $PATH: unknown

I0726 08:42:14.338857       1 main.go:144] Start running
SIGSEGV: segmentation violation
PC=0x0 m=4 sigcode=1
signal arrived during cgo execution

goroutine 1 [syscall]:
runtime.cgocall(0x12f4f20, 0xc00025b720)
 /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc00025b6f8 sp=0xc00025b6c0 pc=0x409b2b

Workaround:

  1. Disable nvidia_device_plugin. Refer to Use an MKE configuration file.

  2. Install NVIDIA GPU components with the NVIDIA GPU Operator.

[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes

The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a Linux-only solution, however it attempts without success to also deploy to Windows nodes.

Workaround:

  1. Edit the node-local-dns daemonset:

    kubectl edit daemonset node-local-dns -n kube-system
    
  2. Add the following under spec.template.spec:

    nodeSelector:
      kubernetes.io/os: linux
    
  3. Save the daemonset.

[MKE-11525] Kubelet node profiles fail to supersede global setting

Flags specified in the global custom_kubelet_flags setting and then applied through kubelet node profiles end up being applied twice.

Workaround:

Do not define any global flags in the global custom_kubelet_flags setting that will be used in kubelet node profiles.

[FIELD-7023] Air-gapped upgrades fail if images are inaccessible

In air-gapped environments, upgrades fail if the MKE images are not preloaded on the selected manager node or the node cannot automatically pull the required MKE images. This results in a rollback to the previous MKE version, which in this particular scenario can inadvertently remove the etcd/RethinkDB cluster from the MKE cluster and thus require you to restore MKE from a backup.

Workaround:

Ensure either that the manager nodes have all necessary MKE images preloaded before performing an upgrade or that they can pull the images from a remote repository.

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.11 release.

Security information

The MKE 3.7.11 patch release focuses exclusively on CVE mitigation. To this end, the following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • [MKE-11542] Kubernetes 1.27.14

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2023-45288

Resolved

  • ucp-hyperkube

An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames. Maintaining HPACK state requires parsing and processing all HEADERS and CONTINUATION frames on a connection. When a request’s headers exceed MaxHeaderBytes, no memory is allocated to store the excess headers, but they are still parsed. This permits an attacker to cause an HTTP/2 endpoint to read arbitrary amounts of header data, all associated with a request which is going to be rejected. These headers can include Huffman-encoded data which is significantly more expensive for the receiver to decode than for an attacker to send. The fix sets a limit on the amount of excess header frames we will process before closing a connection.

CVE-2024-24786

Resolved

  • ucp-hyperkube

The protojson.Unmarshal function can enter an infinite loop when unmarshaling certain forms of invalid JSON. This condition can occur when unmarshaling into a message which contains a google.protobuf.Any value, or when the UnmarshalOptions.DiscardUnknown option is set.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail, and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.10

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.10, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2024-JUNE-17

MKE 3.7.10

Patch release for MKE 3.7 introducing the following enhancements:

  • Support for NodeLocalDNS 1.23.1

  • Support for Kubelet node configurations

  • node-exporter port now configurable

Enhancements

Detail on the new features and enhancements introduced in MKE 3.7.10 includes:

[MKE-11484] Support for NodeLocalDNS 1.23.1

With NodeLocalDNS, you can run a local instance of the DNS caching agent on each node in the cluster. This results in significant performance improvement versus relying on a centralized CoreDNS instance to resolve external DNS records, as the local NodeLocalDNS instance is able to cache DNS results and thus mitigate network latency and conntrack issues. For more information, refer to Manage NodeLocalDNS.

[MKE-11480] Support for Kubelet node configurations

You can now set kubelet node profiles through Kubernetes node Labels. With these profiles, which are a set of kubelet flags, you can customize the settings of your kubelet agents on a node-by-node level, in addition to setting cluster-wide flags for use by every kubelet agent. For more information, refer to Custom kubelet profiles.

[FIELD-6998] node-exporter port now configurable

You can now configure node_exporter_port through the MKE configuration file, from the 9100 default value.

Addressed issues

Issues addressed in the MKE 3.7.10 release include:

  • [FIELD-7002] Fixed an issue wherein the “SIGSEGV: segmentation violation” error caused crashLoopbackoff in nvidia-device-plugin Pods.

  • [MKE-11282][MKE-11171] Fixed an issue wherein MKE upgrade in --swarm-only mode failed due to ‘unavailable’ ports on manager nodes.

Known issues

MKE 3.7.10 known issues with available workaround solutions include:

[MKE-11531] NodeLocal DNS Pods attempt to deploy to Windows nodes

The DNS caching service that NodeLocalDNS deploys to nodes as Pods is a Linux-only solution, however it attempts without success to also deploy to Windows nodes.

Workaround:

  1. Edit the node-local-dns daemonset:

    kubectl edit daemonset node-local-dns -n kube-system
    
  2. Add the following under spec.template.spec:

    nodeSelector:
      kubernetes.io/os: linux
    
  3. Save the daemonset.

[MKE-11525] Kubelet node profiles fail to supersede global setting

Flags specified in the global custom_kubelet_flags setting and then applied through kubelet node profiles end up being applied twice.

Workaround:

Do not define any global flags in the global custom_kubelet_flags setting that will be used in kubelet node profiles.

[FIELD-7023] Air-gapped upgrades fail if images are inaccessible

In air-gapped environments, upgrades fail if the MKE images are not preloaded on the selected manager node or the node cannot automatically pull the required MKE images. This results in a rollback to the previous MKE version, which in this particular scenario can inadvertently remove the etcd/RethinkDB cluster from the MKE cluster and thus require you to restore MKE from a backup.

Workaround:

Ensure either that the manager nodes have all necessary MKE images preloaded before performing an upgrade or that they can pull the images from a remote repository.

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.10 release.

Security information

Updated the following middleware component versions to resolve vulnerabilities in MKE:

  • [MKE-11023] Calico 3.28.0

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2023-45288

Resolved

  • ucp-multus-cni

An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames. Maintaining HPACK state requires parsing and processing all HEADERS and CONTINUATION frames on a connection. When a request’s headers exceed MaxHeaderBytes, no memory is allocated to store the excess headers, but they are still parsed. This permits an attacker to cause an HTTP/2 endpoint to read arbitrary amounts of header data, all associated with a request which is going to be rejected. These headers can include Huffman-encoded data which is significantly more expensive for the receiver to decode than for an attacker to send. The fix sets a limit on the amount of excess header frames we will process before closing a connection.

CVE-2024-33599

Resolved

  • ucp-multus-cni

nscd: Stack-based buffer overflow in netgroup cache. If the Name Service Cache Daemon’s (nscd) fixed size cache is exhausted by client requests then a subsequent client request for netgroup data may result in a stack-based buffer overflow. This flaw was introduced in glibc 2.15 when the cache was added to nscd. This vulnerability is only present in the nscd binary.

CVE-2024-33600

Resolved

  • ucp-multus-cni

nscd: Null pointer crashes after notfound response. If the Name Service Cache Daemon’s (nscd) cache fails to add a not-found netgroup response to the cache, the client request can result in a null pointer dereference. This flaw was introduced in glibc 2.15 when the cache was added to nscd. This vulnerability is only present in the nscd binary.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.9

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.9, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2024-MAY-28

MKE 3.7.9

Patch release for MKE 3.7 that focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Enhancements

The MKE 3.7.9 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Addressed issues

The MKE 3.7.9 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Known issues

MKE 3.7.9 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[MKE-11282] –swarm-only upgrade fails due to ‘unavailable’ manager ports

Upgrades to Swarm-only clusters that were originally installed using the --swarm-only fail pre-upgrade checks at the Check 7 of 8: [Port Requirements] step.

Workaround:

Include the --force-port-check upgrade option when upgrading a Swarm-only cluster.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.9 release.

Security information

The MKE 3.7.9 patch release focuses exclusively on CVE mitigation. To this end, the following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • [MKE-11504] Golang 1.21.10

  • [MKE-11502] cri-dockerd 0.3.14

  • [MKE-11482] NGINX Ingress Controller 1.10.1

  • [MKE-11482] Gatekeeper 3.14.2

  • [MKE-11482] Metallb 0.14.5

  • DOCKER_EE_CLI 23.0.11~3

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2024-24557

Resolved

  • ucp-swarm

Moby is an open-source project created by Docker to enable software containerization. The classic builder cache system is prone to cache poisoning if the image is built FROM scratch. Also, changes to some instructions (most important being HEALTHCHECK and ONBUILD) would not cause a cache miss. An attacker with the knowledge of the Dockerfile someone is using could poison their cache by making them pull a specially crafted image that would be considered as a valid cache candidate for some build steps. 23.0+ users are only affected if they explicitly opted out of Buildkit (DOCKER_BUILDKIT=0 environment variable) or are using the /build API endpoint. All users on versions older than 23.0 could be impacted. Image build API endpoint (/build) and ImageBuild function from github.com/docker/docker/client is also affected as it the uses classic builder by default. Patches are included in 24.0.9 and 25.0.2 releases.

CVE-2023-45288

Resolved

  • ucp-swarm

  • ucp-azure-ip-allocatior

  • ucp-gatekeeper

  • ucp-node-feature-discovery

  • ucp-metallb-controller

  • ucp-metallb-speaker

An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames. Maintaining HPACK state requires parsing and processing all HEADERS and CONTINUATION frames on a connection. When a request’s headers exceed MaxHeaderBytes, no memory is allocated to store the excess headers, but they are still parsed. This permits an attacker to cause an HTTP/2 endpoint to read arbitrary amounts of header data, all associated with a request which is going to be rejected. These headers can include Huffman-encoded data which is significantly more expensive for the receiver to decode than for an attacker to send. The fix sets a limit on the amount of excess header frames we will process before closing a connection.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.8

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.8, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2024-MAY-6

MKE 3.7.8

Patch release for MKE 3.7 introducing the following enhancements:

  • Addition of Kubernetes log retention configuration parameters

  • Customizability of audit log policies

  • Support for scheduling of etcd cluster cleanup and defragmentation

  • Inclusion of Docker events in MKE support bundle

Enhancements

Detail on the new features and enhancements introduced in MKE 3.7.8 includes:

[MKE-11323] Addition of Kubernetes log retention configuration parameters

Audit log retention values for Kubernetes can now be customized using three new Kubernetes apiserver parameters in the MKE configuration file:

  • kube_api_server_audit_log_maxage

  • kube_api_server_audit_log_maxbackup

  • kube_api_server_audit_log_maxsize

[MKE-11265] Customizability of audit log policies

Audit log policies can now be customized, a feature that is enabled through the KubeAPIServerCustomAuditPolicyYaml and KubeAPIServerEnableCustomAuditPolicy settings.

[MKE-9275] Support for scheduling of etcd cluster cleanup and defragmentation

Customers can now schedule etcd cluster cleanup by way of a cron job. In addition, defragmentation can be configured to start following a successful cleanup operation. This new functionality is initiated through the /api/ucp/config-toml endpoint. For more information, refer to MKE Configuration File: etcd_cleanup_schedule_config.

[FIELD-6901] Inclusion of Docker events in MKE support bundle

The MKE support bundle now includes Docker event information from the nodes.

Addressed issues

Issues addressed in the MKE 3.7.8 release include:

  • [MKE-11329] Fixed an issue wherein an incorrect version displayed for coreDNS.

  • [MKE-11301] Fixed an issue wherein rollbacks in long running upgrade processes intermittently failed to add etcd members to the cluster.

  • [MKE-11281] Fixed an issue wherein MKE made attempts to run cAdvisor instances on unsupported Windows nodes.

  • [FIELD-6903] Fixed an issue wherein MKE upgrades from MKE 3.6.x to MKE 3.7.x failed when node ready state could not be saved.

Known issues

MKE 3.7.8 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[MKE-11282] –swarm-only upgrade fails due to ‘unavailable’ manager ports

Upgrades to Swarm-only clusters that were originally installed using the --swarm-only fail pre-upgrade checks at the Check 7 of 8: [Port Requirements] step.

Workaround:

Include the --force-port-check upgrade option when upgrading a Swarm-only cluster.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.8 release.

Security information

The MKE 3.7.8 patch release focuses exclusively on CVE mitigation. To this end, the following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • [MKE-11477] cri-dockerd 0.3.13

  • [MKE-11428] Interlock 3.3.13

  • [MKE-11482] Blackbox Exporter 0.25.0

  • [MKE-11482] Alert Manager 0.27.0

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2024-3651

Resolved

  • ucp-sf-notifier

Internationalized Domain Names in Applications (IDNA) vulnerable to denial of service from specially crafted inputs to idna.encode

CVE-2023-45288

Resolved

  • ucp-interlock

  • ucp-interlock-extension

An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames. Maintaining HPACK state requires parsing and processing all HEADERS and CONTINUATION frames on a connection. When a request’s headers exceed MaxHeaderBytes, no memory is allocated to store the excess headers, but they are still parsed. This permits an attacker to cause an HTTP/2 endpoint to read arbitrary amounts of header data, all associated with a request which is going to be rejected. These headers can include Huffman-encoded data which is significantly more expensive for the receiver to decode than for an attacker to send. The fix sets a limit on the amount of excess header frames we will process before closing a connection.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.7

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.7, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2024-APR-15

MKE 3.7.7

Patch release for MKE 3.7 that focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Enhancements

The MKE 3.7.7 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Addressed issues

The MKE 3.7.7 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Known issues

MKE 3.7.7 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[MKE-11282] –swarm-only upgrade fails due to ‘unavailable’ manager ports

Upgrades to Swarm-only clusters that were originally installed using the --swarm-only fail pre-upgrade checks at the Check 7 of 8: [Port Requirements] step.

Workaround:

Include the --force-port-check upgrade option when upgrading a Swarm-only cluster.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.7 release.

Security information

The MKE 3.7.7 patch release focuses exclusively on CVE mitigation. To this end, the following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • DOCKER_EE_CLI 23.0.10

  • Powershell

  • docker/docker vendor

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2024-21626

Resolved

  • ucp-dsinfo

  • ucp-compose

runc is a CLI tool for spawning and running containers on Linux according to the OCI specification. In runc 1.1.11 and earlier, due to an internal file descriptor leak, an attacker could cause a newly-spawned container process (from runc exec) to have a working directory in the host filesystem namespace, allowing for a container escape by giving access to the host filesystem (“attack 2”). The same attack could be used by a malicious image to allow a container process to gain access to the host filesystem through runc run (“attack 1”). Variants of attacks 1 and 2 could be also be used to overwrite semi-arbitrary host binaries, allowing for complete container escapes (“attack 3a” and “attack 3b”). runc 1.1.12 includes patches for this issue.

CVE-2024-0056

Resolved

  • ucp-dsinfo-win

  • ucp-containerd-shim-process-win

  • ucp-kube-binaries-win

  • ucp-pause-win

  • ucp-hardware-info-win

Microsoft.Data.SqlClient and System.Data.SqlClient SQL Data Provider Security Feature Bypass Vulnerability.

CVE-2024-24557

Resolved

  • ucp-agent

  • ucp-auth-store

  • ucp-controller

  • ucp-hardware-info

  • ucp

  • ucp-cfssl

Moby is an open-source project created by Docker to enable software containerization. The classic builder cache system is prone to cache poisoning if the image is built FROM scratch. Also, changes to some instructions (most important being HEALTHCHECK and ONBUILD) would not cause a cache miss. An attacker with the knowledge of the Dockerfile someone is using could poison their cache by making them pull a specially crafted image that would be considered as a valid cache candidate for some build steps. 23.0+ users are only affected if they explicitly opted out of Buildkit (DOCKER_BUILDKIT=0 environment variable) or are using the /build API endpoint. All users on versions older than 23.0 could be impacted. Image build API endpoint (/build) and ImageBuild function from github.com/docker/docker/client is also affected as it the uses classic builder by default. Patches are included in 24.0.9 and 25.0.2 releases.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.6

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.6, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2024-MAR-20

MKE 3.7.6

  • Kubernetes for GMSA now supported

  • Addition of ucp-cadvisor container level metrics component

Enhancements

Detail on the new features and enhancements introduced in MKE 3.7.6 includes:

[MKE-11046] Kubernetes for GMSA now supported

Mirantis now supports GMSA on Kubernetes in MKE. Consequently, users can generate GMSA credentials on the Kubernetes cluster and use these credentials in Pod specifications. This allows MCR to use the specified GMSA credentials while launching the Pods.

Kubernetes for GMSA functionality is off by default. To activate the function, set windows_gmsa to true in the MKE configuration file.

The implementation supports the latest specification of GMSA credentials, windows.k8s.io/v1. Before enabling this feature, ensure that there are no existing GMSA credential specs or resources using such specs.

[MKE-11022] Addition of ucp-cadvisor container level metrics component

The new optional ucp-cadvisor component runs a standalone cadvisor instance on each node, which provides additional container level metrics.

To enable the ucp-cadvisor component feature, set cadvisor_enabled to true in the MKE configuration file.

Note

Currently, the ucp-cadvisor component is supported for Linux nodes only. It is not supported for Windows nodes.

Addressed issues

Issues addressed in the MKE 3.7.6 release include:

  • [MKE-11091] Fixed an issue wherein running the image list command in Swarm-only mode failed to return the ucp-hardware-info-win image.

  • [MKE-10575] Fixed an issue wherein the results of the ca --rotate --help command failed to describe etcd CA rotation.

  • [FIELD-6889] Fixed an issue wherein cri-dockerd crashed on Windows server.

  • [FIELD-6810] Fixed an issue wherein MKE could not be installed on a FIPS-enabled Swarm cluster.

  • [FIELD-6785] Fixed an issue wherein the reinstallation of MKE failed following rotation of cluster ROOT CA.

Known issues

MKE 3.7.6 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[FIELD-6785] Reinstallation can fail following cluster CA rotation

If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE 3.7.x or 3.6.x on an existing docker swarm can fail with the following error messages:

unable to sign cert: {\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"

Workaround:

  1. Forcefully trigger swarm snapshot:

    old_val=$(docker info --format '{{.Swarm.Cluster.Spec.Raft.SnapshotInterval}}')
    docker swarm update --snapshot-interval 1
    docker swarm update --snapshot-interval ${old_val}
    
  2. Reattempt to install MKE.

[FIELD-6402] Default metric collection memory settings may be insufficient

In MKE 3.7, ucp-metrics collects more metrics than in previous versions of MKE. As such, for large clusters with many nodes, the following ucp-metrics component default settings may be insufficient:

  • memory request: 1Gi

  • memory limit: 2Gi

Workaround:

Administrators can modify the MKE configuration file to increase the default memory request and memory limit setting values for the ucp-metrics component. The settings to configure are both under the cluster section:

  • For memory request, modify the prometheus_memory_request setting

  • For memory limit, modify the prometheus_memory_limit setting

[MKE-11281] cAdvisor Pods on Windows nodes cannot enter ‘Running’ state

When you enable cAdvisor, Pods are deployed to every node in the cluster. These cAdvisor Pods only work on Linux nodes, however, so the Pods that are inadvertently targeted to Windows nodes remain perpetually suspended and never actually run.

we inadvertently target Windows nodes with cAdvisor and the workaround updates the DaemonSet such that only Linux nodes are targeted.

Workaround:

Update the DaemonSet so that only Linux nodes are targeted by patching the ucp-cadvisor DaemonSet to include a node selector for Linux:

kubectl patch daemonset ucp-cadvisor -n kube-system --type='json' \
-p='[{"op": "replace", "path": "/spec/template/spec/nodeSelector", "value":
{"kubernetes.io/os": "linux"}}]'
[MKE-11282] –swarm-only upgrade fails due to ‘unavailable’ manager ports

Upgrades to Swarm-only clusters that were originally installed using the --swarm-only fail pre-upgrade checks at the Check 7 of 8: [Port Requirements] step.

Workaround:

Include the --force-port-check upgrade option when upgrading a Swarm-only cluster.

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.6 release.

Security information

Updated the following middleware component versions to resolve vulnerabilities in MKE:

  • [MKE-11210] cri-dockerd 0.3.11

  • [MKE-11023] Calico 3.27.0/Calico for Windows 3.27.0

  • [MKE-11004] NGINX Ingress Controller 1.10.0

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2023-44487

Resolved

  • ucp-hyperkube

The HTTP/2 protocol allows a denial of service (server resource consumption) because request cancellation can reset many streams quickly, as exploited in the wild in August through October 2023.

CVE-2023-21626

Resolved

  • ucp-nvidia-gpu-feature-discovery

  • ucp-node-feature-discovery

Cryptographic issue in HLOS due to improper authentication while performing key velocity checks using more than one key.

CVE-2024-0727

Resolved

  • ucp-kube-ingress-controller

Processing a maliciously formatted PKCS12 file may lead OpenSSL to crash leading to a potential Denial of Service attack Impact summary: Applications loading files in the PKCS12 format from untrusted sources might terminate abruptly. A file in PKCS12 format can contain certificates and keys and may come from an untrusted source. The PKCS12 specification allows certain fields to be NULL, but OpenSSL does not correctly check for this case. This can lead to a NULL pointer dereference that results in OpenSSL crashing. If an application processes PKCS12 files from an untrusted source using the OpenSSL APIs then that application will be vulnerable to this issue. OpenSSL APIs that are vulnerable to this are: PKCS12_parse(), PKCS12_unpack_p7data(), PKCS12_unpack_p7encdata(), PKCS12_unpack_authsafes() and PKCS12_newpass(). We have also fixed a similar issue in SMIME_write_PKCS7(). However since this function is related to writing data we do not consider it security significant. The FIPS modules in 3.2, 3.1 and 3.0 are not affected by this issue.

CVE-2023-5528

Resolved

  • ucp-calico-node

A security issue was discovered in Kubernetes where a user that can create pods and persistent volumes on Windows nodes may be able to escalate to admin privileges on those nodes. Kubernetes clusters are only affected if they are using an in-tree storage plugin for Windows nodes.

CVE-2023-45142

Resolved

  • ucp-hyperkube

OpenTelemetry-Go Contrib is a collection of third-party packages for OpenTelemetry-Go. A handler wrapper out of the box adds labels http.user_agent and http.method that have unbound cardinality. It leads to the server’s potential memory exhaustion when many malicious requests are sent to it. HTTP header User-Agent or HTTP method for requests can be easily set by an attacker to be random and long. The library internally uses httpconv.ServerRequest that records every value for HTTP method and User-Agent. In order to be affected, a program has to use the otelhttp.NewHandler wrapper and not filter any unknown HTTP methods or User agents on the level of CDN, LB, previous middleware, etc. Version 0.44.0 fixed this issue when the values collected for attribute http.request.method were changed to be restricted to a set of well-known values and other high cardinality attributes were removed. As a workaround to stop being affected, otelhttp.WithFilter() can be used, but it requires manual careful configuration to not log certain requests entirely. For convenience and safe usage of this library, it should by default mark with the label unknown non-standard HTTP methods and User agents to show that such requests were made but do not increase cardinality. In case someone wants to stay with the current behavior, library API should allow to enable it.

CVE-2023-47108

Resolved

  • ucp-hyperkube

OpenTelemetry-Go Contrib is a collection of third-party packages for OpenTelemetry-Go. Prior to version 0.46.0, the grpc Unary Server Interceptor out of the box adds labels net.peer.sock.addr and net.peer.sock.port that have unbound cardinality. It leads to the server’s potential memory exhaustion when many malicious requests are sent. An attacker can easily flood the peer address and port for requests. Version 0.46.0 contains a fix for this issue. As a workaround to stop being affected, a view removing the attributes can be used. The other possibility is to disable grpc metrics instrumentation by passing otelgrpc.WithMeterProvider option with noop.NewMeterProvider.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.5

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.5, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2024-MAR-05

MKE 3.7.5

  • etcd alarms are exposed through Prometheus metrics

  • Augmented validation for etcd storage quota

  • Improved handling of larger sized etcd instances

  • All errors now returned from pre upgrade checks

  • Minimum Docker storage requirement now part of pre upgrade checks

Enhancements

Detail on the new features and enhancements introduced in MKE 3.7.5 includes:

[MKE-10834] etcd alarms are exposed through Prometheus metrics

The NOSPACE and CORRUPT alarms generated by etcd are now exposed through Prometheus metrics. In addition, the alertmanager now sends an alert in the event of a NOSPACE alarm.

[MKE-10833] Augmented validation for etcd storage quota

Validation for etcd storage quota has been extended in terms of minimum quota, maximum quota, and current database size.

  • Minimum quota validation: The system now enforces a minimum etcd storage quota of 2GB.

  • Maximum quota validation: etcd storage quotas can no longer exceed 8GB.

  • Current dbSize validation: Validation checks are now in place to verify whether the current database size exceeds the specified etcd storage quota across all etcd cluster members.

[MKE-10684] SAML CA certificate can now be reset with the DELETE eNZi endpoint

The SAML configuration can now be removed by issuing a DELETE request to https://{cluster}/enzi/v0/config/auth/saml.

[MKE-10070] All errors now returned from pre upgrade checks

All pre upgrade checks are now run to completion, after which a comprehensive list of failures is returned. Previously, a failure in any sub step would result in the exit of the pre upgrade check routine and the return of a single error. As such, any issues in the environment can be triaged in a single run.

[MKE-9946] Minimum Docker storage requirement now part of pre upgrade checks

The pre upgrade checks now verify that the minimum Docker storage requirement of 25GB is met.

[FIELD-6695] Improved handling of larger sized etcd instances

Now, when etcd storage usage exceeds the quota, there are steps that allow for the easy fixing and recovery of the MKE cluster. In addition, MKE web UI banners have been added to indicate etcd alarms and to inform the user when the storage quote setting is in excess of 40% of the node total memory.

Addressed issues

Issues addressed in the MKE 3.7.5 release include:

  • [MKE-10903] Fixed an issue wherein flood messages occurred in ucp-worker-agent.

  • [MKE-10835] Fixed an issue wherein Gatekeeper Pods were frequently entering CrashLoopBackOff due to the absence of an expected CRD definition.

  • [MKE-10644] Fixed an issue wherein a vulnerability in etcd library go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc allowed a memory exhaustion attack on the grpc server. The patch is applied on top of etcd version 3.5.10 and includes the fix for CVE-2023-47108. No other changes from etcd 3.5.10 are included.

  • [FIELD-6835] Fixed an issue wherein ucp-cluster-agent continually restarted during upgrade to MKE 3.7.4.

  • [FIELD-6695] Fixed multiple issues caused by large etcd instances:

    • The addition of a second manager sometimes caused etcd cluster failure when the etcd storage size was of substantial size.

    • Cold starting of MKE clusters would fail whenever the etcd storage size exceeded 2GB.

    • Simultaneous restarting of manager nodes would fail whenever the etcd storage size exceeded 2GB.

    • The etcd storage usage indicator in the MKE web UI banner was not accurate.

  • [FIELD-6670] Fixed an issue wherein CNI plugin log level was not consistent with the MKE log level setting.

  • [FIELD-6602] Resolved couldn't get dbus connection: dial unix /var/run/dbus/system_bus_socket error message by removing systemd collector and repaired undev mount errors in Node Exporter.

  • [FIELD-6598] Fixed an issue wherein users without access to the /system collection could promote worker nodes to manager nodes.

  • [FIELD-6573] Fixed an issue wherein kubelet failed to rotate Pod containter logs.

Known issues

MKE 3.7.5 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[FIELD-6785] Reinstallation can fail following cluster CA rotation

If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE 3.7.x or 3.6.x on an existing docker swarm can fail with the following error messages:

unable to sign cert: {\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"

Workaround:

  1. Forcefully trigger swarm snapshot:

    old_val=$(docker info --format '{{.Swarm.Cluster.Spec.Raft.SnapshotInterval}}')
    docker swarm update --snapshot-interval 1
    docker swarm update --snapshot-interval ${old_val}
    
  2. Reattempt to install MKE.

[FIELD-6402] Default metric collection memory settings may be insufficient

In MKE 3.7, ucp-metrics collects more metrics than in previous versions of MKE. As such, for large clusters with many nodes, the following ucp-metrics component default settings may be insufficient:

  • memory request: 1Gi

  • memory limit: 2Gi

Workaround:

Administrators can modify the MKE configuration file to increase the default memory request and memory limit setting values for the ucp-metrics component. The settings to configure are both under the cluster section:

  • For memory request, modify the prometheus_memory_request setting

  • For memory limit, modify the prometheus_memory_limit setting

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.5 release.

Security information

Updated the following middleware component versions to resolve vulnerabilities in MKE:

  • [MKE-10828] cri-dockerd 0.3.9

  • [MKE-10715] Azure VNET CNI 1.5.15

  • [FIELD-6602] Node Exporter 1.7.0

  • [MKE-10663] Docker EE CLI 23.0.8

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2023-5528

Resolved

  • ucp-dsinfo

  • ucp-compose

A security issue was discovered in Kubernetes where a user that can create pods and persistent volumes on Windows nodes may be able to escalate to admin privileges on those nodes. Kubernetes clusters are only affected if they are using an in-tree storage plugin for Windows nodes.

CVE-2023-3676

Resolved

  • ucp-dsinfo

  • ucp-compose

A security issue was discovered in Kubernetes where a user that can create pods and persistent volumes on Windows nodes may be able to escalate to admin privileges on those nodes. Kubernetes clusters are only affected if they are using an in-tree storage plugin for Windows nodes.

CVE-2023-3955

Resolved

  • ucp-dsinfo

  • ucp-compose

A security issue was discovered in Kubernetes where a user that can create pods on Windows nodes may be able to escalate to admin privileges on those nodes. Kubernetes clusters are only affected if they include Windows nodes.

CVE-2023-47108

Resolved

  • ucp-dsinfo

  • ucp-compose

OpenTelemetry-Go Contrib is a collection of third-party packages for OpenTelemetry-Go. Prior to version 0.46.0, the grpc Unary Server Interceptor out of the box adds labels net.peer.sock.addr and net.peer.sock.port that have unbound cardinality. It leads to the server’s potential memory exhaustion when many malicious requests are sent. An attacker can easily flood the peer address and port for requests. Version 0.46.0 contains a fix for this issue. As a workaround to stop being affected, a view removing the attributes can be used. The other possibility is to disable grpc metrics instrumentation by passing otelgrpc.WithMeterProvider option with noop.NewMeterProvider.

CVE-2023-45142

Resolved

  • ucp-dsinfo

  • ucp-compose

OpenTelemetry-Go Contrib is a collection of third-party packages for OpenTelemetry-Go. A handler wrapper out of the box adds labels http.user_agent and http.method that have unbound cardinality. It leads to the server’s potential memory exhaustion when many malicious requests are sent to it. HTTP header User-Agent or HTTP method for requests can be easily set by an attacker to be random and long. The library internally uses httpconv.ServerRequest that records every value for HTTP method and User-Agent. In order to be affected, a program has to use the otelhttp.NewHandler wrapper and not filter any unknown HTTP methods or User agents on the level of CDN, LB, previous middleware, etc. Version 0.44.0 fixed this issue when the values collected for attribute http.request.method were changed to be restricted to a set of well-known values and other high cardinality attributes were removed. As a workaround to stop being affected, otelhttp.WithFilter() can be used, but it requires manual careful configuration to not log certain requests entirely. For convenience and safe usage of this library, it should by default mark with the label unknown non-standard HTTP methods and User agents to show that such requests were made but do not increase cardinality. In case someone wants to stay with the current behavior, library API should allow to enable it.

CVE-2023-44487

Resolved

  • ucp-dsinfo

  • ucp-compose

The HTTP/2 protocol allows a denial of service (server resource consumption) because request cancellation can reset many streams quickly, as exploited in the wild in August through October 2023.

CVE-2023-6779

Resolved

  • ucp-multus-cni

An off-by-one heap-based buffer overflow was found in the __vsyslog_internal function of the glibc library. This function is called by the syslog and vsyslog functions. This issue occurs when these functions are called with a message bigger than INT_MAX bytes, leading to an incorrect calculation of the buffer size to store the message, resulting in an application crash. This issue affects glibc 2.37 and newer.

CVE-2023-6246

Resolved

  • ucp-multus-cni

A heap-based buffer overflow was found in the __vsyslog_internal function of the glibc library. This function is called by the syslog and vsyslog functions. This issue occurs when the openlog function was not called, or called with the ident argument set to NULL, and the program name (the basename of argv[0]) is bigger than 1024 bytes, resulting in an application crash or local privilege escalation. This issue affects glibc 2.36 and newer.

CVE-2023-6780

Resolved

  • ucp-multus-cni

An integer overflow was found in the __vsyslog_internal function of the glibc library. This function is called by the syslog and vsyslog functions. This issue occurs when these functions are called with a very long message, leading to an incorrect calculation of the buffer size to store the message, resulting in undefined behavior. This issue affects glibc 2.37 and newer.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.4 (discontinued)

(2024-JAN-31)

Warning

MKE 3.7.4 was discontinued shortly after release due to issues encountered when upgrading to it from previous versions of the product.

3.7.3

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.2, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2023-DEC-04

MKE 3.7.3

Patch release for MKE 3.7 that focuses exclusively on CVE resolution. For detail on the specific CVEs addressed, refer to Security information.

Enhancements

The MKE 3.7.3 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Addressed issues

The MKE 3.7.3 patch release focuses exclusively on CVE mitigation. For detail on the specific CVEs addressed, refer to Security information.

Known issues

MKE 3.7.3 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[FIELD-6785] Reinstallation can fail following cluster CA rotation

If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE 3.7.x or 3.6.x on an existing docker swarm can fail with the following error messages:

unable to sign cert: {\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"

Workaround:

  1. Forcefully trigger swarm snapshot:

    old_val=$(docker info --format '{{.Swarm.Cluster.Spec.Raft.SnapshotInterval}}')
    docker swarm update --snapshot-interval 1
    docker swarm update --snapshot-interval ${old_val}
    
  2. Reattempt to install MKE.

[FIELD-6402] Default metric collection memory settings may be insufficient

In MKE 3.7, ucp-metrics collects more metrics than in previous versions of MKE. As such, for large clusters with many nodes, the following ucp-metrics component default settings may be insufficient:

  • memory request: 1Gi

  • memory limit: 2Gi

Workaround:

Administrators can modify the MKE configuration file to increase the default memory request and memory limit setting values for the ucp-metrics component. The settings to configure are both under the cluster section:

  • For memory request, modify the prometheus_memory_request setting

  • For memory limit, modify the prometheus_memory_limit setting

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.3 release.

Security information

The MKE 3.7.3 patch release focuses exclusively on CVE mitigation. To this end, the following middleware component versions have been upgraded to resolve vulnerabilities in MKE:

  • [MKE-10346] Interlock 3.3.12

  • [MKE-10682] Calico 3.26.4/Calico for Windows 3.26.4

  • [SECMKE-113] cri-dockerd 0.3.7

  • [FIELD-6558] NGINX Ingress Controller 1.9.4

  • [MKE-10340] CoreDNS 1.11.1

  • [MKE-10309] Prometheus 2.48.0

  • [SECMKE-122] NVIDIA GPU Feature Discovery 0.8.2

  • [MKE-10586] Gatekeeper 3.13.4

The following table details the specific CVEs addressed, including which images are affected per CVE.

CVE

Status

Image mitigated

Problem details from upstream

CVE-2022-4886

Resolved

  • ucp-kube-ingress-controller

Ingress-nginx path sanitization can be bypassed with log_format directive.

Mitigation in MKE ingress controller was achieved by setting strict-validate-path-type and enable-annotation-validation. In addition, you can use OPA Gatekeeper in MKE 3.7.x to enforce stricter validation.

CVE-2023-3676

Partially resolved

  • ucp-nvidia-device-plugin

  • ucp-nvidia-gpu-feature-discovery

A security issue was discovered in Kubernetes where a user that can create pods on Windows nodes may be able to escalate to admin privileges on those nodes. Kubernetes clusters are only affected if they include Windows nodes.

CVE-2023-3955

Partially resolved

  • ucp-nvidia-device-plugin

  • ucp-nvidia-gpu-feature-discovery

A security issue was discovered in Kubernetes where a user that can create pods on Windows nodes may be able to escalate to admin privileges on those nodes. Kubernetes clusters are only affected if they include Windows nodes.

CVE-2023-5043

Resolved

  • ucp-kube-ingress-controller

Ingress nginx annotation injection causes arbitrary command execution.

Mitigation in MKE ingress controller was achieved by setting strict-validate-path-type and enable-annotation-validation. In addition, you can use OPA Gatekeeper in MKE 3.7.x to enforce stricter validation.

CVE-2023-5044

Resolved

  • ucp-kube-ingress-controller

Code injection via nginx.ingress.kubernetes.io/permanent-redirect annotation.

Mitigation in MKE ingress controller was achieved by setting strict-validate-path-type and enable-annotation-validation. In addition, you can use OPA Gatekeeper in MKE 3.7.x to enforce stricter validation.

CVE-2023-5528

Resolved

  • ucp-nvidia-gpu-feature-discovery

A security issue was discovered in Kubernetes where a user that can create pods and persistent volumes on Windows nodes may be able to escalate to admin privileges on those nodes. Kubernetes clusters are only affected if they are using an in-tree storage plugin for Windows nodes.

CVE-2023-39325

Partially resolved

  • ucp-alertmanager

  • ucp-azure-ip-allocator

  • ucp-blackbox-exporter

  • ucp-calico-cni

  • ucp-calico-kube-controllers

  • ucp-calico-node

  • ucp-coredns

  • ucp-gatekeeper

  • ucp-hyperkube

  • ucp-interlock

  • ucp-interlock-extension

  • ucp-kube-ingress-controller

  • ucp-kube-state-metrics

  • ucp-metallb-controller

  • ucp-metallb-speaker

  • ucp-metrics

  • ucp-metrics-swarm-only

  • ucp-node-exporter

A malicious HTTP/2 client which rapidly creates requests and immediately resets them can cause excessive server resource consumption. While the total number of requests is bounded by the http2.Server.MaxConcurrentStreams setting, resetting an in-progress request allows the attacker to create a new request while the existing one is still executing. With the fix applied, HTTP/2 servers now bound the number of simultaneously executing handler goroutines to the stream concurrency limit (MaxConcurrentStreams). New requests arriving when at the limit (which can only happen after the client has reset an existing, in-flight request) will be queued until a handler exits. If the request queue grows too large, the server will terminate the connection. This issue is also fixed in golang.org/x/net/http2 for users manually configuring HTTP/2. The default stream concurrency limit is 250 streams (requests) per HTTP/2 connection. This value may be adjusted using the golang.org/x/net/http2 package; see the Server.MaxConcurrentStreams setting and the ConfigureServer function.

CVE-2023-44487

Partially resolved

  • ucp

  • ucp-agent

  • ucp-alertmanager

  • ucp-azure-ip-allocator

  • ucp-blackbox-exporter

  • ucp-calico-cni

  • ucp-calico-kube-controllers

  • ucp-calico-node

  • ucp-containerd-shim-process

  • ucp-controller

  • ucp-coredns

  • ucp-gatekeeper

  • ucp-hyperkube

  • ucp-interlock

  • ucp-interlock-extension

  • ucp-kube-ingress-controller

  • ucp-kube-state-metrics

  • ucp-metallb-controller

  • ucp-metallb-speaker

  • ucp-metrics

  • ucp-metrics-swarm-only

  • ucp-multus-cni

  • ucp-node-exporter

  • ucp-nvidia-device-plugin

  • ucp-nvidia-gpu-feature-discovery

The HTTP/2 protocol allows a denial of service (server resource consumption) because request cancellation can reset many streams quickly, as exploited in the wild in August through October 2023.

CVE-2023-45142

Partially resolved

  • ucp-hyperkube

  • ucp-metrics

  • ucp-metrics-swarm-only

OpenTelemetry-Go Contrib is a collection of third-party packages for OpenTelemetry-Go. A handler wrapper out of the box adds labels http.user_agent and http.method that have unbound cardinality. It leads to the server’s potential memory exhaustion when many malicious requests are sent to it. HTTP header User-Agent or HTTP method for requests can be easily set by an attacker to be random and long. The library internally uses httpconv.ServerRequest that records every value for HTTP method and User-Agent. In order to be affected, a program has to use the otelhttp.NewHandler wrapper and not filter any unknown HTTP methods or User agents on the level of CDN, LB, previous middleware, etc. Version 0.44.0 fixed this issue when the values collected for attribute http.request.method were changed to be restricted to a set of well-known values and other high cardinality attributes were removed. As a workaround to stop being affected, otelhttp.WithFilter() can be used, but it requires manual careful configuration to not log certain requests entirely. For convenience and safe usage of this library, it should by default mark with the label unknown non-standard HTTP methods and User agents to show that such requests were made but do not increase cardinality. In case someone wants to stay with the current behavior, library API should allow to enable it.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.2

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.2, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2023-NOV-20

MKE 3.7.2

Patch release for MKE 3.7 introducing the following enhancements:

  • Prometheus metrics scraped from Linux workers

  • Performance improvement to MKE image tagging API

Enhancements

Detail on the new features and enhancements introduced in MKE 3.7.2 includes:

[MKE-10039] Prometheus metrics scraped from Linux workers

Prometheus node exporter metrics are now scraped from linux workers, whereas previously they were only scraped from controllers.

[FIELD-6492] Performance improvement to MKE image tagging API

Improvements have been made to the performance of the MKE image tagging API in large clusters with many nodes.

Addressed issues

Issues addressed in the MKE 3.7.2 release include:

  • [FIELD-6453] Fixed an issue wherein users assigned as admins lost their admin privileges after conducting an LDAP sync with JIT enabled.

  • [FIELD-6446] Fixed an issue wherein the /ucp/etcd/info API endpoint incorrectly displayed the size of objects stored in the database, rather than the actual size of the database on disk. This fix introduces the DbSizeInUse field, alongside the existing ``DbSize``field , which ensures the accurate reporting of both the logically used size and physically allocated size of the backend database.

  • [FIELD-6437] Fixed an issue with the MKE configuration setting cluster_config.metallb_config.metallb_ip_addr_pool.name wherein the name was not verified against RFC-1123 label names.

  • [FIELD-6353] Fixed an issue wherein the accounts/<org>/members API would provide incomplete results when requesting non-admins.

  • [MKE-10267] Fixed three instances of CVE-2023-4911, rated High, which were detected on glibc-related components in the ucp-calico-node image.

  • [MKE-10231] Fixed an issue wherein clusters were left in an inoperable state following either:

    • Upgrade of MKE 3.7.0 clusters installed with the --multus-cni argument to MKE 3.7.1

    • Installation of a fresh MKE 3.7.1 cluster with the --multus-cni argument

  • [MKE-10204] Fixed an issue whereby the ucp images --list command returned all images, including those that are swarm-only. Now the swarm-only images are only returned when the :command:–swarm-only flag is included.

  • [MKE-10202] Fixed an issue whereby in swarm-only mode workers were attempting to run a Kubernetes component following an MKE upgrade.

  • [MKE-10032] Fixed an issue wherein MKE debug levels were not applied to cri-dockerd logs.

  • [MKE-10031] Fixed an issue wherein Calico for Windows was continuously writing to the cri-dockerd logs.

Known issues

MKE 3.7.2 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8662] Swarm only manager nodes are labeled as mixed mode

When MKE is installed in swarm only mode, manager nodes start off in mixed mode. As Kubernetes installation is skipped altogether, however, they should be labeled as swarm mode.

Workaround: Change the labels following installation.

Change the labels following installation.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[FIELD-6785] Reinstallation can fail following cluster CA rotation

If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE 3.7.x or 3.6.x on an existing docker swarm can fail with the following error messages:

unable to sign cert: {\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"

Workaround:

  1. Forcefully trigger swarm snapshot:

    old_val=$(docker info --format '{{.Swarm.Cluster.Spec.Raft.SnapshotInterval}}')
    docker swarm update --snapshot-interval 1
    docker swarm update --snapshot-interval ${old_val}
    
  2. Reattempt to install MKE.

[FIELD-6402] Default metric collection memory settings may be insufficient

In MKE 3.7, ucp-metrics collects more metrics than in previous versions of MKE. As such, for large clusters with many nodes, the following ucp-metrics component default settings may be insufficient:

  • memory request: 1Gi

  • memory limit: 2Gi

Workaround:

Administrators can modify the MKE configuration file to increase the default memory request and memory limit setting values for the ucp-metrics component. The settings to configure are both under the cluster section:

  • For memory request, modify the prometheus_memory_request setting

  • For memory limit, modify the prometheus_memory_limit setting

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.2 release.

Security information

  • Updated the following middleware component versions to resolve vulnerabilities in MKE:

    • [MKE-10268] cri-dockerd 0.3.5

    • [MKE-10205] NGINX Ingress Controller 1.9.3

  • Resolved CVEs, as detailed:

    CVE

    Status

    Problem details from upstream

    CVE-2023-39325

    Resolved

    A malicious HTTP/2 client which rapidly creates requests and immediately resets them can cause excessive server resource consumption. While the total number of requests is bounded by the http2.Server.MaxConcurrentStreams setting, resetting an in-progress request allows the attacker to create a new request while the existing one is still executing. With the fix applied, HTTP/2 servers now bound the number of simultaneously executing handler goroutines to the stream concurrency limit (MaxConcurrentStreams). New requests arriving when at the limit (which can only happen after the client has reset an existing, in-flight request) will be queued until a handler exits. If the request queue grows too large, the server will terminate the connection. This issue is also fixed in golang.org/x/net/http2 for users manually configuring HTTP/2. The default stream concurrency limit is 250 streams (requests) per HTTP/2 connection. This value may be adjusted using the golang.org/x/net/http2 package; see the Server.MaxConcurrentStreams setting and the ConfigureServer function.

    CVE-2023-44487

    Resolved

    The HTTP/2 protocol allows a denial of service (server resource consumption) because request cancellation can reset many streams quickly, as exploited in the wild in August through October 2023.

    MKE

    MKE provides the disable_http2toml-config to disable http2 on the ucp-controller and other components that may be exposed to untrusted clients, regardless of authentication status. Setting this to true results in MKE-s HTTP servers excluding http2 from the http versions offered for negotiation and clients being unable to negotiate http2 protocol, even if they explicitly specify it.

    The setting is disabled by default. It is provided as a defense in-depth mechanism, in the event of an unforeseen scenario requires untrusted clients hit ports used by any of MKE-s http servers. The best defense continues to be disallowing untrusted clients from hitting any of the ports in use by MKE.

    Kubernetes

    Upstream Kubernetes has introduced the feature gate UnauthenticatedHTTP2DOSMitigation to mitigate against http2 rapid reset attacks on the kube-apiserver. The mitigation applies strictly to scenarios where the kube-apiserver communicates with an unauthenticated or anonymous client. It specifically does not apply to authenticated clients of the kube-apiserver.

    From upstream Kubernetes documentation: “Since this change has the potential to cause issues, the UnauthenticatedHTTP2DOSMitigation feature gate can be disabled to remove this protection (which is enabled by default). For example, when the API server is fronted by an L7 load balancer that is set up to mitigate http2 attacks, unauthenticated clients could force disable connection reuse between the load balancer and the API server (many incoming connections could share the same backend connection). An API server that is on a private network may opt to disable this protection to prevent performance regressions for unauthenticated clients.”

    To enable the configuration, MKE has introduced the unauthenticated_http2_dos_mitigation configuration setting. Edit the setting to true to enable this feature gate.

    CVE-2023-38545

    Resolved

    This flaw makes curl overflow a heap based buffer in the SOCKS5 proxy handshake. When curl is asked to pass along the host name to the SOCKS5 proxy to allow that to resolve the address instead of it getting done by curl itself, the maximum length that host name can be is 255 bytes. If the host name is detected to be longer, curl switches to local name resolving and instead passes on the resolved address only. Due to this bug, the local variable that means “let the host resolve the name” could get the wrong value during a slow SOCKS5 handshake, and contrary to the intention, copy the too long host name to the target buffer instead of copying just the resolved address there. The target buffer being a heap based buffer, and the host name coming from the URL that curl has been told to operate with.

    CVE-2023-38546

    Resolved

    This flaw allows an attacker to insert cookies at will into a running program using libcurl, if the specific series of conditions are met. libcurl performs transfers. In its API, an application creates “easy handles” that are the individual handles for single transfers. libcurl provides a function call that duplicates en easy handle called [curl_easy_duphandle](https://curl.se/libcurl/c/curl_easy_duphandle.html). If a transfer has cookies enabled when the handle is duplicated, the cookie-enable state is also cloned - but without cloning the actual cookies. If the source handle did not read any cookies from a specific file on disk, the cloned version of the handle would instead store the file name as none (using the four ASCII letters, no quotes). Subsequent use of the cloned handle that does not explicitly set a source to load cookies from would then inadvertently load cookies from a file named none - if such a file exists and is readable in the current directory of the program using libcurl. And if using the correct file format of course.

    CVE-2023-38039

    Resolved

    When curl retrieves an HTTP response, it stores the incoming headers so that they can be accessed later via the libcurl headers API. However, curl did not have a limit in how many or how large headers it would accept in a response, allowing a malicious server to stream an endless series of headers and eventually cause curl to run out of heap memory.

    CVE-2023-4911

    Resolved

    When curl retrieves an HTTP response, it stores the incoming headers so that they can be accessed later via the libcurl headers API. However, curl did not have a limit in how many or how large headers it would accept in a response, allowing a malicious server to stream an endless series of headers and eventually cause curl to run out of heap memory.

    CVE-2022-26788

    Resolved

    PowerShell Elevation of Privilege Vulnerability.

    CVE-2023-4911

    Resolved

    A buffer overflow was discovered in the GNU C Library’s dynamic loader ld.so while processing the GLIBC_TUNABLES environment variable. This issue could allow a local attacker to use maliciously crafted GLIBC_TUNABLES environment variables when launching binaries with SUID permission to execute code with elevated privileges.

    CVE-2023-45142

    Partially resolved

    OpenTelemetry-Go Contrib is a collection of third-party packages for OpenTelemetry-Go. A handler wrapper out of the box adds labels http.user_agent and http.method that have unbound cardinality. It leads to the server’s potential memory exhaustion when many malicious requests are sent to it. HTTP header User-Agent or HTTP method for requests can be easily set by an attacker to be random and long. The library internally uses httpconv.ServerRequest that records every value for HTTP method and User-Agent. In order to be affected, a program has to use the otelhttp.NewHandler wrapper and not filter any unknown HTTP methods or User agents on the level of CDN, LB, previous middleware, etc. Version 0.44.0 fixed this issue when the values collected for attribute http.request.method were changed to be restricted to a set of well-known values and other high cardinality attributes were removed. As a workaround to stop being affected, otelhttp.WithFilter() can be used, but it requires manual careful configuration to not log certain requests entirely. For convenience and safe usage of this library, it should by default mark with the label unknown non-standard HTTP methods and User agents to show that such requests were made but do not increase cardinality. In case someone wants to stay with the current behavior, library API should allow to enable it.

    MKE is unaffected by CVE-2023-45142, however some code scanners may still detect the following CVE/Image combinations:

    • CVE-2023-45142/ucp-hyperkube/kubelet

    • CVE-2023-45142/ucp-hyperkube//usr/local/bin/kube-apiserver

    • CVE-2023-45142/ucp-hyperkube//usr/local/bin/kube-scheduler

    • CVE-2023-45142/ucp-hyperkube//usr/local/bin/kube-proxy

    • CVE-2023-45142/ucp-hyperkube//usr/local/bin/kube-controller-manager

  • Mirantis has begun an initiative to align MKE with CIS Benchmarks, where pertinent. The following table details the CIS Benchmark resolutions and improvements that are introduced in MKE 3.7.2:

    CIS Benchmark type/version

    Recommendation designation

    Ticket

    Resolution/Improvement

    Kubernetes 1.7

    1.1.9

    MKE-9909

    File permissions for the CNI config file is set to 600.

    Kubernetes 1.7

    1.1.12

    MKE-10150

    The AlwaysPullImages admission control plugin, disabled by default, can now be enabled. To do so, edit the k8s_always_pull_images_ac_enabled parameter in the cluster_config section of the MKE configuration file.

    Kubernetes 1.7

    1.1.15

    MKE-9907

    File permissions for the kube-scheduler configuration file are restricted to 600.

    Kubernetes 1.7

    1.2.22

    MKE-9902

    The kubernetes apiserver --request-timeout argument can be set.

    Kubernetes 1.7

    1.2.23

    MKE-9992

    The kubernetes apiserver service-account-lookup argument is set explicitly to true.

    Kubernetes 1.7

    1.2.31, 4.2.12

    MKE-9978

    The hardening setting use_strong_tls_ciphers allows for limiting the list of accepted ciphers for cipher_suites_for_kube_api_server, cipher_suites_for_kubelet, and cipher_suites_for_etcd_server to the ciphers considered to be strong.

    Kubernetes 1.7

    1.3.1

    MKE-9990

    MKE now supports the kube_manager_terminated_pod_gc_threshold configuration parameter. Using this parameter, users can set the threshold for the terminated Pod garbage collector in Kube Controller Manager according to their cluster-specific requirement.

    Kubernetes 1.7

    2.7

    MKE-10012

    A separate unique certificate authority is now in place for etcd, with MKE components using certificates issued by it to connect to the component. In line with this new CA:

    • A new internal 12392 TCP port requirement is necessary for manager nodes.

    • Admin client bundles now include etcd_cert.pem and etcd_key.pem to connect directly to etcd. The ca.pem file includes etcd CA in addition to the Cluster and Client CAs.

    • The MKE CLI ca command now supports the rotation of the etcd root CA.

      Note

      Users upgrading to MKE 3.7.2 must rotate the etcd CA after doing so, to ensure the uniqueness of the etcd CA. For more information, refer to MKE etcd Root CA.

    Kubernetes 1.7

    4.1.3

    MKE-9911

    File permissions for the kubeconfig file are set to 600.

    Kubernetes 1.7

    4.1.5

    MKE-9910

    File permissions for the kubelet.conf file are set to 600 or fewer.

    Kubernetes 1.7

    4.1.9

    MKE-9912

    File permissions for the kubelet_daemon.conf file are restricted to 600 or fewer.

    Kubernetes 1.7

    5.1.1

    MKE-10138

    To ensure that the cluster-admin role is only used when necessary, a special ucp-metrics cluster role that has only the necessary permissions is now used by Prometheus.

    Kubernetes 1.7

    5.1.2

    MKE 10197, MKE-10114

    Multus and Calico service accounts no longer use the cluster-admin role, and as such no longer require access to secrets.

    Kubernetes 1.7

    5.1.3

    MKE-10117

    Replaced the Multus CNI template wildcards with the exact resources needed for the CNI Roles and ClusterRoles.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.1

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.1, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2023-SEPT-26

MKE 3.7.1

Patch release for MKE 3.7 introducing the following enhancements:

  • Support bundle metrics additions for new MKE 3.7 features

  • Added ability to filter organizations by name in MKE web UI

  • Increased Docker and Kubernetes CIS benchmark compliance

  • MetalLB supports MKE-specific loglevel

  • Improved Kubernetes role creation error handling in MKE web UI

  • Increased SAML proxy feedback detail

  • Upgrade verifies that cluster nodes have minimum required MCR

  • kube-proxy now binds only to localhost

  • Enablement of read-only rootfs for specific containers

  • Support for cgroup v2

  • Added MKE web UI capability to add OS constraints to swarm services

  • Added ability to set support bundle collection windows

  • Added ability to set line limit of log files in support bundles

  • Addition of search function to Grants > Swarm in MKE web UI

Enhancements

Detail on the new features and enhancements introduced in MKE 3.7.1 includes:

[MKE-10118] Support bundle metrics additions for new MKE 3.7 features

Metrics are now available in the support bundle for the following features, introduced in MKE 3.7.0:

[MKE-10106] Added ability to filter organizations by name in MKE web UI

A new admin function in the MKE web UI at Access Control > Orgs & Teams provides the ability to filter organizations by name.

[MKE-10069] Increased Docker and Kubernetes CIS benchmark compliance

Various containers have been moved to use readonly rootfs for the purpose of increasing compliance to Docker and Kubernetes CIS benchmarks.

[MKE-10045] MetalLB supports MKE-specific loglevel

MetalLB now supports MKE-specific loglevel.

[MKE-10041] Improved Kubernetes role creation error handling in MKE web UI

Attempts made in the MKE web UI to create Kubernetes roles with an invalid namespace now return a more accurate error message.

[MKE-10020] Increased SAML proxy feedback detail

SAML proxy configuration now provides more detailed feedback.

[MKE-9994] Upgrade verifies that cluster nodes have minimum required MCR

Before proceeding with upgrade, MKE now verifies that all cluster nodes are running the minimum required MCR version.

[MKE-9991] kube-proxy now binds only to localhost

To improve MKE security posture, the kube-proxy container health check now binds to localhost only, rather than to all interfaces.

[MKE-9653] Enablement of read-only rootfs for specific containers

To improve the MKE security posture, Calico kube-controller and several other ancilliary containers now run with read-only rootfs enabled.

[MKE-9362] Support for cgroup v2

MKE now supports cgroup v2.

[MKE-9105] Added MKE web UI capability to add OS constraints to swarm services

A new helper in the MKE web UI allows users to add OS constraints to swarm services by selecting the OS from a dropdown. Thereafter, the necessary constraints are automatically applied.

[FIELD-6026] Added ability to set support bundle collection windows

Added an MKE web UI function that allows users to specify the time period in which MKE support bundle data collection is to take place.

[FIELD-6024] Added ability to set line limit of log files in support bundles

In the MKE web UI, the Download support bundle dialog that opens when you navigate to <user name> and click Support Bundle` now has a control that you can use to set a line limit for log files. The valid line limit range for the Log lines limit control is from 1 to 999999.

[FIELD-5936] Addition of search function to Grants > Swarm in MKE web UI

MKE web UI users can now use a Search function in Grants > Swarm to filter the list of Swarm grants.

Addressed issues

Issues addressed in the MKE 3.7.1 release include:

  • [FIELD-6318] Fixed an issue wherein MKE authorization decisions for OIDC tokens were at times inconsistent across manager nodes.

  • [MKE-10028] Fixed an issue wherein the CoreDNS Pod became stuck following a restore from from an MKE backup.

  • [MKE-10022] Fixed an issue wherein clientSecret was not returned in GET TOML requests. Now it returns as <redacted>, indicating that it is set. In addition, reuse of the GET TOML <clientSecret: redacted> functions in PUT TOML requests.

  • [MKE-10021] Fixed an issue wherein whenever a user fixed a malformed SAML proxy setting, error messages would continue to propagate.

  • [MKE-9508] Fixed an issue wherein the PUT request to /api/ucp/config-toml was missing the text field needed to provide the MKE configuration file in the live API.

  • [FIELD-6335] Fixed an issue wherein MKE nodes at times showed Pending status, which is a false positive.

Known issues

MKE 3.7.1 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[FIELD-6785] Reinstallation can fail following cluster CA rotation

If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE 3.7.x or 3.6.x on an existing docker swarm can fail with the following error messages:

unable to sign cert: {\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"

Workaround:

  1. Forcefully trigger swarm snapshot:

    old_val=$(docker info --format '{{.Swarm.Cluster.Spec.Raft.SnapshotInterval}}')
    docker swarm update --snapshot-interval 1
    docker swarm update --snapshot-interval ${old_val}
    
  2. Reattempt to install MKE.

[FIELD-6402] Default metric collection memory settings may be insufficient

In MKE 3.7, ucp-metrics collects more metrics than in previous versions of MKE. As such, for large clusters with many nodes, the following ucp-metrics component default settings may be insufficient:

  • memory request: 1Gi

  • memory limit: 2Gi

Workaround:

Administrators can modify the MKE configuration file to increase the default memory request and memory limit setting values for the ucp-metrics component. The settings to configure are both under the cluster section:

  • For memory request, modify the prometheus_memory_request setting

  • For memory limit, modify the prometheus_memory_limit setting

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.1 release.

Security information

  • Updated the following middleware component versions to resolve vulnerabilities in MKE:

    • [MKE-10159] NGINX Ingress Controller 1.8.2

    • [FIELD-6356] AlertManager 0.26.0

    • [MKE-10050] CoreDNS 1.11.0

  • Mirantis has begun an initiative to align MKE with CIS Benchmarks, where pertinent. The following table details the CIS Benchmark resolutions and improvements that are introduced in MKE 3.7.1:

    CIS Benchmark type/version

    Recommendation designation

    Ticket

    Resolution/Improvement

    Docker 1.6

    4.9

    MKE-9960

    The MKE Dockerfiles were improved and are now exempt from ADD instructions, with only COPY in use.

    Kubernetes 1.7

    1.1.17

    MKE-9906

    The permission for /ucp-volume-mounts/ucp-node-certs/controller-manager.conf is now set to 600.

    Kubernetes 1.7

    1.2.9

    MKE-10149

    Support for the EventRateLimit admission controller has been added to MKE. By default, the admission controller remains disabled, however it can be enabled with a TOML configuration, as exemplified below:

    [cluster_config.k8s_event_rate_limit]
        event_rate_limit_ac_enabled = true
    
        [[cluster_config.k8s_event_rate_limit.limits]]
          limit = "Namespace"
          limit_qps = 1
          limit_burst = 1
          limit_cache_size = 16
    
        [[cluster_config.k8s_event_rate_limit.limits]]
          limit = "User"
          limit_qps = 1
          limit_burst = 1
          limit_cache_size = 16
    

    MKE will not validate the individual values for individual limits specified, except to employ a default value of 4096 for limit_cache_size when a value is provided.

    Refer to the Kubernetes documentation Admission Controllers Reference: EventRateLimit. Note that limit types are adherred to strictly, including case match.

    Important

    Ensure that you validate your configuration on a test cluster before applying it in production, as a misconfigured admission controller can make kube-apiserver unavailable for the cluster.

    Kubernetes 1.7

    1.3.7

    MKE-9904

    The --bind-address argument is set to 127.0.0.1 in ucp-kube-controllermanager.

    Kubernetes 1.7

    4.1.8

    MKE-10011, MKE-9917

    The kubelet Client Certficate Authority file ownership is now root:root, changed from its previous nobody:nogroup setting.

    Kubernetes 1.7

    4.2.5

    MKE-9913

    The kubelet streamingConnectIdleTimeout argument is set explicitly to 4h.

    Kubernetes 1.7

    4.2.6

    MKE-9914

    The kubelet make-iptables-util-chains argument is set explicitly to true.

    Kubernetes 1.7

    4.2.8

    MKE-10006

    The kubelet_event_record_qps parameter can now be configured in the MKE configuration file, as exemplified below:

    [cluster_config]
        kubelet_event_record_qps = 50
    

    Kubernetes 1.7

    5.1.5

    MKE-10005

    The MKE install process now sets default service accounts in control plane namespaces to specifically not automount service account tokens.

    Kubernetes 1.7

    5.1.6

    MKE-9921

    The use of service account tokens is restricted, allowing for mounting only where necessary in MKE system namespaces.

    Kubernetes 1.7

    5.2.2

    MKE-9923

    Work was done to minimize the admission of privileged containers.

    Kubernetes 1.7

    5.2.8

    MKE-9924

    NET_RAW capability has been removed from all unprivileged system containers.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

3.7.0

Caution

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must upgrade to MKE 3.7.12 or later, as these versions support a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Release date

Name

Highlights

2023-AUG-31

MKE 3.7.0

Initial MKE 3.7.0 release introducing the following key features and enhancements:

  • ZeroOps: certificate management

  • ZeroOps: upgrade rollback

  • ZeroOps: metrics

  • Prometheus memory resources

  • etcd event cleanup

  • Ingress startup options: TLS, TCP/UDP, HTTP/HTTPS

  • Additional NGINX Ingress Controller options

  • Setting for NGINX Ingress Controller default ports

  • MetalLB

  • Lameduck configuration options

  • Multus CNI

  • SAML proxy

  • Addition of referral chasing LDAP parameter

  • Kubernetes update to version 1.27.4

  • Go update to version 1.20.5.

  • RethinkDB update to version 2.4.3

New features

Detail on the new features and enhancements introduced in MKE 3.7.0 includes:

ZeroOps: certificate management

MKE offers the ability to manage the two root certificate authorities: MKE Cluster Root CA and MKE Client Root CA.

ZeroOps: upgrade rollback

If your MKE upgrade fails, you can roll back to the previously running MKE version without rebuilding your cluster from a backup.

Learn more

Perform the upgrade

ZeroOps: metrics

MKE exposes Prometheus metrics associated with the following core components and functionality:

  • Kube State Metrics

  • Kubernetes Workqueue

  • Kubelet

  • Kube Proxy

  • Kube Controller Manager

  • Kube API Server

  • Calico

  • RethinkDB

Prometheus memory resources

Added MKE configuration file options for the minimum and maximum amount of memory that can be used by the Prometheus container.

etcd event cleanup

Manually clean up Kubernetes event objects in etcd using the MKE API.

TLS passthrough

Use TLS passthrough to pass un-decrypted data through the NGINX Ingress Controller to your web server.

TCP and UDP services

Expose TCP and UDP services using NGINX Ingress Controller.

Additional NGINX Ingress Controller options

Added the following NGINX Ingress Controller options to the MKE configuration file:

  • ingress_extra_args.http_port: Sets the container port for servicing HTTP traffic.

  • ingress_extra_args.https_port: Sets the container port for servicing HTTPS traffic.

  • ingress_extra_args.enable_ssl_passthrough: Enables SSL passthrough.

  • ingress_extra_args.default_ssl_certificate: Sets the Secret that contains an SSL certificate to be used as the default HTTPS server.

Setting for NGINX Ingress Controller default ports

The NGINX Ingress Controller default ports can be changed in the MKE web UI Admin Settings.

MetalLB

Bare metal Kubernetes clusters can leverage MetalLB to create Load Balancer services, offering features such as address allocation and external announcement.

Learn more

Deploy MetalLB

Lameduck configuration options

The MKE configuration file includes options to enable and disable lameduck in CoreDNS.

Multus CNI

MKE provides the option to use Multus CNI, a Kubernetes plugin that enables the attachment of multiple network interfaces to multi-homed Pods.

SAML proxy

Use a SAML proxy to secure your MKE deployment while benefiting from the use of SAML authentication.

Learn more

Set up SAML proxy

Addition of referral chasing LDAP parameter

Added the option to toggle Enable referral chasing in the LDAP configuration settings using the MKE web UI.

Enhancements

Detail on the enhancements introduced in MKE 3.7.0 includes:

Kubernetes 1.27.4

Updated Kubernetes to version 1.27.4.

Go 1.20.5

Updated Go to version 1.20.5.

RethinkDB 2.4.3

Updated RethinkDB to version 2.4.3

Addressed issues

MKE 3.7.0 is an initial version release, and as such it has no legacy issues that required resolution.

Known issues

MKE 3.7.0 known issues with available workaround solutions include:

[MKE-10152] Upgrading large Windows clusters can initiate a rollback

Upgrades can rollback on a cluster with a large number of Windows worker nodes.

Workaround:

Invoke the --manual-worker-upgrade option and then manually upgrade the workers.

[MKE-9699] Ingress Controller with external load balancer can enter crashloop

Due to the upstream Kubernetes issue 73140, rapid toggling of the Ingress Controller with an external load balancer in use can cause the resource to become stuck in a crashloop.

Workaround:

  1. Log in to the MKE web UI as an administrator.

  2. In the left-side navigation panel, navigate to <user name> > Admin Settings > Ingress.

  3. Click the Kubernetes tab to display the HTTP Ingress Controller for Kubernetes pane.

  4. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the left to disable the Ingress Controller.

  5. Use the CLI to delete the Ingress Controller resources:

    kubectl delete service ingress-nginx-controller-admission --namespace ingress-nginx
    kubectl delete deployment ingress-nginx-controller --namespace
    ingress-nginx
    
  6. Verify the successful deletion of the resources:

    kubectl get all --namespace ingress-nginx
    

    Example output:

    No resources found in ingress-nginx namespace.
    
  7. Return to the HTTP Ingress Controller for Kubernetes pane in the MKE web UI and change the nodeport numbers for HTTP Port, HTTPS Port and TCP Port.

  8. Toggle the HTTP Ingress Controller for Kubernetes enabled control to the right to re-enable the Ingress Controller.

[MKE-9358] cgroup v2 (unsupported) is enabled in RHEL 9.0 by default

As MKE does not support cgroup v2 on Linux platforms, RHEL 9.0 users cannot use the software due to cgroup v2 default enablement.

As a workaround, RHEL 9.0 users must disable cgroup v2.

[MKE-8914] Windows Server Core with Containers images incompatible with GCP

The use of Windows ServerCore with Containers images will prevent kubelet from starting up, as these images are not compatible with GCP.

As a workaround, use Windows Server or Windows Server Core images.

[MKE-8814] Mismatched MTU values cause Swarm overlay network issues on GCP

Communication between GCP VPCs and Docker networks that use Swarm overlay networks will fail if their MTU values are not manually aligned. By default, the MTU value for GCP VPCs is 1460, while the default MTU value for Docker networks is 1500.

Workaround:

Select from the following options:

  • Create a new VPC and set the MTU value to 1500.

  • Set the MTU value of the existing VPC to 1500.

For more information, refer to the Google Cloud Platform documentation, Change the MTU setting of a VPC network.

[FIELD-6785] Reinstallation can fail following cluster CA rotation

If MKE 3.7.x is uninstalled soon after rotating cluster CA, re-installing MKE 3.7.x or 3.6.x on an existing docker swarm can fail with the following error messages:

unable to sign cert: {\"code\":1000,\"message\":\"x509: provided PrivateKey doesn't match parent's PublicKey\"}"

Workaround:

  1. Forcefully trigger swarm snapshot:

    old_val=$(docker info --format '{{.Swarm.Cluster.Spec.Raft.SnapshotInterval}}')
    docker swarm update --snapshot-interval 1
    docker swarm update --snapshot-interval ${old_val}
    
  2. Reattempt to install MKE.

[FIELD-6402] Default metric collection memory settings may be insufficient

In MKE 3.7, ucp-metrics collects more metrics than in previous versions of MKE. As such, for large clusters with many nodes, the following ucp-metrics component default settings may be insufficient:

  • memory request: 1Gi

  • memory limit: 2Gi

Workaround:

Administrators can modify the MKE configuration file to increase the default memory request and memory limit setting values for the ucp-metrics component. The settings to configure are both under the cluster section:

  • For memory request, modify the prometheus_memory_request setting

  • For memory limit, modify the prometheus_memory_limit setting

Major component versions

The following table presents the versioning information for the major middleware components included in the MKE 3.7.0 release.

Security information

  • Updated to the following middleware component versions to resolve vulnerabilities in MKE:

    • [MKE-10053] NGINX Ingress Controller 1.8.1

    • [MKE-10052, MKE-10051] Go 1.20.5

    • [FIELD-6263] Prometheus 2.45.0, Alert Manager 0.25.0-f2f4110

  • [MKE-10057] Updated the MKE build process to resolve vulnerabilities reported in ucp-interlock, ucp-interlock-extension, and ucp-swarm.

  • [MKE-10056] Updated the MKE build process to resolve vulnerabilities reported in ucp-nvidia-device-plugin.

  • [MKE-10049] Resolved vulnerabilities reported in the ucp-containerd-shim-process image.

  • [MKE-10089] Updated the Alpine 3.17 image to resolve a vulnerability.

Deprecations

Upstream Kubernetes has removed the in-tree AWS cloud provider. Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. As such, if your MKE cluster is using the AWS in-tree cloud provider, you must defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

If you attempt to upgrade a cluster that uses AWS in-tree cloud provider to MKE 3.7.0, the upgrade will fail and you will receive the following error message:

Your MKE cluster is currently using the AWS in-tree cloud provider, which
Kubernetes no longer supports. Please defer upgrading to MKE 3.7 until a
version that supports migration to an alternative external AWS cloud
provider is released.

Deprecation notes

Taking into account continuous reorganization and enhancement of Mirantis Kubernetes Engine (MKE), certain components are deprecated and eventually removed from the product. This section provides the following details about the deprecated and removed functionality that may potentially impact existing MKE deployments:

  • The MKE release version in which deprecation is announced

  • The final MKE release version in which a deprecated component is present

  • The MKE release version in which a deprecated component is removed

MKE deprecated and removed functionality

Dependency

Deprecated

Final release

Removed

Comments

In-tree AWS cloud provider

3.7.0

3.6.x

3.7.0

Kubernetes 1.27.4, which is the version that is configured to MKE 3.7.0, does not support the AWS in-tree cloud provider. Thus, defer upgrade to a later version of MKE 3.7 that supports a transition pathway to an alternative external AWS cloud provider.

Custom log drivers

3.6.0

3.6.0

3.6.0

Removed due to Dockershim deprecation from Kubernetes.

Pod Security Policy (PSP) support

3.5.0

1

1

PSP functionality is deprecated in Kubernetes 1.21.

FlexVolume drivers (iSCSI, SMB)

3.6.0

1

1

The CSI plugins that remain available for use are detailed in Use CSI drivers.

1(1,2,3,4)

The target MKE release version is under review and will be announced separately.

Release Compatibility Matrix

MKE 3.7 Compatibility Matrix

Mirantis Kubernetes Engine (MKE, and formerly Docker Enterprise/UCP) provides enterprises with the easiest and fastest way to deploy cloud native applications at scale in any environment.

Support for MKE is defined in the Mirantis Cloud Native Platform Subscription Services agreement.

Operating system compatibility

As MKE functionality is dependent on MCR, MKE operating system compatibility is contingent on the operating system compatibility of the MCR versions with which a particular MKE version is compatible.

MKE

MCR required

Client API

Kubernetes provided

3.7.16

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.16

3.7.15

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.16

3.7.14

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.16

3.7.13

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.14

3.7.12

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.14

3.7.11

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.14

3.7.10

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.10

3.7.9

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.10

3.7.8

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.10

3.7.7

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.10

3.7.6

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.10

3.7.5

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.7

3.7.4 1

Not applicable

Not applicable

Not applicable

3.7.3

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.7

3.7.2

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.7

3.7.1

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.4

3.7.0

23.0.1, 23.0.3, 23.0.5, 23.0.6, 23.0.7, 23.0.8, 23.0.9, 23.0.10, 23.0.11, 23.0.12, 23.0.13, 23.0.14, 23.0.15

1.41

1.27.4

1

MKE 3.7.4 was discontinued shortly after release due to issues encountered when upgrading to it from previous versions of the product.

Important

RHEL 9, Rocky 9, Oracle 9, and Ubuntu 22.04 all default to cgroup v2. MKE 3.7.0 only supports cgroup v1, whereas all later versions support cgroup v2. Thus, if you are running any of the aforementioned OS versions, you must either upgrade to MKE 3.7.1 or later, or you must downgrade to cgroup v1.

Kubernetes Volume Drivers

  • NFS v4 via Kubernetes e2e suite

  • AWS EFS

  • AWS EBS

  • Azure Disk

  • Azure File

MKE and MSR Browser compatibility

The Mirantis Kubernetes Engine (MKE) and Mirantis Secure Registry (MSR) web user interfaces (UIs) both run in the browser, separate from any backend software. As such, Mirantis aims to support browsers separately from the backend software in use.

Mirantis currently supports the following web browsers:

Browser

Supported version

Release date

Operating systems

Google Chrome

96.0.4664 or newer

15 November 2021

MacOS, Windows

Microsoft Edge

95.0.1020 or newer

21 October 2021

Windows only

Firefox

94.0 or newer

2 November 2021

MacOS, Windows

To ensure the best user experience, Mirantis recommends that you use the latest version of any of the supported browsers. The use of other browsers or older versions of the browsers we support can result in rendering issues, and can even lead to glitches and crashes in the event that some JavaScript language features or browser web APIs are not supported.

Important

Mirantis does not tie browser support to any particular MKE or MSR software release.

Mirantis strives to leverage the latest in browser technology to build more performant client software, as well as ensuring that our customers benefit from the latest browser security updates. To this end, our strategy is to regularly move our supported browser versions forward, while also lagging behind the latest releases by approximately one year to give our customers a sufficient upgrade buffer.

MKE, MSR, and MCR Maintenance Lifecycle

The MKE, MSR, and MCR platform subscription provides software, support, and certification to enterprise development and IT teams that build and manage critical apps in production at scale. It provides a trusted platform for all apps which supply integrated management and security across the app lifecycle, comprised primarily of Mirantis Kubernetes Engine, Mirantis Secure Registry (MSR), and Mirantis Container Runtime (MCR).

Mirantis validates the MKE, MSR, and MCR platform for the operating system environments specified in the mcr-23.0-compatibility-matrix, adhering to the Maintenance Lifecycle detailed here. Support for the MKE, MSR, and MCR platform is defined in the Mirantis Cloud Native Platform Subscription Services agreement.

Detailed here are all currently supported product versions, as well as the product versions most recently deprecated. It can be assumed that all earlier product versions are at End of Life (EOL).

Important Definitions

  • “Major Releases” (X.y.z): Vehicles for delivering major and minor feature development and enhancements to existing features. They incorporate all applicable Error corrections made in prior Major Releases, Minor Releases, and Maintenance Releases.

  • “Minor Releases” (x.Y.z): Vehicles for delivering minor feature developments, enhancements to existing features, and defect corrections. They incorporate all applicable Error corrections made in prior Minor Releases, and Maintenance Releases.

  • “Maintenance Releases” (x.y.Z): Vehicles for delivering Error corrections that are severely affecting a number of customers and cannot wait for the next major or minor release. They incorporate all applicable defect corrections made in prior Maintenance Releases.

  • “End of Life” (EOL): Versions are no longer supported by Mirantis, updating to a later version is recommended.

Support lifecycle

GA to 12 months

12 to 18 months

18 to 24 months

Full support

Full support 1

Limited Support for existing installations 2

1 Software patches for critical bugs and security issues only; no feature enablement.

2 Software patches for critical security issues only.

Mirantis Kubernetes Engine (MKE)

3.7.z

General Availability (GA)

2023-AUG-30 (3.7.0)

End of Life (EOL)

2025-AUG-29

Release frequency

x.y.Z every 6 weeks

Patch release content

As needed:

  • Maintenance releases

  • Security patches

  • Custom hotfixes

Supported lifespan

2 years 1

1 Refer to the Support lifecycle table for details.

EOL MKE Versions

MKE Version

EOL date

2.0.z

2017-AUG-16

2.1.z

2018-FEB-07

2.2.z

2019-NOV-01

3.0.z

2020-APR-16

3.1.z

2020-NOV-06

3.2.z

2021-JUL-21

3.3.z

2022-MAY-27

3.4.z

2023-APR-11

3.5.z

2023-NOV-22

3.6.z

2023-OCT-13

Mirantis Secure Registry (MSR)

2.9.z

3.1.z

General Availability (GA)

2021-APR-12 (2.9.0)

2023-SEP-28 (3.1.0)

End of Life (EOL)

2025-SEP-27

2025-SEP-27

Release frequency

x.y.Z every 6 weeks

x.y.Z every 6 weeks

Patch release content

As needed:

  • Maintenance releases

  • Security patches

  • Custom hotfixes

As needed:

  • Maintenance releases

  • Security patches

  • Custom hotfixes

Supported lifespan

2 years 1

2 years 1

1 Refer to the Support lifecycle table for details.

EOL MSR Versions

MSR Version

EOL date

2.1.z

2017-AUG-16

2.2.z

2018-FEB-07

2.3.z

2019-FEB-15

2.4.z

2019-NOV-01

2.5.z

2020-APR-16

2.6.z

2020-NOV-06

2.7.z

2021-JUL-21

2.8.z

2022-MAY-27

3.0.z

2024-APR-20

Mirantis Container Runtime (MCR)

Enterprise 23.0

General Availability (GA)

2023-FEB-23 (23.0.1)

End of Life (EOL)

2025-MAY-19

Release frequency

x.y.Z every 6 weeks

Patch release content

As needed:

  • Maintenance releases

  • Security patches

  • Custom hotfixes

Supported lifespan

2 years 1

1 Refer to the Support lifecycle table for details.

EOL MCR Versions

MCR Version

EOL date

CSE 1.11.z

2017-MAR-02

CSE 1.12.z

2017-NOV-14

CSE 1.13.z

2018-FEB-07

EE 17.03.z

2018-MAR-01

Docker Engine - Enterprise v17.06

2020-APR-16

Docker Engine - Enterprise 18.03

2020-JUN-16

Docker Engine - Enterprise 18.09

2020-NOV-06

Docker Engine - Enterprise 19.03

2021-JUL-21

MCR 19.03.8+

2022-MAY-27

MCR 20.10.0+

2023-DEC-10

Release Cadence and Support Lifecycle

With the intent of improving the customer experience, Mirantis strives to offer maintenance releases for the Mirantis Kubernetes Engine (MKE) software every six to eight weeks. Primarily, these maintenance releases will aim to resolve known issues and issues reported by customers, quash CVEs, and reduce technical debt. The version of each MKE maintenance release is reflected in the third digit position of the version number (as an example, for MKE 3.6 the most current maintenance release is MKE 3.7.16).

In parallel with our maintenance MKE release work, each year Mirantis will develop and release a new major version of MKE, the Mirantis support lifespan of which will adhere to our legacy two year standard.

End of Life Date

The End of Life (EOL) date for MKE 3.6 is 2024-OCT-13.

For more information on MKE version lifecycles, refer to the MKE, MSR, and MCR Maintenance Lifecycle.

The MKE team will make every effort to hold to the release cadence stated here. Customers should be aware, though, that development and release cycles can change, and without advance notice.

Technology Preview features

A Technology Preview feature provides early access to upcoming product innovations, allowing customers to experiment with the functionality and provide feedback.

Technology Preview features may be privately or publicly available and neither are intended for production use. While Mirantis will provide assistance with such features through official channels, normal Service Level Agreements do not apply.

As Mirantis considers making future iterations of Technology Preview features generally available, we will do our best to resolve any issues that customers experience when using these features.

During the development of a Technology Preview feature, additional components may become available to the public for evaluation. Mirantis cannot guarantee the stability of such features. As a result, if you are using Technology Preview features, you may not be able to seamlessly upgrade to subsequent product releases.

Mirantis makes no guarantees that Technology Preview features will graduate to generally available features.

Open Source Components and Licenses

Click any product component license below to download a text file of that license to your local system.