3.3.15¶

(2022-02-10)

Components¶

Component	Version
MKE	3.3.15
Kubernetes	1.18.19
Calico	3.18.4
Calico for Windows	3.18.4
Interlock	3.3.2
Interlock NGINX proxy	1.21.1
Istio Ingress	1.4.10
CoreDNS	1.7.0
RethinkDB	2.3.6
etcd	3.4.3
CSI Attacher	2.1.1
CSI Provisioner	1.4.0
CSI Snapshotter	1.2.2
CSI Resizer	0.4.0
CSI Node Driver Registrar	1.2.0
CSI Liveness Probe	1.1.0
Openstack Cinder CSI plugin	1.20.3

Bug fixes¶

[MKE-8678] Fixed an issue with the MKE web UI wherein having one or more nodes in a list without metric values caused sorting to fail.
[FIELD-4482] Fixed an issue wherein MKE did not fetch the SAML identity provider certificates following their rotation. Disabling and reenabling SAML now causes MKE to fetch the new metadata, which includes the identity provider certificates.
[FIELD-4437] Fixed an issue with Kubernetes on Windows Server environments wherein the network failed following a reboot of the manager node.

Known issues¶

[FIELD-4200] The calico-node firewalld-policy init container can disable the docker ingress routing mesh when reloading firewalld.

Workaround:
1. Prevent the issue from recurring by disabling firewalld:
```
sudo systemctl disable --now firewalld
```
2. Restore missing iptables chains by restarting dockerd:
```
sudo systemctl restart docker
```
  Note
  
  Restarting dockerd stops all containers on the corresponding node. The node capacity will not be available to the cluster until the node returns to a healthy state in MKE. You must restart dockerd on manager nodes one node at a time, confirming the health of each one in MKE before moving on to the next.
3. Confirm issue resolution by checking for the presence of the DOCKER-INGRESS iptables chain:
```
sudo iptables --list DOCKER-INGRESS
```
  Expected output:
```
Chain DOCKER-INGRESS (2 references)
target     prot opt source               destination
[...]
```
[MKE-8538] CLI-based support dumps are unavailable on Windows worker nodes.

Workaround:

For Swarm-orchestrated Windows nodes, use the MKE web UI to obtain a support bundle. For Kubernetes-orchestrated Windows nodes, you must manually collect the logs.

[MKE-8738] Windows Kubernetes worker nodes can fail on long-haul runs, with a DiskPressure error that is similar to the following:

time="2022-02-08T17:20:30Z" level=warning msg="error while removing container: failed to unprepare layer C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\3707: hcsshim::UnprepareLayer - failed failed in Win32: The system could not find the instance specified. (0x801f0015): unknown"
time="2022-02-08T17:20:30Z" level=fatal msg="failed to cleanup old containers: failed to unprepare layer C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\3707: hcsshim::UnprepareLayer - failed failed in Win32: The system could not find the instance specified. (0x801f0015): unknown"

Workaround:

Identify the stopped task:

C:\Users\Docker>ctr -n com.docker.ucp task ls

Example output:

TASK                  PID      STATUS
ucp-tigera-felix      12012    RUNNING
ucp-kube-proxy        7912     RUNNING
ucp-kubelet-health    26616    STOPPED
ucp-tigera-node       3236     RUNNING

Identify the containerd-shim process that is associated with the stopped task:

Get-CimInstance -ClassName Win32_Process \
-Filter "Name like 'containerd-shim%'" | \
select ProcessId,CommandLine | fl

Stop the containerd-shim process that is associated with the stopped task:
```
Stop-Process -Id <containerd-shim-pid> -Confirm -PassThru -Force
```